Add test_workflow_document_updated_does_not_overwrite_filename to
verify that run_workflows (DOCUMENT_UPDATED path) does not revert a
DB filename that was updated by a concurrent bulk_update_documents
task's update_filename_and_move_files call.
The test replicates the race window by:
- Updating the DB filename directly (simulating BUD-1 completing)
- Mocking refresh_from_db so the stale in-memory filename persists
- Asserting the DB filename is not clobbered after run_workflows
Relates to: https://github.com/paperless-ngx/paperless-ngx/issues/12386
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
* Uses a custom transport to resolve the slim chance of a DNS rebinding affecting the webhook
* Fix WebhookTransport hostname resolution and validation
* Fix test failures
* Lint
* Keep all internal logic inside WebhookTransport
* Fix test failure
* Update handlers.py
* Update handlers.py
---------
Co-authored-by: Trenton H <797416+stumpylog@users.noreply.github.com>
* tests: general cleanup and fixes for runnning under docker
This now allows tests to be run under a locally built or production
docker image with something like:
`docker run --rm -v $PWD:/usr/src/paperless --entrypoint=bash paperlessngx/paperless-ngx:latest -c "uv run pytest"`
Specific fixes:
- fix unreachable code around `assertRaises` blocks
- fix `assertInt` typos
- fix `str(e)` vs `str(e.exception)` issues
- skip permission-based checks when root (in a docker container)
- catch `OSError` problems when instantiating `INotify` and
skip inotify-based tests when it's unavailable.
* Reverts most files to dev while keeping the exception assert fixes
---------
Co-authored-by: Trenton H <797416+stumpylog@users.noreply.github.com>
This alters the retry/backoff logic in the init-wait-for-db script to be more
optimistic about database availability. During regular deployment and
operations of paperless-ngx, it's common to restart the application server with
the database instance already running, so we should optimize for this case.
Instead of unconditionally delaying 5 seconds between each connection attempt,
start with a minimum delay of 1 second and increase the delay linearly with
each attempt, maxing out at 10 seconds. This makes the retry count-based
failure mode less practical, so instead we just use a timeout-based approach.*
*NOTE: the original implementation would have an effective timeout of 25s. This
alters the behavior to 60s.
Additionally, this removes an unnecessary 5s delay that was injected in the
postgres case. The script uses a more comprehensive connection check for
postgres than it does mariadb, so if anything this 5s delay after getting an
"ok" response from the DB was extra unnecessary in the postgres case.
* chore(devcontainer): drop read-only host .gitconfig bind mount
The bind mount prevented adjusting git config inside the dev container, and VS Code Dev Containers already copies the host .gitconfig automatically, making the mount unnecessary. This restores ability to manage git settings within the container.
* chore(gitignore): ignore .pnpm-store folder for pnpm package management
Add .pnpm-store/ to .gitignore to prevent local pnpm package store from being tracked by git when using the devcontainer.
* docs(development): clarify VS Code devcontainer setup steps for Windows
Add instructions, how to overcome some issues caused by using Windows as host system.
This helps prevent excessive processing times on very large documents
by limiting the text analyzed during date parsing, tag prediction,
and correspondent matching.
If the document exceeds 1.2M chars, crop to 1M char.
* Security: prevent XSS with storage path template rendering
* Security: prevent XSS svg uploads
* Security: force attachment disposition for logo
* Add suggestions from code review
* Improve SVG validation with allowlist for tags and attributes
* Explicitly set the HOME environment for the migrations to fix issue with certificates
* Defines the HOME globally when we're running as root for startup
### ⚠️ Please remember: issues are for *bugs* only! ⚠️
That is, something you believe affects every single user of Paperless-ngx, not just you. If you're not sure, start with one of the other options below.
That is, something you believe affects every single user of Paperless-ngx (and the demo, for example), not just you. If you are not sure, start with one of the other options below.
Also, note that **Paperless-ngx does not perform OCR or archive file creation itself**, those are handled by other tools. Problems with OCR or archive versions of specific files should likely be raised 'upstream', see https://github.com/ocrmypdf/OCRmyPDF/issues or https://github.com/tesseract-ocr/tesseract/issues
Also, note that **Paperless-ngx does not perform OCR or archive file creation itself**, those are handled by other tools. Problems with OCR or archive versions of specific files should likely be raised 'upstream', see https://github.com/ocrmypdf/OCRmyPDF/issues or https://github.com/tesseract-ocr/tesseract/issues
- type:markdown
- type:markdown
@@ -59,6 +59,12 @@ body:
label:Browser logs
label:Browser logs
description:Logs from the web browser related to your issue, if needed
description:Logs from the web browser related to your issue, if needed
render:bash
render:bash
- type:textarea
id:logs_services
attributes:
label:Services logs
description:Logs from other services (or containers) related to your issue, if needed. For example, the database or redis logs.
@@ -35,8 +35,8 @@ NOTE: PRs that do not address the following will not be merged, please do not sk
- [ ] I have read & agree with the [contributing guidelines](https://github.com/paperless-ngx/paperless-ngx/blob/main/CONTRIBUTING.md).
- [ ] I have read & agree with the [contributing guidelines](https://github.com/paperless-ngx/paperless-ngx/blob/main/CONTRIBUTING.md).
- [ ] If applicable, I have included testing coverage for new code in this PR, for [backend](https://docs.paperless-ngx.com/development/#testing) and / or [front-end](https://docs.paperless-ngx.com/development/#testing-and-code-style) changes.
- [ ] If applicable, I have included testing coverage for new code in this PR, for [backend](https://docs.paperless-ngx.com/development/#testing) and / or [front-end](https://docs.paperless-ngx.com/development/#testing-and-code-style) changes.
- [ ] If applicable, I have tested my code for new features & regressions on both mobile & desktop devices, using the latest version of major browsers.
- [ ] If applicable, I have tested my code for breaking changes & regressions on both mobile & desktop devices, using the latest version of major browsers.
- [ ] If applicable, I have checked that all tests pass, see [documentation](https://docs.paperless-ngx.com/development/#back-end-development).
- [ ] If applicable, I have checked that all tests pass, see [documentation](https://docs.paperless-ngx.com/development/#back-end-development).
- [ ] I have run all `pre-commit` hooks, see [documentation](https://docs.paperless-ngx.com/development/#code-formatting-with-pre-commit-hooks).
- [ ] I have run all `pre-commit` hooks, see [documentation](https://docs.paperless-ngx.com/development/#code-formatting-with-pre-commit-hooks).
- [ ] I have made corresponding changes to the documentation as needed.
- [ ] I have made corresponding changes to the documentation as needed.
- [ ] I have checked my modifications for any breaking changes.
- [ ] In the description of the PR above I have disclosed the use of AI tools in the coding of this PR.
This issue has been automatically marked as stale because it has not had
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. See our [contributing guidelines](https://github.com/paperless-ngx/paperless-ngx/blob/dev/CONTRIBUTING.md#automatic-repository-maintenance) for more details.
recent activity. It will be closed if no further activity occurs. Thank you
for your contributions. See our [contributing guidelines](https://github.com/paperless-ngx/paperless-ngx/blob/dev/CONTRIBUTING.md#automatic-repository-maintenance) for more details.
days-before-pr-stale:14
days-before-pr-close:7
stale-pr-message:""
stale-pr-label:stale
exempt-pr-labels:'notable'
close-pr-message:>
This pull request has been automatically closed because it has not had recent activity. Thank you for your contributions. Please open a new pull request or discussion if you would like to continue working on this change.
lock-threads:
lock-threads:
name:'Lock Old Threads'
name:'Lock Old Threads'
if:github.repository_owner == 'paperless-ngx'
if:github.repository_owner == 'paperless-ngx'
@@ -42,26 +44,20 @@ jobs:
discussion-inactive-days:'30'
discussion-inactive-days:'30'
log-output:true
log-output:true
issue-comment:>
issue-comment:>
This issue has been automatically locked since there
This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new discussion or issue for related concerns. See our [contributing guidelines](https://github.com/paperless-ngx/paperless-ngx/blob/dev/CONTRIBUTING.md#automatic-repository-maintenance) for more details.
has not been any recent activity after it was closed.
Please open a new discussion or issue for related concerns.
See our [contributing guidelines](https://github.com/paperless-ngx/paperless-ngx/blob/dev/CONTRIBUTING.md#automatic-repository-maintenance) for more details.
pr-comment:>
pr-comment:>
This pull request has been automatically locked since there
This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new discussion or issue for related concerns. See our [contributing guidelines](https://github.com/paperless-ngx/paperless-ngx/blob/dev/CONTRIBUTING.md#automatic-repository-maintenance) for more details.
has not been any recent activity after it was closed.
Please open a new discussion or issue for related concerns.
See our [contributing guidelines](https://github.com/paperless-ngx/paperless-ngx/blob/dev/CONTRIBUTING.md#automatic-repository-maintenance) for more details.
discussion-comment:>
discussion-comment:>
This discussion has been automatically locked since there
This discussion has been automatically locked since there has not been any recent activity after it was closed. Please open a new discussion for related concerns. See our [contributing guidelines](https://github.com/paperless-ngx/paperless-ngx/blob/dev/CONTRIBUTING.md#automatic-repository-maintenance) for more details.
has not been any recent activity after it was closed.
Please open a new discussion for related concerns.
See our [contributing guidelines](https://github.com/paperless-ngx/paperless-ngx/blob/dev/CONTRIBUTING.md#automatic-repository-maintenance) for more details.
If you feel like contributing to the project, please do! Bug fixes and improvements are always welcome.
If you feel like contributing to the project, please do! Bug fixes and improvements are always welcome.
⚠️ Please note: Pull requests that implement a new feature or enhancement _should almost always target an existing feature request_ with evidence of community interest and discussion. This is in order to balance the work of implementing and maintaining new features / enhancements. Pull requests that are opened without meeting this requirement may not be merged.
If you want to implement something big:
If you want to implement something big:
-Please start a discussion about that in the issues! Maybe something similar is already in development and we can make it happen together.
-As above, please start with a discussion! Maybe something similar is already in development and we can make it happen together.
- When making additions to the project, consider if the majority of users will benefit from your change. If not, you're probably better of forking the project.
- When making additions to the project, consider if the majority of users will benefit from your change. If not, you're probably better of forking the project.
- Also consider if your change will get in the way of other users. A good change is a change that enhances the experience of some users who want that change and does not affect users who do not care about the change.
- Also consider if your change will get in the way of other users. A good change is a change that enhances the experience of some users who want that change and does not affect users who do not care about the change.
- Please see the [paperless-ngx merge process](#merging-prs) below.
- Please see the [paperless-ngx merge process](#merging-prs) below.
@@ -37,6 +39,8 @@ Before you can run `pytest`, ensure to [properly set up your local environment](
Once you have submitted a **P**ull **R**equest it will be reviewed, approved, and merged by one or more community members of any team. Automated code tests and formatting checks must be passed.
Once you have submitted a **P**ull **R**equest it will be reviewed, approved, and merged by one or more community members of any team. Automated code tests and formatting checks must be passed.
Important: Pull requests that implement a new feature or enhancement _should almost always target an existing feature request_ with evidence of community interest and discussion. This is in order to balance the work of implementing and maintaining new features / enhancements. Instead of opening a PR which does not meet this requirement, please open a feature request instead, to gather feedback from both users and the project maintainers.
## Non-Trivial Requests
## Non-Trivial Requests
PRs deemed `non-trivial` will go through a stricter review process before being merged into `dev`. This is to ensure code quality and complete functionality (free of side effects).
PRs deemed `non-trivial` will go through a stricter review process before being merged into `dev`. This is to ensure code quality and complete functionality (free of side effects).
@@ -81,7 +85,7 @@ Some notes about translation:
If a language has already been added, and you would like to contribute new translations or change existing translations, please read the "Translation" section in the README.md file for further details on that.
If a language has already been added, and you would like to contribute new translations or change existing translations, please read the "Translation" section in the README.md file for further details on that.
If you would like the project to be translated to another language, first head over to https://crwd.in/paperless-ngx to check if that language has already been enabled for translation.
If you would like the project to be translated to another language, first head over to https://crowdin.com/project/paperless-ngx to check if that language has already been enabled for translation.
If not, please request the language to be added by creating an issue on GitHub. The issue should contain:
If not, please request the language to be added by creating an issue on GitHub. The issue should contain:
- English name of the language (the localized name can be added on Crowdin).
- English name of the language (the localized name can be added on Crowdin).
@@ -109,28 +113,12 @@ Paperless-ngx is a community project. We do our best to delegate permission and
## Structure
## Structure
As of writing, there are 21 members in paperless-ngx. 4 of these people have complete administrative privileges to the repo:
There are currently 2 members in paperless-ngx with complete administrative privileges to the repo:
There are 5 teams collaborating on specific tasks within paperless-ngx:
There are other members who occasionally contribute but we are actively seeking more dedicated maintainers of the project. Please reach out if you are interested.
-@paperless-ngx/test (General testing for larger PRs)
## Permissions
All team members are notified when mentioned or assigned to a relevant issue or pull request. Additionally, each team has slightly different access to paperless-ngx:
- The **test** team has no special permissions.
- The **issues** team has `triage` access. This means they can organize issues and pull requests.
- The **backend**, **frontend**, and **ci-cd** teams have `write` access. This means they can approve PRs and push code, containers, releases, and more.
## Joining
## Joining
@@ -141,13 +129,13 @@ The admins occasionally invite contributors directly if we believe having them o
# Automatic Repository Maintenance
# Automatic Repository Maintenance
The Paperless-ngx team appreciates all effort and interest from the community in filing bug reports, creating feature requests, sharing ideas and helping other
The Paperless-ngx team appreciates all effort and interest from the community in filing bug reports, creating feature requests, sharing ideas and helping other
community members. That said, in an effort to keep the repository organized and managebale the project uses automatic handling of certain areas:
community members. That said, in an effort to keep the repository organized and manageable the project uses automatic handling of certain areas:
- Issues that cannot be reproduced will be marked 'stale' after 7 days of inactivity and closed after 14 further days of inactivity.
- Issues that cannot be reproduced will be marked 'stale' after 7 days of inactivity and closed after 14 further days of inactivity.
- Issues, pull requests and discussions that are closed will be locked after 30 days of inactivity.
- Issues, pull requests and discussions that are closed will be locked after 30 days of inactivity.
- Discussions with a marked answer will be automatically closed.
- Discussions with a marked answer will be automatically closed.
- Discussions in the 'General' or 'Support' categories will be closed after 180 days of inactivity.
- Discussions in the 'General' or 'Support' categories will be closed after 180 days of inactivity.
- Feature requests that do not meet the following thresholds will be closed: 180 days of inactivity, < 5 "up-votes" after 180 days, < 20 "up-votes" after 1 year or < 80 "up-votes" at 2 years.
- Feature requests that do not meet the following thresholds will be closed: 180 days of inactivity with less than 80 "up-votes", < 5 "up-votes" after 180 days, < 20 "up-votes" after 1 year or < 40 "up-votes" at 2 years.
In all cases, threads can be re-opened by project maintainers and, of course, users can always create a new discussion for related concerns.
In all cases, threads can be re-opened by project maintainers and, of course, users can always create a new discussion for related concerns.
Finally, remember that all information remains searchable and 'closed' feature requests can still serve as inspiration for new features.
Finally, remember that all information remains searchable and 'closed' feature requests can still serve as inspiration for new features.
@@ -83,7 +83,7 @@ People interested in continuing the work on paperless-ngx are encouraged to reac
## Translation
## Translation
Paperless-ngx is available in many languages that are coordinated on Crowdin. If you want to help out by translating paperless-ngx into your language, please head over to https://crwd.in/paperless-ngx, and thank you! More details can be found in [CONTRIBUTING.md](https://github.com/paperless-ngx/paperless-ngx/blob/main/CONTRIBUTING.md#translating-paperless-ngx).
Paperless-ngx is available in many languages that are coordinated on Crowdin. If you want to help out by translating paperless-ngx into your language, please head over to https://crowdin.com/project/paperless-ngx, and thank you! More details can be found in [CONTRIBUTING.md](https://github.com/paperless-ngx/paperless-ngx/blob/main/CONTRIBUTING.md#translating-paperless-ngx).
In general, paperless does not require a specific version of PostgreSQL or MariaDB and it is
Paperless-ngx is compatible with Django-supported versions of PostgreSQL and MariaDB and it is generally
safe to update them to newer versions. However, you should always take a backup and follow
safe to update them to newer versions. However, you should always take a backup and follow
the instructions from your database's documentation for how to upgrade between major versions.
the instructions from your database's documentation for how to upgrade between major versions.
!!! note
As of Paperless-ngx v2.18, the minimum supported version of PostgreSQL is 14.
For PostgreSQL, refer to [Upgrading a PostgreSQL Cluster](https://www.postgresql.org/docs/current/upgrading.html).
For PostgreSQL, refer to [Upgrading a PostgreSQL Cluster](https://www.postgresql.org/docs/current/upgrading.html).
For MariaDB, refer to [Upgrading MariaDB](https://mariadb.com/kb/en/upgrading/)
For MariaDB, refer to [Upgrading MariaDB](https://mariadb.com/kb/en/upgrading/)
@@ -306,7 +310,7 @@ in dedicated folders according to their nature: `archive`, `originals`,
If `-sm` or `--split-manifest` is provided, information about document
If `-sm` or `--split-manifest` is provided, information about document
will be placed in individual json files, instead of a single JSON file. The main
will be placed in individual json files, instead of a single JSON file. The main
manifest.json will still contain application wide information (e.g. tags, correspondent,
manifest.json will still contain application wide information (e.g. tags, correspondent,
documenttype, etc)
documenttype, etc)
If `-z` or `--zip` is provided, the export will be a zip file
If `-z` or `--zip` is provided, the export will be a zip file
in the target directory, named according to the current local date or the
in the target directory, named according to the current local date or the
@@ -333,7 +337,7 @@ must be provided to import. If this value is lost, the export cannot be imported
The document importer takes the export produced by the [Document
The document importer takes the export produced by the [Document
exporter](#exporter) and imports it into paperless.
exporter](#exporter) and imports it into paperless.
The importer works just like the exporter. You point it at a directory,
The importer works just like the exporter. You point it at a directory or the generated .zip file,
and the script does the rest of the work:
and the script does the rest of the work:
```shell
```shell
@@ -351,9 +355,6 @@ When you use the provided docker compose script, put the export inside
the `export` folder in your paperless source directory. Specify
the `export` folder in your paperless source directory. Specify
`../export` as the `source`.
`../export` as the `source`.
Note that .zip files (as can be generated from the exporter) are not supported. You must unzip them into
the target directory first.
!!! note
!!! note
Importing from a previous version of Paperless may work, but for best
Importing from a previous version of Paperless may work, but for best
@@ -460,6 +461,22 @@ of the index and usually makes queries faster and also ensures that the
autocompletion works properly. This command is regularly invoked by the
autocompletion works properly. This command is regularly invoked by the
task scheduler.
task scheduler.
### Clearing the database read cache
If the database read cache is enabled, **you must run this command** after making any changes to the database outside the application context.
This includes operations such as restoring a database backup or executing SQL statements like UPDATE, INSERT, DELETE, ALTER, CREATE, or DROP.
Failing to invalidate the cache after such modifications can lead to stale data being served from the cache, and **may cause data corruption** or inconsistent behavior in the application.
Use the following management command to clear the cache:
```
python3 manage.py invalidate_cachalot
```
!!! info
The database read cache is based on Django-Cachalot. You can refer to their [documentation](https://django-cachalot.readthedocs.io/en/latest/quickstart.html#manage-py-command).
| `DOCUMENT_ID` | Database primary key of the document |
| `DOCUMENT_ID` | Database primary key of the document |
| `DOCUMENT_FILE_NAME` | Formatted filename, not including paths |
| `DOCUMENT_FILE_NAME` | Formatted filename, not including paths |
| `DOCUMENT_TYPE` | The document type (if any) |
| `DOCUMENT_CREATED` | Date & time when document created |
| `DOCUMENT_CREATED` | Date & time when document created |
| `DOCUMENT_MODIFIED` | Date & time when document was last modified |
| `DOCUMENT_MODIFIED` | Date & time when document was last modified |
| `DOCUMENT_ADDED` | Date & time when document was added |
| `DOCUMENT_ADDED` | Date & time when document was added |
@@ -430,8 +431,138 @@ This allows for complex logic to be included in the format, including [logical s
and [filters](https://jinja.palletsprojects.com/en/3.1.x/templates/#id11) to manipulate the [variables](#filename-format-variables)
and [filters](https://jinja.palletsprojects.com/en/3.1.x/templates/#id11) to manipulate the [variables](#filename-format-variables)
provided. The template is provided as a string, potentially multiline, and rendered into a single line.
provided. The template is provided as a string, potentially multiline, and rendered into a single line.
In addition, the entire Document instance is available to be utilized in a more advanced way, as well as some variables which only make sense to be accessed
In addition, a limited `document` object is available for advanced templates.
with more complex logic.
This object includes common metadata fields such as `id`, `pk`, `title`, `content`, `page_count`, `created`, `added`, `modified`, `mime_type`,
`checksum`, `archive_checksum`, `archive_serial_number`, `filename`, `archive_filename`, and `original_filename`.
Related values are available as nested objects with limited fields, for example document.correspondent.name, etc.
#### Custom Jinja2 Filters
##### Custom Field Access
The `get_cf_value` filter retrieves a value from custom field data with optional default fallback.
Paperless is able to utilize barcodes for automatically performing some tasks.
Paperless is able to utilize barcodes for automatically performing some tasks. Barcodes are only supported for PDF documents or TIFF, [if enabled](configuration.md#PAPERLESS_CONSUMER_BARCODE_TIFF_SUPPORT).
At this time, the library utilized for detection of barcodes supports the following types:
At this time, the library utilized for detection of barcodes supports the following types:
: Caches the database read query results into Redis. This can significantly improve application response times by caching database queries, at the cost of slightly increased memory usage.
Defaults to `false`.
!!! danger
**Do not modify the database outside the application while it is running.**
This includes actions such as restoring a backup, upgrading the database, or performing manual inserts. All external modifications must be done **only when the application is stopped**.
After making any such changes, you **must invalidate the DB read cache** using the `invalidate_cachalot` management command.
: Specifies how long (in seconds) read data should be cached.
Allowed values are between `1` (one second) and `31536000` (one year). Defaults to `3600` (one hour).
!!! warning
A high TTL increases memory usage over time. Memory may be used until end of TTL, even if the cache is invalidated with the `invalidate_cachalot` command.
In case of an out-of-memory (OOM) situation, Redis may stop accepting new data — including cache entries, scheduled tasks, and documents to consume.
If your system has limited RAM, consider configuring a dedicated Redis instance for the read cache, with a memory limit and the eviction policy set to `allkeys-lru`.
For more details, refer to the [Redis eviction policy documentation](https://redis.io/docs/latest/develop/reference/eviction/), and see the `PAPERLESS_READ_CACHE_REDIS_URL` setting to specify a separate Redis broker.
Specifies which language Paperless should use when parsing dates from documents.
This should be a language code supported by the dateparser library,
for example: "en", or a combination such as "en+de".
Locales are also supported (e.g., "en-AU").
Multiple languages can be combined using "+", for example: "en+de" or "en-AU+de".
For valid values, refer to the list of supported languages and locales in the [dateparser documentation](https://dateparser.readthedocs.io/en/latest/supported_locales.html).
Set this to match the languages in which most of your documents are written.
If not set, Paperless will attempt to infer the language(s) from the OCR configuration (`PAPERLESS_OCR_LANGUAGE`).
!!! note
This format differs from the `PAPERLESS_OCR_LANGUAGE` setting, which uses ISO 639-2 codes (3 letters, e.g., "eng+deu" for Tesseract OCR).
- use the included `scripts/start_services.sh` to use docker to fire
- use the included `scripts/start_services.sh` to use Docker to fire
up a redis instance (and some other services such as tika,
up a Redis instance (and some other services such as Tika,
gotenberg and a database server) or
Gotenberg and a database server) or
- spin up a bare redis container
- spin up a bare Redis container
```
```
docker run -d -p 6379:6379 --restart unless-stopped redis:latest
docker run -d -p 6379:6379 --restart unless-stopped redis:latest
@@ -147,7 +147,7 @@ $ ng build --configuration production
### Testing
### Testing
- Run `pytest` in the `src/` directory to execute all tests. This also
- Run `pytest` in the `src/` directory to execute all tests. This also
generates a HTML coverage report. When runnings test, `paperless.conf`
generates a HTML coverage report. When running tests, `paperless.conf`
is loaded as well. However, the tests rely on the default
is loaded as well. However, the tests rely on the default
configuration. This is not ideal. But for now, make sure no settings
configuration. This is not ideal. But for now, make sure no settings
except for DEBUG are overridden when testing.
except for DEBUG are overridden when testing.
@@ -338,13 +338,13 @@ LANGUAGES = [
## Building the documentation
## Building the documentation
The documentation is built using material-mkdocs, see their [documentation](https://squidfunk.github.io/mkdocs-material/reference/).
The documentation is built using Zensical, see their [documentation](https://zensical.org/docs/).
If you want to build the documentation locally, this is how you do it:
If you want to build the documentation locally, this is how you do it:
1. Build the documentation
1. Build the documentation
```bash
```bash
$ uv run mkdocs build --config-file mkdocs.yml
$ uv run zensical build
```
```
_alternatively..._
_alternatively..._
@@ -355,10 +355,10 @@ If you want to build the documentation locally, this is how you do it:
something.
something.
```bash
```bash
$ uv run mkdocs serve
$ uv run zensical serve
```
```
## Building the Docker image
## Building the Docker image {#docker_build}
The docker image is primarily built by the GitHub actions workflow, but
The docker image is primarily built by the GitHub actions workflow, but
it can be faster when developing to build and tag an image locally.
it can be faster when developing to build and tag an image locally.
@@ -470,9 +470,14 @@ To get started:
2. VS Code will prompt you with "Reopen in container". Do so and wait for the environment to start.
2. VS Code will prompt you with "Reopen in container". Do so and wait for the environment to start.
3. Initialize the project by running the task **Project Setup: Run all Init Tasks**. This
3. In case your host operating system is Windows:
- The Source Control view in Visual Studio Code might show: "The detected Git repository is potentially unsafe as the folder is owned by someone other than the current user." Use "Manage Unsafe Repositories" to fix this.
- Git might have detecteded modifications for all files, because Windows is using CRLF line endings. Run `git checkout .` in the containers terminal to fix this issue.
4. Initialize the project by running the task **Project Setup: Run all Init Tasks**. This
will initialize the database tables and create a superuser. Then you can compile the front end
will initialize the database tables and create a superuser. Then you can compile the front end
for production or run the frontend in debug mode.
for production or run the frontend in debug mode.
4. The project is ready for debugging, start either run the fullstack debug or individual debug
5. The project is ready for debugging, start either run the fullstack debug or individual debug
processes. Yo spin up the project without debugging run the task **Project Start: Run all Services**
processes. Yo spin up the project without debugging run the task **Project Start: Run all Services**
@@ -30,7 +34,7 @@ physical documents into a searchable online archive so you can keep, well, _less
- Utilizes the open-source Tesseract engine to recognize more than 100 languages.
- Utilizes the open-source Tesseract engine to recognize more than 100 languages.
- Documents are saved as PDF/A format which is designed for long term storage, alongside the unaltered originals.
- Documents are saved as PDF/A format which is designed for long term storage, alongside the unaltered originals.
- Uses machine-learning to automatically add tags, correspondents and document types to your documents.
- Uses machine-learning to automatically add tags, correspondents and document types to your documents.
- Supports PDF documents, images, plain text files, Office documents (Word, Excel, Powerpoint, and LibreOffice equivalents)[^1] and more.
- Supports PDF documents, images, plain text files, Office documents (Word, Excel, PowerPoint, and LibreOffice equivalents)[^1] and more.
- Paperless stores your documents plain on disk. Filenames and folders are managed by paperless and their format can be configured freely with different configurations assigned to different documents.
- Paperless stores your documents plain on disk. Filenames and folders are managed by paperless and their format can be configured freely with different configurations assigned to different documents.
-**Beautiful, modern web application** that features:
-**Beautiful, modern web application** that features:
- Customizable dashboard with statistics.
- Customizable dashboard with statistics.
@@ -197,7 +201,7 @@ People interested in continuing the work on paperless-ngx are encouraged to reac
### Translation
### Translation
Paperless-ngx is available in many languages that are coordinated on [Crowdin](https://crwd.in/paperless-ngx). If you want to help out by translating paperless-ngx into your language, please head over to the [Paperless-ngx project at Crowdin](https://crwd.in/paperless-ngx), and thank you!
Paperless-ngx is available in many languages that are coordinated on [Crowdin](https://crowdin.com/project/paperless-ngx). If you want to help out by translating paperless-ngx into your language, please head over to the [Paperless-ngx project at Crowdin](https://crowdin.com/project/paperless-ngx), and thank you!
You can go multiple routes to setup and run Paperless:
# Installation
- [Use the script to setup a Docker install](#docker_script)
!!! tip "Quick Start"
- [Use the Docker compose templates](#docker)
- [Build the Docker image yourself](#docker_build)
- [Install Paperless-ngx directly on your system manually ("bare metal")](#bare_metal)
- A user-maintained list of commercial hosting providers can be found [in the wiki](https://github.com/paperless-ngx/paperless-ngx/wiki/Related-Projects)
The Docker routes are quick & easy. These are the recommended routes.
This configures all the stuff from the above automatically so that it
just works and uses sensible defaults for all configuration options.
Here you find a cheat-sheet for docker beginners: [CLI
_If piping into a shell directly from the internet makes you nervous, inspect [the script](https://github.com/paperless-ngx/paperless-ngx/blob/main/install-paperless-ngx.sh) first!_
!!! note
## Overview
macOS users will need to install [gnu-sed](https://formulae.brew.sh/formula/gnu-sed) with support
Choose the installation route that best fits your setup:
for running as `sed` as well as [wget](https://formulae.brew.sh/formula/wget).
| [Installation script](#docker_script) | Fastest first-time setup with guided prompts (recommended for most users) | Low |
| [Docker Compose templates](#docker) | Manual control over compose files and settings | Medium |
| [Bare metal](#bare_metal) | Advanced setups, packaging, and development-adjacent workflows | High |
| [Hosted providers (wiki)](https://github.com/paperless-ngx/paperless-ngx/wiki/Related-Projects#hosting-providers) | Managed hosting options maintained by the community — check details carefully | Varies |
1. Make sure that Docker and Docker Compose are [installed](https://docs.docker.com/engine/install/){:target="\_blank"}.
For most users, Docker is the best option. It is faster to set up,
easier to maintain, and ships with sensible defaults.
2. Go to the [/docker/compose directory on the project
The bare-metal route gives you more control, but it requires manual
installation and operation of all components. It is usually best suited
for advanced users and contributors.
!!! info
Because [superuser](usage.md#superusers) accounts have full access to all objects and documents, you may want to create a separate user account for daily use,
or "downgrade" your superuser account to a normal user account after setup.
## Installation Script {#docker_script}
Paperless-ngx provides an interactive script for Docker Compose setups.
It asks a few configuration questions, then creates the required files,
pulls the image, starts the containers, and creates your [superuser](usage.md#superusers)
account. In short, it automates the [Docker Compose setup](#docker) described below.
#### Prerequisites
- Docker and Docker Compose must be [installed](https://docs.docker.com/engine/install/){:target="\_blank"}.
- macOS users will need [GNU sed](https://formulae.brew.sh/formula/gnu-sed) with support for running as `sed` as well as [wget](https://formulae.brew.sh/formula/wget).
The main branch always reflects the latest stable version.
!!! warning
2. Copy one of the `docker/compose/docker-compose.*.yml` to
It is not possible to run the container rootless if additional languages are specified via `PAPERLESS_OCR_LANGUAGES`.
`docker-compose.yml` in the root folder, depending on which database
backend you want to use. Copy `docker-compose.env` into the project
root as well.
3. In the `docker-compose.yml` file, find the line that instructs
If you want to run Paperless as a rootless container, set `user:` in `docker-compose.yml` to the UID and GID of your host user (use `id -u` and `id -g` to find these values). The container process starts directly as that user with no internal privilege remapping:
Docker Compose to pull the paperless image from Docker Hub:
```yaml
```yaml
webserver:
webserver:
image: ghcr.io/paperless-ngx/paperless-ngx:latest
image: ghcr.io/paperless-ngx/paperless-ngx:latest
```
user: '1000:1000'
```
and replace it with a line that instructs Docker Compose to build
Do not combine this with `USERMAP_UID` or `USERMAP_GID`, which are intended for the non-rootless case described in step 3.
the image from the current working directory instead:
```yaml
**File systems without inotify support (e.g. NFS)**
webserver:
build:
context: .
```
4. Follow the [Docker setup](#docker) above except when asked to run
Some file systems, such as NFS network shares, don't support file system
`docker compose pull` to pull the image, run
notifications with `inotify`. When the consumption directory is on such a
file system, Paperless-ngx will not pick up new files with the default
configuration. Use [`PAPERLESS_CONSUMER_POLLING`](configuration.md#PAPERLESS_CONSUMER_POLLING)
to enable polling and disable inotify. See [here](configuration.md#polling).
```shell-session
## Bare Metal Install {#bare_metal}
docker compose build
```
instead to build the image.
#### Prerequisites
### Bare Metal Route {#bare_metal}
- Paperless runs on Linux only, Windows is not supported.
- Python 3 is required with versions 3.10 - 3.12 currently supported. Newer versions may work, but some dependencies may not be fully compatible.
Paperless runs on linux only. The following procedure has been tested on
#### Installation
a minimal installation of Debian/Buster, which is the current stable
release at the time of writing. Windows is not and will never be
supported.
Paperless requires Python 3. At this time, 3.10 - 3.12 are tested versions.
1. Install dependencies. Paperless requires the following packages:
Newer versions may work, but some dependencies may not fully support newer versions.
Support for older Python versions may be dropped as they reach end of life or as newer versions
are released, dependency support is confirmed, etc.
1. Install dependencies. Paperless requires the following packages.
- `python3`
- `python3`
- `python3-pip`
- `python3-pip`
@@ -254,8 +233,8 @@ are released, dependency support is confirmed, etc.
2. Install `redis` >= 6.0 and configure it to start automatically.
2. Install `redis` >= 6.0 and configure it to start automatically.
3. Optional. Install `postgresql` and configure a database, user and
3. Optional: Install `postgresql` and configure a database, user, and
password for paperless. If you do not wish to use PostgreSQL,
password for Paperless-ngx. If you do not wish to use PostgreSQL,
MariaDB and SQLite are available as well.
MariaDB and SQLite are available as well.
!!! note
!!! note
@@ -264,61 +243,60 @@ are released, dependency support is confirmed, etc.
extension](https://code.djangoproject.com/wiki/JSON1Extension) is
extension](https://code.djangoproject.com/wiki/JSON1Extension) is
enabled. This is usually the case, but not always.
enabled. This is usually the case, but not always.
4. Create a system user with a new home folder under which you wish
4. Create a system user with a new home folder in which you want
1. Set the environment variable [`PAPERLESS_REDIS`](configuration.md#PAPERLESS_REDIS) so it points to
1. Set the environment variable [`PAPERLESS_REDIS`](configuration.md#PAPERLESS_REDIS) so it points to
the new Redis container
the new Redis container.
4. Update user mapping
4. Update user mapping.
1. If set, change the environment variable `PUID` to `USERMAP_UID`
1. If set, change the environment variable `PUID` to `USERMAP_UID`.
1. If set, change the environment variable `PGID` to `USERMAP_GID`
1. If set, change the environment variable `PGID` to `USERMAP_GID`.
5. Update configuration paths
5. Update configuration paths.
1. Set the environment variable [`PAPERLESS_DATA_DIR`](configuration.md#PAPERLESS_DATA_DIR) to `/config`
1. Set the environment variable [`PAPERLESS_DATA_DIR`](configuration.md#PAPERLESS_DATA_DIR) to `/config`.
6. Update media paths
6. Update media paths.
1. Set the environment variable [`PAPERLESS_MEDIA_ROOT`](configuration.md#PAPERLESS_MEDIA_ROOT) to
1. Set the environment variable [`PAPERLESS_MEDIA_ROOT`](configuration.md#PAPERLESS_MEDIA_ROOT) to
`/data/media`
`/data/media`.
7. Update timezone
7. Update timezone.
1. Set the environment variable [`PAPERLESS_TIME_ZONE`](configuration.md#PAPERLESS_TIME_ZONE) to the same
1. Set the environment variable [`PAPERLESS_TIME_ZONE`](configuration.md#PAPERLESS_TIME_ZONE) to the same
value as `TZ`
value as `TZ`.
8. Modify the `image:` to point to
8. Modify `image:` to point to
`ghcr.io/paperless-ngx/paperless-ngx:latest` or a specific version
`ghcr.io/paperless-ngx/paperless-ngx:latest` or a specific version
if preferred.
if preferred.
9. Start the containers as before, using `docker compose`.
9. Start the containers as before, using `docker compose`.
## Moving data from SQLite to PostgreSQL or MySQL/MariaDB {#sqlite_to_psql}
## Running Paperless-ngx on less powerful devices {#less-powerful-devices data-toc-label="Less Powerful Devices"}
The best way to migrate between database types is to perform an [export](administration.md#exporter) and then
Paperless runs on Raspberry Pi. Some tasks can be slow on lower-powered
[import](administration.md#importer) into a clean installation of Paperless-ngx.
hardware, but a few settings can improve performance:
## Moving back to Paperless
Lets say you migrated to Paperless-ngx and used it for a while, but
decided that you don't like it and want to move back (If you do, send
me a mail about what part you didn't like!), you can totally do that
with a few simple steps.
Paperless-ngx modified the database schema slightly, however, these
changes can be reverted while keeping your current data, so that your
current data will be compatible with original Paperless. Thumbnails
were also changed from PNG to WEBP format and will need to be
re-generated.
Execute this:
```shell-session
$ cd /path/to/paperless
$ docker compose run --rm webserver migrate documents 0023
```
Or without docker:
```shell-session
$ cd /path/to/paperless/src
$ python3 manage.py migrate documents 0023
```
After regenerating thumbnails, you'll need to clear your cookies
(Paperless-ngx comes with updated dependencies that do cookie-processing
differently) and probably your cache as well.
# Considerations for less powerful devices {#less-powerful-devices}
Paperless runs on Raspberry Pi. However, some things are rather slow on
the Pi and configuring some options in paperless can help improve
performance immensely:
- Stick with SQLite to save some resources. See [troubleshooting](troubleshooting.md#log-reports-creating-paperlesstask-failed)
- Stick with SQLite to save some resources. See [troubleshooting](troubleshooting.md#log-reports-creating-paperlesstask-failed)
if you encounter issues with SQLite locking.
if you encounter issues with SQLite locking.
- If you do not need the filesystem-based consumer, consider disabling it
- If you do not need the filesystem-based consumer, consider disabling it
entirely by setting [`PAPERLESS_CONSUMER_DISABLE`](configuration.md#PAPERLESS_CONSUMER_DISABLE) to `true`.
entirely by setting [`PAPERLESS_CONSUMER_DISABLE`](configuration.md#PAPERLESS_CONSUMER_DISABLE) to `true`.
- Consider setting [`PAPERLESS_OCR_PAGES`](configuration.md#PAPERLESS_OCR_PAGES) to 1, so that paperless will
- Consider setting [`PAPERLESS_OCR_PAGES`](configuration.md#PAPERLESS_OCR_PAGES) to 1, so that Paperless
only OCR the first page of your documents. In most cases, this page
OCRs only the first page of your documents. In most cases, this page
contains enough information to be able to find it.
contains enough information to be able to find it.
- [`PAPERLESS_TASK_WORKERS`](configuration.md#PAPERLESS_TASK_WORKERS) and [`PAPERLESS_THREADS_PER_WORKER`](configuration.md#PAPERLESS_THREADS_PER_WORKER) are
- [`PAPERLESS_TASK_WORKERS`](configuration.md#PAPERLESS_TASK_WORKERS) and [`PAPERLESS_THREADS_PER_WORKER`](configuration.md#PAPERLESS_THREADS_PER_WORKER) are
configured to use all cores. The Raspberry Pi models 3 and up have 4
configured to use all cores. The Raspberry Pi models 3 and up have 4
cores, meaning that paperless will use 2 workers and 2 threads per
cores, meaning that Paperless will use 2 workers and 2 threads per
worker. This may result in sluggish response times during
worker. This may result in sluggish response times during
consumption, so you might want to lower these settings (example: 2
consumption, so you might want to lower these settings (example: 2
workers and 1 thread to always have some computing power left for
workers and 1 thread to always have some computing power left for
other tasks).
other tasks).
- Keep [`PAPERLESS_OCR_MODE`](configuration.md#PAPERLESS_OCR_MODE) at its default value `skip` and consider
- Keep [`PAPERLESS_OCR_MODE`](configuration.md#PAPERLESS_OCR_MODE) at its default value `skip` and consider
OCR'ing your documents before feeding them into paperless. Some
OCRing your documents before feeding them into Paperless. Some
scanners are able to do this!
scanners are able to do this!
- Set [`PAPERLESS_OCR_SKIP_ARCHIVE_FILE`](configuration.md#PAPERLESS_OCR_SKIP_ARCHIVE_FILE) to `with_text` to skip archive
- Set [`PAPERLESS_OCR_SKIP_ARCHIVE_FILE`](configuration.md#PAPERLESS_OCR_SKIP_ARCHIVE_FILE) to `with_text` to skip archive
file generation for already ocr'ed documents, or `always` to skip it
file generation for already OCRed documents, or `always` to skip it
for all documents.
for all documents.
- If you want to perform OCR on the device, consider using
- If you want to perform OCR on the device, consider using
`PAPERLESS_OCR_CLEAN=none`. This will speed up OCR times and use
`PAPERLESS_OCR_CLEAN=none`. This will speed up OCR times and use
less memory at the expense of slightly worse OCR results.
less memory at the expense of slightly worse OCR results.
- If using docker, consider setting [`PAPERLESS_WEBSERVER_WORKERS`](configuration.md#PAPERLESS_WEBSERVER_WORKERS) to 1. This will save some memory.
- If using Docker, consider setting [`PAPERLESS_WEBSERVER_WORKERS`](configuration.md#PAPERLESS_WEBSERVER_WORKERS) to 1. This will save some memory.
- Consider setting [`PAPERLESS_ENABLE_NLTK`](configuration.md#PAPERLESS_ENABLE_NLTK) to false, to disable the
- Consider setting [`PAPERLESS_ENABLE_NLTK`](configuration.md#PAPERLESS_ENABLE_NLTK) to false, to disable the
more advanced language processing, which can take more memory and
more advanced language processing, which can take more memory and
processing time.
processing time.
@@ -727,17 +668,19 @@ For details, refer to [configuration](configuration.md).
Updating the
Updating the
[automatic matching algorithm](advanced_usage.md#automatic-matching) takes quite a bit of time. However, the update mechanism
[automatic matching algorithm](advanced_usage.md#automatic-matching) takes quite a bit of time. However, the update mechanism
checks if your data has changed before doing the heavy lifting. If you
checks if your data has changed before doing the heavy lifting. If you
experience the algorithm taking too much cpu time, consider changing the
experience the algorithm taking too much CPU time, consider changing the
schedule in the admin interface to daily. You can also manually invoke
schedule in the admin interface to daily. You can also manually invoke
the task by changing the date and time of the next run to today/now.
the task by changing the date and time of the next run to today/now.
The actual matching of the algorithm is fast and works on Raspberry Pi
The actual matching of the algorithm is fast and works on Raspberry Pi
as well as on any other device.
as well as on any other device.
# Using nginx as a reverse proxy {#nginx}
## Additional considerations
Please see [the wiki](https://github.com/paperless-ngx/paperless-ngx/wiki/Using-a-Reverse-Proxy-with-Paperless-ngx#nginx) for user-maintained documentation of using nginx with Paperless-ngx.
**Using a reverse proxy with Paperless-ngx**
# Enhancing security {#security}
Please see [the wiki](https://github.com/paperless-ngx/paperless-ngx/wiki/Using-a-Reverse-Proxy-with-Paperless-ngx#nginx) for user-maintained documentation on using nginx with Paperless-ngx.
Please see [the wiki](https://github.com/paperless-ngx/paperless-ngx/wiki/Using-Security-Tools-with-Paperless-ngx) for user-maintained documentation of how to configure security tools like Fail2ban with Paperless-ngx.
**Enhancing security**
Please see [the wiki](https://github.com/paperless-ngx/paperless-ngx/wiki/Using-Security-Tools-with-Paperless-ngx) for user-maintained documentation on configuring security tools like Fail2ban with Paperless-ngx.
As an example, if you are running Paperless-ngx from any Ubuntu or
As an example, if you are running Paperless-ngx from any Ubuntu or
Debian box, and your documents are written in Spanish you may need to
Debian box, and your documents are written in Spanish you may need to
@@ -130,7 +130,7 @@ command:
- 'gotenberg'
- 'gotenberg'
- '--chromium-disable-javascript=true'
- '--chromium-disable-javascript=true'
- '--chromium-allow-list=file:///tmp/.*'
- '--chromium-allow-list=file:///tmp/.*'
- '--api-timeout=60'
- '--api-timeout=60s'
```
```
## Permission denied errors in the consumption directory
## Permission denied errors in the consumption directory
@@ -335,7 +335,7 @@ You may see errors when deleting documents like:
Data too long for column 'transaction_id' at row 1
Data too long for column 'transaction_id' at row 1
```
```
This error can occur in installations which have upgraded from a version of Paperless-ngx that used Django 4 (Paperless-ngx versions prior to v2.13.0) with a MariaDB/MySQL database. Due to the backawards-incompatible change in Django 5, the column "documents_document.transaction_id" will need to be re-created, which can be done with a one-time run of the following management command:
This error can occur in installations which have upgraded from a version of Paperless-ngx that used Django 4 (Paperless-ngx versions prior to v2.13.0) with a MariaDB/MySQL database. Due to the backwards-incompatible change in Django 5, the column "documents_document.transaction_id" will need to be re-created, which can be done with a one-time run of the following management command:
Paperless-ngx is an application that manages your personal documents. With
Paperless-ngx is an application that manages your personal documents. With
the (optional) help of a document scanner (see [the scanners wiki](https://github.com/paperless-ngx/paperless-ngx/wiki/Scanner-&-Software-Recommendations)), Paperless-ngx transforms your unwieldy
the (optional) help of a document scanner (see [the scanners wiki](https://github.com/paperless-ngx/paperless-ngx/wiki/Scanner-&-Software-Recommendations)), Paperless-ngx transforms your unwieldy
@@ -30,6 +34,9 @@ Each document has data fields that you can assign to them:
- A _document type_ is used to demarcate the type of a document such
- A _document type_ is used to demarcate the type of a document such
as letter, bank statement, invoice, contract, etc. It is used to
as letter, bank statement, invoice, contract, etc. It is used to
identify what a document is about.
identify what a document is about.
- The document _storage path_ is the location where the document files
are stored. See [Storage Paths](advanced_usage.md#storage-paths) for
more information.
- The _date added_ of a document is the date the document was scanned
- The _date added_ of a document is the date the document was scanned
into paperless. You cannot and should not change this date.
into paperless. You cannot and should not change this date.
- The _date created_ of a document is the date the document was
- The _date created_ of a document is the date the document was
@@ -89,7 +96,17 @@ and more. These areas allow you to view, add, edit, delete and manage permission
for these objects. You can also manage saved views, mail accounts, mail rules,
for these objects. You can also manage saved views, mail accounts, mail rules,
workflows and more from the management sections.
workflows and more from the management sections.
## Adding documents to paperless
### Nested Tags
Paperless-ngx v2.19 introduces support for nested tags, allowing you to create a
hierarchy of tags, which may be useful for organizing your documents. Tags can
have a 'parent' tag, creating a tree-like structure, to a maximum depth of 5. When
a tag is added to a document, all of its parent tags are also added automatically
and similarly, when a tag is removed from a document, all of its child tags are
also removed. Additionally, assigning a parent to an existing tag will automatically
update all documents that have this tag assigned, adding the parent tag as well.
## Adding documents to Paperless-ngx
Once you've got Paperless setup, you need to start feeding documents
Once you've got Paperless setup, you need to start feeding documents
into it. When adding documents to paperless, it will perform the
into it. When adding documents to paperless, it will perform the
@@ -115,7 +132,8 @@ following operations on your documents:
No matter which options you choose, Paperless will always store the
No matter which options you choose, Paperless will always store the
original document that it found in the consumption directory or in the
original document that it found in the consumption directory or in the
mail and will never overwrite that document. Archived versions are
mail and will never overwrite that document (except when using certain
document actions, which make that clear). Archived versions are
stored alongside the original versions. Any files found in the
stored alongside the original versions. Any files found in the
consumption directory will stored inside the Paperless-ngx file
consumption directory will stored inside the Paperless-ngx file
structure and will not be retained in the consumption directory.
structure and will not be retained in the consumption directory.
@@ -159,7 +177,7 @@ process.
Please see [the wiki](https://github.com/paperless-ngx/paperless-ngx/wiki/Related-Projects) for a user-maintained list of related projects and
Please see [the wiki](https://github.com/paperless-ngx/paperless-ngx/wiki/Related-Projects) for a user-maintained list of related projects and
software (e.g. for mobile devices) that is compatible with Paperless-ngx.
software (e.g. for mobile devices) that is compatible with Paperless-ngx.
### Email {#usage-email}
### Incoming Email {#incoming-mail}
You can tell paperless-ngx to consume documents from your email
You can tell paperless-ngx to consume documents from your email
accounts. This is a very flexible and powerful feature, if you regularly
accounts. This is a very flexible and powerful feature, if you regularly
@@ -247,6 +265,10 @@ different means. These are as follows:
Paperless is set up to check your mails every 10 minutes. This can be
Paperless is set up to check your mails every 10 minutes. This can be
configured via [`PAPERLESS_EMAIL_TASK_CRON`](configuration.md#PAPERLESS_EMAIL_TASK_CRON)
configured via [`PAPERLESS_EMAIL_TASK_CRON`](configuration.md#PAPERLESS_EMAIL_TASK_CRON)
#### Processed Mail
Paperless keeps track of emails it has processed in order to avoid processing the same mail multiple times. This uses the message `UID` provided by the mail server, which should be unique for each message. You can view and manage processed mails from the web UI under Mail > Processed Mails. If you need to re-process a message, you can delete the corresponding processed mail entry, which will allow Paperless-ngx to process the email again the next time the mail fetch task runs.
#### OAuth Email Setup
#### OAuth Email Setup
Paperless-ngx supports OAuth2 authentication for Gmail and Outlook email accounts. To set up an email account with OAuth2, you will need to create a 'developer' app with the respective provider and obtain the client ID and client secret and set the appropriate [configuration variables](configuration.md#email_oauth). You will also need to set either [`PAPERLESS_OAUTH_CALLBACK_BASE_URL`](configuration.md#PAPERLESS_OAUTH_CALLBACK_BASE_URL) or [`PAPERLESS_URL`](configuration.md#PAPERLESS_URL) to the correct value for the OAuth2 flow to work correctly.
Paperless-ngx supports OAuth2 authentication for Gmail and Outlook email accounts. To set up an email account with OAuth2, you will need to create a 'developer' app with the respective provider and obtain the client ID and client secret and set the appropriate [configuration variables](configuration.md#email_oauth). You will also need to set either [`PAPERLESS_OAUTH_CALLBACK_BASE_URL`](configuration.md#PAPERLESS_OAUTH_CALLBACK_BASE_URL) or [`PAPERLESS_URL`](configuration.md#PAPERLESS_URL) to the correct value for the OAuth2 flow to work correctly.
@@ -260,6 +282,31 @@ Once setup, navigating to the email settings page in Paperless-ngx will allow yo
You can also submit a document using the REST API, see [POSTing documents](api.md#file-uploads)
You can also submit a document using the REST API, see [POSTing documents](api.md#file-uploads)
for details.
for details.
## Sharing documents from Paperless-ngx
Paperless-ngx supports sharing documents with other users by assigning them [permissions](#object-permissions)
to the document. Document files can also be shared externally via [share links](#share-links), [email](#email-sharing)
or using [email](#workflow-action-email) or [webhook](#workflow-action-webhook) actions in workflows.
### Share Links
"Share links" are shareable public links to files and can be created and managed under the 'Send' button on the document detail screen.
- Share links do not require a user to login and thus link directly to a file.
- Links are unique and are of the form `{paperless-url}/share/{randomly-generated-slug}`.
- Links can optionally have an expiration time set.
- After a link expires or is deleted users will be redirected to the regular paperless-ngx login.
!!! tip
If your paperless-ngx instance is behind a reverse-proxy you may want to create an exception to bypass any authentication layers that are part of your setup in order to make links truly publicly-accessible. Of course, do so with caution.
### Email Sharing {#email-sharing}
Paperless-ngx supports directly sending documents via email. If an email server has been [configured](configuration.md#email-sending)
the "Send" button on the document detail page will include an "Email" option. You can also share files via email automatically by using
a [workflow action](#workflow-action-email).
## Permissions
## Permissions
Permissions in Paperless-ngx are based around ['global' permissions](#global-permissions) as well as
Permissions in Paperless-ngx are based around ['global' permissions](#global-permissions) as well as
@@ -301,6 +348,11 @@ permissions can be granted to limit access to certain parts of the UI (and corre
Superusers can access all parts of the front and backend application as well as any and all objects. Superuser status can only be granted by another superuser.
Superusers can access all parts of the front and backend application as well as any and all objects. Superuser status can only be granted by another superuser.
!!! tip
Because superuser accounts can see all objects and documents, you may want to use a regular account for day-to-day use. Additional superuser accounts can
be created via [cli](administration.md#create-superuser) or granted superuser status from an existing superuser account.
#### Admin Status
#### Admin Status
Admin status (Django 'staff status') grants access to viewing the paperless logs and the system status dialog
Admin status (Django 'staff status') grants access to viewing the paperless logs and the system status dialog
@@ -312,25 +364,25 @@ Global permissions define what areas of the app and API endpoints users can acce
determine if a user can create, edit, delete or view _any_ documents, but individual documents themselves
determine if a user can create, edit, delete or view _any_ documents, but individual documents themselves
| UISettings | Add, edit, delete or view the UI settings that are used by the web app.<br/>:warning: **Users that will access the web UI must be granted at least _View_ permissions.** |
| UISettings | Add, edit, delete or view the UI settings that are used by the web app.<br/>:warning: **Users that will access the web UI must be granted at least _View_ permissions.** |
| User | Add, edit, delete or view Users. |
| User | Add, edit, delete or view Users. |
| Workflow | Add, edit, delete or view Workflows.<br/>Note that Workflows are global, in other words all users who can access workflows have access to the same set of them. |
| Workflow | Add, edit, delete or view Workflows.<br/>Note that Workflows are global; all users who can access workflows see the same set. Workflows have other permission implications — see [Workflow permissions](#workflow-permissions). |
#### Detailed Explanation of Object Permissions {#object-permissions}
#### Detailed Explanation of Object Permissions {#object-permissions}
@@ -369,9 +421,9 @@ fields and permissions, which will be merged.
### Workflow Triggers
### Workflow Triggers
#### Types
#### Types {#workflow-trigger-types}
Currently, there are three events that correspond to workflow trigger 'types':
Currently, there are four events that correspond to workflow trigger 'types':
1.**Consumption Started**: _before_ a document is consumed, so events can include filters by source (mail, consumption
1.**Consumption Started**: _before_ a document is consumed, so events can include filters by source (mail, consumption
folder or API), file path, file name, mail rule
folder or API), file path, file name, mail rule
@@ -379,11 +431,12 @@ Currently, there are three events that correspond to workflow trigger 'types':
but the document content has been extracted and metadata such as document type, tags, etc. have been set, so these can now
but the document content has been extracted and metadata such as document type, tags, etc. have been set, so these can now
be used for filtering.
be used for filtering.
3.**Document Updated**: when a document is updated. Similar to 'added' events, triggers can include filtering by content matching,
3.**Document Updated**: when a document is updated. Similar to 'added' events, triggers can include filtering by content matching,
tags, doc type, or correspondent.
tags, doc type, correspondent or storage path.
4.**Scheduled**: a scheduled trigger that can be used to run workflows at a specific time. The date used can be either the document
4.**Scheduled**: a scheduled trigger that can be used to run workflows at a specific time. The date used can be either the document
added, created, updated date or you can specify a (date) custom field. You can also specify a day offset from the date.
added, created, updated date or you can specify a (date) custom field. You can also specify a day offset from the date (positive
offsets will trigger after the date, negative offsets will trigger before).
The following flow diagram illustrates the three document trigger types:
The following flow diagram illustrates the four document trigger types:
```mermaid
```mermaid
flowchart TD
flowchart TD
@@ -399,6 +452,10 @@ flowchart TD
'Updated'
'Updated'
trigger(s)"}
trigger(s)"}
scheduled{"Documents
matching
trigger(s)"}
A[New Document] --> consumption
A[New Document] --> consumption
consumption --> |Yes| C[Workflow Actions Run]
consumption --> |Yes| C[Workflow Actions Run]
consumption --> |No| D
consumption --> |No| D
@@ -411,6 +468,11 @@ flowchart TD
updated --> |Yes| J[Workflow Actions Run]
updated --> |Yes| J[Workflow Actions Run]
updated --> |No| K
updated --> |No| K
J --> K[Document Saved]
J --> K[Document Saved]
L[Scheduled Task Check<br/>hourly at :05] --> M[Get All Scheduled Triggers]
M --> scheduled
scheduled --> |Yes| N[Workflow Actions Run]
scheduled --> |No| O[Document Saved]
N --> O
```
```
#### Filters {#workflow-trigger-filters}
#### Filters {#workflow-trigger-filters}
@@ -418,22 +480,32 @@ flowchart TD
Workflows allow you to filter by:
Workflows allow you to filter by:
- Source, e.g. documents uploaded via consume folder, API (& the web UI) and mail fetch
- Source, e.g. documents uploaded via consume folder, API (& the web UI) and mail fetch
- File name, including wildcards e.g. \*.pdf will apply to all pdfs
- File name, including wildcards e.g. \*.pdf will apply to all pdfs.
- File path, including wildcards. Note that enabling `PAPERLESS_CONSUMER_RECURSIVE` would allow, for
- File path, including wildcards. Note that enabling `PAPERLESS_CONSUMER_RECURSIVE` would allow, for
example, automatically assigning documents to different owners based on the upload directory.
example, automatically assigning documents to different owners based on the upload directory.
- Mail rule. Choosing this option will force 'mail fetch' to be the workflow source.
- Mail rule. Choosing this option will force 'mail fetch' to be the workflow source.
- Content matching (`Added` and `Updated` triggers only). Filter document content using the matching settings.
- Content matching (`Added`, `Updated` and `Scheduled` triggers only). Filter document content using the matching settings.
- Tags (`Added` and `Updated` triggers only). Filter for documents with any of the specified tags
- Document type (`Added` and`Updated`triggers only). Filter documents with this doc type
There are also 'advanced' filters available for `Added`,`Updated`and `Scheduled` triggers:
- Correspondent (`Added` and `Updated` triggers only). Filter documents with this correspondent
- Any Tags: Filter for documents with any of the specified tags.
- All Tags: Filter for documents with all of the specified tags.
- No Tags: Filter for documents with none of the specified tags.
- Document type: Filter documents with this document type.
- Not Document types: Filter documents without any of these document types.
- Correspondent: Filter documents with this correspondent.
- Not Correspondents: Filter documents without any of these correspondents.
- Storage path: Filter documents with this storage path.
- Not Storage paths: Filter documents without any of these storage paths.
- Custom field query: Filter documents with a custom field query (the same as used for the document list filters).
### Workflow Actions
### Workflow Actions
#### Types
#### Types {#workflow-action-types}
The following workflow action types are available:
The following workflow action types are available:
##### Assignment
##### Assignment {#workflow-action-assignment}
"Assignment" actions can assign:
"Assignment" actions can assign:
@@ -443,7 +515,7 @@ The following workflow action types are available:
- View and / or edit permissions to users or groups
- View and / or edit permissions to users or groups
- Custom fields. Note that no value for the field will be set
- Custom fields. Note that no value for the field will be set
##### Removal
##### Removal {#workflow-action-removal}
"Removal" actions can remove either all of or specific sets of the following:
"Removal" actions can remove either all of or specific sets of the following:
@@ -452,7 +524,7 @@ The following workflow action types are available:
- View and / or edit permissions
- View and / or edit permissions
- Custom fields
- Custom fields
##### Email
##### Email {#workflow-action-email}
"Email" actions can send documents via email. This action requires a mail server to be [configured](configuration.md#email-sending). You can specify:
"Email" actions can send documents via email. This action requires a mail server to be [configured](configuration.md#email-sending). You can specify:
@@ -460,7 +532,7 @@ The following workflow action types are available:
- The subject and body of the email, which can include placeholders, see [placeholders](usage.md#workflow-placeholders) below
- The subject and body of the email, which can include placeholders, see [placeholders](usage.md#workflow-placeholders) below
- Whether to include the document as an attachment
- Whether to include the document as an attachment
##### Webhook
##### Webhook {#workflow-action-webhook}
"Webhook" actions send a POST request to a specified URL. You can specify:
"Webhook" actions send a POST request to a specified URL. You can specify:
@@ -469,44 +541,66 @@ The following workflow action types are available:
- Encoding for the request body, either JSON or form data
- Encoding for the request body, either JSON or form data
- The request headers as key-value pairs
- The request headers as key-value pairs
For security reasons, webhooks can be limited to specific ports and disallowed from connecting to local URLs. See the relevant
[configuration settings](configuration.md#workflow-webhooks) to change this behavior. If you are allowing non-admins to create workflows,
you may want to adjust these settings to prevent abuse.
#### Workflow placeholders
#### Workflow placeholders
Some workflow text can include placeholders but the available options differ depending on the type of
Titles can be assigned by workflows using [Jinja templates](https://jinja.palletsprojects.com/en/3.1.x/templates/).
workflow trigger. This is becauseat the time of consumption (when the text is to be set), no automatic tags etc. have been
This allows for complex logic to beused to generate the title, including [logical structures](https://jinja.palletsprojects.com/en/3.1.x/templates/#list-of-control-structures)
applied. You can use the following placeholders with any trigger type:
and [filters](https://jinja.palletsprojects.com/en/3.1.x/templates/#id11).
The template is provided as a string.
-`{correspondent}`: assigned correspondent name
Using Jinja2 Templates is also useful for [Date localization](advanced_usage.md#Date-Localization) in the title.
-`{document_type}`: assigned document type name
-`{owner_username}`: assigned owner username
The available inputs differ depending on the type of workflow trigger.
-`{added}`: added datetime
This is because at the time of consumption (when the text is to be set), no automatic tags etc. have been
-`{added_year}`: added year
applied. You can use the following placeholders in the template with any trigger type:
-`{added_year_short}`: added year
-`{added_month}`: added month
-`{{correspondent}}`: assigned correspondent name
-`{added_month_name}`: added month name
-`{{document_type}}`: assigned document type name
-`{added_month_name_short}`: added month short name
-`{{owner_username}}`: assigned owner username
-`{added_day}`: added day
-`{{added}}`: added datetime
-`{added_time}`: added time in HH:MM format
-`{{added_year}}`: added year
-`{original_filename}`: original file name without extension
-`{{added_year_short}}`: added year
-`{filename}`: current file name without extension
-`{{added_month}}`: added month
-`{{added_month_name}}`: added month name
-`{{added_month_name_short}}`: added month short name
-`{{added_day}}`: added day
-`{{added_time}}`: added time in HH:MM format
-`{{original_filename}}`: original file name without extension
-`{{filename}}`: current file name without extension (for "added" workflows this may not be final yet, you can use `{{original_filename}}`)
-`{{doc_title}}`: current document title
The following placeholders are only available for "added" or "updated" triggers
The following placeholders are only available for "added" or "updated" triggers
-`{created}`: created datetime
-`{{created}}`: created datetime
-`{created_year}`: created year
-`{{created_year}}`: created year
-`{created_year_short}`: created year
-`{{created_year_short}}`: created year
-`{created_month}`: created month
-`{{created_month}}`: created month
-`{created_month_name}`: created month name
-`{{created_month_name}}`: created month name
-`{created_month_name_short}`: created month short name
-`{created_month_name_short}}`: created month short name
-`{created_day}`: created day
-`{{created_day}}`: created day
-`{created_time}`: created time in HH:MM format
-`{{created_time}}`: created time in HH:MM format
-`{doc_url}`: URL to the document in the web UI. Requires the `PAPERLESS_URL` setting to be set.
-`{{doc_url}}`: URL to the document in the web UI. Requires the `PAPERLESS_URL` setting to be set.
##### Examples
```jinja2
{{ created | localize_date('MMMM', 'en_US') }}
<!-- Output: "January" -->
{{ added | localize_date('MMMM', 'de_DE') }}
<!-- Output: "Juni" --> # codespell:ignore
```
### Workflow permissions
### Workflow permissions
All users who have application permissions for editing workflows can see the same set
All users who have application permissions for editing workflows can see the same set
of workflows. In other words, workflows themselves intentionally do not have an owner or permissions.
of workflows. In other words, workflows themselves intentionally do not have an owner or permissions.
Given their potentially far-reaching capabilities, you may want to restrict access to workflows.
Given their potentially far-reaching capabilities, including changing the permissions of existing documents, you may want to restrict access to workflows.
Upon migration, existing installs will grant access to workflows to users who can add
Upon migration, existing installs will grant access to workflows to users who can add
documents (and superusers who can always access all parts of the app).
documents (and superusers who can always access all parts of the app).
@@ -544,27 +638,16 @@ The following custom field types are supported:
-`Document Link`: reference(s) to other document(s) displayed as links, automatically creates a symmetrical link in reverse
-`Document Link`: reference(s) to other document(s) displayed as links, automatically creates a symmetrical link in reverse
-`Select`: a pre-defined list of strings from which the user can choose
-`Select`: a pre-defined list of strings from which the user can choose
## Share Links
Paperless-ngx added the ability to create shareable links to files in version 2.0. You can find the button for this on the document detail screen.
- Share links do not require a user to login and thus link directly to a file.
- Links are unique and are of the form `{paperless-url}/share/{randomly-generated-slug}`.
- Links can optionally have an expiration time set.
- After a link expires or is deleted users will be redirected to the regular paperless-ngx login.
!!! tip
If your paperless-ngx instance is behind a reverse-proxy you may want to create an exception to bypass any authentication layers that are part of your setup in order to make links truly publicly-accessible. Of course, do so with caution.
## PDF Actions
## PDF Actions
Paperless-ngx supports four basic editing operations for PDFs (these operations currently cannot be performed on non-PDF files):
Paperless-ngx supports basic editing operations for PDFs (these operations currently cannot be performed on non-PDF files). When viewing an individual document you can
open the 'PDF Editor' to use a simple UI for re-arranging, rotating, deleting pages and splitting documents.
- Merging documents: available when selecting multiple documents for 'bulk editing'.
- Merging documents: available when selecting multiple documents for 'bulk editing'.
- Rotating documents: available when selecting multiple documents for 'bulk editing' and from an individual document's details page.
- Rotating documents: available when selecting multiple documents for 'bulk editing' and via the pdf editor on an individual document's details page.
- Splitting documents: available from an individual document's details page.
- Splitting documents: via the pdf editor on an individual document's details page.
- Deleting pages: available from an individual document's details page.
- Deleting pages: via the pdf editor on an individual document's details page.
- Re-arranging pages: via the pdf editor on an individual document's details page.
!!! important
!!! important
@@ -582,7 +665,7 @@ When you first delete a document it is moved to the 'trash' until either it is e
You can set how long documents remain in the trash before being automatically deleted with [`PAPERLESS_EMPTY_TRASH_DELAY`](configuration.md#PAPERLESS_EMPTY_TRASH_DELAY), which defaults
You can set how long documents remain in the trash before being automatically deleted with [`PAPERLESS_EMPTY_TRASH_DELAY`](configuration.md#PAPERLESS_EMPTY_TRASH_DELAY), which defaults
to 30 days. Until the file is actually deleted (e.g. the trash is emptied), all files and database content remains intact and can be restored at any point up until that time.
to 30 days. Until the file is actually deleted (e.g. the trash is emptied), all files and database content remains intact and can be restored at any point up until that time.
Additionally you may configure a directory where deleted files are moved to when they the trash is emptied with [`PAPERLESS_EMPTY_TRASH_DIR`](configuration.md#PAPERLESS_EMPTY_TRASH_DIR).
Additionally you may configure a directory where deleted files are moved to when the trash is emptied with [`PAPERLESS_EMPTY_TRASH_DIR`](configuration.md#PAPERLESS_EMPTY_TRASH_DIR).
Note that the empty trash directory only stores the original file, the archive file and all database information is permanently removed once a document is fully deleted.
Note that the empty trash directory only stores the original file, the archive file and all database information is permanently removed once a document is fully deleted.
<aclass="btn btn-sm btn-link"title="Read the documentation about this setting"i18n-title[href]="getDocsUrl(option.config_key)"target="_blank"referrerpolicy="no-referrer">
<i-bsname="info-circle"></i-bs>
</a>
</h6>
</h6>
<aclass="btn btn-sm btn-link"title="Read the documentation about this setting"i18n-title[href]="getDocsUrl(option.config_key)"target="_blank"referrerpolicy="no-referrer">
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.