Compare commits

..

1 Commits

Author SHA1 Message Date
Trenton H 2fbf975903 Experimetns with using magika instead of magic for mime detection 2026-04-23 09:09:33 -07:00
105 changed files with 7708 additions and 7224 deletions
+606 -301
View File
File diff suppressed because it is too large Load Diff
+5250 -3762
View File
File diff suppressed because it is too large Load Diff
-24
View File
@@ -1,29 +1,5 @@
# Changelog
## paperless-ngx 2.20.15
### Security
- Resolve [GHSA-96jx-fj7m-qh6x](https://github.com/paperless-ngx/paperless-ngx/security/advisories/GHSA-8c6x-pfjq-9gr7)
### Bug Fixes
- Fix: use only allauth login/logout endpoints [@shamoon](https://github.com/shamoon) ([#12639](https://github.com/paperless-ngx/paperless-ngx/pull/12639))
- Fix: correctly scope mail account enumeration [@shamoon](https://github.com/shamoon) ([#12636](https://github.com/paperless-ngx/paperless-ngx/pull/12636))
- Fix: prevent intermediate change event when CustomFieldQueryAtom operator changes type [@ggouzi](https://github.com/ggouzi) ([#12597](https://github.com/paperless-ngx/paperless-ngx/pull/12597))
- Fix: reject invalid requests to API notes endpoint [@ggouzi](https://github.com/ggouzi) ([#12582](https://github.com/paperless-ngx/paperless-ngx/pull/12582))
### All App Changes
<details>
<summary>4 changes</summary>
- Fix: use only allauth login/logout endpoints [@shamoon](https://github.com/shamoon) ([#12639](https://github.com/paperless-ngx/paperless-ngx/pull/12639))
- Fix: correctly scope mail account enumeration [@shamoon](https://github.com/shamoon) ([#12636](https://github.com/paperless-ngx/paperless-ngx/pull/12636))
- Fix: prevent intermediate change event when CustomFieldQueryAtom operator changes type [@ggouzi](https://github.com/ggouzi) ([#12597](https://github.com/paperless-ngx/paperless-ngx/pull/12597))
- Fix: reject invalid requests to API notes endpoint [@ggouzi](https://github.com/ggouzi) ([#12582](https://github.com/paperless-ngx/paperless-ngx/pull/12582))
</details>
## paperless-ngx 2.20.14
### Bug Fixes
+14 -22
View File
@@ -2014,57 +2014,49 @@ suggestions. This setting is required to be set to true in order to use the AI f
#### [`PAPERLESS_AI_LLM_EMBEDDING_BACKEND=<str>`](#PAPERLESS_AI_LLM_EMBEDDING_BACKEND) {#PAPERLESS_AI_LLM_EMBEDDING_BACKEND}
: The embedding backend to use for RAG. This can be either "openai-like" or "huggingface". The
"openai-like" backend uses an OpenAI-compatible embeddings API.
: The embedding backend to use for RAG. This can be either "openai" or "huggingface".
Defaults to None.
#### [`PAPERLESS_AI_LLM_EMBEDDING_MODEL=<str>`](#PAPERLESS_AI_LLM_EMBEDDING_MODEL) {#PAPERLESS_AI_LLM_EMBEDDING_MODEL}
: The model to use for the embedding backend for RAG. This can be set to any of the embedding
models supported by the current embedding backend. If not supplied, defaults to
"text-embedding-3-small" for the OpenAI-compatible backend and
"sentence-transformers/all-MiniLM-L6-v2" for Huggingface.
: The model to use for the embedding backend for RAG. This can be set to any of the embedding models supported by the current embedding backend. If not supplied, defaults to "text-embedding-3-small" for OpenAI and "sentence-transformers/all-MiniLM-L6-v2" for Huggingface.
Defaults to None.
#### [`PAPERLESS_AI_LLM_BACKEND=<str>`](#PAPERLESS_AI_LLM_BACKEND) {#PAPERLESS_AI_LLM_BACKEND}
: The AI backend to use. This can be either "openai-like" or "ollama". If set to "ollama", the AI
features will be run locally on your machine. If set to "openai-like", the AI features will use
an OpenAI-compatible API endpoint, including OpenAI itself and compatible providers. This
setting is required to be set to use the AI features.
: The AI backend to use. This can be either "openai" or "ollama". If set to "ollama", the AI
features will be run locally on your machine. If set to "openai", the AI features will be run
using the OpenAI API. This setting is required to be set to use the AI features.
Defaults to None.
!!! note
Remote AI providers may be paid services. If you use a hosted OpenAI-compatible API, you
are responsible for any usage charges incurred by Paperless-ngx features, and your
document data will be sent to the provider you configure.
The OpenAI API is a paid service. You will need to set up an OpenAI account and
will be charged for usage incurred by Paperless-ngx features and your document data
will (of course) be sent to the OpenAI API. Paperless-ngx does not endorse the use of the
OpenAI API in any way.
Paperless-ngx does not endorse any specific provider. Refer to your provider's terms of
service and privacy policy, and use at your own risk.
Refer to the OpenAI terms of service, and use at your own risk.
#### [`PAPERLESS_AI_LLM_MODEL=<str>`](#PAPERLESS_AI_LLM_MODEL) {#PAPERLESS_AI_LLM_MODEL}
: The model to use for the AI backend, i.e. "gpt-3.5-turbo", "gpt-4" or any of the models supported
by the current backend. If not supplied, defaults to "gpt-3.5-turbo" for the OpenAI-compatible
backend and "llama3.1" for Ollama.
: The model to use for the AI backend, i.e. "gpt-3.5-turbo", "gpt-4" or any of the models supported by the
current backend. If not supplied, defaults to "gpt-3.5-turbo" for OpenAI and "llama3.1" for Ollama.
Defaults to None.
#### [`PAPERLESS_AI_LLM_API_KEY=<str>`](#PAPERLESS_AI_LLM_API_KEY) {#PAPERLESS_AI_LLM_API_KEY}
: The API key to use for the AI backend. This is typically required for the OpenAI-compatible
backend (optional for others).
: The API key to use for the AI backend. This is required for the OpenAI backend (optional for others).
Defaults to None.
#### [`PAPERLESS_AI_LLM_ENDPOINT=<str>`](#PAPERLESS_AI_LLM_ENDPOINT) {#PAPERLESS_AI_LLM_ENDPOINT}
: The endpoint / url to use for the AI backend. This is required for the Ollama backend and may be
used with the OpenAI-compatible backend to target a custom provider or local gateway.
: The endpoint / url to use for the AI backend. This is required for the Ollama backend (optional for others).
Defaults to None.
+2 -8
View File
@@ -302,19 +302,13 @@ Paperless-ngx includes several features that use AI to enhance the document mana
!!! warning
Remember that Paperless-ngx will send document content to the AI provider you have configured,
so consider the privacy implications of using these features, especially if using a remote
model or API provider instead of the default local model.
Remember that Paperless-ngx will send document content to the AI provider you have configured, so consider the privacy implications of using these features, especially if using a remote model (e.g. OpenAI), instead of the default local model.
The AI features work by creating an embedding of the text content and metadata of documents, which is then used for various tasks such as similarity search and question answering. This uses the FAISS vector store.
### AI-Enhanced Suggestions
If enabled, Paperless-ngx can use an AI LLM model to suggest document titles, dates, tags,
correspondents and document types for documents. This feature will always be "opt-in" and does not
disable the existing classifier-based suggestion system. Currently, both remote
(via OpenAI-compatible APIs) and local (via Ollama) models are supported, see
[configuration](configuration.md#ai) for details.
If enabled, Paperless-ngx can use an AI LLM model to suggest document titles, dates, tags, correspondents and document types for documents. This feature will always be "opt-in" and does not disable the existing classifier-based suggestion system. Currently, both remote (via the OpenAI API) and local (via Ollama) models are supported, see [configuration](configuration.md#ai) for details.
### Document Chat
+2 -16
View File
@@ -30,25 +30,11 @@
"**/.idea": true,
"**/.venv": true,
"**/.coverage": true,
"**/coverage.json": true,
"htmlcov/": true,
"coverage.xml": true,
"junit.xml": true
"**/coverage.json": true
},
"python.languageServer": "Pylance",
"python.defaultInterpreterPath": "${workspaceFolder:paperless-ngx}/.venv/bin/python3",
"python.analysis.extraPaths": ["${workspaceFolder:paperless-ngx}/src"],
"python.defaultInterpreterPath": ".venv/bin/python3",
"python.analysis.inlayHints.pytestParameters": true,
"python.testing.pytestEnabled": true,
"python.testing.unittestEnabled": false,
"[python]": {
"editor.defaultFormatter": "charliermarsh.ruff",
"editor.formatOnSave": true,
"editor.codeActionsOnSave": {
"source.fixAll.ruff": "explicit",
"source.organizeImports.ruff": "explicit"
}
}
},
"extensions": {
"recommendations": ["ms-python.python", "charliermarsh.ruff", "editorconfig.editorconfig"],
+6 -9
View File
@@ -1,6 +1,6 @@
[project]
name = "paperless-ngx"
version = "2.20.15"
version = "2.20.14"
description = "A community-supported supercharged document management system: scan, index and archive all your physical documents"
readme = "README.md"
requires-python = ">=3.11"
@@ -53,10 +53,11 @@ dependencies = [
"langdetect~=1.0.9",
"llama-index-core>=0.14.12",
"llama-index-embeddings-huggingface>=0.6.1",
"llama-index-embeddings-openai-like>=0.2.2",
"llama-index-embeddings-openai>=0.5.1",
"llama-index-llms-ollama>=0.9.1",
"llama-index-llms-openai-like>=0.7.1",
"llama-index-llms-openai>=0.6.13",
"llama-index-vector-stores-faiss>=0.5.2",
"magika>=1.0.2",
"nltk~=3.9.1",
"ocrmypdf~=17.4.0",
"openai>=1.76",
@@ -66,14 +67,13 @@ dependencies = [
"python-dotenv~=1.2.1",
"python-gnupg~=0.5.4",
"python-ipware~=3.0.0",
"python-magic~=0.4.27",
"rapidfuzz~=3.14.0",
"redis[hiredis]~=5.2.1",
"regex>=2025.9.18",
"scikit-learn~=1.8.0",
"sentence-transformers>=4.1",
"setproctitle~=1.3.4",
"tantivy~=0.26.0",
"tantivy>=0.25.1",
"tika-client~=0.11.0",
"torch~=2.11.0",
"watchfiles>=1.1.1",
@@ -143,8 +143,7 @@ typing = [
"types-python-dateutil",
"types-pytz",
"types-redis",
"types-regex",
"types-setuptools"
"types-setuptools",
]
[tool.uv]
@@ -179,8 +178,6 @@ respect-gitignore = true
fix = true
show-fixes = true
output-format = "grouped"
[tool.ruff.format]
line-ending = "lf"
[tool.ruff.lint]
# https://docs.astral.sh/ruff/rules/
extend-select = [
+95 -131
View File
@@ -1834,7 +1834,7 @@
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.ts</context>
<context context-type="linenumber">54</context>
<context context-type="linenumber">53</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/custom-fields-query-dropdown/custom-fields-query-dropdown.component.html</context>
@@ -1865,40 +1865,40 @@
<context context-type="linenumber">37</context>
</context-group>
</trans-unit>
<trans-unit id="4880728824338713664" datatype="html">
<source>Filter by</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.html</context>
<context context-type="linenumber">55</context>
</context-group>
</trans-unit>
<trans-unit id="424356320420294719" datatype="html">
<source>All types</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.html</context>
<context context-type="linenumber">60</context>
<context context-type="linenumber">58</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.ts</context>
<context context-type="linenumber">209</context>
<context context-type="linenumber">202</context>
</context-group>
</trans-unit>
<trans-unit id="131016739441837046" datatype="html">
<source>All sources</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.html</context>
<context context-type="linenumber">69</context>
<context context-type="linenumber">67</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.ts</context>
<context context-type="linenumber">221</context>
<context context-type="linenumber">214</context>
</context-group>
</trans-unit>
<trans-unit id="4880728824338713664" datatype="html">
<source>Filter by</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.html</context>
<context context-type="linenumber">76</context>
</context-group>
</trans-unit>
<trans-unit id="6849725902312323996" datatype="html">
<source>Reset filters</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.html</context>
<context context-type="linenumber">99</context>
<context context-type="linenumber">98</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/document-list/document-list.component.html</context>
@@ -1913,18 +1913,18 @@
<source>{VAR_PLURAL, plural, =1 {1 task} other {<x id="INTERPOLATION"/> tasks}}</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.html</context>
<context context-type="linenumber">121</context>
<context context-type="linenumber">108</context>
</context-group>
</trans-unit>
<trans-unit id="8953033926734869941" datatype="html">
<source>Name</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.html</context>
<context context-type="linenumber">143</context>
<context context-type="linenumber">130</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.ts</context>
<context context-type="linenumber">49</context>
<context context-type="linenumber">48</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/trash/trash.component.html</context>
@@ -2031,7 +2031,7 @@
<source>Created</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.html</context>
<context context-type="linenumber">144</context>
<context context-type="linenumber">131</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/dates-dropdown/dates-dropdown.component.html</context>
@@ -2062,21 +2062,21 @@
<source>Results</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.html</context>
<context context-type="linenumber">146</context>
<context context-type="linenumber">133</context>
</context-group>
</trans-unit>
<trans-unit id="314315645942131479" datatype="html">
<source>Info</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.html</context>
<context context-type="linenumber">148</context>
<context context-type="linenumber">135</context>
</context-group>
</trans-unit>
<trans-unit id="3193976279273491157" datatype="html">
<source>Actions</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.html</context>
<context context-type="linenumber">149</context>
<context context-type="linenumber">136</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/trash/trash.component.html</context>
@@ -2147,18 +2147,18 @@
<source>click for full output</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.html</context>
<context context-type="linenumber">200</context>
<context context-type="linenumber">187</context>
</context-group>
</trans-unit>
<trans-unit id="1536087519743707362" datatype="html">
<source>Dismiss</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.html</context>
<context context-type="linenumber">213</context>
<context context-type="linenumber">200</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.ts</context>
<context context-type="linenumber">310</context>
<context context-type="linenumber">298</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/document-detail/document-detail.component.ts</context>
@@ -2173,28 +2173,28 @@
<source>Open Document</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.html</context>
<context context-type="linenumber">218</context>
<context context-type="linenumber">205</context>
</context-group>
</trans-unit>
<trans-unit id="5404759957685833020" datatype="html">
<source>Result message</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.html</context>
<context context-type="linenumber">231</context>
<context context-type="linenumber">218</context>
</context-group>
</trans-unit>
<trans-unit id="6621329748219109148" datatype="html">
<source>Duplicate</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.html</context>
<context context-type="linenumber">238</context>
<context context-type="linenumber">225</context>
</context-group>
</trans-unit>
<trans-unit id="7593555694782789615" datatype="html">
<source>Open</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.html</context>
<context context-type="linenumber">246</context>
<context context-type="linenumber">233</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/app-frame/global-search/global-search.component.html</context>
@@ -2225,77 +2225,77 @@
<source>Input data</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.html</context>
<context context-type="linenumber">256</context>
<context context-type="linenumber">243</context>
</context-group>
</trans-unit>
<trans-unit id="1585185618099050920" datatype="html">
<source>Result data</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.html</context>
<context context-type="linenumber">262</context>
<context context-type="linenumber">249</context>
</context-group>
</trans-unit>
<trans-unit id="7976920528153858271" datatype="html">
<source>No tasks match the current filters.</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.html</context>
<context context-type="linenumber">284</context>
<context context-type="linenumber">271</context>
</context-group>
</trans-unit>
<trans-unit id="2525230676386818985" datatype="html">
<source>Result</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.ts</context>
<context context-type="linenumber">50</context>
<context context-type="linenumber">49</context>
</context-group>
</trans-unit>
<trans-unit id="2952127331314675199" datatype="html">
<source>Needs attention</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.ts</context>
<context context-type="linenumber">55</context>
<context context-type="linenumber">54</context>
</context-group>
</trans-unit>
<trans-unit id="3307214558906712688" datatype="html">
<source>In progress</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.ts</context>
<context context-type="linenumber">56</context>
<context context-type="linenumber">55</context>
</context-group>
</trans-unit>
<trans-unit id="1371178633976505412" datatype="html">
<source>Recently completed</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.ts</context>
<context context-type="linenumber">57</context>
<context context-type="linenumber">56</context>
</context-group>
</trans-unit>
<trans-unit id="2936452645673957909" datatype="html">
<source>Consume File</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.ts</context>
<context context-type="linenumber">66</context>
<context context-type="linenumber">65</context>
</context-group>
</trans-unit>
<trans-unit id="5622386857946132209" datatype="html">
<source>Train Classifier</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.ts</context>
<context context-type="linenumber">70</context>
<context context-type="linenumber">69</context>
</context-group>
</trans-unit>
<trans-unit id="776309955507105337" datatype="html">
<source>Sanity Check</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.ts</context>
<context context-type="linenumber">74</context>
<context context-type="linenumber">73</context>
</context-group>
</trans-unit>
<trans-unit id="7502272564743467653" datatype="html">
<source>Mail Fetch</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.ts</context>
<context context-type="linenumber">76</context>
<context context-type="linenumber">75</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/edit-dialog/workflow-edit-dialog/workflow-edit-dialog.component.ts</context>
@@ -2306,56 +2306,56 @@
<source>LLM Index</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.ts</context>
<context context-type="linenumber">77</context>
<context context-type="linenumber">76</context>
</context-group>
</trans-unit>
<trans-unit id="6402092370576716734" datatype="html">
<source>Empty Trash</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.ts</context>
<context context-type="linenumber">80</context>
<context context-type="linenumber">79</context>
</context-group>
</trans-unit>
<trans-unit id="4173754487295857311" datatype="html">
<source>Check Workflows</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.ts</context>
<context context-type="linenumber">84</context>
<context context-type="linenumber">83</context>
</context-group>
</trans-unit>
<trans-unit id="2065831805515896240" datatype="html">
<source>Bulk Update</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.ts</context>
<context context-type="linenumber">88</context>
<context context-type="linenumber">87</context>
</context-group>
</trans-unit>
<trans-unit id="5062348912423871240" datatype="html">
<source>Reprocess Document</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.ts</context>
<context context-type="linenumber">92</context>
<context context-type="linenumber">91</context>
</context-group>
</trans-unit>
<trans-unit id="6903696804811148799" datatype="html">
<source>Build Share Link</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.ts</context>
<context context-type="linenumber">96</context>
<context context-type="linenumber">95</context>
</context-group>
</trans-unit>
<trans-unit id="6581973658756561124" datatype="html">
<source>Bulk Delete</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.ts</context>
<context context-type="linenumber">100</context>
<context context-type="linenumber">99</context>
</context-group>
</trans-unit>
<trans-unit id="9172233176401579786" datatype="html">
<source>Scheduled</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.ts</context>
<context context-type="linenumber">110</context>
<context context-type="linenumber">109</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/edit-dialog/workflow-edit-dialog/workflow-edit-dialog.component.ts</context>
@@ -2366,7 +2366,7 @@
<source>Web UI</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.ts</context>
<context context-type="linenumber">112</context>
<context context-type="linenumber">111</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/edit-dialog/workflow-edit-dialog/workflow-edit-dialog.component.ts</context>
@@ -2377,7 +2377,7 @@
<source>API Upload</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.ts</context>
<context context-type="linenumber">115</context>
<context context-type="linenumber">114</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/edit-dialog/workflow-edit-dialog/workflow-edit-dialog.component.ts</context>
@@ -2388,77 +2388,77 @@
<source>Folder Consume</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.ts</context>
<context context-type="linenumber">119</context>
<context context-type="linenumber">118</context>
</context-group>
</trans-unit>
<trans-unit id="1456901843819961615" datatype="html">
<source>Email Consume</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.ts</context>
<context context-type="linenumber">123</context>
<context context-type="linenumber">122</context>
</context-group>
</trans-unit>
<trans-unit id="29832309535656200" datatype="html">
<source>System</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.ts</context>
<context context-type="linenumber">125</context>
<context context-type="linenumber">124</context>
</context-group>
</trans-unit>
<trans-unit id="2233560223291461480" datatype="html">
<source>Manual</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.ts</context>
<context context-type="linenumber">126</context>
<context context-type="linenumber">125</context>
</context-group>
</trans-unit>
<trans-unit id="5404910960991552159" datatype="html">
<source>Dismiss selected</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.ts</context>
<context context-type="linenumber">233</context>
<context context-type="linenumber">226</context>
</context-group>
</trans-unit>
<trans-unit id="9169677036332103838" datatype="html">
<source>Dismiss visible</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.ts</context>
<context context-type="linenumber">234</context>
<context context-type="linenumber">227</context>
</context-group>
</trans-unit>
<trans-unit id="3169751690815214293" datatype="html">
<source>Confirm Dismiss</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.ts</context>
<context context-type="linenumber">307</context>
<context context-type="linenumber">295</context>
</context-group>
</trans-unit>
<trans-unit id="5029621907742319073" datatype="html">
<source>Dismiss <x id="PH" equiv-text="tasks.size"/> tasks?</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.ts</context>
<context context-type="linenumber">308</context>
<context context-type="linenumber">296</context>
</context-group>
</trans-unit>
<trans-unit id="3597309129998924778" datatype="html">
<source>Error dismissing tasks</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.ts</context>
<context context-type="linenumber">319</context>
<context context-type="linenumber">304</context>
</context-group>
</trans-unit>
<trans-unit id="2132179171926568807" datatype="html">
<source>Error dismissing task</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.ts</context>
<context context-type="linenumber">331</context>
<context context-type="linenumber">313</context>
</context-group>
</trans-unit>
<trans-unit id="8760066891202884337" datatype="html">
<source>Duplicate of document #<x id="PH" equiv-text="this.duplicateDocumentId(task)"/></source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.ts</context>
<context context-type="linenumber">421</context>
<context context-type="linenumber">408</context>
</context-group>
</trans-unit>
<trans-unit id="3418677553313974490" datatype="html">
@@ -3560,14 +3560,14 @@
<source>Ask a question about this document...</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/chat/chat/chat.component.ts</context>
<context context-type="linenumber">42</context>
<context context-type="linenumber">37</context>
</context-group>
</trans-unit>
<trans-unit id="224049052163028286" datatype="html">
<source>Ask a question about a document...</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/chat/chat/chat.component.ts</context>
<context context-type="linenumber">43</context>
<context context-type="linenumber">38</context>
</context-group>
</trans-unit>
<trans-unit id="8700121026680200191" datatype="html">
@@ -4622,23 +4622,23 @@
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/system-status-dialog/system-status-dialog.component.html</context>
<context context-type="linenumber">197</context>
<context context-type="linenumber">172</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/system-status-dialog/system-status-dialog.component.html</context>
<context context-type="linenumber">231</context>
<context context-type="linenumber">206</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/system-status-dialog/system-status-dialog.component.html</context>
<context context-type="linenumber">265</context>
<context context-type="linenumber">240</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/system-status-dialog/system-status-dialog.component.html</context>
<context context-type="linenumber">275</context>
<context context-type="linenumber">250</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/system-status-dialog/system-status-dialog.component.html</context>
<context context-type="linenumber">313</context>
<context context-type="linenumber">288</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/toast/toast.component.html</context>
@@ -6422,7 +6422,7 @@
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/system-status-dialog/system-status-dialog.component.html</context>
<context context-type="linenumber">332</context>
<context context-type="linenumber">307</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/manage/mail/mail.component.html</context>
@@ -7131,143 +7131,93 @@
<context context-type="linenumber">127</context>
</context-group>
</trans-unit>
<trans-unit id="1413806771452363882" datatype="html">
<source>Recent Task Activity <x id="START_TAG_SPAN" ctype="x-span" equiv-text="&lt;span class=&quot;small text-muted fw-light&quot;&gt;"/>(<x id="INTERPOLATION" equiv-text="{{status.tasks.summary.days}}"/> days)<x id="CLOSE_TAG_SPAN" ctype="x-span" equiv-text="&lt;/span&gt;"/></source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/system-status-dialog/system-status-dialog.component.html</context>
<context context-type="linenumber">145</context>
</context-group>
</trans-unit>
<trans-unit id="3448462145758383019" datatype="html">
<source>Total</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/system-status-dialog/system-status-dialog.component.html</context>
<context context-type="linenumber">150</context>
</context-group>
</trans-unit>
<trans-unit id="3521084103654700903" datatype="html">
<source>Successful</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/system-status-dialog/system-status-dialog.component.html</context>
<context context-type="linenumber">154</context>
</context-group>
</trans-unit>
<trans-unit id="7256395947475975935" datatype="html">
<source>Failed</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/system-status-dialog/system-status-dialog.component.html</context>
<context context-type="linenumber">158</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/data/share-link-bundle.ts</context>
<context context-type="linenumber">44</context>
</context-group>
</trans-unit>
<trans-unit id="4416413576346763682" datatype="html">
<source>Pending</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/system-status-dialog/system-status-dialog.component.html</context>
<context context-type="linenumber">162</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/data/share-link-bundle.ts</context>
<context context-type="linenumber">41</context>
</context-group>
</trans-unit>
<trans-unit id="2000433102211963975" datatype="html">
<source>No recent tasks</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/system-status-dialog/system-status-dialog.component.html</context>
<context context-type="linenumber">167</context>
</context-group>
</trans-unit>
<trans-unit id="2041675390931385838" datatype="html">
<source>Health</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/system-status-dialog/system-status-dialog.component.html</context>
<context context-type="linenumber">178</context>
<context context-type="linenumber">153</context>
</context-group>
</trans-unit>
<trans-unit id="31377277941774469" datatype="html">
<source>Search Index</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/system-status-dialog/system-status-dialog.component.html</context>
<context context-type="linenumber">182</context>
<context context-type="linenumber">157</context>
</context-group>
</trans-unit>
<trans-unit id="4089509911694721896" datatype="html">
<source>Last Updated</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/system-status-dialog/system-status-dialog.component.html</context>
<context context-type="linenumber">195</context>
<context context-type="linenumber">170</context>
</context-group>
</trans-unit>
<trans-unit id="46628344485199198" datatype="html">
<source>Classifier</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/system-status-dialog/system-status-dialog.component.html</context>
<context context-type="linenumber">200</context>
<context context-type="linenumber">175</context>
</context-group>
</trans-unit>
<trans-unit id="9127131074422113272" datatype="html">
<source>Run Task</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/system-status-dialog/system-status-dialog.component.html</context>
<context context-type="linenumber">222</context>
<context context-type="linenumber">197</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/system-status-dialog/system-status-dialog.component.html</context>
<context context-type="linenumber">256</context>
<context context-type="linenumber">231</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/system-status-dialog/system-status-dialog.component.html</context>
<context context-type="linenumber">304</context>
<context context-type="linenumber">279</context>
</context-group>
</trans-unit>
<trans-unit id="6096684179126491743" datatype="html">
<source>Last Trained</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/system-status-dialog/system-status-dialog.component.html</context>
<context context-type="linenumber">229</context>
<context context-type="linenumber">204</context>
</context-group>
</trans-unit>
<trans-unit id="6427836860962380759" datatype="html">
<source>Sanity Checker</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/system-status-dialog/system-status-dialog.component.html</context>
<context context-type="linenumber">234</context>
<context context-type="linenumber">209</context>
</context-group>
</trans-unit>
<trans-unit id="6578747070254776938" datatype="html">
<source>Last Run</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/system-status-dialog/system-status-dialog.component.html</context>
<context context-type="linenumber">263</context>
<context context-type="linenumber">238</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/system-status-dialog/system-status-dialog.component.html</context>
<context context-type="linenumber">311</context>
<context context-type="linenumber">286</context>
</context-group>
</trans-unit>
<trans-unit id="5921685253729220446" datatype="html">
<source>WebSocket Connection</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/system-status-dialog/system-status-dialog.component.html</context>
<context context-type="linenumber">268</context>
<context context-type="linenumber">243</context>
</context-group>
</trans-unit>
<trans-unit id="8998179362936748717" datatype="html">
<source>OK</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/system-status-dialog/system-status-dialog.component.html</context>
<context context-type="linenumber">272</context>
<context context-type="linenumber">247</context>
</context-group>
</trans-unit>
<trans-unit id="3804349597565969872" datatype="html">
<source>AI Index</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/system-status-dialog/system-status-dialog.component.html</context>
<context context-type="linenumber">281</context>
<context context-type="linenumber">256</context>
</context-group>
</trans-unit>
<trans-unit id="6732151329960766506" datatype="html">
@@ -10867,6 +10817,13 @@
<context context-type="linenumber">328</context>
</context-group>
</trans-unit>
<trans-unit id="4416413576346763682" datatype="html">
<source>Pending</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/data/share-link-bundle.ts</context>
<context context-type="linenumber">41</context>
</context-group>
</trans-unit>
<trans-unit id="9155387182259025015" datatype="html">
<source>Processing</source>
<context-group purpose="location">
@@ -10881,6 +10838,13 @@
<context context-type="linenumber">43</context>
</context-group>
</trans-unit>
<trans-unit id="7256395947475975935" datatype="html">
<source>Failed</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/data/share-link-bundle.ts</context>
<context context-type="linenumber">44</context>
</context-group>
</trans-unit>
<trans-unit id="1055686627716339120" datatype="html">
<source>Archive</source>
<context-group purpose="location">
+1 -1
View File
@@ -1,6 +1,6 @@
{
"name": "paperless-ngx-ui",
"version": "2.20.15",
"version": "2.20.14",
"scripts": {
"preinstall": "npx only-allow pnpm",
"ng": "ng",
@@ -23,7 +23,7 @@
<div class="visually-hidden" i18n>Loading...</div>
}
<div class="task-controls mb-3 btn-toolbar align-items-center" role="toolbar">
<div class="task-controls mb-3 btn-toolbar">
<div class="task-view-scope btn-group btn-group-sm me-3" role="group">
<input
type="radio"
@@ -52,8 +52,6 @@
}
</div>
<div class="ms-3 me-2 text-muted"><ng-container i18n>Filter by</ng-container>:</div>
<div ngbDropdown>
<button class="btn btn-sm btn-outline-primary me-3" ngbDropdownToggle>{{selectedTaskTypeLabel}}</button>
<div class="dropdown-menu shadow" ngbDropdownMenu>
@@ -75,6 +73,7 @@
<div class="form-inline d-flex align-items-center flex-grow-1 task-search">
<div class="input-group input-group-sm flex-fill w-auto flex-nowrap">
<span class="input-group-text text-muted" i18n>Filter by</span>
<div ngbDropdown>
<button class="btn btn-sm btn-outline-primary" ngbDropdownToggle>{{filterTargetName}}</button>
<div class="dropdown-menu shadow" ngbDropdownMenu>
@@ -99,18 +98,6 @@
<i-bs width="1em" height="1em" name="x"></i-bs><small i18n>Reset filters</small>
</button>
}
<ngb-pagination
class="ms-md-3 mb-0"
[pageSize]="pageSize"
[collectionSize]="totalTasks"
[page]="page"
[maxSize]="5"
[rotate]="true"
size="sm"
aria-label="Tasks pagination"
(pageChange)="setPage($event)">
</ngb-pagination>
</div>
<ng-template let-tasks="tasks" let-section="section" #tasksTemplate>
@@ -124,10 +111,10 @@
</div>
<div class="card border table-responsive mb-3">
<table class="table table-striped align-middle shadow-sm mb-0 tasks-table">
<table class="table table-striped align-middle shadow-sm mb-0">
<thead>
<tr>
<th scope="col" class="select-col">
<th scope="col">
<div class="form-check">
<input
type="checkbox"
@@ -140,19 +127,19 @@
<label class="form-check-label" for="all-tasks-{{section}}"><span class="visually-hidden">Check all</span></label>
</div>
</th>
<th scope="col" class="name-col-header" i18n>Name</th>
<th scope="col" class="d-none d-lg-table-cell created-col" i18n>Created</th>
<th scope="col" i18n>Name</th>
<th scope="col" class="d-none d-lg-table-cell" i18n>Created</th>
@if (sectionShowsResults(section)) {
<th scope="col" class="d-none d-lg-table-cell results-col" i18n>Results</th>
<th scope="col" class="d-none d-lg-table-cell" i18n>Results</th>
}
<th scope="col" class="d-table-cell d-lg-none" i18n>Info</th>
<th scope="col" class="actions-col" i18n>Actions</th>
<th scope="col" i18n>Actions</th>
</tr>
</thead>
<tbody>
@for (task of tasks; track task.id) {
<tr (click)="toggleSelected(task); $event.stopPropagation();" (keydown)="toggleSelected(task); $event.stopPropagation();">
<td class="select-col">
<td>
<div class="form-check">
<input
type="checkbox"
@@ -174,9 +161,9 @@
<span>{{ task.trigger_source_display }}</span>
</div>
</td>
<td class="d-none d-lg-table-cell created-col">{{ task.date_created | customDate:'short' }}</td>
<td class="d-none d-lg-table-cell">{{ task.date_created | customDate:'short' }}</td>
@if (sectionShowsResults(section)) {
<td class="d-none d-lg-table-cell results-col">
<td class="d-none d-lg-table-cell">
@if (taskHasLongResultMessage(task)) {
<div class="result" (click)="expandTask(task); $event.stopPropagation();"
[ngbPopover]="resultPopover" popoverClass="shadow small mobile" triggers="mouseenter:mouseleave" container="body">
@@ -207,23 +194,23 @@
<i-bs width="1.2em" height="1.2em" name="info-circle"></i-bs>
</button>
</td>
<td scope="row" class="actions-col">
<td scope="row">
<div class="btn-group" role="group">
<button class="btn btn-sm btn-outline-secondary" (click)="dismissTask(task); $event.stopPropagation();" *pngxIfPermissions="{ action: PermissionAction.Change, type: PermissionType.PaperlessTask }">
<i-bs name="check" class="me-1"></i-bs><ng-container i18n>Dismiss</ng-container>
</button>
<ng-container *pngxIfPermissions="{ action: PermissionAction.View, type: PermissionType.Document }">
@if (task.related_document_ids?.[0]) {
<a class="btn btn-sm btn-outline-primary" [routerLink]="['/documents', task.related_document_ids[0]]" (click)="dismissTask(task)">
<button class="btn btn-sm btn-outline-primary" (click)="dismissAndGo(task); $event.stopPropagation();">
<i-bs name="file-text" class="me-1"></i-bs><ng-container i18n>Open Document</ng-container>
</a>
</button>
}
</ng-container>
</div>
</td>
</tr>
<tr>
<td class="px-2 py-0" [class.border-0]="expandedTask !== task.id" [attr.colspan]="sectionShowsResults(section) ? 5 : 4">
<td class="p-0" [class.border-0]="expandedTask !== task.id" [attr.colspan]="sectionShowsResults(section) ? 5 : 4">
<div #collapse="ngbCollapse" [ngbCollapse]="expandedTask !== task.id" class="task-detail-panel bg-darker small mb-0">
<div class="p-2 p-lg-3 ms-lg-3">
@if (taskHasResultMessage(task)) {
@@ -16,39 +16,6 @@ pre {
cursor: pointer;
}
.tasks-table {
width: 100%;
}
@media (min-width: 992px) {
.tasks-table {
table-layout: fixed;
}
.tasks-table .select-col {
width: 3rem;
}
.tasks-table .created-col {
width: 13rem;
white-space: nowrap;
}
.tasks-table .results-col {
width: 24%;
}
.tasks-table .actions-col {
width: 18rem;
white-space: nowrap;
}
.tasks-table .name-col,
.tasks-table .results-col {
overflow: hidden;
}
}
.btn .spinner-border-sm {
width: 0.8rem;
height: 0.8rem;
@@ -63,6 +30,8 @@ pre {
.input-group .dropdown .btn {
border-top-right-radius: 0;
border-bottom-right-radius: 0;
border-top-left-radius: 0;
border-bottom-left-radius: 0;
}
.z-10 {
@@ -19,7 +19,6 @@ import {
PaperlessTaskTriggerSource,
PaperlessTaskType,
} from 'src/app/data/paperless-task'
import { Results } from 'src/app/data/results'
import { IfPermissionsDirective } from 'src/app/directives/if-permissions.directive'
import { PermissionsGuard } from 'src/app/guards/permissions.guard'
import { CustomDatePipe } from 'src/app/pipes/custom-date.pipe'
@@ -149,11 +148,6 @@ const tasks: PaperlessTask[] = [
},
]
const paginatedTasks: Results<PaperlessTask> = {
count: tasks.length,
results: tasks,
}
describe('TasksComponent', () => {
let component: TasksComponent
let fixture: ComponentFixture<TasksComponent>
@@ -202,25 +196,9 @@ describe('TasksComponent', () => {
component = fixture.componentInstance
jest.useFakeTimers()
fixture.detectChanges()
httpTestingController
.expectOne(
(req) =>
req.url === `${environment.apiBaseUrl}tasks/` &&
req.params.get('acknowledged') === 'false' &&
req.params.get('page_size') === '1000'
)
.flush(paginatedTasks)
httpTestingController
.expectOne(
(req) =>
req.url === `${environment.apiBaseUrl}tasks/` &&
req.params.get('acknowledged') === 'false' &&
req.params.get('page_size') === '25' &&
req.params.get('page') === '1'
)
.flush(paginatedTasks)
.expectOne(`${environment.apiBaseUrl}tasks/?acknowledged=false`)
.flush(tasks)
})
it('should display task sections with counts', () => {
@@ -316,40 +294,6 @@ describe('TasksComponent', () => {
).toBeTruthy()
})
it('should render pagination controls next to the task filter', () => {
fixture.detectChanges()
const controls = fixture.debugElement.query(By.css('.task-controls'))
const search = controls.query(By.css('.task-search'))
const pagination = controls.query(By.css('ngb-pagination'))
expect(search).not.toBeNull()
expect(pagination).not.toBeNull()
})
it('should load a different task page when pagination changes', () => {
component.setPage(2)
const pageTwoTasks = {
count: 30,
results: [tasks[0]],
}
httpTestingController
.expectOne(
(req) =>
req.url === `${environment.apiBaseUrl}tasks/` &&
req.params.get('acknowledged') === 'false' &&
req.params.get('page_size') === '25' &&
req.params.get('page') === '2'
)
.flush(pageTwoTasks)
expect(component.page).toBe(2)
expect(component.totalTasks).toBe(30)
expect(component.pagedTasks).toEqual([tasks[0]])
})
it('should expose stable task type options and disable empty ones', () => {
expect(component.taskTypeOptions.map((option) => option.value)).toContain(
PaperlessTaskType.TrainClassifier
@@ -526,28 +470,18 @@ describe('TasksComponent', () => {
component.toggleSection(TaskSection.NeedsAttention, {
target: { checked: false },
} as unknown as PointerEvent)
} as PointerEvent)
expect(component.selectedTasks).toEqual(new Set())
})
it('should support dismiss and open a document', () => {
const dismissSpy = jest.spyOn(component, 'dismissTask')
fixture.detectChanges()
const openDocumentLink = fixture.debugElement
.queryAll(By.css('a'))
.find((link) => link.nativeElement.textContent.includes('Open Document'))
expect(openDocumentLink).not.toBeNull()
openDocumentLink.triggerEventHandler(
'click',
new MouseEvent('click', { ctrlKey: true })
)
expect(dismissSpy).toHaveBeenCalledWith(tasks[3])
const routerSpy = jest.spyOn(router, 'navigate')
component.dismissAndGo(tasks[3])
expect(routerSpy).toHaveBeenCalledWith([
'documents',
tasks[3].related_document_ids?.[0],
])
})
it('should auto refresh, allow toggle', () => {
@@ -1,12 +1,11 @@
import { JsonPipe, NgTemplateOutlet } from '@angular/common'
import { Component, inject, OnDestroy, OnInit } from '@angular/core'
import { FormsModule, ReactiveFormsModule } from '@angular/forms'
import { Router, RouterLink } from '@angular/router'
import { Router } from '@angular/router'
import {
NgbCollapseModule,
NgbDropdownModule,
NgbModal,
NgbPaginationModule,
NgbPopoverModule,
} from '@ng-bootstrap/ng-bootstrap'
import { NgxBootstrapIconsModule } from 'ngx-bootstrap-icons'
@@ -138,10 +137,8 @@ const TRIGGER_SOURCE_OPTIONS: Array<{
FormsModule,
ReactiveFormsModule,
NgTemplateOutlet,
RouterLink,
NgbCollapseModule,
NgbDropdownModule,
NgbPaginationModule,
NgbPopoverModule,
NgxBootstrapIconsModule,
],
@@ -164,10 +161,6 @@ export class TasksComponent
public selectedTasks: Set<number> = new Set()
public expandedTask: number
public autoRefreshEnabled: boolean = true
public readonly pageSize = 25
public page: number = 1
public totalTasks: number = 0
public pagedTasks: PaperlessTask[] = []
public selectedSection: TaskSection = TaskSection.All
public selectedTaskType: PaperlessTaskType | null = null
public selectedTriggerSource: PaperlessTaskTriggerSource | null = null
@@ -261,7 +254,6 @@ export class TasksComponent
ngOnInit() {
this.tasksService.reload()
this.reloadPage()
timer(5000, 5000)
.pipe(
filter(() => this.autoRefreshEnabled),
@@ -269,7 +261,6 @@ export class TasksComponent
)
.subscribe(() => {
this.tasksService.reload()
this.reloadPage(false)
})
this.filterDebounce
@@ -279,10 +270,7 @@ export class TasksComponent
distinctUntilChanged(),
filter((query) => !query.length || query.length > 2)
)
.subscribe((query) => {
this._filterText = query
this.clearSelection()
})
.subscribe((query) => (this._filterText = query))
}
ngOnDestroy() {
@@ -312,9 +300,6 @@ export class TasksComponent
modal.componentInstance.buttonsEnabled = false
modal.close()
this.tasksService.dismissTasks(tasks).subscribe({
next: () => {
this.reloadPage(false)
},
error: (e) => {
this.toastService.showError($localize`Error dismissing tasks`, e)
modal.componentInstance.buttonsEnabled = true
@@ -324,9 +309,6 @@ export class TasksComponent
})
} else if (tasks.size === 1) {
this.tasksService.dismissTasks(tasks).subscribe({
next: () => {
this.reloadPage(false)
},
error: (e) =>
this.toastService.showError($localize`Error dismissing task`, e),
})
@@ -334,6 +316,11 @@ export class TasksComponent
}
}
dismissAndGo(task: PaperlessTask) {
this.dismissTask(task)
this.router.navigate(['documents', task.related_document_ids?.[0]])
}
expandTask(task: PaperlessTask) {
this.expandedTask = this.expandedTask == task.id ? undefined : task.id
}
@@ -434,7 +421,7 @@ export class TasksComponent
}
tasksForSection(section: TaskSection): PaperlessTask[] {
let tasks = this.pagedTasks.filter((task) =>
let tasks = this.tasksService.allFileTasks.filter((task) =>
this.taskBelongsToSection(task, section)
)
@@ -446,7 +433,7 @@ export class TasksComponent
}
sectionCount(section: TaskSection): number {
return this.pagedTasks.filter((task) =>
return this.tasksService.allFileTasks.filter((task) =>
this.taskBelongsToSection(task, section)
).length
}
@@ -494,16 +481,6 @@ export class TasksComponent
this.selectedTasks.clear()
}
setPage(page: number) {
if (this.page === page) {
return
}
this.page = page
this.clearSelection()
this.reloadPage()
}
public resetFilter() {
this._filterText = ''
}
@@ -599,39 +576,10 @@ export class TasksComponent
? this.sections
: [this.selectedSection]
return this.pagedTasks.filter(
return this.tasksService.allFileTasks.filter(
(task) =>
sections.some((section) => this.taskBelongsToSection(task, section)) &&
this.taskMatchesFilters(task, { taskType, triggerSource })
)
}
private reloadPage(resetToFirstPage: boolean = false) {
if (resetToFirstPage) {
this.page = 1
}
this.loading = true
this.tasksService
.list(this.page, this.pageSize, { acknowledged: false })
.pipe(first(), takeUntil(this.unsubscribeNotifier))
.subscribe({
next: (result) => {
this.pagedTasks = result.results
this.totalTasks = result.count
this.loading = false
if (
this.page > 1 &&
this.pagedTasks.length === 0 &&
this.totalTasks > 0
) {
this.page -= 1
this.reloadPage()
}
},
error: () => {
this.loading = false
},
})
}
}
@@ -8,21 +8,10 @@
<div class="chat-messages font-monospace small">
@for (message of messages; track message) {
<div class="message d-flex flex-row small" [class.justify-content-end]="message.role === 'user'">
<div class="p-2 m-2" [class.bg-dark]="message.role === 'user'">
<span>
{{ message.content }}
@if (message.isStreaming) { <span class="blinking-cursor">|</span> }
</span>
@if (message.role === 'assistant' && message.references?.length) {
<div class="chat-references list-group mt-3">
@for (reference of message.references; track reference.id) {
<a class="list-group-item list-group-item-action text-primary" [routerLink]="['/documents', reference.id]">
<i-bs width="0.9em" height="0.9em" name="file-text" class="me-1"></i-bs><span>{{ reference.title }}</span>
</a>
}
</div>
}
</div>
<span class="p-2 m-2" [class.bg-dark]="message.role === 'user'">
{{ message.content }}
@if (message.isStreaming) { <span class="blinking-cursor">|</span> }
</span>
</div>
}
<div #scrollAnchor></div>
@@ -7,10 +7,6 @@
overflow-y: auto;
}
.chat-references {
font-family: var(--bs-font-sans-serif);
}
.dropdown-toggle::after {
display: none;
}
@@ -3,13 +3,9 @@ import { provideHttpClientTesting } from '@angular/common/http/testing'
import { ElementRef } from '@angular/core'
import { ComponentFixture, TestBed } from '@angular/core/testing'
import { NavigationEnd, Router } from '@angular/router'
import { RouterTestingModule } from '@angular/router/testing'
import { allIcons, NgxBootstrapIconsModule } from 'ngx-bootstrap-icons'
import { Subject } from 'rxjs'
import {
CHAT_METADATA_DELIMITER,
ChatService,
} from 'src/app/services/chat.service'
import { ChatService } from 'src/app/services/chat.service'
import { ChatComponent } from './chat.component'
describe('ChatComponent', () => {
@@ -22,11 +18,7 @@ describe('ChatComponent', () => {
beforeEach(async () => {
TestBed.configureTestingModule({
imports: [
NgxBootstrapIconsModule.pick(allIcons),
RouterTestingModule,
ChatComponent,
],
imports: [NgxBootstrapIconsModule.pick(allIcons), ChatComponent],
providers: [
provideHttpClient(withInterceptorsFromDi()),
provideHttpClientTesting(),
@@ -92,57 +84,6 @@ describe('ChatComponent', () => {
expect(component.messages[1].isStreaming).toBe(false)
})
it('should parse references from the metadata trailer without showing it', () => {
component.input = 'Hello'
component.sendMessage()
mockStream$.next(
`Hi there${CHAT_METADATA_DELIMITER}{"references":[{"id":42,"title":"Bread Recipe"}]}`
)
jest.advanceTimersByTime(1000)
expect(component.messages[1].content).toBe('Hi there')
expect(component.messages[1].references).toEqual([
{ id: 42, title: 'Bread Recipe' },
])
})
it('should render document reference links under assistant messages', () => {
component.input = 'Hello'
component.sendMessage()
mockStream$.next(
`Hi there${CHAT_METADATA_DELIMITER}{"references":[{"id":42,"title":"Bread Recipe"}]}`
)
jest.advanceTimersByTime(1000)
fixture.detectChanges()
const link = fixture.nativeElement.querySelector('.chat-references a')
expect(link.textContent).toContain('Bread Recipe')
expect(link.getAttribute('href')).toContain('/documents/42')
})
it('should remove delimiter fragments that were already streamed', () => {
component.input = 'Hello'
component.sendMessage()
mockStream$.next(`Hi there${CHAT_METADATA_DELIMITER.slice(0, 8)}`)
jest.advanceTimersByTime(1000)
expect(component.messages[1].content).toBe(
`Hi there${CHAT_METADATA_DELIMITER.slice(0, 8)}`
)
mockStream$.next(
`Hi there${CHAT_METADATA_DELIMITER}{"references":[{"id":42,"title":"Bread Recipe"}]}`
)
jest.advanceTimersByTime(1000)
expect(component.messages[1].content).toBe('Hi there')
expect(component.messages[1].references).toEqual([
{ id: 42, title: 'Bread Recipe' },
])
})
it('should handle errors during streaming', () => {
component.input = 'Hello'
component.sendMessage()
@@ -1,21 +1,16 @@
import { Component, ElementRef, inject, OnInit, ViewChild } from '@angular/core'
import { FormsModule, ReactiveFormsModule } from '@angular/forms'
import { NavigationEnd, Router, RouterModule } from '@angular/router'
import { NavigationEnd, Router } from '@angular/router'
import { NgbDropdownModule } from '@ng-bootstrap/ng-bootstrap'
import { NgxBootstrapIconsModule } from 'ngx-bootstrap-icons'
import { filter, map } from 'rxjs'
import {
ChatMessage,
ChatService,
parseChatResponse,
} from 'src/app/services/chat.service'
import { ChatMessage, ChatService } from 'src/app/services/chat.service'
@Component({
selector: 'pngx-chat',
imports: [
FormsModule,
ReactiveFormsModule,
RouterModule,
NgxBootstrapIconsModule,
NgbDropdownModule,
],
@@ -75,24 +70,13 @@ export class ChatComponent implements OnInit {
this.messages.push(assistantMessage)
this.loading = true
let lastVisibleContent = ''
let lastPartialLength = 0
this.chatService.streamChat(this.documentId, this.input).subscribe({
next: (chunk) => {
const nextResponse = parseChatResponse(chunk)
if (nextResponse.content.length < lastVisibleContent.length) {
this.resetTypewriter(assistantMessage, nextResponse.content)
lastVisibleContent = nextResponse.content
} else {
const visibleDelta = nextResponse.content.substring(
lastVisibleContent.length
)
lastVisibleContent = nextResponse.content
this.enqueueTypewriter(visibleDelta, assistantMessage)
}
assistantMessage.references = nextResponse.references
const delta = chunk.substring(lastPartialLength)
lastPartialLength = chunk.length
this.enqueueTypewriter(delta, assistantMessage)
},
error: () => {
assistantMessage.content += '\n\n⚠️ Error receiving response.'
@@ -109,13 +93,6 @@ export class ChatComponent implements OnInit {
this.input = ''
}
private resetTypewriter(message: ChatMessage, content: string): void {
this.typewriterBuffer = []
this.typewriterActive = false
message.content = content
this.scrollToBottom()
}
enqueueTypewriter(chunk: string, message: ChatMessage): void {
if (!chunk) return
@@ -272,7 +272,7 @@ export class PngxPdfViewerComponent
if (!this.hasRenderedPage) {
return
}
const query = this.searchQuery?.trim()
const query = this.searchQuery.trim()
if (query === this.lastFindQuery) {
return
}
@@ -142,31 +142,6 @@
}
</ng-template>
</dd>
<dt i18n>Recent Task Activity <span class="small text-muted fw-light">({{status.tasks.summary.days}} days)</span></dt>
<dd class="mb-0">
@if (status.tasks.summary.total_count > 0) {
<ul class="list-group border-light mt-2">
<li class="list-group-item list-group-item-action d-flex justify-content-between align-items-center">
<span class="small"><ng-container i18n>Total</ng-container>:</span>
<span class="badge bg-light rounded-pill">{{status.tasks.summary.total_count}}</span>
</li>
<li class="list-group-item list-group-item-action d-flex justify-content-between align-items-center">
<span class="small"><ng-container i18n>Successful</ng-container>:</span>
<span class="badge bg-primary rounded-pill">{{status.tasks.summary.success_count}}</span>
</li>
<li class="list-group-item list-group-item-action d-flex justify-content-between align-items-center">
<span class="small"><ng-container i18n>Failed</ng-container>:</span>
<span class="badge bg-danger rounded-pill">{{status.tasks.summary.failure_count}}</span>
</li>
<li class="list-group-item list-group-item-action d-flex justify-content-between align-items-center">
<span class="small"><ng-container i18n>Pending</ng-container>:</span>
<span class="badge bg-warning rounded-pill">{{status.tasks.summary.pending_count}}</span>
</li>
</ul>
} @else {
<span class="small text-muted" i18n>No recent tasks</span>
}
</dd>
</dl>
</div>
</div>
@@ -71,13 +71,6 @@ const status: SystemStatus = {
llmindex_status: SystemStatusItemStatus.OK,
llmindex_last_modified: new Date().toISOString(),
llmindex_error: null,
summary: {
days: 30,
total_count: 12,
pending_count: 1,
success_count: 10,
failure_count: 1,
},
},
}
@@ -54,10 +54,10 @@
@if (isFinished(status)) {
<div>
@if (status.documentId) {
<a class="btn btn-sm btn-outline-primary btn-open" [routerLink]="['/documents', status.documentId]" (click)="dismiss(status)">
<button class="btn btn-sm btn-outline-primary btn-open" routerLink="/documents/{{status.documentId}}" (click)="dismiss(status)">
<small i18n>Open document</small>
<i-bs name="arrow-right-short"></i-bs>
</a>
</button>
}
</div>
}
+2 -2
View File
@@ -55,12 +55,12 @@ export const ConfigCategory = {
}
export const LLMEmbeddingBackendConfig = {
OPENAI_LIKE: 'openai-like',
OPENAI: 'openai',
HUGGINGFACE: 'huggingface',
}
export const LLMBackendConfig = {
OPENAI_LIKE: 'openai-like',
OPENAI: 'openai',
OLLAMA: 'ollama',
}
-7
View File
@@ -47,13 +47,6 @@ export interface SystemStatus {
llmindex_status: SystemStatusItemStatus
llmindex_last_modified: string // ISO date string
llmindex_error: string
summary: {
days: number
total_count: number
pending_count: number
success_count: number
failure_count: number
}
}
websocket_connected?: SystemStatusItemStatus // added client-side
}
+1 -23
View File
@@ -9,11 +9,7 @@ import {
} from '@angular/common/http/testing'
import { TestBed } from '@angular/core/testing'
import { environment } from 'src/environments/environment'
import {
CHAT_METADATA_DELIMITER,
ChatService,
parseChatResponse,
} from './chat.service'
import { ChatService } from './chat.service'
describe('ChatService', () => {
let service: ChatService
@@ -59,22 +55,4 @@ describe('ChatService', () => {
partialText: mockResponse,
} as any)
})
it('should parse chat references from the metadata trailer', () => {
const parsed = parseChatResponse(
`Answer text${CHAT_METADATA_DELIMITER}{"references":[{"id":1,"title":"Document 1"}]}`
)
expect(parsed.content).toBe('Answer text')
expect(parsed.references).toEqual([{ id: 1, title: 'Document 1' }])
})
it('should hide incomplete metadata trailer from the visible content', () => {
const parsed = parseChatResponse(
`Answer text${CHAT_METADATA_DELIMITER}{"references"`
)
expect(parsed.content).toBe('Answer text')
expect(parsed.references).toBeUndefined()
})
})
-40
View File
@@ -11,46 +11,6 @@ export interface ChatMessage {
role: 'user' | 'assistant'
content: string
isStreaming?: boolean
references?: ChatReference[]
}
export interface ChatReference {
id: number
title: string
}
export interface ParsedChatResponse {
content: string
references?: ChatReference[]
}
export const CHAT_METADATA_DELIMITER = '\n\n__PAPERLESS_CHAT_METADATA__'
export function parseChatResponse(response: string): ParsedChatResponse {
const delimiterIndex = response.indexOf(CHAT_METADATA_DELIMITER)
if (delimiterIndex === -1) {
return { content: response }
}
const metadataString = response.slice(
delimiterIndex + CHAT_METADATA_DELIMITER.length
)
try {
const metadata = JSON.parse(metadataString) as {
references?: ChatReference[]
}
return {
content: response.slice(0, delimiterIndex),
references: metadata.references ?? [],
}
} catch {
return {
content: response.slice(0, delimiterIndex),
}
}
}
@Injectable({
+9 -29
View File
@@ -1,8 +1,4 @@
import {
HttpRequest,
provideHttpClient,
withInterceptorsFromDi,
} from '@angular/common/http'
import { provideHttpClient, withInterceptorsFromDi } from '@angular/common/http'
import {
HttpTestingController,
provideHttpClientTesting,
@@ -41,21 +37,16 @@ describe('TasksService', () => {
it('calls tasks api endpoint on reload', () => {
tasksService.reload()
const req = httpTestingController.expectOne(
(req: HttpRequest<unknown>) =>
req.url === `${environment.apiBaseUrl}tasks/` &&
req.params.get('acknowledged') === 'false' &&
req.params.get('page_size') === '1000'
`${environment.apiBaseUrl}tasks/?acknowledged=false`
)
expect(req.request.method).toEqual('GET')
req.flush({ count: 0, results: [] })
})
it('does not call tasks api endpoint on reload if already loading', () => {
tasksService.loading = true
tasksService.reload()
httpTestingController.expectNone(
(req: HttpRequest<unknown>) =>
req.url === `${environment.apiBaseUrl}tasks/`
`${environment.apiBaseUrl}tasks/?acknowledged=false`
)
})
@@ -71,13 +62,8 @@ describe('TasksService', () => {
req.flush([])
// reload is then called
httpTestingController
.expectOne(
(req: HttpRequest<unknown>) =>
req.url === `${environment.apiBaseUrl}tasks/` &&
req.params.get('acknowledged') === 'false' &&
req.params.get('page_size') === '1000'
)
.flush({ count: 0, results: [] })
.expectOne(`${environment.apiBaseUrl}tasks/?acknowledged=false`)
.flush([])
})
it('groups mixed task types by status when reloading', () => {
@@ -138,13 +124,10 @@ describe('TasksService', () => {
tasksService.reload()
const req = httpTestingController.expectOne(
(req: HttpRequest<unknown>) =>
req.url === `${environment.apiBaseUrl}tasks/` &&
req.params.get('acknowledged') === 'false' &&
req.params.get('page_size') === '1000'
`${environment.apiBaseUrl}tasks/?acknowledged=false`
)
req.flush({ count: mockTasks.length, results: mockTasks })
req.flush(mockTasks)
expect(tasksService.allFileTasks).toHaveLength(5)
expect(tasksService.completedFileTasks).toHaveLength(2)
@@ -190,13 +173,10 @@ describe('TasksService', () => {
tasksService.reload()
const req = httpTestingController.expectOne(
(req: HttpRequest<unknown>) =>
req.url === `${environment.apiBaseUrl}tasks/` &&
req.params.get('acknowledged') === 'false' &&
req.params.get('page_size') === '1000'
`${environment.apiBaseUrl}tasks/?acknowledged=false`
)
req.flush({ count: mockTasks.length, results: mockTasks })
req.flush(mockTasks)
expect(tasksService.needsAttentionTasks).toHaveLength(2)
expect(tasksService.needsAttentionTasks.map((task) => task.status)).toEqual(
+4 -27
View File
@@ -1,13 +1,12 @@
import { HttpClient } from '@angular/common/http'
import { Injectable, inject } from '@angular/core'
import { Observable, Subject } from 'rxjs'
import { first, map, takeUntil, tap } from 'rxjs/operators'
import { first, takeUntil, tap } from 'rxjs/operators'
import {
PaperlessTask,
PaperlessTaskStatus,
PaperlessTaskType,
} from 'src/app/data/paperless-task'
import { Results } from 'src/app/data/results'
import { environment } from 'src/environments/environment'
@Injectable({
@@ -18,7 +17,6 @@ export class TasksService {
private baseUrl: string = environment.apiBaseUrl
private endpoint: string = 'tasks'
private readonly defaultReloadPageSize = 1000
public loading: boolean = false
@@ -71,13 +69,9 @@ export class TasksService {
this.loading = true
this.http
.get<Results<PaperlessTask>>(`${this.baseUrl}${this.endpoint}/`, {
params: {
acknowledged: 'false',
page_size: this.defaultReloadPageSize,
},
})
.pipe(map((r) => r.results))
.get<PaperlessTask[]>(
`${this.baseUrl}${this.endpoint}/?acknowledged=false`
)
.pipe(takeUntil(this.unsubscribeNotifer), first())
.subscribe((r) => {
this.fileTasks = r
@@ -85,23 +79,6 @@ export class TasksService {
})
}
public list(
page: number,
pageSize: number,
extraParams?: Record<string, string | number | boolean>
): Observable<Results<PaperlessTask>> {
return this.http.get<Results<PaperlessTask>>(
`${this.baseUrl}${this.endpoint}/`,
{
params: {
page,
page_size: pageSize,
...extraParams,
},
}
)
}
public dismissTasks(task_ids: Set<number>): Observable<any> {
return this.http
.post(`${this.baseUrl}tasks/acknowledge/`, {
@@ -103,13 +103,7 @@ export class CustomFieldQueryAtom extends CustomFieldQueryElement {
return super.operator
}
constructor(
queryArray: [number, string, string | string[] | number[]] = [
null,
null,
null,
]
) {
constructor(queryArray: [number, string, string] = [null, null, null]) {
super(CustomFieldQueryElementType.Atom)
;[this._field, this._operator, this._value] = queryArray
}
+1 -1
View File
@@ -6,7 +6,7 @@ export const environment = {
apiVersion: '10', // match src/paperless/settings.py
appTitle: 'Paperless-ngx',
tag: 'prod',
version: '2.20.15',
version: '2.20.14',
webSocketHost: window.location.host,
webSocketProtocol: window.location.protocol == 'https:' ? 'wss:' : 'ws:',
webSocketBaseUrl: base_url.pathname + 'ws/',
+1 -2
View File
@@ -218,8 +218,7 @@ def set_llm_suggestions_cache(
timeout: int = CACHE_50_MINUTES,
) -> None:
"""
Cache LLM-generated suggestions using a backend-specific identifier
(e.g. 'openai-like:gpt-4').
Cache LLM-generated suggestions using a backend-specific identifier (e.g. 'openai:gpt-4').
"""
doc_key = get_suggestion_cache_key(document_id)
cache.set(
+3 -3
View File
@@ -8,7 +8,6 @@ from pathlib import Path
from typing import TYPE_CHECKING
from typing import Final
import magic
from django.conf import settings
from django.contrib.auth.models import User
from django.db import transaction
@@ -52,6 +51,7 @@ from documents.utils import compute_checksum
from documents.utils import copy_basic_file_stats
from documents.utils import copy_file_with_basic_stats
from documents.utils import run_subprocess
from paperless import mime_detection
from paperless.config import OcrConfig
from paperless.models import ArchiveFileGenerationChoices
from paperless.parsers import ParserContext
@@ -424,7 +424,7 @@ class ConsumerPlugin(
# Determine the parser class.
mime_type = magic.from_file(self.working_copy, mime=True)
mime_type = mime_detection.from_file(self.working_copy)
self.log.debug(f"Detected mime type: {mime_type}")
@@ -446,7 +446,7 @@ class ConsumerPlugin(
],
logger=self.log,
)
mime_type = magic.from_file(self.working_copy, mime=True)
mime_type = mime_detection.from_file(self.working_copy)
self.log.debug(f"Detected mime type after qpdf: {mime_type}")
# Save the original file for later
self.unmodified_original = (
+3 -2
View File
@@ -4,10 +4,11 @@ from enum import IntEnum
from pathlib import Path
from typing import TypedDict
import magic
from guardian.shortcuts import get_groups_with_perms
from guardian.shortcuts import get_users_with_perms
from paperless import mime_detection
@dataclasses.dataclass
class DocumentMetadataOverrides:
@@ -184,7 +185,7 @@ class ConsumableDocument:
# Get the file type once at init
# Note this function isn't called when the object is unpickled
self.mime_type = magic.from_file(self.original_file, mime=True)
self.mime_type = mime_detection.from_file(self.original_file)
class ConsumeFileDuplicateResult(TypedDict):
@@ -6,7 +6,6 @@ import tempfile
from itertools import islice
from pathlib import Path
from typing import TYPE_CHECKING
from typing import Any
from allauth.mfa.models import Authenticator
from allauth.socialaccount.models import SocialAccount
@@ -69,7 +68,7 @@ from paperless_mail.models import MailRule
def serialize_queryset_batched(
queryset: "QuerySet[Any]",
queryset: "QuerySet",
*,
batch_size: int = 500,
) -> "Generator[list[dict], None, None]":
@@ -365,7 +364,7 @@ class Command(CryptMixin, PaperlessCommand):
# 2. Create manifest, containing all correspondents, types, tags, storage paths
# note, documents and ui_settings
manifest_key_to_object_query: dict[str, QuerySet[Any]] = {
manifest_key_to_object_query: dict[str, QuerySet] = {
"correspondents": Correspondent.objects.all(),
"tags": Tag.objects.all(),
"document_types": DocumentType.objects.all(),
+1 -1
View File
@@ -261,7 +261,7 @@ def get_objects_for_user_owner_aware(
Model: Any,
*,
include_deleted: bool = False,
) -> QuerySet[Any]:
) -> QuerySet:
"""
Returns objects the user owns, are unowned, or has explicit perms.
When include_deleted is True, soft-deleted items are also included.
+142 -90
View File
@@ -3,6 +3,7 @@ from __future__ import annotations
import logging
import re
import threading
from collections import Counter
from datetime import UTC
from datetime import datetime
from enum import StrEnum
@@ -10,7 +11,6 @@ from typing import TYPE_CHECKING
from typing import Self
from typing import TypedDict
from typing import TypeVar
from typing import cast
import filelock
import regex
@@ -36,7 +36,7 @@ from documents.utils import identity
if TYPE_CHECKING:
from pathlib import Path
from django.contrib.auth.models import AbstractUser
from django.contrib.auth.base_user import AbstractBaseUser
from django.db.models import QuerySet
from documents.models import Document
@@ -169,16 +169,9 @@ class WriteBatch:
def __init__(self, backend: TantivyBackend, lock_timeout: float):
self._backend = backend
self._lock_timeout = lock_timeout
self._raw_writer: tantivy.IndexWriter | None = None
self._writer = None
self._lock = None
@property
def _writer(self) -> tantivy.IndexWriter:
assert self._raw_writer is not None, (
"WriteBatch not entered; use as context manager"
)
return self._raw_writer
def __enter__(self) -> Self:
if self._backend._path is not None:
lock_path = self._backend._path / ".tantivy.lock"
@@ -190,7 +183,7 @@ class WriteBatch:
f"Could not acquire index lock within {self._lock_timeout}s",
) from e
self._raw_writer = self._backend._index.writer()
self._writer = self._backend._index.writer()
return self
def __exit__(self, exc_type, exc_val, exc_tb):
@@ -200,9 +193,9 @@ class WriteBatch:
self._backend._index.reload()
# Explicitly delete writer to release tantivy's internal lock.
# On exception the uncommitted writer is simply discarded.
if self._raw_writer is not None:
del self._raw_writer
self._raw_writer = None
if self._writer is not None:
del self._writer
self._writer = None
finally:
if self._lock is not None:
self._lock.release()
@@ -229,9 +222,24 @@ class WriteBatch:
self._writer.add_document(doc)
def remove(self, doc_id: int) -> None:
"""Remove a document from the batch by its primary key."""
"""
Remove a document from the batch by its primary key.
Uses range_query instead of term_query to work around a tantivy-py bug
where Python integers are inferred as i64, producing Terms that never
match u64 fields.
TODO: Replace with term_query("id", doc_id) once
https://github.com/quickwit-oss/tantivy-py/pull/642 lands.
"""
self._writer.delete_documents_by_query(
tantivy.Query.term_query(self._backend._schema, "id", doc_id),
tantivy.Query.range_query(
self._backend._schema,
"id",
tantivy.FieldType.Unsigned,
doc_id,
doc_id,
),
)
@@ -280,18 +288,8 @@ class TantivyBackend:
# path=None → in-memory index (for tests)
# path=some_dir → on-disk index (for production)
self._path = path
self._raw_index: tantivy.Index | None = None
self._raw_schema: tantivy.Schema | None = None
@property
def _index(self) -> tantivy.Index:
assert self._raw_index is not None, "Index not open; call open() first"
return self._raw_index
@property
def _schema(self) -> tantivy.Schema:
assert self._raw_schema is not None, "Schema not open; call open() first"
return self._raw_schema
self._index = None
self._schema = None
def open(self) -> None:
"""
@@ -301,14 +299,14 @@ class TantivyBackend:
version or language changes. Registers custom tokenizers after opening.
Safe to call multiple times - subsequent calls are no-ops.
"""
if self._raw_index is not None:
if self._index is not None:
return # pragma: no cover
if self._path is not None:
self._raw_index = open_or_rebuild_index(self._path)
self._index = open_or_rebuild_index(self._path)
else:
self._raw_index = tantivy.Index(build_schema())
register_tokenizers(self._raw_index, settings.SEARCH_LANGUAGE)
self._raw_schema = self._raw_index.schema
self._index = tantivy.Index(build_schema())
register_tokenizers(self._index, settings.SEARCH_LANGUAGE)
self._schema = self._index.schema
def close(self) -> None:
"""
@@ -316,12 +314,12 @@ class TantivyBackend:
Safe to call multiple times - subsequent calls are no-ops.
"""
self._raw_index = None
self._raw_schema = None
self._index = None
self._schema = None
def _ensure_open(self) -> None:
"""Ensure the index is open before operations."""
if self._raw_index is None:
if self._index is None:
self.open() # pragma: no cover
def _parse_query(
@@ -341,7 +339,7 @@ class TantivyBackend:
def _apply_permission_filter(
self,
query: tantivy.Query,
user: AbstractUser | None,
user: AbstractBaseUser | None,
) -> tantivy.Query:
"""Wrap a query with a permission filter if the user is not a superuser."""
if user is not None:
@@ -528,6 +526,15 @@ class TantivyBackend:
Use this when you already know which documents to display (from
search_ids + ORM filtering) and just need highlight data.
Note: Each doc_id requires an individual index lookup because tantivy-py
does not yet expose a batch fast-field read API. This is acceptable for
page-sized batches (typically 25 docs) but should not be called with
thousands of IDs.
TODO: When https://github.com/quickwit-oss/tantivy-py/pull/641 lands,
the per-doc range_query lookups here can be replaced with a single
collect_u64_fast_field("id", doc_addresses) call.
Args:
query: The search query (used for snippet generation)
doc_ids: Ordered list of document IDs to generate hits for
@@ -564,42 +571,32 @@ class TantivyBackend:
notes_text_query = user_query
searcher = self._index.searcher()
# Fetch all requested docs in a single search: user_query MUST match
# and exactly the requested IDs MUST match (OR of term_queries).
id_filter = tantivy.Query.boolean_query(
[
(
tantivy.Occur.Should,
tantivy.Query.term_query(self._schema, "id", did),
)
for did in doc_ids
],
)
batch_query = tantivy.Query.boolean_query(
[
(tantivy.Occur.Must, user_query),
(tantivy.Occur.Must, id_filter),
],
)
batch_results = searcher.search(batch_query, limit=len(doc_ids))
result_addrs = [addr for _score, addr in batch_results.hits]
result_ids = cast("list[int]", searcher.fast_field_values("id", result_addrs))
addr_by_id: dict[int, tuple[float, tantivy.DocAddress]] = {
doc_id: (score, addr)
for (score, addr), doc_id in zip(batch_results.hits, result_ids)
}
snippet_generator = None
notes_snippet_generator = None
hits: list[SearchHit] = []
for rank, doc_id in enumerate(doc_ids, start=rank_start):
if doc_id not in addr_by_id:
# Look up document by ID, scoring against the user query so that
# the returned SearchHit carries a real BM25 relevance score.
id_query = tantivy.Query.range_query(
self._schema,
"id",
tantivy.FieldType.Unsigned,
doc_id,
doc_id,
)
scored_query = tantivy.Query.boolean_query(
[
(tantivy.Occur.Must, user_query),
(tantivy.Occur.Must, id_query),
],
)
results = searcher.search(scored_query, limit=1)
if not results.hits:
continue
score, doc_address = addr_by_id[doc_id]
score, doc_address = results.hits[0]
actual_doc = searcher.doc(doc_address)
doc_dict = actual_doc.to_dict()
@@ -650,7 +647,7 @@ class TantivyBackend:
def search_ids(
self,
query: str,
user: AbstractUser | None,
user: AbstractBaseUser | None,
*,
sort_field: str | None = None,
sort_reverse: bool = False,
@@ -704,16 +701,16 @@ class TantivyBackend:
if threshold is not None:
all_hits = [hit for hit in all_hits if hit[1] >= threshold]
return cast(
"list[int]",
searcher.fast_field_values("id", [doc_addr for doc_addr, *_ in all_hits]),
)
# TODO: Replace with searcher.collect_u64_fast_field("id", addrs) once
# https://github.com/quickwit-oss/tantivy-py/pull/641 lands — eliminates
# one stored-doc fetch per result (~80% reduction in search_ids latency).
return [searcher.doc(doc_addr).to_dict()["id"][0] for doc_addr, *_ in all_hits]
def autocomplete(
self,
term: str,
limit: int,
user: AbstractUser | None = None,
user: AbstractBaseUser | None = None,
) -> list[str]:
"""
Get autocomplete suggestions for search queries.
@@ -741,25 +738,69 @@ class TantivyBackend:
searcher = self._index.searcher()
permission_query = None
# Build a prefix query on autocomplete_word so we only scan docs
# containing words that start with the prefix, not the entire index.
# tantivy regex is implicitly anchored; .+ avoids the empty-match
# error that .* triggers. We OR with term_query to also match the
# exact prefix as a complete word.
escaped = re.escape(normalized_term)
prefix_query = tantivy.Query.boolean_query(
[
(
tantivy.Occur.Should,
tantivy.Query.term_query(
self._schema,
"autocomplete_word",
normalized_term,
),
),
(
tantivy.Occur.Should,
tantivy.Query.regex_query(
self._schema,
"autocomplete_word",
f"{escaped}.+",
),
),
],
)
# Intersect with permission filter so autocomplete words from
# invisible documents don't leak to other users.
if user is not None and not user.is_superuser:
permission_query = build_permission_filter(self._schema, user)
final_query = tantivy.Query.boolean_query(
[
(tantivy.Occur.Must, prefix_query),
(tantivy.Occur.Must, build_permission_filter(self._schema, user)),
],
)
else:
final_query = prefix_query
matches = searcher.terms_with_prefix(
"autocomplete_word",
normalized_term,
permission_query,
limit,
results = searcher.search(final_query, limit=searcher.num_docs)
# Count how many visible documents each matching word appears in.
word_counts: Counter[str] = Counter()
for _score, doc_address in results.hits:
stored_doc = searcher.doc(doc_address)
doc_dict = stored_doc.to_dict()
if "autocomplete_word" in doc_dict:
for word in doc_dict["autocomplete_word"]:
if word.startswith(normalized_term):
word_counts[word] += 1
# Sort by document frequency descending; break ties alphabetically.
matches = sorted(
word_counts,
key=lambda w: (-word_counts[w], w),
)
return [x[0] for x in matches]
return matches[:limit]
def more_like_this_ids(
self,
doc_id: int,
user: AbstractUser | None,
user: AbstractBaseUser | None,
*,
limit: int | None = None,
) -> list[int]:
@@ -780,7 +821,13 @@ class TantivyBackend:
self._ensure_open()
searcher = self._index.searcher()
id_query = tantivy.Query.term_query(self._schema, "id", doc_id)
id_query = tantivy.Query.range_query(
self._schema,
"id",
tantivy.FieldType.Unsigned,
doc_id,
doc_id,
)
results = searcher.search(id_query, limit=1)
if not results.hits:
@@ -804,9 +851,14 @@ class TantivyBackend:
# Fetch one extra to account for excluding the original document
results = searcher.search(final_query, limit=effective_limit + 1)
addrs = [addr for _score, addr in results.hits]
all_ids = cast("list[int]", searcher.fast_field_values("id", addrs))
ids = [rid for rid in all_ids if rid != doc_id]
# TODO: Replace with collect_u64_fast_field("id", addrs) once
# https://github.com/quickwit-oss/tantivy-py/pull/641 lands.
ids = []
for _score, doc_address in results.hits:
result_doc_id = searcher.doc(doc_address).to_dict()["id"][0]
if result_doc_id != doc_id:
ids.append(result_doc_id)
return ids[:limit] if limit is not None else ids
def batch_update(self, lock_timeout: float = 30.0) -> WriteBatch:
@@ -854,9 +906,9 @@ class TantivyBackend:
register_tokenizers(new_index, settings.SEARCH_LANGUAGE)
# Point instance at the new index so _build_tantivy_doc uses it
old_index, old_schema = self._raw_index, self._raw_schema
self._raw_index = new_index
self._raw_schema = new_index.schema
old_index, old_schema = self._index, self._schema
self._index = new_index
self._schema = new_index.schema
try:
writer = new_index.writer()
@@ -870,8 +922,8 @@ class TantivyBackend:
new_index.reload()
except BaseException: # pragma: no cover
# Restore old index on failure so the backend remains usable
self._raw_index = old_index
self._raw_schema = old_schema
self._index = old_index
self._schema = old_schema
raise
+38 -3
View File
@@ -410,6 +410,9 @@ def normalize_query(query: str) -> str:
raise ValueError("Query too complex to process (normalization timed out)")
_MAX_U64 = 2**64 - 1 # u64 max — used as inclusive upper bound for "any owner" range
def build_permission_filter(
schema: tantivy.Schema,
user: AbstractBaseUser,
@@ -429,16 +432,48 @@ def build_permission_filter(
Returns:
Tantivy query that filters results to visible documents
Implementation Notes:
- Uses range_query instead of term_query for owner_id/viewer_id to work
around a tantivy-py bug where Python ints are inferred as i64, causing
term_query to return no hits on u64 fields.
TODO: Replace with term_query once
https://github.com/quickwit-oss/tantivy-py/pull/642 lands.
- Uses range_query(owner_id, 1, MAX_U64) as an "owner exists" check
because exists_query is not yet available in tantivy-py 0.25.
TODO: Replace with exists_query("owner_id") once that is exposed in
a tantivy-py release.
- Uses disjunction_max_query to combine permission clauses with OR logic
"""
owner_any = tantivy.Query.exists_query("owner_id")
owner_any = tantivy.Query.range_query(
schema,
"owner_id",
tantivy.FieldType.Unsigned,
1,
_MAX_U64,
)
no_owner = tantivy.Query.boolean_query(
[
(tantivy.Occur.Must, tantivy.Query.all_query()),
(tantivy.Occur.MustNot, owner_any),
],
)
owned = tantivy.Query.term_query(schema, "owner_id", user.pk)
shared = tantivy.Query.term_query(schema, "viewer_id", user.pk)
owned = tantivy.Query.range_query(
schema,
"owner_id",
tantivy.FieldType.Unsigned,
user.pk,
user.pk,
)
shared = tantivy.Query.range_query(
schema,
"viewer_id",
tantivy.FieldType.Unsigned,
user.pk,
user.pk,
)
return tantivy.Query.disjunction_max_query([no_owner, owned, shared])
+2 -5
View File
@@ -4,8 +4,6 @@ import json
import logging
import shutil
from typing import TYPE_CHECKING
from typing import Final
from typing import cast
import tantivy
from django.conf import settings
@@ -15,8 +13,7 @@ if TYPE_CHECKING:
logger = logging.getLogger("paperless.search")
# v1 - Initial tantivy schema format
SCHEMA_VERSION: Final[int] = 1
SCHEMA_VERSION = 1
def build_schema() -> tantivy.Schema:
@@ -175,7 +172,7 @@ def open_or_rebuild_index(index_dir: Path | None = None) -> tantivy.Index:
Opened Tantivy index (caller must register custom tokenizers)
"""
if index_dir is None:
index_dir = cast("Path", settings.INDEX_DIR)
index_dir = settings.INDEX_DIR
if not index_dir.exists():
return tantivy.Index(build_schema())
if needs_rebuild(index_dir):
+20 -19
View File
@@ -11,7 +11,6 @@ from typing import Any
from typing import Literal
from typing import TypedDict
import magic
from django.conf import settings
from django.contrib.auth.models import Group
from django.contrib.auth.models import User
@@ -49,6 +48,8 @@ from rest_framework.exceptions import PermissionDenied
from rest_framework.fields import SerializerMethodField
from rest_framework.filters import OrderingFilter
from paperless import mime_detection
if settings.AUDIT_LOG_ENABLED:
from auditlog.context import set_actor
@@ -214,7 +215,7 @@ class SetPermissionsMixin:
set_permissions_for_object(permissions, object)
class SerializerWithPerms(serializers.Serializer[dict[str, Any]]):
class SerializerWithPerms(serializers.Serializer):
def __init__(self, *args, **kwargs) -> None:
self.user = kwargs.pop("user", None)
self.full_perms = kwargs.pop("full_perms", False)
@@ -961,12 +962,20 @@ def _get_viewable_duplicates(
return duplicates.filter(id__in=allowed)
class DuplicateDocumentSummarySerializer(serializers.Serializer[dict[str, Any]]):
class DuplicateDocumentSummarySerializer(serializers.Serializer):
id = serializers.IntegerField()
title = serializers.CharField()
deleted_at = serializers.DateTimeField(allow_null=True)
class DocumentVersionInfoSerializer(serializers.Serializer):
id = serializers.IntegerField()
added = serializers.DateTimeField()
version_label = serializers.CharField(required=False, allow_null=True)
checksum = serializers.CharField(required=False, allow_null=True)
is_root = serializers.BooleanField()
class _DocumentVersionInfo(TypedDict):
id: int
added: datetime
@@ -975,14 +984,6 @@ class _DocumentVersionInfo(TypedDict):
is_root: bool
class DocumentVersionInfoSerializer(serializers.Serializer[_DocumentVersionInfo]):
id = serializers.IntegerField()
added = serializers.DateTimeField()
version_label = serializers.CharField(required=False, allow_null=True)
checksum = serializers.CharField(required=False, allow_null=True)
is_root = serializers.BooleanField()
@extend_schema_serializer(
deprecate_fields=["created_date"],
)
@@ -1532,7 +1533,7 @@ class SavedViewSerializer(OwnedObjectSerializer):
return saved_view
class DocumentListSerializer(serializers.Serializer[dict[str, list[int]]]):
class DocumentListSerializer(serializers.Serializer):
documents = serializers.ListField(
required=True,
label="Documents",
@@ -2085,7 +2086,7 @@ class BulkEditSerializer(
return attrs
class PostDocumentSerializer(serializers.Serializer[dict[str, Any]]):
class PostDocumentSerializer(serializers.Serializer):
created = serializers.DateTimeField(
label="Created",
allow_null=True,
@@ -2159,7 +2160,7 @@ class PostDocumentSerializer(serializers.Serializer[dict[str, Any]]):
def validate_document(self, document):
document_data = document.file.read()
mime_type = magic.from_buffer(document_data, mime=True)
mime_type = mime_detection.from_buffer(document_data)
if not is_mime_type_supported(mime_type):
if (
@@ -2262,7 +2263,7 @@ class PostDocumentSerializer(serializers.Serializer[dict[str, Any]]):
return created.date()
class DocumentVersionSerializer(serializers.Serializer[dict[str, Any]]):
class DocumentVersionSerializer(serializers.Serializer):
document = serializers.FileField(
label="Document",
write_only=True,
@@ -2278,7 +2279,7 @@ class DocumentVersionSerializer(serializers.Serializer[dict[str, Any]]):
validate_document = PostDocumentSerializer().validate_document
class DocumentVersionLabelSerializer(serializers.Serializer[dict[str, str | None]]):
class DocumentVersionLabelSerializer(serializers.Serializer):
version_label = serializers.CharField(
label="Version label",
required=True,
@@ -2484,7 +2485,7 @@ class TaskSerializerV10(OwnedObjectSerializer):
read_only_fields = fields
class TaskSerializerV9(serializers.ModelSerializer[PaperlessTask]):
class TaskSerializerV9(serializers.ModelSerializer):
"""Task serializer for API v9 backwards compatibility.
Maps old field names to the new model fields so existing clients continue
@@ -2609,7 +2610,7 @@ class TaskSerializerV9(serializers.ModelSerializer[PaperlessTask]):
return list(qs.values("id", "title", "deleted_at"))
class TaskSummarySerializer(serializers.Serializer[dict[str, Any]]):
class TaskSummarySerializer(serializers.Serializer):
task_type = serializers.CharField()
total_count = serializers.IntegerField()
pending_count = serializers.IntegerField()
@@ -2622,7 +2623,7 @@ class TaskSummarySerializer(serializers.Serializer[dict[str, Any]]):
last_failure = serializers.DateTimeField(allow_null=True)
class RunTaskSerializer(serializers.Serializer[dict[str, str]]):
class RunTaskSerializer(serializers.Serializer):
task_type = serializers.ChoiceField(
choices=PaperlessTask.TaskType.choices,
label="Task Type",
+6 -6
View File
@@ -16,7 +16,7 @@ from documents.models import StoragePath
from documents.models import Tag
class CorrespondentFactory(DjangoModelFactory[Correspondent]):
class CorrespondentFactory(DjangoModelFactory):
class Meta:
model = Correspondent
@@ -25,7 +25,7 @@ class CorrespondentFactory(DjangoModelFactory[Correspondent]):
matching_algorithm = MatchingModel.MATCH_NONE
class DocumentTypeFactory(DjangoModelFactory[DocumentType]):
class DocumentTypeFactory(DjangoModelFactory):
class Meta:
model = DocumentType
@@ -34,7 +34,7 @@ class DocumentTypeFactory(DjangoModelFactory[DocumentType]):
matching_algorithm = MatchingModel.MATCH_NONE
class TagFactory(DjangoModelFactory[Tag]):
class TagFactory(DjangoModelFactory):
class Meta:
model = Tag
@@ -44,7 +44,7 @@ class TagFactory(DjangoModelFactory[Tag]):
is_inbox_tag = False
class StoragePathFactory(DjangoModelFactory[StoragePath]):
class StoragePathFactory(DjangoModelFactory):
class Meta:
model = StoragePath
@@ -56,7 +56,7 @@ class StoragePathFactory(DjangoModelFactory[StoragePath]):
matching_algorithm = MatchingModel.MATCH_NONE
class DocumentFactory(DjangoModelFactory[Document]):
class DocumentFactory(DjangoModelFactory):
class Meta:
model = Document
@@ -68,7 +68,7 @@ class DocumentFactory(DjangoModelFactory[Document]):
storage_path = None
class PaperlessTaskFactory(DjangoModelFactory[PaperlessTask]):
class PaperlessTaskFactory(DjangoModelFactory):
class Meta:
model = PaperlessTask
@@ -4,7 +4,6 @@ from __future__ import annotations
import io
from typing import TYPE_CHECKING
from typing import Any
import pytest
from django.core.management import CommandError
@@ -123,7 +122,7 @@ def mock_queryset():
This verifies we use .count() instead of len() for querysets.
"""
class MockQuerySet(QuerySet[Any]):
class MockQuerySet(QuerySet):
def __init__(self, items: list):
self._items = items
self.count_called = False
@@ -148,7 +147,7 @@ def mock_queryset():
class TestProcessResult:
"""Tests for the ProcessResult dataclass."""
def test_success_result(self) -> None:
def test_success_result(self):
result = ProcessResult(item=1, result=2, error=None)
assert result.item == 1
@@ -156,7 +155,7 @@ class TestProcessResult:
assert result.error is None
assert result.success is True
def test_error_result(self) -> None:
def test_error_result(self):
error = ValueError("test error")
result = ProcessResult(item=1, result=None, error=error)
@@ -170,7 +169,7 @@ class TestProcessResult:
class TestPaperlessCommandArguments:
"""Tests for argument parsing behavior."""
def test_progress_bar_argument_added_by_default(self) -> None:
def test_progress_bar_argument_added_by_default(self):
command = SimpleCommand()
parser = command.create_parser("manage.py", "simple")
@@ -180,14 +179,14 @@ class TestPaperlessCommandArguments:
options = parser.parse_args([])
assert options.no_progress_bar is False
def test_progress_bar_argument_not_added_when_disabled(self) -> None:
def test_progress_bar_argument_not_added_when_disabled(self):
command = NoProgressBarCommand()
parser = command.create_parser("manage.py", "noprogress")
options = parser.parse_args([])
assert not hasattr(options, "no_progress_bar")
def test_processes_argument_added_when_multiprocessing_enabled(self) -> None:
def test_processes_argument_added_when_multiprocessing_enabled(self):
command = MultiprocessCommand()
parser = command.create_parser("manage.py", "multiprocess")
@@ -197,7 +196,7 @@ class TestPaperlessCommandArguments:
options = parser.parse_args([])
assert options.processes >= 1
def test_processes_argument_not_added_when_multiprocessing_disabled(self) -> None:
def test_processes_argument_not_added_when_multiprocessing_disabled(self):
command = SimpleCommand()
parser = command.create_parser("manage.py", "simple")
@@ -232,7 +231,7 @@ class TestPaperlessCommandExecute:
*,
no_progress_bar_flag: bool,
expected: bool,
) -> None:
):
command = SimpleCommand()
command.stdout = io.StringIO()
command.stderr = io.StringIO()
@@ -242,10 +241,7 @@ class TestPaperlessCommandExecute:
assert command.no_progress_bar is expected
def test_no_progress_bar_always_true_when_not_supported(
self,
base_options: dict,
) -> None:
def test_no_progress_bar_always_true_when_not_supported(self, base_options: dict):
command = NoProgressBarCommand()
command.stdout = io.StringIO()
command.stderr = io.StringIO()
@@ -266,7 +262,7 @@ class TestPaperlessCommandExecute:
base_options: dict,
processes: int,
expected: int,
) -> None:
):
command = MultiprocessCommand()
command.stdout = io.StringIO()
command.stderr = io.StringIO()
@@ -287,7 +283,7 @@ class TestPaperlessCommandExecute:
self,
base_options: dict,
invalid_count: int,
) -> None:
):
command = MultiprocessCommand()
command.stdout = io.StringIO()
command.stderr = io.StringIO()
@@ -297,10 +293,7 @@ class TestPaperlessCommandExecute:
with pytest.raises(CommandError, match="--processes must be at least 1"):
command.execute(**options)
def test_process_count_defaults_to_one_when_not_supported(
self,
base_options: dict,
) -> None:
def test_process_count_defaults_to_one_when_not_supported(self, base_options: dict):
command = SimpleCommand()
command.stdout = io.StringIO()
command.stderr = io.StringIO()
@@ -319,7 +312,7 @@ class TestGetIterableLength:
self,
simple_command: SimpleCommand,
mock_queryset,
) -> None:
):
"""Should call .count() on Django querysets rather than len()."""
queryset = mock_queryset([1, 2, 3, 4, 5])
@@ -328,16 +321,13 @@ class TestGetIterableLength:
assert result == 5
assert queryset.count_called is True
def test_uses_len_for_sized(self, simple_command: SimpleCommand) -> None:
def test_uses_len_for_sized(self, simple_command: SimpleCommand):
"""Should use len() for sequences and other Sized types."""
result = simple_command._get_iterable_length([1, 2, 3, 4])
assert result == 4
def test_returns_none_for_unsized_iterables(
self,
simple_command: SimpleCommand,
) -> None:
def test_returns_none_for_unsized_iterables(self, simple_command: SimpleCommand):
"""Should return None for generators and other iterables without len()."""
result = simple_command._get_iterable_length(x for x in [1, 2, 3])
@@ -348,7 +338,7 @@ class TestGetIterableLength:
class TestTrack:
"""Tests for the track() method."""
def test_with_progress_bar_disabled(self, simple_command: SimpleCommand) -> None:
def test_with_progress_bar_disabled(self, simple_command: SimpleCommand):
simple_command.no_progress_bar = True
items = ["a", "b", "c"]
@@ -356,7 +346,7 @@ class TestTrack:
assert result == items
def test_with_progress_bar_enabled(self, simple_command: SimpleCommand) -> None:
def test_with_progress_bar_enabled(self, simple_command: SimpleCommand):
simple_command.no_progress_bar = False
items = [1, 2, 3]
@@ -364,7 +354,7 @@ class TestTrack:
assert result == items
def test_with_explicit_total(self, simple_command: SimpleCommand) -> None:
def test_with_explicit_total(self, simple_command: SimpleCommand):
simple_command.no_progress_bar = False
def gen():
@@ -374,7 +364,7 @@ class TestTrack:
assert result == [1, 2, 3]
def test_with_generator_no_total(self, simple_command: SimpleCommand) -> None:
def test_with_generator_no_total(self, simple_command: SimpleCommand):
def gen():
yield from [1, 2, 3]
@@ -382,7 +372,7 @@ class TestTrack:
assert result == [1, 2, 3]
def test_empty_iterable(self, simple_command: SimpleCommand) -> None:
def test_empty_iterable(self, simple_command: SimpleCommand):
result = list(simple_command.track([]))
assert result == []
@@ -392,7 +382,7 @@ class TestTrack:
simple_command: SimpleCommand,
mock_queryset,
mocker: MockerFixture,
) -> None:
):
"""Verify track() uses .count() for querysets."""
simple_command.no_progress_bar = False
queryset = mock_queryset([1, 2, 3])
@@ -413,7 +403,7 @@ class TestProcessParallel:
def test_sequential_processing_single_process(
self,
multiprocess_command: MultiprocessCommand,
) -> None:
):
multiprocess_command.process_count = 1
items = [1, 2, 3, 4, 5]
@@ -428,7 +418,7 @@ class TestProcessParallel:
def test_sequential_processing_handles_errors(
self,
multiprocess_command: MultiprocessCommand,
) -> None:
):
multiprocess_command.process_count = 1
items = [1, 2, 0, 4] # 0 causes ZeroDivisionError
@@ -448,7 +438,7 @@ class TestProcessParallel:
self,
multiprocess_command: MultiprocessCommand,
mocker: MockerFixture,
) -> None:
):
multiprocess_command.process_count = 2
items = [1, 2, 3]
@@ -465,7 +455,7 @@ class TestProcessParallel:
self,
multiprocess_command: MultiprocessCommand,
mocker: MockerFixture,
) -> None:
):
multiprocess_command.process_count = 2
items = [1, 2, 0, 4]
@@ -477,7 +467,7 @@ class TestProcessParallel:
assert len(failures) == 1
assert failures[0].item == 0
def test_empty_items(self, multiprocess_command: MultiprocessCommand) -> None:
def test_empty_items(self, multiprocess_command: MultiprocessCommand):
results = list(multiprocess_command.process_parallel(_double_value, []))
assert results == []
@@ -485,7 +475,7 @@ class TestProcessParallel:
def test_result_contains_original_item(
self,
multiprocess_command: MultiprocessCommand,
) -> None:
):
items = [10, 20, 30]
results = list(multiprocess_command.process_parallel(_double_value, items))
@@ -498,7 +488,7 @@ class TestProcessParallel:
self,
multiprocess_command: MultiprocessCommand,
mocker: MockerFixture,
) -> None:
):
"""Verify single process uses sequential path (important for testing)."""
multiprocess_command.process_count = 1
@@ -514,7 +504,7 @@ class TestProcessParallel:
self,
multiprocess_command: MultiprocessCommand,
mocker: MockerFixture,
) -> None:
):
"""Verify multiple processes uses parallel path."""
multiprocess_command.process_count = 2
+31 -49
View File
@@ -16,7 +16,7 @@ pytestmark = [pytest.mark.search, pytest.mark.django_db]
class TestWriteBatch:
"""Test WriteBatch context manager functionality."""
def test_rolls_back_on_exception(self, backend: TantivyBackend) -> None:
def test_rolls_back_on_exception(self, backend: TantivyBackend):
"""Batch operations must rollback on exception to preserve index integrity."""
doc = Document.objects.create(
title="Rollback Target",
@@ -43,7 +43,7 @@ class TestSearch:
def test_text_mode_limits_default_search_to_title_and_content(
self,
backend: TantivyBackend,
) -> None:
):
"""Simple text mode must not match metadata-only fields."""
doc = Document.objects.create(
title="Invoice document",
@@ -71,7 +71,7 @@ class TestSearch:
def test_title_mode_limits_default_search_to_title_only(
self,
backend: TantivyBackend,
) -> None:
):
"""Title mode must not match content-only terms."""
doc = Document.objects.create(
title="Invoice document",
@@ -93,7 +93,7 @@ class TestSearch:
def test_text_mode_matches_partial_term_substrings(
self,
backend: TantivyBackend,
) -> None:
):
"""Simple text mode should support substring matching within tokens."""
doc = Document.objects.create(
title="Account access",
@@ -117,7 +117,7 @@ class TestSearch:
def test_text_mode_does_not_match_on_partial_term_overlap(
self,
backend: TantivyBackend,
) -> None:
):
"""Simple text mode should not match documents that merely share partial fragments."""
doc = Document.objects.create(
title="Adobe Acrobat PDF Files",
@@ -135,7 +135,7 @@ class TestSearch:
def test_text_mode_anchors_later_query_tokens_to_token_starts(
self,
backend: TantivyBackend,
) -> None:
):
"""Multi-token simple search should not match later tokens in the middle of a word."""
exact_doc = Document.objects.create(
title="Z-Berichte 6",
@@ -170,7 +170,7 @@ class TestSearch:
def test_text_mode_ignores_queries_without_searchable_tokens(
self,
backend: TantivyBackend,
) -> None:
):
"""Simple text mode should safely return no hits for symbol-only strings."""
doc = Document.objects.create(
title="Guide",
@@ -187,7 +187,7 @@ class TestSearch:
def test_title_mode_matches_partial_term_substrings(
self,
backend: TantivyBackend,
) -> None:
):
"""Title mode should support substring matching within title tokens."""
doc = Document.objects.create(
title="Password guide",
@@ -210,7 +210,7 @@ class TestSearch:
== 1
)
def test_sort_field_ascending(self, backend: TantivyBackend) -> None:
def test_sort_field_ascending(self, backend: TantivyBackend):
"""Searching with sort_reverse=False must return results in ascending ASN order."""
for asn in [30, 10, 20]:
doc = Document.objects.create(
@@ -231,7 +231,7 @@ class TestSearch:
asns = [Document.objects.get(pk=doc_id).archive_serial_number for doc_id in ids]
assert asns == [10, 20, 30]
def test_sort_field_descending(self, backend: TantivyBackend) -> None:
def test_sort_field_descending(self, backend: TantivyBackend):
"""Searching with sort_reverse=True must return results in descending ASN order."""
for asn in [30, 10, 20]:
doc = Document.objects.create(
@@ -256,7 +256,7 @@ class TestSearch:
class TestSearchIds:
"""Test lightweight ID-only search."""
def test_returns_matching_ids(self, backend: TantivyBackend) -> None:
def test_returns_matching_ids(self, backend: TantivyBackend):
"""search_ids must return IDs of all matching documents."""
docs = []
for i in range(5):
@@ -282,7 +282,7 @@ class TestSearchIds:
assert set(ids) == {d.pk for d in docs}
assert other.pk not in ids
def test_respects_permission_filter(self, backend: TantivyBackend) -> None:
def test_respects_permission_filter(self, backend: TantivyBackend):
"""search_ids must respect user permission filtering."""
owner = User.objects.create_user("ids_owner")
other = User.objects.create_user("ids_other")
@@ -303,7 +303,7 @@ class TestSearchIds:
backend.search_ids("secret", user=other, search_mode=SearchMode.QUERY) == []
)
def test_respects_fuzzy_threshold(self, backend: TantivyBackend, settings) -> None:
def test_respects_fuzzy_threshold(self, backend: TantivyBackend, settings):
"""search_ids must apply the same fuzzy threshold as search()."""
doc = Document.objects.create(
title="threshold test",
@@ -316,7 +316,7 @@ class TestSearchIds:
ids = backend.search_ids("unique", user=None, search_mode=SearchMode.QUERY)
assert ids == []
def test_returns_ids_for_text_mode(self, backend: TantivyBackend) -> None:
def test_returns_ids_for_text_mode(self, backend: TantivyBackend):
"""search_ids must work with TEXT search mode."""
doc = Document.objects.create(
title="text mode doc",
@@ -332,7 +332,7 @@ class TestSearchIds:
class TestRebuild:
"""Test index rebuilding functionality."""
def test_with_iter_wrapper_called(self, backend: TantivyBackend) -> None:
def test_with_iter_wrapper_called(self, backend: TantivyBackend):
"""Index rebuild must pass documents through iter_wrapper for progress tracking."""
seen = []
@@ -349,7 +349,7 @@ class TestRebuild:
class TestAutocomplete:
"""Test autocomplete functionality."""
def test_basic_functionality(self, backend: TantivyBackend) -> None:
def test_basic_functionality(self, backend: TantivyBackend):
"""Autocomplete must return words matching the given prefix."""
doc = Document.objects.create(
title="Invoice from Microsoft Corporation",
@@ -362,10 +362,7 @@ class TestAutocomplete:
results = backend.autocomplete("micro", limit=10)
assert "microsoft" in results
def test_results_ordered_by_document_frequency(
self,
backend: TantivyBackend,
) -> None:
def test_results_ordered_by_document_frequency(self, backend: TantivyBackend):
"""Autocomplete results must be ordered by document frequency to prioritize common terms."""
# "payment" appears in 3 docs; "payslip" in 1 — "pay" prefix should
# return "payment" before "payslip".
@@ -393,10 +390,7 @@ class TestAutocomplete:
class TestMoreLikeThis:
"""Test more like this functionality."""
def test_more_like_this_ids_excludes_original(
self,
backend: TantivyBackend,
) -> None:
def test_more_like_this_ids_excludes_original(self, backend: TantivyBackend):
"""more_like_this_ids must return IDs of similar documents, excluding the original."""
doc1 = Document.objects.create(
title="Important document",
@@ -427,11 +421,11 @@ class TestSingleton:
yield
reset_backend()
def test_returns_same_instance_on_repeated_calls(self, index_dir) -> None:
def test_returns_same_instance_on_repeated_calls(self, index_dir):
"""Singleton pattern: repeated calls to get_backend() must return the same instance."""
assert get_backend() is get_backend()
def test_reinitializes_when_index_dir_changes(self, tmp_path, settings) -> None:
def test_reinitializes_when_index_dir_changes(self, tmp_path, settings):
"""Backend singleton must reinitialize when INDEX_DIR setting changes for test isolation."""
settings.INDEX_DIR = tmp_path / "a"
(tmp_path / "a").mkdir()
@@ -444,7 +438,7 @@ class TestSingleton:
assert b1 is not b2
assert b2._path == tmp_path / "b"
def test_reset_forces_new_instance(self, index_dir) -> None:
def test_reset_forces_new_instance(self, index_dir):
"""reset_backend() must force creation of a new backend instance on next get_backend() call."""
b1 = get_backend()
reset_backend()
@@ -455,7 +449,7 @@ class TestSingleton:
class TestFieldHandling:
"""Test handling of various document fields."""
def test_none_values_handled_correctly(self, backend: TantivyBackend) -> None:
def test_none_values_handled_correctly(self, backend: TantivyBackend):
"""Document fields with None values must not cause indexing errors."""
doc = Document.objects.create(
title="Test Doc",
@@ -470,10 +464,7 @@ class TestFieldHandling:
assert len(backend.search_ids("test", user=None)) == 1
def test_custom_fields_include_name_and_value(
self,
backend: TantivyBackend,
) -> None:
def test_custom_fields_include_name_and_value(self, backend: TantivyBackend):
"""Custom fields must be indexed with both field name and value for structured queries."""
field = CustomField.objects.create(
name="Invoice Number",
@@ -495,10 +486,7 @@ class TestFieldHandling:
assert len(backend.search_ids("invoice", user=None)) == 1
def test_select_custom_field_indexes_label_not_id(
self,
backend: TantivyBackend,
) -> None:
def test_select_custom_field_indexes_label_not_id(self, backend: TantivyBackend):
"""SELECT custom fields must index the human-readable label, not the opaque option ID."""
field = CustomField.objects.create(
name="Category",
@@ -526,7 +514,7 @@ class TestFieldHandling:
assert len(backend.search_ids("custom_fields.value:invoice", user=None)) == 1
assert len(backend.search_ids("custom_fields.value:opt_abc", user=None)) == 0
def test_none_custom_field_value_not_indexed(self, backend: TantivyBackend) -> None:
def test_none_custom_field_value_not_indexed(self, backend: TantivyBackend):
"""Custom field instances with no value set must not produce an index entry."""
field = CustomField.objects.create(
name="Optional",
@@ -548,7 +536,7 @@ class TestFieldHandling:
assert len(backend.search_ids("custom_fields.value:none", user=None)) == 0
def test_notes_include_user_information(self, backend: TantivyBackend) -> None:
def test_notes_include_user_information(self, backend: TantivyBackend):
"""Notes must be indexed with user information when available for structured queries."""
user = User.objects.create_user("notewriter")
doc = Document.objects.create(
@@ -578,7 +566,7 @@ class TestHighlightHits:
def test_highlights_simple_text_mode_returns_html_string(
self,
backend: TantivyBackend,
) -> None:
):
"""Simple text search should still produce content highlights for exact-token hits."""
doc = Document.objects.create(
title="Highlight Test",
@@ -595,10 +583,7 @@ class TestHighlightHits:
assert "content" in highlights
assert "<b>" in highlights["content"]
def test_highlights_content_returns_html_string(
self,
backend: TantivyBackend,
) -> None:
def test_highlights_content_returns_html_string(self, backend: TantivyBackend):
"""highlight_hits must return HTML strings (from Snippet.to_html()), not Snippet objects."""
doc = Document.objects.create(
title="Highlight Test",
@@ -622,10 +607,7 @@ class TestHighlightHits:
f"Expected HTML with <b> tags, got: {content_highlight!r}"
)
def test_highlights_notes_returns_html_string(
self,
backend: TantivyBackend,
) -> None:
def test_highlights_notes_returns_html_string(self, backend: TantivyBackend):
"""Note highlights must be HTML strings via notes_text companion field.
The notes JSON field does not support tantivy SnippetGenerator; the
@@ -660,12 +642,12 @@ class TestHighlightHits:
f"Expected HTML with <b> tags, got: {note_highlight!r}"
)
def test_empty_doc_list_returns_empty_hits(self, backend: TantivyBackend) -> None:
def test_empty_doc_list_returns_empty_hits(self, backend: TantivyBackend):
"""highlight_hits with no doc IDs must return an empty list."""
hits = backend.highlight_hits("anything", [])
assert hits == []
def test_no_highlights_when_no_match(self, backend: TantivyBackend) -> None:
def test_no_highlights_when_no_match(self, backend: TantivyBackend):
"""Documents not matching the query should not appear in results."""
doc = Document.objects.create(
title="Unrelated",
@@ -79,60 +79,60 @@ class TestMigrateFulltextQueryFieldPrefixes(TestMigrations):
value="note:something",
)
def test_note_prefix_rewritten(self) -> None:
def test_note_prefix_rewritten(self):
self.rule_note.refresh_from_db()
self.assertEqual(self.rule_note.value, "notes.note:invoice")
def test_custom_field_prefix_rewritten(self) -> None:
def test_custom_field_prefix_rewritten(self):
self.rule_cf.refresh_from_db()
self.assertEqual(self.rule_cf.value, "custom_fields.value:amount")
def test_combined_query_rewritten(self) -> None:
def test_combined_query_rewritten(self):
self.rule_combined.refresh_from_db()
self.assertEqual(
self.rule_combined.value,
"notes.note:invoice AND custom_fields.value:total",
)
def test_parenthesized_groups(self) -> None:
def test_parenthesized_groups(self):
self.rule_parens.refresh_from_db()
self.assertEqual(
self.rule_parens.value,
"(notes.note:invoice OR notes.note:receipt)",
)
def test_plus_prefix(self) -> None:
def test_plus_prefix(self):
self.rule_plus.refresh_from_db()
self.assertEqual(self.rule_plus.value, "+notes.note:foo")
def test_minus_prefix(self) -> None:
def test_minus_prefix(self):
self.rule_minus.refresh_from_db()
self.assertEqual(self.rule_minus.value, "-notes.note:bar")
def test_boosted(self) -> None:
def test_boosted(self):
self.rule_boost.refresh_from_db()
self.assertEqual(self.rule_boost.value, "notes.note:test^2")
def test_no_match_unchanged(self) -> None:
def test_no_match_unchanged(self):
self.rule_no_match.refresh_from_db()
self.assertEqual(self.rule_no_match.value, "title:hello content:world")
def test_word_boundary_no_false_positive(self) -> None:
def test_word_boundary_no_false_positive(self):
self.rule_denote.refresh_from_db()
self.assertEqual(self.rule_denote.value, "denote:foo")
def test_already_migrated_idempotent(self) -> None:
def test_already_migrated_idempotent(self):
self.rule_already_migrated.refresh_from_db()
self.assertEqual(self.rule_already_migrated.value, "notes.note:foo")
def test_already_migrated_cf_idempotent(self) -> None:
def test_already_migrated_cf_idempotent(self):
self.rule_already_migrated_cf.refresh_from_db()
self.assertEqual(self.rule_already_migrated_cf.value, "custom_fields.value:bar")
def test_null_value_no_crash(self) -> None:
def test_null_value_no_crash(self):
self.rule_null.refresh_from_db()
self.assertIsNone(self.rule_null.value)
def test_non_fulltext_rule_untouched(self) -> None:
def test_non_fulltext_rule_untouched(self):
self.rule_other_type.refresh_from_db()
self.assertEqual(self.rule_other_type.value, "note:something")
+8 -27
View File
@@ -24,7 +24,13 @@ class TestDocumentAdmin(DirectoriesMixin, TestCase):
backend = get_backend()
searcher = backend._index.searcher()
results = searcher.search(
tantivy.Query.term_query(backend._schema, "id", doc.pk),
tantivy.Query.range_query(
backend._schema,
"id",
tantivy.FieldType.Unsigned,
doc.pk,
doc.pk,
),
limit=1,
)
if results.hits:
@@ -94,7 +100,7 @@ class TestTagAdmin(DirectoriesMixin, TestCase):
self.tag_admin = TagAdmin(model=Tag, admin_site=AdminSite())
@patch("documents.tasks.bulk_update_documents")
def test_parent_tags_get_added(self, mock_bulk_update) -> None:
def test_parent_tags_get_added(self, mock_bulk_update):
document = Document.objects.create(title="test")
parent = Tag.objects.create(name="parent")
child = Tag.objects.create(name="child")
@@ -173,28 +179,3 @@ class TestPaperlessAdmin(DirectoriesMixin, TestCase):
self.assertEqual(response.status_code, status.HTTP_200_OK)
superuser.refresh_from_db()
self.assertEqual(superuser.first_name, "Updated")
def test_superuser_can_only_be_deleted_by_superuser(self):
superuser = User.objects.create_superuser(username="superuser", password="test")
user = User.objects.create(
username="test",
is_superuser=False,
is_staff=True,
)
delete_user_perm = Permission.objects.get(codename="delete_user")
user.user_permissions.add(delete_user_perm)
self.client.force_login(user)
response = self.client.delete(f"/api/users/{superuser.pk}/")
self.assertEqual(response.status_code, status.HTTP_403_FORBIDDEN)
self.assertEqual(
response.content.decode(),
"Superusers can only be deleted by other superusers",
)
self.assertTrue(User.objects.filter(pk=superuser.pk).exists())
self.client.logout()
self.client.force_login(superuser)
response = self.client.delete(f"/api/users/{superuser.pk}/")
self.assertEqual(response.status_code, status.HTTP_204_NO_CONTENT)
self.assertFalse(User.objects.filter(pk=superuser.pk).exists())
+1 -9
View File
@@ -91,7 +91,6 @@ class TestApiAppConfig(DirectoriesMixin, APITestCase):
- app_title and app_logo are included
"""
config = ApplicationConfiguration.objects.first()
assert config is not None
config.app_title = "Fancy New Title"
config.app_logo = "/logo/example.jpg"
config.save()
@@ -126,7 +125,6 @@ class TestApiAppConfig(DirectoriesMixin, APITestCase):
)
self.assertEqual(response.status_code, status.HTTP_200_OK)
config = ApplicationConfiguration.objects.first()
assert config is not None
self.assertEqual(config.color_conversion_strategy, ColorConvertChoices.RGB)
def test_api_update_config_empty_fields(self) -> None:
@@ -152,7 +150,6 @@ class TestApiAppConfig(DirectoriesMixin, APITestCase):
)
self.assertEqual(response.status_code, status.HTTP_200_OK)
config = ApplicationConfiguration.objects.first()
assert config is not None
self.assertEqual(config.user_args, None)
self.assertEqual(config.language, None)
self.assertEqual(config.barcode_tag_mapping, None)
@@ -190,7 +187,6 @@ class TestApiAppConfig(DirectoriesMixin, APITestCase):
self.assertIn("image/jpeg", response["Content-Type"])
config = ApplicationConfiguration.objects.first()
assert config is not None
old_logo = config.app_logo
self.assertTrue(Path(old_logo.path).exists())
self.client.patch(
@@ -237,7 +233,6 @@ class TestApiAppConfig(DirectoriesMixin, APITestCase):
self.assertEqual(response.status_code, status.HTTP_200_OK)
config = ApplicationConfiguration.objects.first()
assert config is not None
with Image.open(config.app_logo.path) as stored_logo:
stored_exif = stored_logo.getexif()
@@ -273,7 +268,6 @@ class TestApiAppConfig(DirectoriesMixin, APITestCase):
self.assertEqual(response.status_code, status.HTTP_200_OK)
config = ApplicationConfiguration.objects.first()
assert config is not None
with Image.open(config.app_logo.path) as stored_logo:
stored_text = stored_logo.text
@@ -792,7 +786,6 @@ class TestApiAppConfig(DirectoriesMixin, APITestCase):
- llm_api_key is set to None
"""
config = ApplicationConfiguration.objects.first()
assert config is not None
config.llm_api_key = "1234567890"
config.save()
@@ -833,7 +826,6 @@ class TestApiAppConfig(DirectoriesMixin, APITestCase):
- LLM index is triggered to update
"""
config = ApplicationConfiguration.objects.first()
assert config is not None
config.ai_enabled = False
config.llm_embedding_backend = None
config.save()
@@ -848,7 +840,7 @@ class TestApiAppConfig(DirectoriesMixin, APITestCase):
json.dumps(
{
"ai_enabled": True,
"llm_embedding_backend": "openai-like",
"llm_embedding_backend": "openai",
},
),
content_type="application/json",
+8 -10
View File
@@ -15,7 +15,6 @@ from documents.models import Document
from documents.models import DocumentType
from documents.tests.utils import DirectoriesMixin
from documents.tests.utils import SampleDirMixin
from documents.tests.utils import read_streaming_response
class TestBulkDownload(DirectoriesMixin, SampleDirMixin, APITestCase):
@@ -69,7 +68,7 @@ class TestBulkDownload(DirectoriesMixin, SampleDirMixin, APITestCase):
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.assertEqual(response["Content-Type"], "application/zip")
with zipfile.ZipFile(io.BytesIO(read_streaming_response(response))) as zipf:
with zipfile.ZipFile(io.BytesIO(response.content)) as zipf:
self.assertEqual(len(zipf.filelist), 2)
self.assertIn("2021-01-01 document A.pdf", zipf.namelist())
self.assertIn("2020-03-21 document B.jpg", zipf.namelist())
@@ -90,7 +89,7 @@ class TestBulkDownload(DirectoriesMixin, SampleDirMixin, APITestCase):
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.assertEqual(response["Content-Type"], "application/zip")
with zipfile.ZipFile(io.BytesIO(read_streaming_response(response))) as zipf:
with zipfile.ZipFile(io.BytesIO(response.content)) as zipf:
self.assertEqual(len(zipf.filelist), 2)
self.assertIn("2021-01-01 document A.pdf", zipf.namelist())
self.assertIn("2020-03-21 document B.pdf", zipf.namelist())
@@ -111,7 +110,7 @@ class TestBulkDownload(DirectoriesMixin, SampleDirMixin, APITestCase):
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.assertEqual(response["Content-Type"], "application/zip")
with zipfile.ZipFile(io.BytesIO(read_streaming_response(response))) as zipf:
with zipfile.ZipFile(io.BytesIO(response.content)) as zipf:
self.assertEqual(len(zipf.filelist), 3)
self.assertIn("originals/2021-01-01 document A.pdf", zipf.namelist())
self.assertIn("archive/2020-03-21 document B.pdf", zipf.namelist())
@@ -145,7 +144,7 @@ class TestBulkDownload(DirectoriesMixin, SampleDirMixin, APITestCase):
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.assertEqual(response["Content-Type"], "application/zip")
with zipfile.ZipFile(io.BytesIO(read_streaming_response(response))) as zipf:
with zipfile.ZipFile(io.BytesIO(response.content)) as zipf:
self.assertEqual(len(zipf.filelist), 2)
self.assertIn("2021-01-01 document A.pdf", zipf.namelist())
@@ -158,14 +157,13 @@ class TestBulkDownload(DirectoriesMixin, SampleDirMixin, APITestCase):
self.assertEqual(f.read(), zipf.read("2021-01-01 document A_01.pdf"))
def test_compression(self) -> None:
response = self.client.post(
self.client.post(
self.ENDPOINT,
json.dumps(
{"documents": [self.doc2.id, self.doc2b.id], "compression": "lzma"},
),
content_type="application/json",
)
response.close()
@override_settings(FILENAME_FORMAT="{correspondent}/{title}")
def test_formatted_download_originals(self) -> None:
@@ -205,7 +203,7 @@ class TestBulkDownload(DirectoriesMixin, SampleDirMixin, APITestCase):
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.assertEqual(response["Content-Type"], "application/zip")
with zipfile.ZipFile(io.BytesIO(read_streaming_response(response))) as zipf:
with zipfile.ZipFile(io.BytesIO(response.content)) as zipf:
self.assertEqual(len(zipf.filelist), 2)
self.assertIn("a space name/Title 2 - Doc 3.jpg", zipf.namelist())
self.assertIn("test/This is Doc 2.pdf", zipf.namelist())
@@ -251,7 +249,7 @@ class TestBulkDownload(DirectoriesMixin, SampleDirMixin, APITestCase):
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.assertEqual(response["Content-Type"], "application/zip")
with zipfile.ZipFile(io.BytesIO(read_streaming_response(response))) as zipf:
with zipfile.ZipFile(io.BytesIO(response.content)) as zipf:
self.assertEqual(len(zipf.filelist), 2)
self.assertIn("somewhere/This is Doc 2.pdf", zipf.namelist())
self.assertIn("somewhere/Title 2 - Doc 3.pdf", zipf.namelist())
@@ -300,7 +298,7 @@ class TestBulkDownload(DirectoriesMixin, SampleDirMixin, APITestCase):
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.assertEqual(response["Content-Type"], "application/zip")
with zipfile.ZipFile(io.BytesIO(read_streaming_response(response))) as zipf:
with zipfile.ZipFile(io.BytesIO(response.content)) as zipf:
self.assertEqual(len(zipf.filelist), 3)
self.assertIn("originals/bill/This is Doc 2.pdf", zipf.namelist())
self.assertIn("archive/statement/Title 2 - Doc 3.pdf", zipf.namelist())
+1 -1
View File
@@ -918,7 +918,7 @@ class TestBulkEditAPI(DirectoriesMixin, APITestCase):
],
)
def test_api_selection_data_requires_view_permission(self) -> None:
def test_api_selection_data_requires_view_permission(self):
self.doc2.owner = self.user
self.doc2.save()
+16 -48
View File
@@ -276,9 +276,7 @@ class TestCustomFieldsAPI(DirectoriesMixin, APITestCase):
)
doc.refresh_from_db()
_cf_1 = doc.custom_fields.first()
assert _cf_1 is not None
self.assertEqual(_cf_1.value, None)
self.assertEqual(doc.custom_fields.first().value, None)
@mock.patch("documents.signals.handlers.process_cf_select_update.apply_async")
def test_custom_field_update_offloaded_once(self, mock_delay) -> None:
@@ -569,9 +567,7 @@ class TestCustomFieldsAPI(DirectoriesMixin, APITestCase):
self.assertEqual(resp.status_code, status.HTTP_200_OK)
self.assertEqual(CustomFieldInstance.objects.count(), 1)
_cf_2 = doc.custom_fields.first()
assert _cf_2 is not None
self.assertEqual(_cf_2.value, "test value")
self.assertEqual(doc.custom_fields.first().value, "test value")
# Update
resp = self.client.patch(
@@ -588,9 +584,7 @@ class TestCustomFieldsAPI(DirectoriesMixin, APITestCase):
)
self.assertEqual(resp.status_code, status.HTTP_200_OK)
self.assertEqual(CustomFieldInstance.objects.count(), 1)
_cf_3 = doc.custom_fields.first()
assert _cf_3 is not None
self.assertEqual(_cf_3.value, "a new test value")
self.assertEqual(doc.custom_fields.first().value, "a new test value")
def test_delete_custom_field_instance(self) -> None:
"""
@@ -656,9 +650,7 @@ class TestCustomFieldsAPI(DirectoriesMixin, APITestCase):
self.assertEqual(CustomFieldInstance.objects.count(), 1)
self.assertEqual(Document.objects.count(), 1)
self.assertEqual(len(doc.custom_fields.all()), 1)
_cf_4 = doc.custom_fields.first()
assert _cf_4 is not None
self.assertEqual(_cf_4.value, date_value)
self.assertEqual(doc.custom_fields.first().value, date_value)
def test_custom_field_validation(self) -> None:
"""
@@ -1070,15 +1062,9 @@ class TestCustomFieldsAPI(DirectoriesMixin, APITestCase):
self.assertEqual(resp.status_code, status.HTTP_200_OK)
self.assertEqual(CustomFieldInstance.objects.count(), 4)
_cf_5 = doc2.custom_fields.first()
assert _cf_5 is not None
self.assertEqual(_cf_5.value, [1])
_cf_6 = doc3.custom_fields.first()
assert _cf_6 is not None
self.assertEqual(_cf_6.value, [1])
_cf_7 = doc4.custom_fields.first()
assert _cf_7 is not None
self.assertEqual(_cf_7.value, [1])
self.assertEqual(doc2.custom_fields.first().value, [1])
self.assertEqual(doc3.custom_fields.first().value, [1])
self.assertEqual(doc4.custom_fields.first().value, [1])
# Add links appends if necessary
resp = self.client.patch(
@@ -1095,9 +1081,7 @@ class TestCustomFieldsAPI(DirectoriesMixin, APITestCase):
)
self.assertEqual(resp.status_code, status.HTTP_200_OK)
_cf_8 = doc4.custom_fields.first()
assert _cf_8 is not None
self.assertEqual(_cf_8.value, [1, 3])
self.assertEqual(doc4.custom_fields.first().value, [1, 3])
# Remove one of the links, removed on other doc
resp = self.client.patch(
@@ -1114,15 +1098,9 @@ class TestCustomFieldsAPI(DirectoriesMixin, APITestCase):
)
self.assertEqual(resp.status_code, status.HTTP_200_OK)
_cf_9 = doc2.custom_fields.first()
assert _cf_9 is not None
self.assertEqual(_cf_9.value, [1])
_cf_10 = doc3.custom_fields.first()
assert _cf_10 is not None
self.assertEqual(_cf_10.value, [1, 4])
_cf_11 = doc4.custom_fields.first()
assert _cf_11 is not None
self.assertEqual(_cf_11.value, [3])
self.assertEqual(doc2.custom_fields.first().value, [1])
self.assertEqual(doc3.custom_fields.first().value, [1, 4])
self.assertEqual(doc4.custom_fields.first().value, [3])
# Removes the field entirely
resp = self.client.patch(
@@ -1134,15 +1112,9 @@ class TestCustomFieldsAPI(DirectoriesMixin, APITestCase):
)
self.assertEqual(resp.status_code, status.HTTP_200_OK)
_cf_12 = doc2.custom_fields.first()
assert _cf_12 is not None
self.assertEqual(_cf_12.value, [])
_cf_13 = doc3.custom_fields.first()
assert _cf_13 is not None
self.assertEqual(_cf_13.value, [4])
_cf_14 = doc4.custom_fields.first()
assert _cf_14 is not None
self.assertEqual(_cf_14.value, [3])
self.assertEqual(doc2.custom_fields.first().value, [])
self.assertEqual(doc3.custom_fields.first().value, [4])
self.assertEqual(doc4.custom_fields.first().value, [3])
# If field exists on target doc but value is None
doc5 = Document.objects.create(
@@ -1167,9 +1139,7 @@ class TestCustomFieldsAPI(DirectoriesMixin, APITestCase):
)
self.assertEqual(resp.status_code, status.HTTP_200_OK)
_cf_15 = doc5.custom_fields.first()
assert _cf_15 is not None
self.assertEqual(_cf_15.value, [1])
self.assertEqual(doc5.custom_fields.first().value, [1])
def test_documentlink_patch_requires_change_permission_on_target_documents(
self,
@@ -1351,9 +1321,7 @@ class TestCustomFieldsAPI(DirectoriesMixin, APITestCase):
results = response.data["results"]
self.assertEqual(results[0]["document_count"], 0)
def test_patch_document_invalid_date_custom_field_returns_validation_error(
self,
) -> None:
def test_patch_document_invalid_date_custom_field_returns_validation_error(self):
"""
GIVEN:
- A date custom field
@@ -18,7 +18,6 @@ from documents.filters import EffectiveContentFilter
from documents.filters import TitleContentFilter
from documents.models import Document
from documents.tests.utils import DirectoriesMixin
from documents.tests.utils import read_streaming_response
if TYPE_CHECKING:
from pathlib import Path
@@ -450,19 +449,19 @@ class TestDocumentVersioningApi(DirectoriesMixin, APITestCase):
f"/api/documents/{root.id}/download/?version={version.id}",
)
self.assertEqual(resp.status_code, status.HTTP_200_OK)
self.assertEqual(read_streaming_response(resp), b"version")
self.assertEqual(resp.content, b"version")
resp = self.client.get(
f"/api/documents/{root.id}/preview/?version={version.id}",
)
self.assertEqual(resp.status_code, status.HTTP_200_OK)
self.assertEqual(read_streaming_response(resp), b"version")
self.assertEqual(resp.content, b"version")
resp = self.client.get(
f"/api/documents/{root.id}/thumb/?version={version.id}",
)
self.assertEqual(resp.status_code, status.HTTP_200_OK)
self.assertEqual(read_streaming_response(resp), b"thumb")
self.assertEqual(resp.content, b"thumb")
def test_metadata_version_param_uses_version(self) -> None:
root = Document.objects.create(
+20 -20
View File
@@ -49,7 +49,6 @@ from documents.models import WorkflowTrigger
from documents.signals.handlers import run_workflows
from documents.tests.utils import ConsumeTaskMixin
from documents.tests.utils import DirectoriesMixin
from documents.tests.utils import read_streaming_response
class TestDocumentApi(DirectoriesMixin, ConsumeTaskMixin, APITestCase):
@@ -324,16 +323,19 @@ class TestDocumentApi(DirectoriesMixin, ConsumeTaskMixin, APITestCase):
f.write(content_thumbnail)
response = self.client.get(f"/api/documents/{doc.pk}/download/")
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.assertEqual(read_streaming_response(response), content)
self.assertEqual(response.content, content)
response = self.client.get(f"/api/documents/{doc.pk}/preview/")
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.assertEqual(read_streaming_response(response), content)
self.assertEqual(response.content, content)
response = self.client.get(f"/api/documents/{doc.pk}/thumb/")
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.assertEqual(read_streaming_response(response), content_thumbnail)
self.assertEqual(response.content, content_thumbnail)
def test_document_actions_with_perms(self) -> None:
"""
@@ -384,15 +386,12 @@ class TestDocumentApi(DirectoriesMixin, ConsumeTaskMixin, APITestCase):
response = self.client.get(f"/api/documents/{doc.pk}/download/")
self.assertEqual(response.status_code, status.HTTP_200_OK)
response.close()
response = self.client.get(f"/api/documents/{doc.pk}/preview/")
self.assertEqual(response.status_code, status.HTTP_200_OK)
response.close()
response = self.client.get(f"/api/documents/{doc.pk}/thumb/")
self.assertEqual(response.status_code, status.HTTP_200_OK)
response.close()
@override_settings(FILENAME_FORMAT="")
def test_download_with_archive(self) -> None:
@@ -413,24 +412,28 @@ class TestDocumentApi(DirectoriesMixin, ConsumeTaskMixin, APITestCase):
f.write(content_archive)
response = self.client.get(f"/api/documents/{doc.pk}/download/")
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.assertEqual(read_streaming_response(response), content_archive)
self.assertEqual(response.content, content_archive)
response = self.client.get(
f"/api/documents/{doc.pk}/download/?original=true",
)
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.assertEqual(read_streaming_response(response), content)
self.assertEqual(response.content, content)
response = self.client.get(f"/api/documents/{doc.pk}/preview/")
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.assertEqual(read_streaming_response(response), content_archive)
self.assertEqual(response.content, content_archive)
response = self.client.get(
f"/api/documents/{doc.pk}/preview/?original=true",
)
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.assertEqual(read_streaming_response(response), content)
self.assertEqual(response.content, content)
@override_settings(FILENAME_FORMAT="")
def test_download_follow_formatting(self) -> None:
@@ -453,21 +456,18 @@ class TestDocumentApi(DirectoriesMixin, ConsumeTaskMixin, APITestCase):
# Without follow_formatting, should use public filename
response = self.client.get(f"/api/documents/{doc.pk}/download/")
self.assertIn("none.pdf", response["Content-Disposition"])
response.close()
# With follow_formatting, should use actual filename on disk
response = self.client.get(
f"/api/documents/{doc.pk}/download/?follow_formatting=true",
)
self.assertIn("archived.pdf", response["Content-Disposition"])
response.close()
# With follow_formatting and original, should use source filename
response = self.client.get(
f"/api/documents/{doc.pk}/download/?original=true&follow_formatting=true",
)
self.assertIn("my_document.pdf", response["Content-Disposition"])
response.close()
def test_document_actions_not_existing_file(self) -> None:
doc = Document.objects.create(
@@ -1168,7 +1168,7 @@ class TestDocumentApi(DirectoriesMixin, ConsumeTaskMixin, APITestCase):
self.assertIn("all", response.data)
self.assertCountEqual(response.data["all"], [d.id for d in docs])
def test_default_ordering_uses_id_as_tiebreaker(self) -> None:
def test_default_ordering_uses_id_as_tiebreaker(self):
"""
GIVEN:
- Documents sharing the same created date
@@ -2156,7 +2156,7 @@ class TestDocumentApi(DirectoriesMixin, ConsumeTaskMixin, APITestCase):
match_tags,
match_document_types,
match_storage_paths,
) -> None:
):
doc = Document.objects.create(
title="test",
mime_type="application/pdf",
@@ -2193,7 +2193,7 @@ class TestDocumentApi(DirectoriesMixin, ConsumeTaskMixin, APITestCase):
match_document_types,
match_storage_paths,
mocked_load,
) -> None:
):
"""
GIVEN:
- Request for suggestions for a document
@@ -2276,7 +2276,7 @@ class TestDocumentApi(DirectoriesMixin, ConsumeTaskMixin, APITestCase):
def test_get_suggestions_dates_disabled(
self,
mock_get_date_parser: mock.MagicMock,
) -> None:
):
"""
GIVEN:
- NUMBER_OF_SUGGESTED_DATES = 0 (disables feature)
@@ -3409,7 +3409,7 @@ class TestDocumentApi(DirectoriesMixin, ConsumeTaskMixin, APITestCase):
)
self.assertEqual(resp.status_code, status.HTTP_200_OK)
def test_create_share_link_requires_view_permission_for_document(self) -> None:
def test_create_share_link_requires_view_permission_for_document(self):
"""
GIVEN:
- A user with add_sharelink but without view permission on a document
@@ -3457,7 +3457,7 @@ class TestDocumentApi(DirectoriesMixin, ConsumeTaskMixin, APITestCase):
self.assertEqual(create_resp.status_code, status.HTTP_201_CREATED)
self.assertEqual(create_resp.data["document"], doc.pk)
def test_next_asn(self) -> None:
def test_next_asn(self):
"""
GIVEN:
- Existing documents with ASNs, highest owned by user2
+1 -1
View File
@@ -933,7 +933,7 @@ class TestApiUser(DirectoriesMixin, APITestCase):
returned_user1 = User.objects.get(pk=user1.pk)
self.assertEqual(returned_user1.is_superuser, False)
def test_only_superusers_can_create_or_alter_staff_status(self) -> None:
def test_only_superusers_can_create_or_alter_staff_status(self):
"""
GIVEN:
- Existing user account
+14 -68
View File
@@ -79,14 +79,14 @@ class TestApiSchema(APITestCase):
class TestTasksSummarySchema:
"""tasks_summary_retrieve: response must be an array of TaskSummarySerializer."""
def test_summary_response_is_array(self, api_schema: SchemaGenerator) -> None:
def test_summary_response_is_array(self, api_schema: SchemaGenerator):
op = api_schema["paths"]["/api/tasks/summary/"]["get"]
resp_200 = op["responses"]["200"]["content"]["application/json"]["schema"]
assert resp_200["type"] == "array", (
"tasks_summary_retrieve response must be type:array"
)
def test_summary_items_have_total_count(self, api_schema: SchemaGenerator) -> None:
def test_summary_items_have_total_count(self, api_schema: SchemaGenerator):
op = api_schema["paths"]["/api/tasks/summary/"]["get"]
resp_200 = op["responses"]["200"]["content"]["application/json"]["schema"]
items = resp_200.get("items", {})
@@ -100,10 +100,7 @@ class TestTasksSummarySchema:
"summary items must have 'total_count' (TaskSummarySerializer)"
)
def test_summary_days_parameter_constraints(
self,
api_schema: SchemaGenerator,
) -> None:
def test_summary_days_parameter_constraints(self, api_schema: SchemaGenerator):
op = api_schema["paths"]["/api/tasks/summary/"]["get"]
params = {p["name"]: p for p in op.get("parameters", [])}
assert "days" in params, "days query parameter must be declared"
@@ -115,14 +112,14 @@ class TestTasksSummarySchema:
class TestTasksActiveSchema:
"""tasks_active_retrieve: response must be an array of TaskSerializerV10."""
def test_active_response_is_array(self, api_schema: SchemaGenerator) -> None:
def test_active_response_is_array(self, api_schema: SchemaGenerator):
op = api_schema["paths"]["/api/tasks/active/"]["get"]
resp_200 = op["responses"]["200"]["content"]["application/json"]["schema"]
assert resp_200["type"] == "array", (
"tasks_active_retrieve response must be type:array"
)
def test_active_items_ref_named_schema(self, api_schema: SchemaGenerator) -> None:
def test_active_items_ref_named_schema(self, api_schema: SchemaGenerator):
op = api_schema["paths"]["/api/tasks/active/"]["get"]
resp_200 = op["responses"]["200"]["content"]["application/json"]["schema"]
items = resp_200.get("items", {})
@@ -136,11 +133,7 @@ class TestMetadataSchema:
"""Metadata component: array fields and optional archive fields."""
@pytest.mark.parametrize("field", ["original_metadata", "archive_metadata"])
def test_metadata_field_is_array(
self,
api_schema: SchemaGenerator,
field: str,
) -> None:
def test_metadata_field_is_array(self, api_schema: SchemaGenerator, field: str):
props = api_schema["components"]["schemas"]["Metadata"]["properties"]
assert props[field]["type"] == "array", (
f"{field} should be type:array, not type:object"
@@ -151,7 +144,7 @@ class TestMetadataSchema:
self,
api_schema: SchemaGenerator,
field: str,
) -> None:
):
props = api_schema["components"]["schemas"]["Metadata"]["properties"]
items = props[field]["items"]
ref = items.get("$ref", "")
@@ -173,7 +166,7 @@ class TestMetadataSchema:
"archive_metadata",
],
)
def test_archive_field_not_required(self, api_schema, field) -> None:
def test_archive_field_not_required(self, api_schema, field):
schema = api_schema["components"]["schemas"]["Metadata"]
required = schema.get("required", [])
assert field not in required
@@ -186,7 +179,7 @@ class TestMetadataSchema:
class TestStoragePathTestSchema:
"""storage_paths_test_create: response must be a string, not a StoragePath object."""
def test_test_action_response_is_string(self, api_schema: SchemaGenerator) -> None:
def test_test_action_response_is_string(self, api_schema: SchemaGenerator):
op = api_schema["paths"]["/api/storage_paths/test/"]["post"]
resp_200 = op["responses"]["200"]["content"]["application/json"]["schema"]
assert resp_200.get("type") == "string", (
@@ -196,7 +189,7 @@ class TestStoragePathTestSchema:
def test_test_action_request_uses_storage_path_test_serializer(
self,
api_schema: SchemaGenerator,
) -> None:
):
op = api_schema["paths"]["/api/storage_paths/test/"]["post"]
content = (
op.get("requestBody", {}).get("content", {}).get("application/json", {})
@@ -227,14 +220,11 @@ class TestProcessedMailBulkDeleteSchema:
self,
api_schema: SchemaGenerator,
field: str,
) -> None:
):
props = self._get_props(api_schema)
assert field in props, f"bulk_delete 200 response must have a '{field}' field"
def test_bulk_delete_response_is_not_processed_mail_serializer(
self,
api_schema,
) -> None:
def test_bulk_delete_response_is_not_processed_mail_serializer(self, api_schema):
op = api_schema["paths"]["/api/processed_mail/bulk_delete/"]["post"]
resp_200 = op["responses"]["200"]["content"]["application/json"]["schema"]
ref = resp_200.get("$ref", "")
@@ -247,13 +237,13 @@ class TestProcessedMailBulkDeleteSchema:
class TestShareLinkBundleRebuildSchema:
"""share_link_bundles_rebuild_create: 200 returns bundle data; 400 is documented."""
def test_rebuild_has_400_response(self, api_schema: SchemaGenerator) -> None:
def test_rebuild_has_400_response(self, api_schema: SchemaGenerator):
op = api_schema["paths"]["/api/share_link_bundles/{id}/rebuild/"]["post"]
assert "400" in op["responses"], (
"rebuild must document the 400 response for 'Bundle is already being processed.'"
)
def test_rebuild_400_has_detail_field(self, api_schema: SchemaGenerator) -> None:
def test_rebuild_400_has_detail_field(self, api_schema: SchemaGenerator):
op = api_schema["paths"]["/api/share_link_bundles/{id}/rebuild/"]["post"]
resp_400 = op["responses"]["400"]["content"]["application/json"]["schema"]
ref = resp_400.get("$ref", "")
@@ -263,47 +253,3 @@ class TestShareLinkBundleRebuildSchema:
else:
props = resp_400.get("properties", {})
assert "detail" in props, "rebuild 400 response must have a 'detail' field"
class TestBulkDownloadSchema:
"""bulk_download_create: POST accepts BulkDownloadSerializer, returns application/zip, documents 403."""
def test_bulk_download_path_exists(self, api_schema: SchemaGenerator) -> None:
assert "/api/documents/bulk_download/" in api_schema["paths"]
def test_bulk_download_operation_id(self, api_schema: SchemaGenerator) -> None:
op = api_schema["paths"]["/api/documents/bulk_download/"]["post"]
assert op["operationId"] == "bulk_download"
def test_bulk_download_request_body_is_json(
self,
api_schema: SchemaGenerator,
) -> None:
op = api_schema["paths"]["/api/documents/bulk_download/"]["post"]
assert "requestBody" in op
assert "application/json" in op["requestBody"]["content"]
def test_bulk_download_request_references_serializer(
self,
api_schema: SchemaGenerator,
) -> None:
op = api_schema["paths"]["/api/documents/bulk_download/"]["post"]
schema_ref = (
op["requestBody"]["content"]["application/json"]
.get("schema", {})
.get("$ref", "")
)
component_name = schema_ref.split("/")[-1]
assert component_name == "BulkDownloadRequest"
def test_bulk_download_response_200_is_zip(
self,
api_schema: SchemaGenerator,
) -> None:
op = api_schema["paths"]["/api/documents/bulk_download/"]["post"]
assert "200" in op["responses"]
assert "application/zip" in op["responses"]["200"]["content"]
def test_bulk_download_response_403(self, api_schema: SchemaGenerator) -> None:
op = api_schema["paths"]["/api/documents/bulk_download/"]["post"]
assert "403" in op["responses"]
+1 -1
View File
@@ -943,7 +943,7 @@ class TestDocumentSearchApi(DirectoriesMixin, APITestCase):
response = self.client.get("/api/documents/?query=things")
self.assertIsNone(response.data["corrected_query"])
def test_search_spelling_suggestion_suppressed_for_private_terms(self) -> None:
def test_search_spelling_suggestion_suppressed_for_private_terms(self):
owner = User.objects.create_user("owner")
attacker = User.objects.create_user("attacker")
attacker.user_permissions.add(
+2 -38
View File
@@ -1,14 +1,12 @@
import os
import shutil
import tempfile
from datetime import timedelta
from pathlib import Path
from unittest import mock
from django.contrib.auth.models import Permission
from django.contrib.auth.models import User
from django.test import override_settings
from django.utils import timezone
from rest_framework import status
from rest_framework.test import APITestCase
@@ -78,11 +76,6 @@ class TestSystemStatus(APITestCase):
self.assertEqual(response.data["tasks"]["redis_url"], "redis://localhost:6379")
self.assertEqual(response.data["tasks"]["redis_status"], "ERROR")
self.assertIsNotNone(response.data["tasks"]["redis_error"])
self.assertEqual(response.data["tasks"]["summary"]["days"], 30)
self.assertEqual(response.data["tasks"]["summary"]["total_count"], 0)
self.assertEqual(response.data["tasks"]["summary"]["success_count"], 0)
self.assertEqual(response.data["tasks"]["summary"]["failure_count"], 0)
self.assertEqual(response.data["tasks"]["summary"]["pending_count"], 0)
def test_system_status_insufficient_permissions(self) -> None:
"""
@@ -404,7 +397,7 @@ class TestSystemStatus(APITestCase):
THEN:
- The response contains the correct AI status
"""
with override_settings(AI_ENABLED=True, LLM_EMBEDDING_BACKEND="openai-like"):
with override_settings(AI_ENABLED=True, LLM_EMBEDDING_BACKEND="openai"):
self.client.force_login(self.user)
# No tasks found
@@ -431,7 +424,7 @@ class TestSystemStatus(APITestCase):
THEN:
- The response contains the correct AI status
"""
with override_settings(AI_ENABLED=True, LLM_EMBEDDING_BACKEND="openai-like"):
with override_settings(AI_ENABLED=True, LLM_EMBEDDING_BACKEND="openai"):
PaperlessTaskFactory(
task_type=PaperlessTask.TaskType.LLM_INDEX,
trigger_source=PaperlessTask.TriggerSource.SCHEDULED,
@@ -443,32 +436,3 @@ class TestSystemStatus(APITestCase):
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.assertEqual(response.data["tasks"]["llmindex_status"], "ERROR")
self.assertIsNotNone(response.data["tasks"]["llmindex_error"])
def test_system_status_includes_recent_task_summary(self) -> None:
PaperlessTaskFactory(
task_type=PaperlessTask.TaskType.CONSUME_FILE,
status=PaperlessTask.Status.SUCCESS,
)
PaperlessTaskFactory(
task_type=PaperlessTask.TaskType.CONSUME_FILE,
status=PaperlessTask.Status.FAILURE,
)
PaperlessTaskFactory(
task_type=PaperlessTask.TaskType.SANITY_CHECK,
status=PaperlessTask.Status.PENDING,
)
PaperlessTaskFactory(
task_type=PaperlessTask.TaskType.MAIL_FETCH,
status=PaperlessTask.Status.SUCCESS,
date_created=timezone.now() - timedelta(days=45),
)
self.client.force_login(self.user)
response = self.client.get(self.ENDPOINT)
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.assertEqual(response.data["tasks"]["summary"]["days"], 30)
self.assertEqual(response.data["tasks"]["summary"]["total_count"], 3)
self.assertEqual(response.data["tasks"]["summary"]["success_count"], 1)
self.assertEqual(response.data["tasks"]["summary"]["failure_count"], 1)
self.assertEqual(response.data["tasks"]["summary"]["pending_count"], 1)
+20 -47
View File
@@ -31,20 +31,6 @@ ACCEPT_V9 = "application/json; version=9"
@pytest.mark.django_db()
class TestGetTasksV10:
def test_list_response_has_paginated_structure(
self,
admin_client: APIClient,
) -> None:
"""GET /api/tasks/ returns a paginated envelope with count and results."""
PaperlessTaskFactory.create_batch(3)
response = admin_client.get(ENDPOINT)
assert response.status_code == status.HTTP_200_OK
assert "count" in response.data
assert "results" in response.data
assert response.data["count"] == 3
def test_list_returns_tasks(self, admin_client: APIClient) -> None:
"""GET /api/tasks/ returns all tasks visible to the admin."""
PaperlessTaskFactory.create_batch(2)
@@ -52,7 +38,7 @@ class TestGetTasksV10:
response = admin_client.get(ENDPOINT)
assert response.status_code == status.HTTP_200_OK
assert response.data["count"] == 2
assert len(response.data) == 2
def test_related_document_ids_populated_from_result_data(
self,
@@ -67,7 +53,7 @@ class TestGetTasksV10:
response = admin_client.get(ENDPOINT)
assert response.status_code == status.HTTP_200_OK
assert response.data["results"][0]["related_document_ids"] == [7]
assert response.data[0]["related_document_ids"] == [7]
def test_related_document_ids_includes_duplicate_of(
self,
@@ -82,7 +68,7 @@ class TestGetTasksV10:
response = admin_client.get(ENDPOINT)
assert response.status_code == status.HTTP_200_OK
assert response.data["results"][0]["related_document_ids"] == [12]
assert response.data[0]["related_document_ids"] == [12]
def test_filter_by_task_type(self, admin_client: APIClient) -> None:
"""?task_type= filters results to tasks of that type only."""
@@ -95,11 +81,8 @@ class TestGetTasksV10:
)
assert response.status_code == status.HTTP_200_OK
assert response.data["count"] == 1
assert (
response.data["results"][0]["task_type"]
== PaperlessTask.TaskType.TRAIN_CLASSIFIER
)
assert len(response.data) == 1
assert response.data[0]["task_type"] == PaperlessTask.TaskType.TRAIN_CLASSIFIER
def test_filter_by_status(self, admin_client: APIClient) -> None:
"""?status= filters results to tasks with that status only."""
@@ -112,8 +95,8 @@ class TestGetTasksV10:
)
assert response.status_code == status.HTTP_200_OK
assert response.data["count"] == 1
assert response.data["results"][0]["status"] == PaperlessTask.Status.SUCCESS
assert len(response.data) == 1
assert response.data[0]["status"] == PaperlessTask.Status.SUCCESS
def test_filter_by_task_id(self, admin_client: APIClient) -> None:
"""?task_id= returns only the task with that UUID."""
@@ -123,8 +106,8 @@ class TestGetTasksV10:
response = admin_client.get(ENDPOINT, {"task_id": task.task_id})
assert response.status_code == status.HTTP_200_OK
assert response.data["count"] == 1
assert response.data["results"][0]["task_id"] == task.task_id
assert len(response.data) == 1
assert response.data[0]["task_id"] == task.task_id
def test_filter_by_acknowledged(self, admin_client: APIClient) -> None:
"""?acknowledged=false returns only tasks that have not been acknowledged."""
@@ -134,8 +117,8 @@ class TestGetTasksV10:
response = admin_client.get(ENDPOINT, {"acknowledged": "false"})
assert response.status_code == status.HTTP_200_OK
assert response.data["count"] == 1
assert response.data["results"][0]["acknowledged"] is False
assert len(response.data) == 1
assert response.data[0]["acknowledged"] is False
def test_filter_is_complete_true(self, admin_client: APIClient) -> None:
"""?is_complete=true returns only SUCCESS and FAILURE tasks."""
@@ -146,8 +129,8 @@ class TestGetTasksV10:
response = admin_client.get(ENDPOINT, {"is_complete": "true"})
assert response.status_code == status.HTTP_200_OK
assert response.data["count"] == 2
returned_statuses = {t["status"] for t in response.data["results"]}
assert len(response.data) == 2
returned_statuses = {t["status"] for t in response.data}
assert returned_statuses == {
PaperlessTask.Status.SUCCESS,
PaperlessTask.Status.FAILURE,
@@ -162,8 +145,8 @@ class TestGetTasksV10:
response = admin_client.get(ENDPOINT, {"is_complete": "false"})
assert response.status_code == status.HTTP_200_OK
assert response.data["count"] == 2
returned_statuses = {t["status"] for t in response.data["results"]}
assert len(response.data) == 2
returned_statuses = {t["status"] for t in response.data}
assert returned_statuses == {
PaperlessTask.Status.PENDING,
PaperlessTask.Status.STARTED,
@@ -179,7 +162,7 @@ class TestGetTasksV10:
response = admin_client.get(ENDPOINT)
assert response.status_code == status.HTTP_200_OK
ids = [t["task_id"] for t in response.data["results"]]
ids = [t["task_id"] for t in response.data]
assert ids == [t3.task_id, t2.task_id, t1.task_id]
def test_list_scoped_to_own_and_unowned_tasks_for_regular_user(
@@ -203,8 +186,8 @@ class TestGetTasksV10:
response = client.get(ENDPOINT)
assert response.status_code == status.HTTP_200_OK
assert response.data["count"] == 2
visible_ids = {t["task_id"] for t in response.data["results"]}
assert len(response.data) == 2
visible_ids = {t["task_id"] for t in response.data}
assert visible_ids == {own_task.task_id, unowned_task.task_id}
def test_list_admin_sees_all_tasks(
@@ -221,7 +204,7 @@ class TestGetTasksV10:
response = admin_client.get(ENDPOINT)
assert response.status_code == status.HTTP_200_OK
assert response.data["count"] == 3
assert len(response.data) == 3
@pytest.mark.django_db()
@@ -414,16 +397,6 @@ class TestGetTasksV9:
assert len(response.data) == 1
assert response.data[0]["task_name"] == "check_sanity"
def test_v9_list_returns_plain_list(self, v9_client: APIClient) -> None:
"""v9 task list returns a plain JSON array, not a paginated envelope."""
PaperlessTaskFactory.create_batch(3)
response = v9_client.get(ENDPOINT)
assert response.status_code == status.HTTP_200_OK
assert isinstance(response.data, list)
assert len(response.data) == 3
def test_v9_non_staff_sees_own_and_unowned_tasks(
self,
admin_user: User,
@@ -537,7 +510,7 @@ class TestAcknowledge:
response = admin_client.get(ENDPOINT, {"acknowledged": "false"})
assert response.status_code == status.HTTP_200_OK
assert response.data["count"] == 0
assert len(response.data) == 0
def test_requires_change_permission(self, user_client: APIClient) -> None:
"""Regular users without change_paperlesstask permission receive 403."""
+39 -24
View File
@@ -273,7 +273,6 @@ class TestApiWorkflows(DirectoriesMixin, APITestCase):
self.assertEqual(Workflow.objects.count(), 2)
workflow = Workflow.objects.get(name="Workflow 2")
trigger = workflow.triggers.first()
assert trigger is not None
self.assertSetEqual(
set(trigger.filter_has_tags.values_list("id", flat=True)),
{self.t1.id},
@@ -494,24 +493,44 @@ class TestApiWorkflows(DirectoriesMixin, APITestCase):
self.assertEqual(response.status_code, status.HTTP_200_OK)
workflow = Workflow.objects.get(id=response.data["id"])
self.assertEqual(workflow.name, "Workflow Updated")
trigger = workflow.triggers.first()
assert trigger is not None
action = workflow.actions.first()
assert action is not None
self.assertEqual(trigger.filter_has_tags.first(), self.t1)
self.assertEqual(trigger.filter_has_all_tags.first(), self.t2)
self.assertEqual(trigger.filter_has_not_tags.first(), self.t3)
self.assertEqual(trigger.filter_has_any_correspondents.first(), self.c)
self.assertEqual(trigger.filter_has_not_correspondents.first(), self.c2)
self.assertEqual(trigger.filter_has_any_document_types.first(), self.dt)
self.assertEqual(trigger.filter_has_not_document_types.first(), self.dt2)
self.assertEqual(trigger.filter_has_any_storage_paths.first(), self.sp)
self.assertEqual(trigger.filter_has_not_storage_paths.first(), self.sp2)
self.assertEqual(workflow.triggers.first().filter_has_tags.first(), self.t1)
self.assertEqual(
trigger.filter_custom_field_query,
workflow.triggers.first().filter_has_all_tags.first(),
self.t2,
)
self.assertEqual(
workflow.triggers.first().filter_has_not_tags.first(),
self.t3,
)
self.assertEqual(
workflow.triggers.first().filter_has_any_correspondents.first(),
self.c,
)
self.assertEqual(
workflow.triggers.first().filter_has_not_correspondents.first(),
self.c2,
)
self.assertEqual(
workflow.triggers.first().filter_has_any_document_types.first(),
self.dt,
)
self.assertEqual(
workflow.triggers.first().filter_has_not_document_types.first(),
self.dt2,
)
self.assertEqual(
workflow.triggers.first().filter_has_any_storage_paths.first(),
self.sp,
)
self.assertEqual(
workflow.triggers.first().filter_has_not_storage_paths.first(),
self.sp2,
)
self.assertEqual(
workflow.triggers.first().filter_custom_field_query,
json.dumps(["AND", [[self.cf1.id, "exact", "value"]]]),
)
self.assertEqual(action.assign_title, "Action New Title")
self.assertEqual(workflow.actions.first().assign_title, "Action New Title")
def test_api_update_workflow_no_trigger_actions(self) -> None:
"""
@@ -593,13 +612,9 @@ class TestApiWorkflows(DirectoriesMixin, APITestCase):
self.assertEqual(response.status_code, status.HTTP_200_OK)
workflow = Workflow.objects.get(id=response.data["id"])
self.assertEqual(WorkflowTrigger.objects.all().count(), 1)
new_trigger = workflow.triggers.first()
assert new_trigger is not None
self.assertNotEqual(new_trigger.id, self.trigger.id)
self.assertNotEqual(workflow.triggers.first().id, self.trigger.id)
self.assertEqual(WorkflowAction.objects.all().count(), 1)
new_action = workflow.actions.first()
assert new_action is not None
self.assertNotEqual(new_action.id, self.action.id)
self.assertNotEqual(workflow.actions.first().id, self.action.id)
def test_email_action_validation(self) -> None:
"""
@@ -858,7 +873,7 @@ class TestApiWorkflows(DirectoriesMixin, APITestCase):
self.action.refresh_from_db()
self.assertEqual(self.action.assign_title, "Patched Title")
def test_password_action_passwords_field(self) -> None:
def test_password_action_passwords_field(self):
"""
GIVEN:
- Nothing
@@ -881,7 +896,7 @@ class TestApiWorkflows(DirectoriesMixin, APITestCase):
self.assertEqual(response.status_code, status.HTTP_201_CREATED)
self.assertEqual(response.data["passwords"], passwords)
def test_password_action_invalid_passwords_field(self) -> None:
def test_password_action_invalid_passwords_field(self):
"""
GIVEN:
- Nothing
+3 -9
View File
@@ -86,7 +86,7 @@ class TestBarcode(
self.assertDictEqual(separator_page_numbers, {1: False})
@override_settings(CONSUMER_ENABLE_ASN_BARCODE=True)
def test_asn_barcode_duplicate_in_trash_fails(self) -> None:
def test_asn_barcode_duplicate_in_trash_fails(self):
"""
GIVEN:
- A document with ASN barcode 123 is in the trash
@@ -585,7 +585,6 @@ class TestBarcode(
- The barcode config is used
"""
app_config = ApplicationConfiguration.objects.first()
assert app_config is not None
app_config.barcodes_enabled = True
app_config.barcode_string = "CUSTOM BARCODE"
app_config.save()
@@ -772,7 +771,6 @@ class TestAsnBarcode(DirectoriesMixin, SampleDirMixin, GetReaderPluginMixin, Tes
)
document = Document.objects.first()
assert document is not None
self.assertEqual(document.archive_serial_number, 123)
@@ -1061,15 +1059,11 @@ class TestTagBarcode(DirectoriesMixin, SampleDirMixin, GetReaderPluginMixin, Tes
doc2 = documents[1]
self.assertEqual(doc2.tags.count(), 1)
_tag_1 = doc2.tags.first()
assert _tag_1 is not None
self.assertEqual(_tag_1.name, "invoice")
self.assertEqual(doc2.tags.first().name, "invoice")
doc3 = documents[2]
self.assertEqual(doc3.tags.count(), 1)
_tag_2 = doc3.tags.first()
assert _tag_2 is not None
self.assertEqual(_tag_2.name, "receipt")
self.assertEqual(doc3.tags.first().name, "receipt")
@override_settings(
CONSUMER_ENABLE_TAG_BARCODE=True,
+50 -33
View File
@@ -319,10 +319,8 @@ class TestBulkEdit(DirectoriesMixin, TestCase):
[self.doc3.id],
)
# assert reflect document link
_cf_1 = self.doc3.custom_fields.first()
assert _cf_1 is not None
self.assertEqual(
_cf_1.value,
self.doc3.custom_fields.first().value,
[self.doc2.id, self.doc1.id],
)
@@ -336,12 +334,14 @@ class TestBulkEdit(DirectoriesMixin, TestCase):
add_custom_fields={},
remove_custom_fields=[cf3.id],
)
_cf_2 = self.doc1.custom_fields.filter(field=cf3).first()
assert _cf_2 is not None
self.assertNotIn(self.doc3.id, _cf_2.value)
_cf_3 = self.doc2.custom_fields.filter(field=cf3).first()
assert _cf_3 is not None
self.assertNotIn(self.doc3.id, _cf_3.value)
self.assertNotIn(
self.doc3.id,
self.doc1.custom_fields.filter(field=cf3).first().value,
)
self.assertNotIn(
self.doc3.id,
self.doc2.custom_fields.filter(field=cf3).first().value,
)
def test_modify_custom_fields_doclink_self_link(self) -> None:
"""
@@ -363,12 +363,14 @@ class TestBulkEdit(DirectoriesMixin, TestCase):
remove_custom_fields=[],
)
_cf_4 = self.doc1.custom_fields.first()
assert _cf_4 is not None
self.assertEqual(_cf_4.value, [self.doc2.id])
_cf_5 = self.doc2.custom_fields.first()
assert _cf_5 is not None
self.assertEqual(_cf_5.value, [self.doc1.id])
self.assertEqual(
self.doc1.custom_fields.first().value,
[self.doc2.id],
)
self.assertEqual(
self.doc2.custom_fields.first().value,
[self.doc1.id],
)
def test_delete(self) -> None:
self.assertEqual(Document.objects.count(), 5)
@@ -691,7 +693,7 @@ class TestPDFActions(DirectoriesMixin, TestCase):
self,
mock_consume_file,
mock_delete_documents,
) -> None:
):
"""
GIVEN:
- Existing documents
@@ -930,7 +932,7 @@ class TestPDFActions(DirectoriesMixin, TestCase):
mock_chord,
mock_consume_file,
mock_delete_documents,
) -> None:
):
"""
GIVEN:
- Existing documents
@@ -1025,7 +1027,7 @@ class TestPDFActions(DirectoriesMixin, TestCase):
mock_consume_file.assert_not_called()
@mock.patch("documents.tasks.consume_file.apply_async")
def test_rotate(self, mock_consume_delay) -> None:
def test_rotate(self, mock_consume_delay):
"""
GIVEN:
- Existing documents
@@ -1052,7 +1054,7 @@ class TestPDFActions(DirectoriesMixin, TestCase):
self,
mock_pdf_save,
mock_consume_delay,
) -> None:
):
"""
GIVEN:
- Existing documents
@@ -1076,7 +1078,7 @@ class TestPDFActions(DirectoriesMixin, TestCase):
def test_rotate_non_pdf(
self,
mock_consume_delay,
) -> None:
):
"""
GIVEN:
- Existing documents
@@ -1095,7 +1097,10 @@ class TestPDFActions(DirectoriesMixin, TestCase):
self.assertIsNotNone(task_kwargs["overrides"])
self.assertEqual(result, "OK")
@mock.patch("documents.data_models.magic.from_file", return_value="application/pdf")
@mock.patch(
"documents.data_models.mime_detection.from_file",
return_value="application/pdf",
)
@mock.patch("documents.tasks.consume_file.apply_async")
@mock.patch("pikepdf.open")
def test_rotate_explicit_selection_uses_root_source_when_root_selected(
@@ -1103,7 +1108,7 @@ class TestPDFActions(DirectoriesMixin, TestCase):
mock_open,
mock_consume_delay,
mock_magic,
) -> None:
):
Document.objects.create(
checksum="B-v1",
title="B version 1",
@@ -1125,8 +1130,11 @@ class TestPDFActions(DirectoriesMixin, TestCase):
@mock.patch("documents.tasks.consume_file.apply_async")
@mock.patch("pikepdf.Pdf.save")
@mock.patch("documents.data_models.magic.from_file", return_value="application/pdf")
def test_delete_pages(self, mock_magic, mock_pdf_save, mock_consume_delay) -> None:
@mock.patch(
"documents.data_models.mime_detection.from_file",
return_value="application/pdf",
)
def test_delete_pages(self, mock_magic, mock_pdf_save, mock_consume_delay):
"""
GIVEN:
- Existing documents
@@ -1149,7 +1157,10 @@ class TestPDFActions(DirectoriesMixin, TestCase):
self.assertIsNotNone(task_kwargs["overrides"])
self.assertEqual(result, "OK")
@mock.patch("documents.data_models.magic.from_file", return_value="application/pdf")
@mock.patch(
"documents.data_models.mime_detection.from_file",
return_value="application/pdf",
)
@mock.patch("documents.tasks.consume_file.apply_async")
@mock.patch("pikepdf.open")
def test_delete_pages_explicit_selection_uses_root_source_when_root_selected(
@@ -1157,7 +1168,7 @@ class TestPDFActions(DirectoriesMixin, TestCase):
mock_open,
mock_consume_delay,
mock_magic,
) -> None:
):
Document.objects.create(
checksum="B-v1",
title="B version 1",
@@ -1179,7 +1190,7 @@ class TestPDFActions(DirectoriesMixin, TestCase):
@mock.patch("documents.tasks.consume_file.apply_async")
@mock.patch("pikepdf.Pdf.save")
def test_delete_pages_with_error(self, mock_pdf_save, mock_consume_delay) -> None:
def test_delete_pages_with_error(self, mock_pdf_save, mock_consume_delay):
"""
GIVEN:
- Existing documents
@@ -1298,7 +1309,7 @@ class TestPDFActions(DirectoriesMixin, TestCase):
self.assertEqual(self.doc2.archive_serial_number, 333)
@mock.patch("documents.tasks.consume_file.apply_async")
def test_edit_pdf_with_update_document(self, mock_consume_delay) -> None:
def test_edit_pdf_with_update_document(self, mock_consume_delay):
"""
GIVEN:
- A single existing PDF document
@@ -1326,7 +1337,10 @@ class TestPDFActions(DirectoriesMixin, TestCase):
)
self.assertIsNotNone(task_kwargs["overrides"])
@mock.patch("documents.data_models.magic.from_file", return_value="application/pdf")
@mock.patch(
"documents.data_models.mime_detection.from_file",
return_value="application/pdf",
)
@mock.patch("documents.tasks.consume_file.apply_async")
@mock.patch("pikepdf.new")
@mock.patch("pikepdf.open")
@@ -1336,7 +1350,7 @@ class TestPDFActions(DirectoriesMixin, TestCase):
mock_new,
mock_consume_delay,
mock_magic,
) -> None:
):
Document.objects.create(
checksum="B-v1",
title="B version 1",
@@ -1414,7 +1428,7 @@ class TestPDFActions(DirectoriesMixin, TestCase):
self,
mock_consume_file,
mock_group,
) -> None:
):
"""
GIVEN:
- Existing document
@@ -1444,7 +1458,7 @@ class TestPDFActions(DirectoriesMixin, TestCase):
mock_mkdtemp,
mock_consume_delay,
mock_update_document,
) -> None:
):
doc = self.doc1
temp_dir = self.dirs.scratch_dir / "remove-password-update"
temp_dir.mkdir(parents=True, exist_ok=True)
@@ -1480,7 +1494,10 @@ class TestPDFActions(DirectoriesMixin, TestCase):
self.assertEqual(task_kwargs["input_doc"].root_document_id, doc.id)
self.assertIsNotNone(task_kwargs["overrides"])
@mock.patch("documents.data_models.magic.from_file", return_value="application/pdf")
@mock.patch(
"documents.data_models.mime_detection.from_file",
return_value="application/pdf",
)
@mock.patch("documents.tasks.consume_file.apply_async")
@mock.patch("pikepdf.open")
def test_remove_password_explicit_selection_uses_root_source_when_root_selected(
+17 -46
View File
@@ -140,28 +140,25 @@ class FaultyGenericExceptionParser(_BaseNewStyleParser):
raise Exception("Generic exception.")
def fake_magic_from_file(file, *, mime=False): # NOSONAR
if mime:
filepath = Path(file)
if filepath.name.startswith("invalid_pdf"):
return "application/octet-stream"
if filepath.name.startswith("valid_pdf"):
return "application/pdf"
if filepath.suffix == ".pdf":
return "application/pdf"
elif filepath.suffix == ".png":
return "image/png"
elif filepath.suffix == ".webp":
return "image/webp"
elif filepath.suffix == ".eml":
return "message/rfc822"
else:
return "unknown"
def fake_magic_from_file(file): # NOSONAR
filepath = Path(file)
if filepath.name.startswith("invalid_pdf"):
return "application/octet-stream"
if filepath.name.startswith("valid_pdf"):
return "application/pdf"
if filepath.suffix == ".pdf":
return "application/pdf"
elif filepath.suffix == ".png":
return "image/png"
elif filepath.suffix == ".webp":
return "image/webp"
elif filepath.suffix == ".eml":
return "message/rfc822"
else:
return "A verbose string that describes the contents of the file"
return "unknown"
@mock.patch("documents.consumer.magic.from_file", fake_magic_from_file)
@mock.patch("documents.consumer.mime_detection.from_file", fake_magic_from_file)
class TestConsumer(
DirectoriesMixin,
FileSystemAssertsMixin,
@@ -246,7 +243,6 @@ class TestConsumer(
consumer.run()
document = Document.objects.first()
assert document is not None
self.assertIsNotNone(document)
@@ -298,7 +294,6 @@ class TestConsumer(
consumer.run()
document = Document.objects.first()
assert document is not None
self.assertIsNotNone(document)
@@ -318,7 +313,6 @@ class TestConsumer(
consumer.run()
document = Document.objects.first()
assert document is not None
self.assertIsNotNone(document)
@@ -334,7 +328,6 @@ class TestConsumer(
consumer.run()
document = Document.objects.first()
assert document is not None
self.assertIsNotNone(document)
@@ -351,7 +344,6 @@ class TestConsumer(
consumer.run()
document = Document.objects.first()
assert document is not None
self.assertIsNotNone(document)
@@ -368,7 +360,6 @@ class TestConsumer(
consumer.run()
document = Document.objects.first()
assert document is not None
self.assertEqual(document.document_type.id, dt.id)
self._assert_first_last_send_progress()
@@ -383,7 +374,6 @@ class TestConsumer(
consumer.run()
document = Document.objects.first()
assert document is not None
self.assertEqual(document.storage_path.id, sp.id)
self._assert_first_last_send_progress()
@@ -400,7 +390,6 @@ class TestConsumer(
consumer.run()
document = Document.objects.first()
assert document is not None
self.assertIn(t1, document.tags.all())
self.assertNotIn(t2, document.tags.all())
@@ -427,7 +416,6 @@ class TestConsumer(
consumer.run()
document = Document.objects.first()
assert document is not None
fields_used = [
field_instance.field for field_instance in document.custom_fields.all()
@@ -450,7 +438,6 @@ class TestConsumer(
consumer.run()
document = Document.objects.first()
assert document is not None
self.assertEqual(document.archive_serial_number, 123)
self._assert_first_last_send_progress()
@@ -470,7 +457,6 @@ class TestConsumer(
consumer.run()
document = Document.objects.first()
assert document is not None
now = timezone.now()
self.assertEqual(document.title, f"{c.name}{dt.name} {now.strftime('%m-%y')}")
@@ -486,7 +472,6 @@ class TestConsumer(
consumer.run()
document = Document.objects.first()
assert document is not None
self.assertEqual(document.owner, testuser)
self._assert_first_last_send_progress()
@@ -505,7 +490,6 @@ class TestConsumer(
consumer.run()
document = Document.objects.first()
assert document is not None
user_checker = ObjectPermissionChecker(testuser)
self.assertTrue(user_checker.has_perm("view_document", document))
@@ -578,7 +562,6 @@ class TestConsumer(
consumer.run()
document = Document.objects.first()
assert document is not None
document.delete()
with self.assertRaisesMessage(ConsumerError, "document is in the trash"):
@@ -659,7 +642,6 @@ class TestConsumer(
consumer.run()
document = Document.objects.first()
assert document is not None
self.assertEqual(document.title, "new docs")
self.assertEqual(document.filename, "none/new docs.pdf")
@@ -681,7 +663,6 @@ class TestConsumer(
consumer.run()
document = Document.objects.first()
assert document is not None
self.assertIsNotNone(document)
assert document is not None
@@ -720,7 +701,6 @@ class TestConsumer(
consumer.run()
document = Document.objects.first()
assert document is not None
self.assertEqual(document.title, "new docs")
self.assertIsNotNone(document.title)
@@ -741,7 +721,6 @@ class TestConsumer(
document = Document.objects.first()
assert document is not None
assert document is not None
self.assertEqual(document.version_label, "v1")
@@ -958,7 +937,6 @@ class TestConsumer(
consumer.run()
document = Document.objects.first()
assert document is not None
self.assertEqual(document.correspondent, correspondent)
self.assertEqual(document.document_type, dtype)
@@ -976,7 +954,6 @@ class TestConsumer(
consumer.run()
document = Document.objects.first()
assert document is not None
self._assert_first_last_send_progress()
@@ -1007,7 +984,6 @@ class TestConsumer(
# Move the existing document to trash
document = Document.objects.first()
assert document is not None
document.delete()
dst = self.get_test_file()
@@ -1036,7 +1012,6 @@ class TestConsumer(
consumer.run()
document = Document.objects.first()
assert document is not None
self._assert_first_last_send_progress()
@@ -1168,7 +1143,7 @@ class TestConsumer(
)
@mock.patch("documents.consumer.magic.from_file", fake_magic_from_file)
@mock.patch("documents.consumer.mime_detection.from_file", fake_magic_from_file)
class TestConsumerCreatedDate(DirectoriesMixin, GetConsumerMixin, TestCase):
def setUp(self) -> None:
super().setUp()
@@ -1195,7 +1170,6 @@ class TestConsumerCreatedDate(DirectoriesMixin, GetConsumerMixin, TestCase):
consumer.run()
document = Document.objects.first()
assert document is not None
self.assertEqual(
document.created,
@@ -1226,7 +1200,6 @@ class TestConsumerCreatedDate(DirectoriesMixin, GetConsumerMixin, TestCase):
consumer.run()
document = Document.objects.first()
assert document is not None
self.assertEqual(
document.created,
@@ -1257,7 +1230,6 @@ class TestConsumerCreatedDate(DirectoriesMixin, GetConsumerMixin, TestCase):
consumer.run()
document = Document.objects.first()
assert document is not None
self.assertEqual(
document.created,
@@ -1290,7 +1262,6 @@ class TestConsumerCreatedDate(DirectoriesMixin, GetConsumerMixin, TestCase):
consumer.run()
document = Document.objects.first()
assert document is not None
self.assertEqual(
document.created,
+2 -2
View File
@@ -1514,7 +1514,7 @@ class TestFilenameGeneration(DirectoriesMixin, TestCase):
Path("somepath/asn-201-400/asn-3xx/Does Matter.pdf"),
)
def test_template_related_context_keeps_legacy_string_coercion(self) -> None:
def test_template_related_context_keeps_legacy_string_coercion(self):
"""
GIVEN:
- A storage path template that uses related objects directly as strings
@@ -1908,7 +1908,7 @@ class TestCustomFieldFilenameUpdates(
self.assertLessEqual(m.call_count, 1)
@override_settings(FILENAME_FORMAT=None)
def test_overlong_storage_path_keeps_existing_filename(self) -> None:
def test_overlong_storage_path_keeps_existing_filename(self):
initial_filename = generate_filename(self.doc)
Document.objects.filter(pk=self.doc.pk).update(filename=str(initial_filename))
self.doc.refresh_from_db()
+5 -5
View File
@@ -83,7 +83,7 @@ class TestDateLocalization:
format_style: str,
locale_str: str,
expected_output: str,
) -> None:
):
"""
Tests `localize_date` with `date` objects across different locales and formats.
"""
@@ -144,7 +144,7 @@ class TestDateLocalization:
format_style: str,
locale_str: str,
expected_output: str,
) -> None:
):
# To handle the non-breaking space in French and other locales
result = localize_date(value, format_style, locale_str)
assert result.replace("\u202f", " ") == expected_output.replace("\u202f", " ")
@@ -161,7 +161,7 @@ class TestDateLocalization:
def test_localize_date_raises_type_error_for_invalid_input(
self,
invalid_value: None | list[object] | dict[Any, Any] | Literal[1698330605],
) -> None:
):
with pytest.raises(TypeError) as excinfo:
localize_date(invalid_value, "medium", "en_US")
@@ -228,7 +228,7 @@ class TestDateLocalization:
format_style: str,
locale_str: str,
expected_output: str,
) -> None:
):
"""
Tests `localize_date` with `date` string across different locales and formats.
"""
@@ -289,7 +289,7 @@ class TestDateLocalization:
format_style: str,
locale_str: str,
expected_output: str,
) -> None:
):
"""
Tests `localize_date` with `date` string across different locales and formats.
"""
+13 -18
View File
@@ -10,10 +10,10 @@ from documents.regex import validate_regex_pattern
class TestValidateRegexPattern:
def test_valid_pattern(self) -> None:
def test_valid_pattern(self):
validate_regex_pattern(r"\d+")
def test_invalid_pattern_raises(self) -> None:
def test_invalid_pattern_raises(self):
with pytest.raises(ValueError):
validate_regex_pattern(r"[invalid")
@@ -40,7 +40,7 @@ class TestSafeRegexSearchAndMatch:
),
],
)
def test_match_found(self, func, pattern, text, expected_group) -> None:
def test_match_found(self, func, pattern, text, expected_group):
result = func(pattern, text)
assert result is not None
assert result.group() == expected_group
@@ -52,7 +52,7 @@ class TestSafeRegexSearchAndMatch:
pytest.param(safe_regex_match, r"\d+", "abc123", id="match-no-match"),
],
)
def test_no_match(self, func, pattern, text) -> None:
def test_no_match(self, func, pattern, text):
assert func(pattern, text) is None
@pytest.mark.parametrize(
@@ -62,7 +62,7 @@ class TestSafeRegexSearchAndMatch:
pytest.param(safe_regex_match, id="match"),
],
)
def test_invalid_pattern_returns_none(self, func) -> None:
def test_invalid_pattern_returns_none(self, func):
assert func(r"[invalid", "test") is None
@pytest.mark.parametrize(
@@ -72,7 +72,7 @@ class TestSafeRegexSearchAndMatch:
pytest.param(safe_regex_match, id="match"),
],
)
def test_flags_respected(self, func) -> None:
def test_flags_respected(self, func):
assert func(r"abc", "ABC", flags=regex.IGNORECASE) is not None
@pytest.mark.parametrize(
@@ -82,12 +82,7 @@ class TestSafeRegexSearchAndMatch:
pytest.param(safe_regex_match, "match", id="match"),
],
)
def test_timeout_returns_none(
self,
func,
method_name,
mocker: MockerFixture,
) -> None:
def test_timeout_returns_none(self, func, method_name, mocker: MockerFixture):
mock_compile = mocker.patch("documents.regex.regex.compile")
getattr(mock_compile.return_value, method_name).side_effect = TimeoutError
assert func(r"\d+", "test") is None
@@ -102,31 +97,31 @@ class TestSafeRegexSub:
pytest.param(r"abc", "X", "ABC", "X", id="flags"),
],
)
def test_substitution(self, pattern, repl, text, expected) -> None:
def test_substitution(self, pattern, repl, text, expected):
flags = regex.IGNORECASE if pattern == r"abc" else 0
result = safe_regex_sub(pattern, repl, text, flags=flags)
assert result == expected
def test_invalid_pattern_returns_none(self) -> None:
def test_invalid_pattern_returns_none(self):
assert safe_regex_sub(r"[invalid", "x", "test") is None
def test_timeout_returns_none(self, mocker: MockerFixture) -> None:
def test_timeout_returns_none(self, mocker: MockerFixture):
mock_compile = mocker.patch("documents.regex.regex.compile")
mock_compile.return_value.sub.side_effect = TimeoutError
assert safe_regex_sub(r"\d+", "X", "test") is None
class TestSafeRegexFinditer:
def test_yields_matches(self) -> None:
def test_yields_matches(self):
pattern = regex.compile(r"\d+")
matches = list(safe_regex_finditer(pattern, "a1b22c333"))
assert [m.group() for m in matches] == ["1", "22", "333"]
def test_no_matches(self) -> None:
def test_no_matches(self):
pattern = regex.compile(r"\d+")
assert list(safe_regex_finditer(pattern, "abcdef")) == []
def test_timeout_stops_iteration(self, mocker: MockerFixture) -> None:
def test_timeout_stops_iteration(self, mocker: MockerFixture):
mock_pattern = mocker.MagicMock()
mock_pattern.finditer.side_effect = TimeoutError
mock_pattern.pattern = r"\d+"
@@ -280,7 +280,6 @@ class ShareLinkBundleBuildTaskTests(DirectoriesMixin, APITestCase):
self.document.archive_filename = f"{self.document.pk:07}.pdf"
self.document.save()
path = self.document.archive_path
assert path is not None
else:
path = self.document.source_path
path.parent.mkdir(parents=True, exist_ok=True)
@@ -305,7 +304,6 @@ class ShareLinkBundleBuildTaskTests(DirectoriesMixin, APITestCase):
self.assertGreater(bundle.size_bytes or 0, 0)
final_path = bundle.absolute_file_path
self.assertIsNotNone(final_path)
assert final_path is not None
self.assertTrue(final_path.exists())
with zipfile.ZipFile(final_path) as zipf:
names = zipf.namelist()
@@ -329,7 +327,6 @@ class ShareLinkBundleBuildTaskTests(DirectoriesMixin, APITestCase):
bundle.refresh_from_db()
final_path = bundle.absolute_file_path
self.assertIsNotNone(final_path)
assert final_path is not None
self.assertTrue(final_path.exists())
self.assertNotEqual(final_path.read_bytes(), b"old")
@@ -357,7 +354,6 @@ class ShareLinkBundleBuildTaskTests(DirectoriesMixin, APITestCase):
bundle.refresh_from_db()
self.assertEqual(bundle.status, ShareLinkBundle.Status.FAILED)
self.assertIsInstance(bundle.last_error, dict)
assert isinstance(bundle.last_error, dict)
self.assertEqual(bundle.last_error.get("message"), "zip failure")
self.assertEqual(bundle.last_error.get("exception_type"), "RuntimeError")
scratch_zips = list(Path(settings.SCRATCH_DIR).glob("*.zip"))
+24 -28
View File
@@ -56,11 +56,7 @@ def send_publish(
@pytest.mark.django_db
class TestBeforeTaskPublishHandler:
def test_creates_task_for_consume_file(
self,
consume_input_doc,
consume_overrides,
) -> None:
def test_creates_task_for_consume_file(self, consume_input_doc, consume_overrides):
task_id = send_publish(
"documents.tasks.consume_file",
(),
@@ -74,18 +70,18 @@ class TestBeforeTaskPublishHandler:
assert task.input_data["filename"] == "invoice.pdf"
assert task.owner_id == consume_overrides.owner_id
def test_creates_task_for_train_classifier(self) -> None:
def test_creates_task_for_train_classifier(self):
task_id = send_publish("documents.tasks.train_classifier", (), {})
task = PaperlessTask.objects.get(task_id=task_id)
assert task.task_type == PaperlessTask.TaskType.TRAIN_CLASSIFIER
assert task.trigger_source == PaperlessTask.TriggerSource.MANUAL
def test_creates_task_for_sanity_check(self) -> None:
def test_creates_task_for_sanity_check(self):
task_id = send_publish("documents.tasks.sanity_check", (), {})
task = PaperlessTask.objects.get(task_id=task_id)
assert task.task_type == PaperlessTask.TaskType.SANITY_CHECK
def test_creates_task_for_process_mail_accounts(self) -> None:
def test_creates_task_for_process_mail_accounts(self):
task_id = send_publish(
"paperless_mail.tasks.process_mail_accounts",
(),
@@ -95,13 +91,13 @@ class TestBeforeTaskPublishHandler:
assert task.task_type == PaperlessTask.TaskType.MAIL_FETCH
assert task.input_data["account_ids"] == [1, 2]
def test_mail_fetch_no_account_ids_stores_empty_input(self) -> None:
def test_mail_fetch_no_account_ids_stores_empty_input(self):
"""Beat-scheduled mail checks pass no account_ids; input_data should be {} not {"account_ids": None}."""
task_id = send_publish("paperless_mail.tasks.process_mail_accounts", (), {})
task = PaperlessTask.objects.get(task_id=task_id)
assert task.input_data == {}
def test_overrides_date_serialized_as_iso_string(self, consume_input_doc) -> None:
def test_overrides_date_serialized_as_iso_string(self, consume_input_doc):
"""A datetime.date in overrides is stored as an ISO string so input_data is JSON-safe."""
overrides = DocumentMetadataOverrides(created=datetime.date(2024, 1, 15))
@@ -114,7 +110,7 @@ class TestBeforeTaskPublishHandler:
task = PaperlessTask.objects.get(task_id=task_id)
assert task.input_data["overrides"]["created"] == "2024-01-15"
def test_overrides_path_serialized_as_string(self, consume_input_doc) -> None:
def test_overrides_path_serialized_as_string(self, consume_input_doc):
"""A Path value in overrides is stored as a plain string so input_data is JSON-safe."""
overrides = DocumentMetadataOverrides()
overrides.filename = Path("/uploads/invoice.pdf") # type: ignore[assignment]
@@ -163,11 +159,11 @@ class TestBeforeTaskPublishHandler:
task = PaperlessTask.objects.get(task_id=task_id)
assert task.trigger_source == expected_trigger_source
def test_ignores_untracked_task(self) -> None:
def test_ignores_untracked_task(self):
send_publish("documents.tasks.some_untracked_task", (), {})
assert PaperlessTask.objects.count() == 0
def test_ignores_none_headers(self) -> None:
def test_ignores_none_headers(self):
before_task_publish_handler(sender=None, headers=None, body=None)
assert PaperlessTask.objects.count() == 0
@@ -189,7 +185,7 @@ class TestBeforeTaskPublishHandler:
@pytest.mark.django_db
class TestTaskPrerunHandler:
def test_marks_task_started(self) -> None:
def test_marks_task_started(self):
task = PaperlessTaskFactory(status=PaperlessTask.Status.PENDING)
task_prerun_handler(task_id=task.task_id)
@@ -219,7 +215,7 @@ class TestTaskPostrunHandler:
date_started=timezone.now(),
)
def test_records_success_with_dict_result(self) -> None:
def test_records_success_with_dict_result(self):
task = self._started_task()
task_postrun_handler(
@@ -234,7 +230,7 @@ class TestTaskPostrunHandler:
assert task.duration_seconds is not None
assert task.wait_time_seconds is not None
def test_skips_failure_state(self) -> None:
def test_skips_failure_state(self):
"""postrun skips FAILURE; task_failure_handler owns that path."""
task = self._started_task()
@@ -242,7 +238,7 @@ class TestTaskPostrunHandler:
task.refresh_from_db()
assert task.status == PaperlessTask.Status.STARTED
def test_records_success_with_consume_result(self) -> None:
def test_records_success_with_consume_result(self):
"""ConsumeFileSuccessResult dict is stored directly as result_data."""
from documents.data_models import ConsumeFileSuccessResult
@@ -255,7 +251,7 @@ class TestTaskPostrunHandler:
task.refresh_from_db()
assert task.result_data == {"document_id": 42}
def test_records_stopped_with_reason(self) -> None:
def test_records_stopped_with_reason(self):
"""ConsumeFileStoppedResult dict is stored directly as result_data."""
from documents.data_models import ConsumeFileStoppedResult
@@ -268,14 +264,14 @@ class TestTaskPostrunHandler:
task.refresh_from_db()
assert task.result_data == {"reason": "Barcode splitting complete!"}
def test_none_retval_stores_no_result_data(self) -> None:
def test_none_retval_stores_no_result_data(self):
"""None return value (non-consume tasks) leaves result_data untouched."""
task = self._started_task()
task_postrun_handler(task_id=task.task_id, retval=None, state="SUCCESS")
task.refresh_from_db()
assert task.result_data is None
def test_ignores_unknown_task_id(self) -> None:
def test_ignores_unknown_task_id(self):
task_postrun_handler(
task_id="nonexistent",
@@ -283,7 +279,7 @@ class TestTaskPostrunHandler:
state="SUCCESS",
) # must not raise
def test_records_revoked_state(self) -> None:
def test_records_revoked_state(self):
task = self._started_task()
task_postrun_handler(task_id=task.task_id, retval=None, state="REVOKED")
@@ -293,7 +289,7 @@ class TestTaskPostrunHandler:
@pytest.mark.django_db
class TestTaskFailureHandler:
def test_records_failure_with_exception(self) -> None:
def test_records_failure_with_exception(self):
task = PaperlessTaskFactory(
task_type=PaperlessTask.TaskType.CONSUME_FILE,
@@ -312,7 +308,7 @@ class TestTaskFailureHandler:
assert task.result_data["error_message"] == "PDF parse failed"
assert task.date_done is not None
def test_records_traceback_when_provided(self) -> None:
def test_records_traceback_when_provided(self):
task = PaperlessTaskFactory(
task_type=PaperlessTask.TaskType.CONSUME_FILE,
@@ -335,7 +331,7 @@ class TestTaskFailureHandler:
assert "traceback" in task.result_data
assert len(task.result_data["traceback"]) <= 5000
def test_computes_duration_and_wait_time(self) -> None:
def test_computes_duration_and_wait_time(self):
now = timezone.now()
task = PaperlessTaskFactory(
@@ -354,14 +350,14 @@ class TestTaskFailureHandler:
assert task.duration_seconds == pytest.approx(5.0, abs=1.0)
assert task.wait_time_seconds == pytest.approx(5.0, abs=1.0)
def test_ignores_none_task_id(self) -> None:
def test_ignores_none_task_id(self):
task_failure_handler(task_id=None, exception=ValueError("x"), traceback=None)
@pytest.mark.django_db
class TestTaskRevokedHandler:
def test_marks_task_revoked(self, mocker: pytest_mock.MockerFixture) -> None:
def test_marks_task_revoked(self, mocker: pytest_mock.MockerFixture):
"""task_revoked_handler moves a queued task to REVOKED and stamps date_done."""
task = PaperlessTaskFactory(status=PaperlessTask.Status.PENDING)
request = mocker.MagicMock()
@@ -372,12 +368,12 @@ class TestTaskRevokedHandler:
assert task.status == PaperlessTask.Status.REVOKED
assert task.date_done is not None
def test_ignores_none_request(self) -> None:
def test_ignores_none_request(self):
"""task_revoked_handler must not raise when request is None."""
task_revoked_handler(request=None) # must not raise
def test_ignores_unknown_task_id(self, mocker: pytest_mock.MockerFixture) -> None:
def test_ignores_unknown_task_id(self, mocker: pytest_mock.MockerFixture):
"""task_revoked_handler must not raise for a task_id not in the database."""
request = mocker.MagicMock()
request.id = "nonexistent-id"
+1 -29
View File
@@ -27,7 +27,6 @@ from documents.models import StoragePath
from documents.models import Tag
from documents.signals.handlers import update_llm_suggestions_cache
from documents.tests.utils import DirectoriesMixin
from documents.tests.utils import read_streaming_response
from paperless.models import ApplicationConfiguration
@@ -158,7 +157,7 @@ class TestViews(DirectoriesMixin, TestCase):
# Valid
response = self.client.get(f"/share/{sl1.slug}")
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.assertEqual(read_streaming_response(response), content)
self.assertEqual(response.content, content)
# Invalid
response = self.client.get("/share/123notaslug", follow=True)
@@ -356,33 +355,6 @@ class TestAISuggestions(DirectoriesMixin, TestCase):
},
)
@patch("documents.views.get_ai_document_classification")
@override_settings(
AI_ENABLED=True,
LLM_BACKEND="openai-like",
)
def test_suggestions_with_invalid_ai_configuration(
self,
mock_get_ai_classification,
) -> None:
mock_get_ai_classification.side_effect = ValueError(
"Unknown model 'gpt-5.4-mini-2026-03-17'.",
)
self.client.force_login(user=self.user)
response = self.client.get(f"/api/documents/{self.document.pk}/suggestions/")
self.assertEqual(response.status_code, status.HTTP_400_BAD_REQUEST)
self.assertEqual(
response.json(),
{
"ai": ["Invalid AI configuration."],
},
)
self.assertIsNone(
get_llm_suggestion_cache(self.document.pk, backend="openai-like"),
)
def test_invalidate_suggestions_cache(self) -> None:
self.client.force_login(user=self.user)
suggestions = {
+68 -117
View File
@@ -187,7 +187,6 @@ class TestWorkflows(
)
document = Document.objects.first()
assert document is not None
self.assertEqual(document.correspondent, self.c)
self.assertEqual(document.document_type, self.dt)
self.assertEqual(list(document.tags.all()), [self.t1, self.t2, self.t3])
@@ -299,7 +298,6 @@ class TestWorkflows(
None,
)
document = Document.objects.first()
assert document is not None
self.assertEqual(document.correspondent, self.c)
self.assertEqual(document.document_type, self.dt)
self.assertEqual(list(document.tags.all()), [self.t1, self.t2, self.t3])
@@ -417,7 +415,6 @@ class TestWorkflows(
None,
)
document = Document.objects.first()
assert document is not None
# workflow 1
self.assertEqual(document.document_type, self.dt)
# workflow 2
@@ -486,7 +483,6 @@ class TestWorkflows(
None,
)
document = Document.objects.first()
assert document is not None
self.assertEqual(document.title, "Doc fnmatch title")
expected_str = f"Document matched {trigger} from {w}"
@@ -539,7 +535,6 @@ class TestWorkflows(
None,
)
document = Document.objects.first()
assert document is not None
self.assertIsNone(document.correspondent)
self.assertIsNone(document.document_type)
self.assertEqual(document.tags.all().count(), 0)
@@ -552,8 +547,7 @@ class TestWorkflows(
).count(),
0,
)
group_perms: QuerySet[Any] = get_groups_with_perms(document)
self.assertEqual(group_perms.count(), 0)
self.assertEqual(get_groups_with_perms(document).count(), 0)
self.assertEqual(
get_users_with_perms(
document,
@@ -561,8 +555,7 @@ class TestWorkflows(
).count(),
0,
)
group_perms: QuerySet[Any] = get_groups_with_perms(document)
self.assertEqual(group_perms.count(), 0)
self.assertEqual(get_groups_with_perms(document).count(), 0)
self.assertEqual(document.title, "simple")
expected_str = f"Document did not match {w}"
@@ -616,7 +609,6 @@ class TestWorkflows(
None,
)
document = Document.objects.first()
assert document is not None
self.assertIsNone(document.correspondent)
self.assertIsNone(document.document_type)
self.assertEqual(document.tags.all().count(), 0)
@@ -629,8 +621,12 @@ class TestWorkflows(
).count(),
0,
)
group_perms: QuerySet[Any] = get_groups_with_perms(document)
self.assertEqual(group_perms.count(), 0)
self.assertEqual(
get_groups_with_perms(
document,
).count(),
0,
)
self.assertEqual(
get_users_with_perms(
document,
@@ -638,8 +634,12 @@ class TestWorkflows(
).count(),
0,
)
group_perms: QuerySet[Any] = get_groups_with_perms(document)
self.assertEqual(group_perms.count(), 0)
self.assertEqual(
get_groups_with_perms(
document,
).count(),
0,
)
self.assertEqual(document.title, "simple")
expected_str = f"Document did not match {w}"
@@ -696,7 +696,6 @@ class TestWorkflows(
None,
)
document = Document.objects.first()
assert document is not None
self.assertIsNone(document.correspondent)
self.assertIsNone(document.document_type)
self.assertEqual(document.tags.all().count(), 0)
@@ -709,8 +708,12 @@ class TestWorkflows(
).count(),
0,
)
group_perms: QuerySet[Any] = get_groups_with_perms(document)
self.assertEqual(group_perms.count(), 0)
self.assertEqual(
get_groups_with_perms(
document,
).count(),
0,
)
self.assertEqual(
get_users_with_perms(
document,
@@ -718,8 +721,12 @@ class TestWorkflows(
).count(),
0,
)
group_perms: QuerySet[Any] = get_groups_with_perms(document)
self.assertEqual(group_perms.count(), 0)
self.assertEqual(
get_groups_with_perms(
document,
).count(),
0,
)
self.assertEqual(document.title, "simple")
expected_str = f"Document did not match {w}"
@@ -773,7 +780,6 @@ class TestWorkflows(
None,
)
document = Document.objects.first()
assert document is not None
self.assertIsNone(document.correspondent)
self.assertIsNone(document.document_type)
self.assertEqual(document.tags.all().count(), 0)
@@ -786,8 +792,12 @@ class TestWorkflows(
).count(),
0,
)
group_perms: QuerySet[Any] = get_groups_with_perms(document)
self.assertEqual(group_perms.count(), 0)
self.assertEqual(
get_groups_with_perms(
document,
).count(),
0,
)
self.assertEqual(
get_users_with_perms(
document,
@@ -795,8 +805,12 @@ class TestWorkflows(
).count(),
0,
)
group_perms: QuerySet[Any] = get_groups_with_perms(document)
self.assertEqual(group_perms.count(), 0)
self.assertEqual(
get_groups_with_perms(
document,
).count(),
0,
)
self.assertEqual(document.title, "simple")
expected_str = f"Document did not match {w}"
@@ -884,7 +898,6 @@ class TestWorkflows(
None,
)
document = Document.objects.first()
assert document is not None
self.assertEqual(
list(document.custom_fields.all().values_list("field", flat=True)),
[self.cf1.pk],
@@ -1955,7 +1968,6 @@ class TestWorkflows(
None,
)
document = Document.objects.first()
assert document is not None
self.assertRegex(
document.title,
r"Doc added in \w{3,}",
@@ -2052,11 +2064,11 @@ class TestWorkflows(
format="json",
)
view_users_perms: QuerySet[Any] = get_users_with_perms(
view_users_perms: QuerySet = get_users_with_perms(
doc,
only_with_perms_in=["view_document"],
)
change_users_perms: QuerySet[Any] = get_users_with_perms(
change_users_perms: QuerySet = get_users_with_perms(
doc,
only_with_perms_in=["change_document"],
)
@@ -2067,7 +2079,7 @@ class TestWorkflows(
self.assertIn(self.user3, view_users_perms)
self.assertIn(self.user3, change_users_perms)
group_perms: QuerySet[Any] = get_groups_with_perms(doc)
group_perms: QuerySet = get_groups_with_perms(doc)
# group1 should still have permissions
self.assertIn(self.group1, group_perms)
# group2 should have been added
@@ -2833,72 +2845,9 @@ class TestWorkflows(
self.assertEqual(doc.custom_fields.all().count(), 0)
self.assertFalse(self.user3.has_perm("documents.view_document", doc))
self.assertFalse(self.user3.has_perm("documents.change_document", doc))
group_perms: QuerySet[Any] = get_groups_with_perms(doc)
group_perms: QuerySet = get_groups_with_perms(doc)
self.assertNotIn(self.group1, group_perms)
def test_document_updated_workflow_assignment_persists_when_removing_trigger_tag(
self,
) -> None:
"""
GIVEN:
- A document updated workflow filtered on a tag
- The workflow assigns a new title and removes that same tag
WHEN:
- The document is updated while carrying the trigger tag
THEN:
- The new title persists and the trigger tag is removed
"""
trigger = WorkflowTrigger.objects.create(
type=WorkflowTrigger.WorkflowTriggerType.DOCUMENT_UPDATED,
)
trigger.filter_has_tags.add(self.t1)
assignment = WorkflowAction.objects.create(
type=WorkflowAction.WorkflowActionType.ASSIGNMENT,
assign_title="workflow renamed",
order=0,
)
removal = WorkflowAction.objects.create(
type=WorkflowAction.WorkflowActionType.REMOVAL,
order=1,
)
removal.remove_tags.add(self.t1)
removal.save()
workflow = Workflow.objects.create(
name="Workflow rename and remove trigger tag",
order=0,
)
workflow.triggers.add(trigger)
workflow.actions.add(assignment, removal)
workflow.save()
doc = Document.objects.create(
title="sample test",
mime_type="application/pdf",
checksum="rename-remove-trigger-tag",
original_filename="sample.pdf",
)
generated = generate_unique_filename(doc)
destination = (settings.ORIGINALS_DIR / generated).resolve()
create_source_path_directory(destination)
shutil.copy(self.SAMPLE_DIR / "simple.pdf", destination)
Document.objects.filter(pk=doc.pk).update(filename=generated.as_posix())
doc.refresh_from_db()
doc.tags.set([self.t1, self.t2])
superuser = User.objects.create_superuser("superuser")
self.client.force_authenticate(user=superuser)
self.client.patch(
f"/api/documents/{doc.id}/",
{"title": "user update to trigger workflow"},
format="json",
)
doc.refresh_from_db()
self.assertEqual(doc.title, "workflow renamed")
self.assertFalse(doc.tags.filter(pk=self.t1.pk).exists())
self.assertTrue(doc.tags.filter(pk=self.t2.pk).exists())
def test_removal_action_document_updated_removeall(self) -> None:
"""
GIVEN:
@@ -2967,7 +2916,7 @@ class TestWorkflows(
self.assertEqual(doc.custom_fields.all().count(), 0)
self.assertFalse(self.user3.has_perm("documents.view_document", doc))
self.assertFalse(self.user3.has_perm("documents.change_document", doc))
group_perms: QuerySet[Any] = get_groups_with_perms(doc)
group_perms: QuerySet = get_groups_with_perms(doc)
self.assertNotIn(self.group1, group_perms)
def test_removal_action_document_consumed(self) -> None:
@@ -3045,7 +2994,6 @@ class TestWorkflows(
)
document = Document.objects.first()
assert document is not None
self.assertIsNone(document.correspondent)
self.assertIsNone(document.document_type)
@@ -3168,7 +3116,6 @@ class TestWorkflows(
None,
)
document = Document.objects.first()
assert document is not None
self.assertIsNone(document.correspondent)
self.assertIsNone(document.document_type)
self.assertEqual(document.tags.all().count(), 0)
@@ -3182,8 +3129,12 @@ class TestWorkflows(
).count(),
0,
)
group_perms: QuerySet[Any] = get_groups_with_perms(document)
self.assertEqual(group_perms.count(), 0)
self.assertEqual(
get_groups_with_perms(
document,
).count(),
0,
)
self.assertEqual(
get_users_with_perms(
document,
@@ -3191,8 +3142,12 @@ class TestWorkflows(
).count(),
0,
)
group_perms: QuerySet[Any] = get_groups_with_perms(document)
self.assertEqual(group_perms.count(), 0)
self.assertEqual(
get_groups_with_perms(
document,
).count(),
0,
)
self.assertEqual(
document.custom_fields.all()
.values_list(
@@ -3261,10 +3216,7 @@ class TestWorkflows(
PAPERLESS_URL="http://localhost:8000",
)
@mock.patch("django.core.mail.message.EmailMessage.send")
def test_workflow_assignment_then_email_includes_attachment(
self,
mock_email_send,
) -> None:
def test_workflow_assignment_then_email_includes_attachment(self, mock_email_send):
"""
GIVEN:
- Workflow with assignment and email actions
@@ -3813,7 +3765,7 @@ class TestWorkflows(
def test_workflow_webhook_action_does_not_overwrite_concurrent_tags(
self,
mock_execute_webhook_action,
) -> None:
):
"""
GIVEN:
- A document updated workflow with only a webhook action
@@ -3867,7 +3819,7 @@ class TestWorkflows(
def test_workflow_tag_actions_do_not_overwrite_concurrent_tags(
self,
mock_execute_webhook_action,
) -> None:
):
"""
GIVEN:
- A document updated workflow that clears tags and assigns an inbox tag
@@ -4145,7 +4097,7 @@ class TestWorkflows(
def test_password_removal_action_attempts_multiple_passwords(
self,
mock_remove_password,
) -> None:
):
"""
GIVEN:
- Workflow password removal action
@@ -4199,7 +4151,7 @@ class TestWorkflows(
def test_password_removal_action_fails_without_correct_password(
self,
mock_remove_password,
) -> None:
):
"""
GIVEN:
- Workflow password removal action
@@ -4232,7 +4184,7 @@ class TestWorkflows(
def test_password_removal_action_skips_without_passwords(
self,
mock_remove_password,
) -> None:
):
"""
GIVEN:
- Workflow password removal action with no passwords
@@ -4264,7 +4216,7 @@ class TestWorkflows(
def test_password_removal_consumable_document_deferred(
self,
mock_remove_password,
) -> None:
):
"""
GIVEN:
- Workflow password removal action
@@ -4331,7 +4283,7 @@ class TestWorkflows(
)
assert mock_remove_password.call_count == 2
def test_workflow_trash_action_soft_delete(self) -> None:
def test_workflow_trash_action_soft_delete(self):
"""
GIVEN:
- Document updated workflow with delete action
@@ -4374,7 +4326,7 @@ class TestWorkflows(
PAPERLESS_URL="http://localhost:8000",
)
@mock.patch("django.core.mail.message.EmailMessage.send")
def test_workflow_trash_with_email_action(self, mock_email_send) -> None:
def test_workflow_trash_with_email_action(self, mock_email_send):
"""
GIVEN:
- Workflow with email action, then move to trash action
@@ -4429,7 +4381,7 @@ class TestWorkflows(
PAPERLESS_URL="http://localhost:8000",
)
@mock.patch("documents.workflows.webhooks.send_webhook.apply_async")
def test_workflow_trash_with_webhook_action(self, mock_webhook_delay) -> None:
def test_workflow_trash_with_webhook_action(self, mock_webhook_delay):
"""
GIVEN:
- Workflow with webhook action (include_document=True), then move to trash action
@@ -4562,7 +4514,7 @@ class TestWorkflows(
self.assertEqual(Document.objects.count(), 0)
self.assertEqual(Document.deleted_objects.count(), 1)
def test_multiple_workflows_trash_then_assignment(self) -> None:
def test_multiple_workflows_trash_then_assignment(self):
"""
GIVEN:
- Workflow 1 (order=0) with move to trash action
@@ -4631,7 +4583,7 @@ class TestWorkflows(
log_output,
)
def test_workflow_delete_action_during_consumption(self) -> None:
def test_workflow_delete_action_during_consumption(self):
"""
GIVEN:
- Workflow with consumption trigger and delete action
@@ -4690,7 +4642,7 @@ class TestWorkflows(
# No document should be created
self.assertEqual(Document.objects.count(), 0)
def test_workflow_delete_action_during_consumption_with_assignment(self) -> None:
def test_workflow_delete_action_during_consumption_with_assignment(self):
"""
GIVEN:
- Workflow with consumption trigger, assignment action, then delete action
@@ -5204,5 +5156,4 @@ class TestDateWorkflowLocalization(
None,
)
document = Document.objects.first()
assert document is not None
assert document.title == expected_title
+7 -15
View File
@@ -17,7 +17,6 @@ import pytest
from django.apps import apps
from django.db import connection
from django.db.migrations.executor import MigrationExecutor
from django.http import StreamingHttpResponse
from django.test import TransactionTestCase
from django.test import override_settings
@@ -151,13 +150,6 @@ def util_call_with_backoff(
return succeeded, result
def read_streaming_response(response: StreamingHttpResponse) -> bytes:
"""Consume a StreamingHttpResponse/FileResponse and close it."""
content = b"".join(response.streaming_content)
response.close()
return content
class DirectoriesMixin:
"""
Creates and overrides settings for all folders and paths, then ensures
@@ -184,22 +176,22 @@ class FileSystemAssertsMixin:
Utilities for checks various state information of the file system
"""
def assertIsFile(self, path: PathLike[str] | str) -> None:
def assertIsFile(self, path: PathLike | str) -> None:
self.assertTrue(Path(path).resolve().is_file(), f"File does not exist: {path}")
def assertIsNotFile(self, path: PathLike[str] | str) -> None:
def assertIsNotFile(self, path: PathLike | str) -> None:
self.assertFalse(Path(path).resolve().is_file(), f"File does exist: {path}")
def assertIsDir(self, path: PathLike[str] | str) -> None:
def assertIsDir(self, path: PathLike | str) -> None:
self.assertTrue(Path(path).resolve().is_dir(), f"Dir does not exist: {path}")
def assertIsNotDir(self, path: PathLike[str] | str) -> None:
def assertIsNotDir(self, path: PathLike | str) -> None:
self.assertFalse(Path(path).resolve().is_dir(), f"Dir does exist: {path}")
def assertFilesEqual(
self,
path1: PathLike[str] | str,
path2: PathLike[str] | str,
path1: PathLike | str,
path2: PathLike | str,
) -> None:
path1 = Path(path1)
path2 = Path(path2)
@@ -210,7 +202,7 @@ class FileSystemAssertsMixin:
self.assertEqual(hash1, hash2, "File SHA256 mismatch")
def assertFileCountInDir(self, path: PathLike[str] | str, count: int) -> None:
def assertFileCountInDir(self, path: PathLike | str, count: int) -> None:
path = Path(path).resolve()
self.assertTrue(path.is_dir(), f"Path {path} is not a directory")
files = [x for x in path.iterdir() if x.is_file()]
+37 -113
View File
@@ -20,7 +20,6 @@ from urllib.parse import quote
from urllib.parse import urlparse
import httpx
import magic
import pathvalidate
from django.conf import settings
from django.contrib.auth.models import Group
@@ -69,7 +68,6 @@ from django.views.decorators.http import condition
from django.views.decorators.http import last_modified
from django.views.generic import TemplateView
from django_filters.rest_framework import DjangoFilterBackend
from drf_spectacular.openapi import AutoSchema
from drf_spectacular.types import OpenApiTypes
from drf_spectacular.utils import OpenApiParameter
from drf_spectacular.utils import extend_schema
@@ -227,6 +225,7 @@ from documents.versioning import get_latest_version_for_root
from documents.versioning import get_request_version_param
from documents.versioning import get_root_document
from documents.versioning import resolve_requested_version_for_root
from paperless import mime_detection
from paperless import version
from paperless.celery import app as celery_app
from paperless.config import AIConfig
@@ -281,7 +280,8 @@ class IndexView(TemplateView):
first = lang[: lang.index("-")]
second = lang[lang.index("-") + 1 :]
return f"{first}-{second.upper()}"
return lang
else:
return lang
def get_context_data(self, **kwargs):
context = super().get_context_data(**kwargs)
@@ -426,7 +426,9 @@ class BulkPermissionMixin:
class PermissionsAwareDocumentCountMixin(BulkPermissionMixin, PassUserMixin):
"""Mixin to add document count to queryset, permissions-aware if needed"""
"""
Mixin to add document count to queryset, permissions-aware if needed
"""
# Default is simple relation path, override for through-table/count specialization.
document_count_through: type[Model] | None = None
@@ -1228,7 +1230,8 @@ class DocumentViewSet(
def get_filesize(self, filename):
if Path(filename).is_file():
return Path(filename).stat().st_size
return None
else:
return None
@action(methods=["get"], detail=True, filter_backends=[])
@method_decorator(cache_control(no_cache=True))
@@ -1320,17 +1323,7 @@ class DocumentViewSet(
refresh_suggestions_cache(doc.pk)
return Response(cached_llm_suggestions.suggestions)
try:
llm_suggestions = get_ai_document_classification(doc, request.user)
except ValueError as exc:
logger.exception(
"Invalid AI configuration while generating suggestions for "
"document %s: %s",
doc.pk,
exc,
exc_info=True,
)
raise ValidationError({"ai": [_("Invalid AI configuration.")]}) from exc
llm_suggestions = get_ai_document_classification(doc, request.user)
matched_tags = match_tags_by_name(
llm_suggestions.get("tags", []),
@@ -1455,7 +1448,7 @@ class DocumentViewSet(
file_doc = self._get_effective_file_doc(request_doc, root_doc, request)
handle = file_doc.thumbnail_file
return FileResponse(handle, content_type="image/webp")
return HttpResponse(handle, content_type="image/webp")
except FileNotFoundError:
raise Http404
@@ -2009,7 +2002,7 @@ class DocumentViewSet(
)
class ChatStreamingSerializer(serializers.Serializer[dict[str, Any]]):
class ChatStreamingSerializer(serializers.Serializer):
q = serializers.CharField(required=True)
document_id = serializers.IntegerField(required=False, allow_null=True)
@@ -2113,7 +2106,8 @@ class UnifiedSearchViewSet(DocumentViewSet):
def get_serializer_class(self):
if self._is_search_request():
return SearchResultSerializer
return DocumentSerializer
else:
return DocumentSerializer
def _get_active_search_params(self, request: Request | None = None) -> list[str]:
request = request or self.request
@@ -3231,7 +3225,7 @@ class GlobalSearchView(PassUserMixin):
query = request.query_params.get("query", None)
if query is None:
return HttpResponseBadRequest("Query required")
if len(query) < 3:
elif len(query) < 3:
return HttpResponseBadRequest("Query must be at least 3 characters")
db_only = request.query_params.get("db_only", False)
@@ -3526,7 +3520,7 @@ class StatisticsView(GenericAPIView[Any]):
"inbox_tag": (
inbox_tag_pks[0] if inbox_tag_pks else None
), # backwards compatibility
"inbox_tags": (inbox_tag_pks or None),
"inbox_tags": (inbox_tag_pks if inbox_tag_pks else None),
"document_file_type_counts": document_file_type_counts,
"character_count": character_count,
"tag_count": len(tags),
@@ -3538,16 +3532,6 @@ class StatisticsView(GenericAPIView[Any]):
)
@extend_schema_view(
post=extend_schema(
operation_id="bulk_download",
description="Download multiple documents as a ZIP archive.",
responses={
(HTTPStatus.OK, "application/zip"): OpenApiTypes.BINARY,
HTTPStatus.FORBIDDEN: None,
},
),
)
class BulkDownloadView(DocumentSelectionMixin, GenericAPIView[Any]):
permission_classes = (IsAuthenticated,)
serializer_class = BulkDownloadSerializer
@@ -3570,6 +3554,13 @@ class BulkDownloadView(DocumentSelectionMixin, GenericAPIView[Any]):
if not has_perms_owner_aware(request.user, "change_document", document):
return HttpResponseForbidden("Insufficient permissions")
settings.SCRATCH_DIR.mkdir(parents=True, exist_ok=True)
temp = tempfile.NamedTemporaryFile( # noqa: SIM115
dir=settings.SCRATCH_DIR,
suffix="-compressed-archive",
delete=False,
)
if content == "both":
strategy_class = OriginalAndArchiveStrategy
elif content == "originals":
@@ -3577,35 +3568,20 @@ class BulkDownloadView(DocumentSelectionMixin, GenericAPIView[Any]):
else:
strategy_class = ArchiveOnlyStrategy
settings.SCRATCH_DIR.mkdir(parents=True, exist_ok=True)
fd, temp_name = tempfile.mkstemp(
dir=settings.SCRATCH_DIR,
suffix="-compressed-archive",
)
os.close(fd)
temp_path = Path(temp_name)
with zipfile.ZipFile(temp.name, "w", compression) as zipf:
strategy = strategy_class(zipf, follow_formatting=follow_filename_format)
for document in documents:
strategy.add_document(document)
try:
with zipfile.ZipFile(temp_path, "w", compression) as zipf:
strategy = strategy_class(
zipf,
follow_formatting=follow_filename_format,
)
for document in documents:
strategy.add_document(document)
# TODO(stumpylog): Investigate using FileResponse here
with Path(temp.name).open("rb") as f:
response = HttpResponse(f, content_type="application/zip")
response["Content-Disposition"] = '{}; filename="{}"'.format(
"attachment",
"documents.zip",
)
f = temp_path.open("rb")
temp_path.unlink()
except Exception:
temp_path.unlink(missing_ok=True)
raise
return FileResponse(
f,
as_attachment=True,
filename="documents.zip",
content_type="application/zip",
)
return response
@extend_schema_view(
@@ -3823,15 +3799,6 @@ class RemoteVersionView(GenericAPIView[Any]):
)
class _TasksViewSetSchema(AutoSchema):
_UNPAGINATED_ACTIONS = frozenset({"summary", "active"})
def _get_paginator(self):
if getattr(self.view, "action", None) in self._UNPAGINATED_ACTIONS:
return None
return super()._get_paginator()
@extend_schema_view(
list=extend_schema(
parameters=[
@@ -3890,9 +3857,7 @@ class _TasksViewSetSchema(AutoSchema):
),
)
class TasksViewSet(ReadOnlyModelViewSet[PaperlessTask]):
schema = _TasksViewSetSchema()
permission_classes = (IsAuthenticated, PaperlessObjectPermissions)
pagination_class = StandardPagination
filter_backends = (
DjangoFilterBackend,
OrderingFilter,
@@ -3944,12 +3909,6 @@ class TasksViewSet(ReadOnlyModelViewSet[PaperlessTask]):
return TaskSerializerV9
return TaskSerializerV10
def paginate_queryset(self, queryset):
# v9: tasks endpoint was not paginated; preserve plain-list response
if self.request.version and int(self.request.version) < 10:
return None
return super().paginate_queryset(queryset)
def get_queryset(self):
is_v9 = self.request.version and int(self.request.version) < 10
if self.request.user.is_staff:
@@ -4313,7 +4272,7 @@ def serve_file(
use_archive: bool,
disposition: str,
follow_formatting: bool = False,
) -> FileResponse:
) -> HttpResponse:
if use_archive:
if TYPE_CHECKING:
assert doc.archive_filename
@@ -4336,7 +4295,7 @@ def serve_file(
if mime_type in {"application/csv", "text/csv"} and disposition == "inline":
mime_type = "text/plain"
response = FileResponse(file_handle, content_type=mime_type)
response = HttpResponse(file_handle, content_type=mime_type)
# Firefox is not able to handle unicode characters in filename field
# RFC 5987 addresses this issue
# see https://datatracker.ietf.org/doc/html/rfc5987#section-4.2
@@ -4638,16 +4597,6 @@ class CustomFieldViewSet(PermissionsAwareDocumentCountMixin, ModelViewSet[Custom
"redis_status": serializers.CharField(),
"redis_error": serializers.CharField(),
"celery_status": serializers.CharField(),
"summary": inline_serializer(
name="TasksSummaryOverview",
fields={
"days": serializers.IntegerField(),
"total_count": serializers.IntegerField(),
"pending_count": serializers.IntegerField(),
"success_count": serializers.IntegerField(),
"failure_count": serializers.IntegerField(),
},
),
},
),
"index": inline_serializer(
@@ -4681,7 +4630,6 @@ class CustomFieldViewSet(PermissionsAwareDocumentCountMixin, ModelViewSet[Custom
)
class SystemStatusView(PassUserMixin):
permission_classes = (IsAuthenticated,)
TASK_SUMMARY_DAYS = 30
def get(self, request, format=None):
if not has_system_status_permission(request.user):
@@ -4848,29 +4796,6 @@ class SystemStatusView(PassUserMixin):
last_llmindex_update.date_done if last_llmindex_update else None
)
summary_cutoff = timezone.now() - timedelta(days=self.TASK_SUMMARY_DAYS)
task_summary_agg = PaperlessTask.objects.filter(
date_created__gte=summary_cutoff,
).aggregate(
total_count=Count("id"),
pending_count=Count(
"id",
filter=Q(status=PaperlessTask.Status.PENDING),
),
success_count=Count(
"id",
filter=Q(status=PaperlessTask.Status.SUCCESS),
),
failure_count=Count(
"id",
filter=Q(status=PaperlessTask.Status.FAILURE),
),
)
task_summary = {
"days": self.TASK_SUMMARY_DAYS,
**task_summary_agg,
}
return Response(
{
"pngx_version": current_version,
@@ -4911,7 +4836,6 @@ class SystemStatusView(PassUserMixin):
"llmindex_status": llmindex_status,
"llmindex_last_modified": llmindex_last_modified,
"llmindex_error": llmindex_error,
"summary": task_summary,
},
},
)
@@ -4972,7 +4896,7 @@ def serve_logo(request: HttpRequest, filename: str | None = None) -> FileRespons
raise Http404("No logo configured")
path = app_logo.path
content_type = magic.from_file(path, mime=True) or "application/octet-stream"
content_type = mime_detection.from_file(path) or "application/octet-stream"
return FileResponse(
app_logo.open("rb"),
+2 -2
View File
@@ -202,7 +202,7 @@ def apply_removal_to_document(
"""
if action.remove_all_tags:
Document.objects.get(pk=document.pk).tags.clear()
document.tags.clear()
else:
tag_ids_to_remove: set[int] = set()
for tag in action.remove_tags.all():
@@ -210,7 +210,7 @@ def apply_removal_to_document(
tag_ids_to_remove.update(int(pk) for pk in tag.get_descendants_pks())
if tag_ids_to_remove:
Document.objects.get(pk=document.pk).tags.remove(*tag_ids_to_remove)
document.tags.remove(*tag_ids_to_remove)
if action.remove_all_correspondents or (
document.correspondent
+49 -53
View File
@@ -2,7 +2,7 @@ msgid ""
msgstr ""
"Project-Id-Version: paperless-ngx\n"
"Report-Msgid-Bugs-To: \n"
"POT-Creation-Date: 2026-04-28 17:07+0000\n"
"POT-Creation-Date: 2026-04-22 20:49+0000\n"
"PO-Revision-Date: 2022-02-17 04:17\n"
"Last-Translator: \n"
"Language-Team: English\n"
@@ -1352,8 +1352,8 @@ msgid "workflow runs"
msgstr ""
#: documents/serialisers.py:463 documents/serialisers.py:815
#: documents/serialisers.py:2681 documents/views.py:2264
#: documents/views.py:2333 paperless_mail/serialisers.py:155
#: documents/serialisers.py:2681 documents/views.py:2258
#: documents/views.py:2327 paperless_mail/serialisers.py:143
msgid "Insufficient permissions."
msgstr ""
@@ -1393,7 +1393,7 @@ msgstr ""
msgid "Duplicate document identifiers are not allowed."
msgstr ""
#: documents/serialisers.py:2767 documents/views.py:4158
#: documents/serialisers.py:2767 documents/views.py:4117
#, python-format
msgid "Documents not found: %(ids)s"
msgstr ""
@@ -1661,32 +1661,28 @@ msgstr ""
msgid "Unable to parse URI {value}"
msgstr ""
#: documents/views.py:1333
msgid "Invalid AI configuration."
msgstr ""
#: documents/views.py:2144
#: documents/views.py:2138
msgid "Specify only one of text, title_search, query, or more_like_id."
msgstr ""
#: documents/views.py:2257 documents/views.py:2330
#: documents/views.py:2251 documents/views.py:2324
msgid "Invalid more_like_id"
msgstr ""
#: documents/views.py:4170
#: documents/views.py:4129
#, python-format
msgid "Insufficient permissions to share document %(id)s."
msgstr ""
#: documents/views.py:4216
#: documents/views.py:4175
msgid "Bundle is already being processed."
msgstr ""
#: documents/views.py:4276
#: documents/views.py:4235
msgid "The share link bundle is still being prepared. Please try again later."
msgstr ""
#: documents/views.py:4286
#: documents/views.py:4245
msgid "The share link bundle is unavailable."
msgstr ""
@@ -1771,7 +1767,7 @@ msgid "CMYK"
msgstr ""
#: paperless/models.py:78 paperless/models.py:87
msgid "OpenAI-compatible"
msgid "OpenAI"
msgstr ""
#: paperless/models.py:79
@@ -1922,155 +1918,155 @@ msgstr ""
msgid "paperless application settings"
msgstr ""
#: paperless/settings/__init__.py:532
#: paperless/settings/__init__.py:531
msgid "English (US)"
msgstr ""
#: paperless/settings/__init__.py:533
#: paperless/settings/__init__.py:532
msgid "Arabic"
msgstr ""
#: paperless/settings/__init__.py:534
#: paperless/settings/__init__.py:533
msgid "Afrikaans"
msgstr ""
#: paperless/settings/__init__.py:535
#: paperless/settings/__init__.py:534
msgid "Belarusian"
msgstr ""
#: paperless/settings/__init__.py:536
#: paperless/settings/__init__.py:535
msgid "Bulgarian"
msgstr ""
#: paperless/settings/__init__.py:537
#: paperless/settings/__init__.py:536
msgid "Catalan"
msgstr ""
#: paperless/settings/__init__.py:538
#: paperless/settings/__init__.py:537
msgid "Czech"
msgstr ""
#: paperless/settings/__init__.py:539
#: paperless/settings/__init__.py:538
msgid "Danish"
msgstr ""
#: paperless/settings/__init__.py:540
#: paperless/settings/__init__.py:539
msgid "German"
msgstr ""
#: paperless/settings/__init__.py:541
#: paperless/settings/__init__.py:540
msgid "Greek"
msgstr ""
#: paperless/settings/__init__.py:542
#: paperless/settings/__init__.py:541
msgid "English (GB)"
msgstr ""
#: paperless/settings/__init__.py:543
#: paperless/settings/__init__.py:542
msgid "Spanish"
msgstr ""
#: paperless/settings/__init__.py:544
#: paperless/settings/__init__.py:543
msgid "Persian"
msgstr ""
#: paperless/settings/__init__.py:545
#: paperless/settings/__init__.py:544
msgid "Finnish"
msgstr ""
#: paperless/settings/__init__.py:546
#: paperless/settings/__init__.py:545
msgid "French"
msgstr ""
#: paperless/settings/__init__.py:547
#: paperless/settings/__init__.py:546
msgid "Hungarian"
msgstr ""
#: paperless/settings/__init__.py:548
#: paperless/settings/__init__.py:547
msgid "Indonesian"
msgstr ""
#: paperless/settings/__init__.py:549
#: paperless/settings/__init__.py:548
msgid "Italian"
msgstr ""
#: paperless/settings/__init__.py:550
#: paperless/settings/__init__.py:549
msgid "Japanese"
msgstr ""
#: paperless/settings/__init__.py:551
#: paperless/settings/__init__.py:550
msgid "Korean"
msgstr ""
#: paperless/settings/__init__.py:552
#: paperless/settings/__init__.py:551
msgid "Luxembourgish"
msgstr ""
#: paperless/settings/__init__.py:553
#: paperless/settings/__init__.py:552
msgid "Norwegian"
msgstr ""
#: paperless/settings/__init__.py:554
#: paperless/settings/__init__.py:553
msgid "Dutch"
msgstr ""
#: paperless/settings/__init__.py:555
#: paperless/settings/__init__.py:554
msgid "Polish"
msgstr ""
#: paperless/settings/__init__.py:556
#: paperless/settings/__init__.py:555
msgid "Portuguese (Brazil)"
msgstr ""
#: paperless/settings/__init__.py:557
#: paperless/settings/__init__.py:556
msgid "Portuguese"
msgstr ""
#: paperless/settings/__init__.py:558
#: paperless/settings/__init__.py:557
msgid "Romanian"
msgstr ""
#: paperless/settings/__init__.py:559
#: paperless/settings/__init__.py:558
msgid "Russian"
msgstr ""
#: paperless/settings/__init__.py:560
#: paperless/settings/__init__.py:559
msgid "Slovak"
msgstr ""
#: paperless/settings/__init__.py:561
#: paperless/settings/__init__.py:560
msgid "Slovenian"
msgstr ""
#: paperless/settings/__init__.py:562
#: paperless/settings/__init__.py:561
msgid "Serbian"
msgstr ""
#: paperless/settings/__init__.py:563
#: paperless/settings/__init__.py:562
msgid "Swedish"
msgstr ""
#: paperless/settings/__init__.py:564
#: paperless/settings/__init__.py:563
msgid "Turkish"
msgstr ""
#: paperless/settings/__init__.py:565
#: paperless/settings/__init__.py:564
msgid "Ukrainian"
msgstr ""
#: paperless/settings/__init__.py:566
#: paperless/settings/__init__.py:565
msgid "Vietnamese"
msgstr ""
#: paperless/settings/__init__.py:567
#: paperless/settings/__init__.py:566
msgid "Chinese Simplified"
msgstr ""
#: paperless/settings/__init__.py:568
#: paperless/settings/__init__.py:567
msgid "Chinese Traditional"
msgstr ""
#: paperless/urls.py:429
#: paperless/urls.py:415
msgid "Paperless-ngx administration"
msgstr ""
+4 -2
View File
@@ -1,4 +1,6 @@
import grp
import os
import pwd
import shutil
import stat
import subprocess
@@ -36,8 +38,8 @@ def path_check(var: str, directory: Path) -> list[Error]:
except PermissionError:
dir_stat: os.stat_result = Path(directory).stat()
dir_mode: str = stat.filemode(dir_stat.st_mode)
dir_owner: str = ""
dir_group: str = ""
dir_owner: str = pwd.getpwuid(dir_stat.st_uid).pw_name
dir_group: str = grp.getgrgid(dir_stat.st_gid).gr_name
messages.append(
Error(
writeable_message.format(var),
+5 -8
View File
@@ -38,9 +38,7 @@ class OutputTypeConfig(BaseConfig):
def __post_init__(self) -> None:
app_config = self._get_config_instance()
self.output_type = app_config.output_type or OutputTypeChoices(
settings.OCR_OUTPUT_TYPE,
)
self.output_type = app_config.output_type or settings.OCR_OUTPUT_TYPE
@dataclasses.dataclass
@@ -72,13 +70,12 @@ class OcrConfig(OutputTypeConfig):
self.pages = app_config.pages or settings.OCR_PAGES
self.language = app_config.language or settings.OCR_LANGUAGE
self.mode = app_config.mode or ModeChoices(settings.OCR_MODE)
self.mode = app_config.mode or settings.OCR_MODE
self.archive_file_generation = (
app_config.archive_file_generation
or ArchiveFileGenerationChoices(settings.ARCHIVE_FILE_GENERATION)
app_config.archive_file_generation or settings.ARCHIVE_FILE_GENERATION
)
self.image_dpi = app_config.image_dpi or settings.OCR_IMAGE_DPI
self.clean = app_config.unpaper_clean or CleanChoices(settings.OCR_CLEAN)
self.clean = app_config.unpaper_clean or settings.OCR_CLEAN
self.deskew = (
app_config.deskew if app_config.deskew is not None else settings.OCR_DESKEW
)
@@ -95,7 +92,7 @@ class OcrConfig(OutputTypeConfig):
)
self.color_conversion_strategy = (
app_config.color_conversion_strategy
or ColorConvertChoices(settings.OCR_COLOR_CONVERSION_STRATEGY)
or settings.OCR_COLOR_CONVERSION_STRATEGY
)
user_args = None
@@ -34,7 +34,7 @@ class Migration(migrations.Migration):
name="llm_backend",
field=models.CharField(
blank=True,
choices=[("openai-like", "OpenAI-compatible"), ("ollama", "Ollama")],
choices=[("openai", "OpenAI"), ("ollama", "Ollama")],
max_length=128,
null=True,
verbose_name="Sets the LLM backend",
@@ -45,10 +45,7 @@ class Migration(migrations.Migration):
name="llm_embedding_backend",
field=models.CharField(
blank=True,
choices=[
("openai-like", "OpenAI-compatible"),
("huggingface", "Huggingface"),
],
choices=[("openai", "OpenAI"), ("huggingface", "Huggingface")],
max_length=128,
null=True,
verbose_name="Sets the LLM embedding backend",
+13
View File
@@ -0,0 +1,13 @@
from pathlib import Path
from magika import Magika
_magika = Magika()
def from_file(path: str | Path) -> str:
return _magika.identify_path(path).output.mime_type
def from_buffer(data: bytes) -> str:
return _magika.identify_bytes(data).output.mime_type
+2 -2
View File
@@ -75,7 +75,7 @@ class ColorConvertChoices(models.TextChoices):
class LLMEmbeddingBackend(models.TextChoices):
OPENAI_LIKE = ("openai-like", _("OpenAI-compatible"))
OPENAI = ("openai", _("OpenAI"))
HUGGINGFACE = ("huggingface", _("Huggingface"))
@@ -84,7 +84,7 @@ class LLMBackend(models.TextChoices):
Matches to --llm-backend
"""
OPENAI_LIKE = ("openai-like", _("OpenAI-compatible"))
OPENAI = ("openai", _("OpenAI"))
OLLAMA = ("ollama", _("Ollama"))
+2 -2
View File
@@ -1,7 +1,6 @@
import logging
from io import BytesIO
import magic
from allauth.mfa.adapter import get_adapter as get_mfa_adapter
from allauth.mfa.models import Authenticator
from allauth.mfa.totp.internal.auth import TOTP
@@ -18,6 +17,7 @@ from PIL import Image
from rest_framework import serializers
from rest_framework.authtoken.serializers import AuthTokenSerializer
from paperless import mime_detection
from paperless.models import ApplicationConfiguration
from paperless.network import validate_outbound_http_url
from paperless.validators import reject_dangerous_svg
@@ -263,7 +263,7 @@ class ApplicationConfigurationSerializer(
jpg/png/gif/svg.
"""
if file:
mime_type = magic.from_buffer(file.read(2048), mime=True)
mime_type = mime_detection.from_buffer(file.read(2048))
if mime_type == "image/svg+xml":
reject_dangerous_svg(file)
+5 -6
View File
@@ -463,11 +463,10 @@ SECURE_PROXY_SSL_HEADER = (
else None
)
SECRET_KEY = os.getenv("PAPERLESS_SECRET_KEY")
_INSECURE_SECRET_KEYS = {None, "", "change-me"}
if not DEBUG and SECRET_KEY in _INSECURE_SECRET_KEYS: # pragma: no cover
SECRET_KEY = os.getenv("PAPERLESS_SECRET_KEY", "")
if not SECRET_KEY: # pragma: no cover
raise ImproperlyConfigured(
"PAPERLESS_SECRET_KEY is not set or is the default 'change-me' value. "
"PAPERLESS_SECRET_KEY is not set. "
"A unique, secret key is required for secure operation. "
'Generate one with: python3 -c "import secrets; print(secrets.token_urlsafe(64))"',
)
@@ -1174,9 +1173,9 @@ REMOTE_OCR_ENDPOINT = os.getenv("PAPERLESS_REMOTE_OCR_ENDPOINT")
AI_ENABLED = get_bool_from_env("PAPERLESS_AI_ENABLED", "NO")
LLM_EMBEDDING_BACKEND = os.getenv(
"PAPERLESS_AI_LLM_EMBEDDING_BACKEND",
) # "huggingface" or "openai-like"
) # "huggingface" or "openai"
LLM_EMBEDDING_MODEL = os.getenv("PAPERLESS_AI_LLM_EMBEDDING_MODEL")
LLM_BACKEND = os.getenv("PAPERLESS_AI_LLM_BACKEND") # "ollama" or "openai-like"
LLM_BACKEND = os.getenv("PAPERLESS_AI_LLM_BACKEND") # "ollama" or "openai"
LLM_MODEL = os.getenv("PAPERLESS_AI_LLM_MODEL")
LLM_API_KEY = os.getenv("PAPERLESS_AI_LLM_API_KEY")
LLM_ENDPOINT = os.getenv("PAPERLESS_AI_LLM_ENDPOINT")
@@ -15,11 +15,11 @@ from pathlib import Path
from typing import TYPE_CHECKING
import img2pdf
import magic
import pikepdf
import pytest
from documents.parsers import ParseError
from paperless import mime_detection
if TYPE_CHECKING:
from pytest_mock import MockerFixture
@@ -43,7 +43,7 @@ class TestConvertImageToPdfa:
result = tesseract_parser._convert_image_to_pdfa(simple_png_file)
assert result.exists()
assert magic.from_file(str(result), mime=True) == "application/pdf"
assert mime_detection.from_file(result) == "application/pdf"
def test_output_path_is_archive_pdf_in_tempdir(
self,
@@ -92,7 +92,7 @@ class TestConvertImageToPdfa:
result = tesseract_parser._convert_image_to_pdfa(simple_png_file)
assert result.exists()
assert magic.from_file(str(result), mime=True) == "application/pdf"
assert mime_detection.from_file(result) == "application/pdf"
def test_image_dpi_setting_applies_fixed_dpi_layout(
self,
@@ -116,7 +116,7 @@ class TestConvertImageToPdfa:
result = tesseract_parser._convert_image_to_pdfa(simple_no_dpi_png_file)
spy.assert_called_once_with((150, 150))
assert magic.from_file(str(result), mime=True) == "application/pdf"
assert mime_detection.from_file(result) == "application/pdf"
def test_no_image_dpi_setting_skips_fixed_dpi_layout(
self,
@@ -766,11 +766,7 @@ class TestParser:
content=b"Pretend merged PDF content",
)
def test_layout_option(
layout_option,
expected_calls,
expected_pdf_names,
) -> None:
def test_layout_option(layout_option, expected_calls, expected_pdf_names):
mock_mailrule_get.return_value = mock.Mock(pdf_layout=layout_option)
mail_parser.configure(ParserContext(mailrule_id=1))
mail_parser.parse(
@@ -16,8 +16,6 @@ from typing import TYPE_CHECKING
import pytest
from paperless.models import ModeChoices
if TYPE_CHECKING:
from pytest_mock import MockerFixture
@@ -74,7 +72,7 @@ class TestAutoModeWithText:
)
mock_ocr = mocker.patch("ocrmypdf.ocr")
tesseract_parser.settings.mode = ModeChoices.AUTO
tesseract_parser.settings.mode = "auto"
tesseract_parser.parse(
simple_digital_pdf_file,
"application/pdf",
@@ -108,7 +106,7 @@ class TestAutoModeWithText:
)
mock_ocr = mocker.patch("ocrmypdf.ocr")
tesseract_parser.settings.mode = ModeChoices.AUTO
tesseract_parser.settings.mode = "auto"
tesseract_parser.parse(
simple_digital_pdf_file,
"application/pdf",
@@ -160,7 +158,7 @@ class TestAutoModeNoText:
mocker.patch.object(tesseract_parser, "extract_text", side_effect=_extract_side)
mock_ocr = mocker.patch("ocrmypdf.ocr")
tesseract_parser.settings.mode = ModeChoices.AUTO
tesseract_parser.settings.mode = "auto"
tesseract_parser.parse(
multi_page_images_pdf_file,
"application/pdf",
@@ -202,7 +200,7 @@ class TestAutoModeNoText:
mocker.patch.object(tesseract_parser, "extract_text", side_effect=_extract_side)
mock_ocr = mocker.patch("ocrmypdf.ocr")
tesseract_parser.settings.mode = ModeChoices.AUTO
tesseract_parser.settings.mode = "auto"
tesseract_parser.parse(
multi_page_images_pdf_file,
"application/pdf",
@@ -245,7 +243,7 @@ class TestOffModePdf:
)
mock_ocr = mocker.patch("ocrmypdf.ocr")
tesseract_parser.settings.mode = ModeChoices.OFF
tesseract_parser.settings.mode = "off"
tesseract_parser.parse(
simple_digital_pdf_file,
"application/pdf",
@@ -285,7 +283,7 @@ class TestOffModePdf:
)
mocker.patch("ocrmypdf.pdfa.generate_pdfa_ps")
tesseract_parser.settings.mode = ModeChoices.OFF
tesseract_parser.settings.mode = "off"
tesseract_parser.parse(
simple_digital_pdf_file,
"application/pdf",
@@ -325,7 +323,7 @@ class TestOffModeImage:
"""
mock_ocr = mocker.patch("ocrmypdf.ocr")
tesseract_parser.settings.mode = ModeChoices.OFF
tesseract_parser.settings.mode = "off"
tesseract_parser.parse(simple_png_file, "image/png", produce_archive=False)
mock_ocr.assert_not_called()
@@ -357,7 +355,7 @@ class TestOffModeImage:
)
mock_ocr = mocker.patch("ocrmypdf.ocr")
tesseract_parser.settings.mode = ModeChoices.OFF
tesseract_parser.settings.mode = "off"
tesseract_parser.parse(simple_png_file, "image/png", produce_archive=True)
mock_convert.assert_called_once_with(simple_png_file)
@@ -431,7 +429,7 @@ class TestProduceArchiveFalse:
)
mock_ocr = mocker.patch("ocrmypdf.ocr")
tesseract_parser.settings.mode = ModeChoices.AUTO
tesseract_parser.settings.mode = "auto"
tesseract_parser.parse(
simple_digital_pdf_file,
"application/pdf",
@@ -44,7 +44,6 @@ class TestParserSettingsFromDb(DirectoriesMixin, FileSystemAssertsMixin, TestCas
"""
with override_settings(OCR_PAGES=10):
instance = ApplicationConfiguration.objects.all().first()
assert instance is not None
instance.pages = 5
instance.save()
@@ -63,7 +62,6 @@ class TestParserSettingsFromDb(DirectoriesMixin, FileSystemAssertsMixin, TestCas
"""
with override_settings(OCR_LANGUAGE="eng+deu"):
instance = ApplicationConfiguration.objects.all().first()
assert instance is not None
instance.language = "fra+ita"
instance.save()
@@ -82,7 +80,6 @@ class TestParserSettingsFromDb(DirectoriesMixin, FileSystemAssertsMixin, TestCas
"""
with override_settings(OCR_OUTPUT_TYPE="pdfa-3"):
instance = ApplicationConfiguration.objects.all().first()
assert instance is not None
instance.output_type = OutputTypeChoices.PDF_A
instance.save()
@@ -103,7 +100,6 @@ class TestParserSettingsFromDb(DirectoriesMixin, FileSystemAssertsMixin, TestCas
# AUTO mode with skip_text=True explicitly passed: skip_text is set
with override_settings(OCR_MODE="redo"):
instance = ApplicationConfiguration.objects.all().first()
assert instance is not None
instance.mode = ModeChoices.AUTO
instance.save()
@@ -122,7 +118,6 @@ class TestParserSettingsFromDb(DirectoriesMixin, FileSystemAssertsMixin, TestCas
# AUTO mode alone (no skip_text): no extra OCR flag is set
with override_settings(OCR_MODE="redo"):
instance = ApplicationConfiguration.objects.all().first()
assert instance is not None
instance.mode = ModeChoices.AUTO
instance.save()
@@ -143,7 +138,6 @@ class TestParserSettingsFromDb(DirectoriesMixin, FileSystemAssertsMixin, TestCas
"""
with override_settings(OCR_CLEAN="clean-final"):
instance = ApplicationConfiguration.objects.all().first()
assert instance is not None
instance.unpaper_clean = CleanChoices.CLEAN
instance.save()
@@ -153,7 +147,6 @@ class TestParserSettingsFromDb(DirectoriesMixin, FileSystemAssertsMixin, TestCas
with override_settings(OCR_CLEAN="clean-final"):
instance = ApplicationConfiguration.objects.all().first()
assert instance is not None
instance.unpaper_clean = CleanChoices.FINAL
instance.save()
@@ -173,7 +166,6 @@ class TestParserSettingsFromDb(DirectoriesMixin, FileSystemAssertsMixin, TestCas
"""
with override_settings(OCR_DESKEW=False):
instance = ApplicationConfiguration.objects.all().first()
assert instance is not None
instance.deskew = True
instance.save()
@@ -193,7 +185,6 @@ class TestParserSettingsFromDb(DirectoriesMixin, FileSystemAssertsMixin, TestCas
with override_settings(OCR_ROTATE_PAGES=False, OCR_ROTATE_PAGES_THRESHOLD=30.0):
instance = ApplicationConfiguration.objects.all().first()
assert instance is not None
assert instance is not None
instance.rotate_pages = True
instance.rotate_pages_threshold = 15.0
instance.save()
@@ -214,7 +205,6 @@ class TestParserSettingsFromDb(DirectoriesMixin, FileSystemAssertsMixin, TestCas
"""
with override_settings(OCR_MAX_IMAGE_PIXELS=2_000_000.0):
instance = ApplicationConfiguration.objects.all().first()
assert instance is not None
instance.max_image_pixels = 1_000_000.0
instance.save()
@@ -233,7 +223,6 @@ class TestParserSettingsFromDb(DirectoriesMixin, FileSystemAssertsMixin, TestCas
"""
with override_settings(OCR_COLOR_CONVERSION_STRATEGY="LeaveColorUnchanged"):
instance = ApplicationConfiguration.objects.all().first()
assert instance is not None
instance.color_conversion_strategy = ColorConvertChoices.INDEPENDENT
instance.save()
@@ -257,7 +246,6 @@ class TestParserSettingsFromDb(DirectoriesMixin, FileSystemAssertsMixin, TestCas
OCR_USER_ARGS=json.dumps({"continue_on_soft_render_error": True}),
):
instance = ApplicationConfiguration.objects.all().first()
assert instance is not None
instance.user_args = {"unpaper_args": "--pre-rotate 90"}
instance.save()
@@ -18,7 +18,6 @@ from ocrmypdf import SubprocessOutputError
from documents.parsers import ParseError
from documents.parsers import run_convert
from paperless.models import ModeChoices
from paperless.parsers import ParserProtocol
from paperless.parsers.tesseract import RasterisedDocumentParser
from paperless.parsers.tesseract import post_process_text
@@ -388,10 +387,8 @@ class TestParsePdf:
)
assert tesseract_parser.archive_path is not None
assert tesseract_parser.archive_path.is_file()
_text = tesseract_parser.get_text()
assert _text is not None
assert_ordered_substrings(
_text.lower(),
tesseract_parser.get_text().lower(),
["page 1", "page 2", "page 3"],
)
@@ -416,7 +413,7 @@ class TestParsePdf:
tesseract_parser: RasterisedDocumentParser,
tesseract_samples_dir: Path,
) -> None:
tesseract_parser.settings.mode = ModeChoices.REDO
tesseract_parser.settings.mode = "redo"
tesseract_parser.parse(
tesseract_samples_dir / "with-form.pdf",
"application/pdf",
@@ -433,7 +430,7 @@ class TestParsePdf:
tesseract_parser: RasterisedDocumentParser,
tesseract_samples_dir: Path,
) -> None:
tesseract_parser.settings.mode = ModeChoices.FORCE
tesseract_parser.settings.mode = "force"
tesseract_parser.parse(
tesseract_samples_dir / "with-form.pdf",
"application/pdf",
@@ -448,7 +445,7 @@ class TestParsePdf:
tesseract_parser: RasterisedDocumentParser,
tesseract_samples_dir: Path,
) -> None:
tesseract_parser.settings.mode = ModeChoices.AUTO
tesseract_parser.settings.mode = "auto"
tesseract_parser.parse(tesseract_samples_dir / "signed.pdf", "application/pdf")
assert tesseract_parser.archive_path is None
assert_ordered_substrings(
@@ -464,7 +461,7 @@ class TestParsePdf:
tesseract_parser: RasterisedDocumentParser,
tesseract_samples_dir: Path,
) -> None:
tesseract_parser.settings.mode = ModeChoices.AUTO
tesseract_parser.settings.mode = "auto"
tesseract_parser.parse(
tesseract_samples_dir / "encrypted.pdf",
"application/pdf",
@@ -533,9 +530,7 @@ class TestParseImages:
tesseract_parser.parse(tesseract_samples_dir / "simple-no-dpi.png", "image/png")
assert tesseract_parser.archive_path is not None
assert tesseract_parser.archive_path.is_file()
_text = tesseract_parser.get_text()
assert _text is not None
assert "this is a test document." in _text.lower()
assert "this is a test document." in tesseract_parser.get_text().lower()
def test_no_dpi_no_fallback_raises(
self,
@@ -568,10 +563,8 @@ class TestParseMultiPage:
)
assert tesseract_parser.archive_path is not None
assert tesseract_parser.archive_path.is_file()
_text = tesseract_parser.get_text()
assert _text is not None
assert_ordered_substrings(
_text.lower(),
tesseract_parser.get_text().lower(),
["page 1", "page 2", "page 3"],
)
@@ -596,10 +589,8 @@ class TestParseMultiPage:
"application/pdf",
)
assert tesseract_parser.archive_path is not None
_text = tesseract_parser.get_text()
assert _text is not None
assert_ordered_substrings(
_text.lower(),
tesseract_parser.get_text().lower(),
["page 1", "page 2", "page 3"],
)
@@ -608,16 +599,14 @@ class TestParseMultiPage:
tesseract_parser: RasterisedDocumentParser,
tesseract_samples_dir: Path,
) -> None:
tesseract_parser.settings.mode = ModeChoices.AUTO
tesseract_parser.settings.mode = "auto"
tesseract_parser.parse(
tesseract_samples_dir / "multi-page-images.pdf",
"application/pdf",
)
assert tesseract_parser.archive_path is not None
_text = tesseract_parser.get_text()
assert _text is not None
assert_ordered_substrings(
_text.lower(),
tesseract_parser.get_text().lower(),
["page 1", "page 2", "page 3"],
)
@@ -637,15 +626,13 @@ class TestParseMultiPage:
- Pages 1 and 2 extracted; page 3 absent
"""
tesseract_parser.settings.pages = 2
tesseract_parser.settings.mode = ModeChoices.REDO
tesseract_parser.settings.mode = "redo"
tesseract_parser.parse(
tesseract_samples_dir / "multi-page-images.pdf",
"application/pdf",
)
assert tesseract_parser.archive_path is not None
text = tesseract_parser.get_text()
assert text is not None
text = text.lower()
text = tesseract_parser.get_text().lower()
assert_ordered_substrings(text, ["page 1", "page 2"])
assert "page 3" not in text
@@ -665,15 +652,13 @@ class TestParseMultiPage:
- Only page 1 extracted
"""
tesseract_parser.settings.pages = 1
tesseract_parser.settings.mode = ModeChoices.FORCE
tesseract_parser.settings.mode = "force"
tesseract_parser.parse(
tesseract_samples_dir / "multi-page-images.pdf",
"application/pdf",
)
assert tesseract_parser.archive_path is not None
text = tesseract_parser.get_text()
assert text is not None
text = text.lower()
text = tesseract_parser.get_text().lower()
assert "page 1" in text
assert "page 2" not in text
assert "page 3" not in text
@@ -696,10 +681,8 @@ class TestParseMultiPage:
"image/tiff",
)
assert tesseract_parser.archive_path is not None
_text = tesseract_parser.get_text()
assert _text is not None
assert_ordered_substrings(
_text.lower(),
tesseract_parser.get_text().lower(),
["page 1", "page 2", "page 3"],
)
@@ -721,10 +704,8 @@ class TestParseMultiPage:
shutil.copy(tesseract_samples_dir / "multi-page-images-alpha.tiff", dest)
tesseract_parser.parse(dest, "image/tiff")
assert tesseract_parser.archive_path is not None
_text = tesseract_parser.get_text()
assert _text is not None
assert_ordered_substrings(
_text.lower(),
tesseract_parser.get_text().lower(),
["page 1", "page 2", "page 3"],
)
@@ -746,10 +727,8 @@ class TestParseMultiPage:
shutil.copy(tesseract_samples_dir / "multi-page-images-alpha-rgb.tiff", dest)
tesseract_parser.parse(dest, "image/tiff")
assert tesseract_parser.archive_path is not None
_text = tesseract_parser.get_text()
assert _text is not None
assert_ordered_substrings(
_text.lower(),
tesseract_parser.get_text().lower(),
["page 1", "page 2", "page 3"],
)
@@ -775,17 +754,15 @@ class TestSkipArchive:
- Text extracted from original; no archive created (text exists +
produce_archive=False skips OCRmyPDF entirely)
"""
tesseract_parser.settings.mode = ModeChoices.AUTO
tesseract_parser.settings.mode = "auto"
tesseract_parser.parse(
tesseract_samples_dir / "multi-page-digital.pdf",
"application/pdf",
produce_archive=False,
)
assert tesseract_parser.archive_path is None
_text = tesseract_parser.get_text()
assert _text is not None
assert_ordered_substrings(
_text.lower(),
tesseract_parser.get_text().lower(),
["page 1", "page 2", "page 3"],
)
@@ -803,16 +780,14 @@ class TestSkipArchive:
THEN:
- Text extracted; archive created (OCR needed, no existing text)
"""
tesseract_parser.settings.mode = ModeChoices.AUTO
tesseract_parser.settings.mode = "auto"
tesseract_parser.parse(
tesseract_samples_dir / "multi-page-images.pdf",
"application/pdf",
)
assert tesseract_parser.archive_path is not None
_text = tesseract_parser.get_text()
assert _text is not None
assert_ordered_substrings(
_text.lower(),
tesseract_parser.get_text().lower(),
["page 1", "page 2", "page 3"],
)
@@ -863,15 +838,13 @@ class TestSkipArchive:
- archive_path is set if and only if produce_archive=True
- Text is always extracted
"""
tesseract_parser.settings.mode = ModeChoices.AUTO
tesseract_parser.settings.mode = "auto"
tesseract_parser.parse(
tesseract_samples_dir / filename,
"application/pdf",
produce_archive=produce_archive,
)
text = tesseract_parser.get_text()
assert text is not None
text = text.lower()
text = tesseract_parser.get_text().lower()
assert_ordered_substrings(text, ["page 1", "page 2", "page 3"])
if expect_archive:
assert tesseract_parser.archive_path is not None
@@ -895,7 +868,7 @@ class TestSkipArchive:
- Text is extracted from the original via pdftotext
- No archive is produced
"""
tesseract_parser.settings.mode = ModeChoices.AUTO
tesseract_parser.settings.mode = "auto"
mock_ocr = mocker.patch("ocrmypdf.ocr")
tesseract_parser.parse(
tesseract_samples_dir / "simple-digital.pdf",
@@ -922,7 +895,7 @@ class TestSkipArchive:
- Archive is produced
- Text is preserved from the original
"""
tesseract_parser.settings.mode = ModeChoices.AUTO
tesseract_parser.settings.mode = "auto"
tesseract_parser.parse(
tesseract_samples_dir / "simple-digital.pdf",
"application/pdf",
@@ -952,17 +925,15 @@ class TestParseMixed:
THEN:
- All pages extracted; archive created; sidecar notes skipped pages
"""
tesseract_parser.settings.mode = ModeChoices.AUTO
tesseract_parser.settings.mode = "auto"
tesseract_parser.parse(
tesseract_samples_dir / "multi-page-mixed.pdf",
"application/pdf",
)
assert tesseract_parser.archive_path is not None
assert tesseract_parser.archive_path.is_file()
_text = tesseract_parser.get_text()
assert _text is not None
assert_ordered_substrings(
_text.lower(),
tesseract_parser.get_text().lower(),
["page 1", "page 2", "page 3", "page 4", "page 5", "page 6"],
)
sidecar = (tesseract_parser.tempdir / "sidecar.txt").read_text()
@@ -982,17 +953,15 @@ class TestParseMixed:
THEN:
- Both text layer and image text extracted; archive created
"""
tesseract_parser.settings.mode = ModeChoices.REDO
tesseract_parser.settings.mode = "redo"
tesseract_parser.parse(
tesseract_samples_dir / "single-page-mixed.pdf",
"application/pdf",
)
assert tesseract_parser.archive_path is not None
assert tesseract_parser.archive_path.is_file()
_text = tesseract_parser.get_text()
assert _text is not None
assert_ordered_substrings(
_text.lower(),
tesseract_parser.get_text().lower(),
[
"this is some normal text, present on page 1 of the document.",
"this is some text, but in an image, also on page 1.",
@@ -1020,17 +989,15 @@ class TestParseMixed:
THEN:
- No archive created (produce_archive=False); text from text layer present
"""
tesseract_parser.settings.mode = ModeChoices.AUTO
tesseract_parser.settings.mode = "auto"
tesseract_parser.parse(
tesseract_samples_dir / "multi-page-mixed.pdf",
"application/pdf",
produce_archive=False,
)
assert tesseract_parser.archive_path is None
_text = tesseract_parser.get_text()
assert _text is not None
assert_ordered_substrings(
_text.lower(),
tesseract_parser.get_text().lower(),
["page 4", "page 5", "page 6"],
)
@@ -1046,7 +1013,7 @@ class TestParseRotate:
tesseract_parser: RasterisedDocumentParser,
tesseract_samples_dir: Path,
) -> None:
tesseract_parser.settings.mode = ModeChoices.AUTO
tesseract_parser.settings.mode = "auto"
tesseract_parser.settings.rotate = True
tesseract_parser.parse(tesseract_samples_dir / "rotated.pdf", "application/pdf")
assert_ordered_substrings(
@@ -1085,16 +1052,14 @@ class TestParseRtl:
force-ocr with English Tesseract (producing garbage). Using mode="off" forces
skip_text=True so the Arabic text layer is preserved through PDF/A conversion.
"""
tesseract_parser.settings.mode = ModeChoices.OFF
tesseract_parser.settings.mode = "off"
tesseract_parser.parse(
tesseract_samples_dir / "rtl-test.pdf",
"application/pdf",
)
_text = tesseract_parser.get_text()
assert _text is not None
normalised = "".join(
ch
for ch in unicodedata.normalize("NFKC", _text)
for ch in unicodedata.normalize("NFKC", tesseract_parser.get_text())
if unicodedata.category(ch) != "Cf" and not ch.isspace()
)
assert "ةرازو" in normalised
@@ -1231,9 +1196,7 @@ class TestParserFileTypes:
tesseract_parser.parse(tesseract_samples_dir / filename, mime_type)
assert tesseract_parser.archive_path is not None
assert tesseract_parser.archive_path.is_file()
_text = tesseract_parser.get_text()
assert _text is not None
assert "this is a test document" in _text.lower()
assert "this is a test document" in tesseract_parser.get_text().lower()
def test_heic(
self,
@@ -1242,9 +1205,7 @@ class TestParserFileTypes:
) -> None:
tesseract_parser.parse(tesseract_samples_dir / "simple.heic", "image/heic")
assert tesseract_parser.archive_path is not None
_text = tesseract_parser.get_text()
assert _text is not None
assert "pizza" in _text.lower()
assert "pizza" in tesseract_parser.get_text().lower()
def test_gif_with_explicit_dpi(
self,
@@ -1254,9 +1215,7 @@ class TestParserFileTypes:
tesseract_parser.settings.image_dpi = 200
tesseract_parser.parse(tesseract_samples_dir / "simple.gif", "image/gif")
assert tesseract_parser.archive_path is not None
_text = tesseract_parser.get_text()
assert _text is not None
assert "this is a test document" in _text.lower()
assert "this is a test document" in tesseract_parser.get_text().lower()
def test_webp_with_explicit_dpi(
self,
@@ -1266,11 +1225,9 @@ class TestParserFileTypes:
tesseract_parser.settings.image_dpi = 72
tesseract_parser.parse(tesseract_samples_dir / "document.webp", "image/webp")
assert tesseract_parser.archive_path is not None
_text = tesseract_parser.get_text()
assert _text is not None
assert re.search(
r"this is a ?webp document, created 11/14/2022\.",
_text.lower(),
tesseract_parser.get_text().lower(),
)
@@ -26,7 +26,7 @@ class TestStringToBool:
pytest.param(" True ", id="whitespace_true"),
],
)
def test_true_conversion(self, true_value: str) -> None:
def test_true_conversion(self, true_value: str):
"""Test that various 'true' strings correctly evaluate to True."""
assert str_to_bool(true_value) is True
@@ -41,18 +41,18 @@ class TestStringToBool:
pytest.param(" False ", id="whitespace_false"),
],
)
def test_false_conversion(self, false_value: str) -> None:
def test_false_conversion(self, false_value: str):
"""Test that various 'false' strings correctly evaluate to False."""
assert str_to_bool(false_value) is False
def test_invalid_conversion(self) -> None:
def test_invalid_conversion(self):
"""Test that an invalid string raises a ValueError."""
with pytest.raises(ValueError, match="Cannot convert 'maybe' to a boolean\\."):
str_to_bool("maybe")
class TestParseDictFromString:
def test_empty_and_none_input(self) -> None:
def test_empty_and_none_input(self):
"""Test behavior with None or empty string input."""
assert parse_dict_from_str(None) == {}
assert parse_dict_from_str("") == {}
@@ -62,13 +62,13 @@ class TestParseDictFromString:
# Ensure it returns a copy, not the original object
assert res is not defaults
def test_basic_parsing(self) -> None:
def test_basic_parsing(self):
"""Test simple key-value parsing without defaults or types."""
env_str = "key1=val1, key2=val2"
expected = {"key1": "val1", "key2": "val2"}
assert parse_dict_from_str(env_str) == expected
def test_with_defaults(self) -> None:
def test_with_defaults(self):
"""Test that environment values override defaults correctly."""
defaults = {"host": "localhost", "port": 8000, "user": "default"}
env_str = "port=9090, host=db.example.com"
@@ -76,7 +76,7 @@ class TestParseDictFromString:
result = parse_dict_from_str(env_str, defaults=defaults)
assert result == expected
def test_type_casting(self) -> None:
def test_type_casting(self):
"""Test successful casting of values to specified types."""
env_str = "port=9090, debug=true, timeout=12.5, user=admin"
type_map = {"port": int, "debug": bool, "timeout": float}
@@ -84,7 +84,7 @@ class TestParseDictFromString:
result = parse_dict_from_str(env_str, type_map=type_map)
assert result == expected
def test_type_casting_with_defaults(self) -> None:
def test_type_casting_with_defaults(self):
"""Test casting when values come from both defaults and env string."""
defaults = {"port": 8000, "debug": False, "retries": 3}
env_str = "port=9090, debug=true"
@@ -97,7 +97,7 @@ class TestParseDictFromString:
assert result == expected
assert isinstance(result["retries"], int)
def test_path_casting(self, tmp_path: Path) -> None:
def test_path_casting(self, tmp_path: Path):
"""Test successful casting of a string to a resolved pathlib.Path object."""
# Create a dummy file to resolve against
test_file = tmp_path / "test_file.txt"
@@ -111,14 +111,14 @@ class TestParseDictFromString:
assert isinstance(result["config_path"], Path)
assert result["config_path"] == test_file.resolve()
def test_custom_separator(self) -> None:
def test_custom_separator(self):
"""Test parsing with a custom separator like a semicolon."""
env_str = "host=db; port=5432; user=test"
expected = {"host": "db", "port": "5432", "user": "test"}
result = parse_dict_from_str(env_str, separator=";")
assert result == expected
def test_edge_cases_in_string(self) -> None:
def test_edge_cases_in_string(self):
"""Test malformed strings to ensure robustness."""
# Malformed pair 'debug' is skipped, extra comma is ignored
env_str = "key=val,, debug, foo=bar"
@@ -130,7 +130,7 @@ class TestParseDictFromString:
expected = {"url": "postgres://user:pass@host:5432/db"}
assert parse_dict_from_str(env_str) == expected
def test_casting_error_handling(self) -> None:
def test_casting_error_handling(self):
"""Test that a ValueError is raised for invalid casting."""
env_str = "port=not-a-number"
type_map = {"port": int}
@@ -142,14 +142,14 @@ class TestParseDictFromString:
assert "value 'not-a-number'" in str(excinfo.value)
assert "to type 'int'" in str(excinfo.value)
def test_bool_casting_error(self) -> None:
def test_bool_casting_error(self):
"""Test that an invalid boolean string raises a ValueError."""
env_str = "debug=maybe"
type_map = {"debug": bool}
with pytest.raises(ValueError, match="Error casting key 'debug'"):
parse_dict_from_str(env_str, type_map=type_map)
def test_nested_key_parsing_basic(self) -> None:
def test_nested_key_parsing_basic(self):
"""Basic nested key parsing using dot-notation."""
env_str = "database.host=db.example.com, database.port=5432, logging.level=INFO"
result = parse_dict_from_str(env_str)
@@ -158,7 +158,7 @@ class TestParseDictFromString:
"logging": {"level": "INFO"},
}
def test_nested_overrides_defaults_and_deepcopy(self) -> None:
def test_nested_overrides_defaults_and_deepcopy(self):
"""Nested env keys override defaults and defaults are deep-copied."""
defaults = {"database": {"host": "127.0.0.1", "port": 3306, "user": "default"}}
env_str = "database.host=db.example.com, debug=true"
@@ -176,7 +176,7 @@ class TestParseDictFromString:
assert result is not defaults
assert result["database"] is not defaults["database"]
def test_nested_type_casting(self) -> None:
def test_nested_type_casting(self):
"""Type casting for nested keys (dot-notation) should work."""
env_str = "database.host=db.example.com, database.port=5433, debug=false"
type_map = {"database.port": int, "debug": bool}
@@ -188,7 +188,7 @@ class TestParseDictFromString:
assert result["debug"] is False
assert isinstance(result["debug"], bool)
def test_nested_casting_error_message(self) -> None:
def test_nested_casting_error_message(self):
"""Error messages should include the full dotted key name on failure."""
env_str = "database.port=not-a-number"
type_map = {"database.port": int}
@@ -200,7 +200,7 @@ class TestParseDictFromString:
assert "value 'not-a-number'" in msg
assert "to type 'int'" in msg
def test_type_map_does_not_recast_non_string_defaults(self) -> None:
def test_type_map_does_not_recast_non_string_defaults(self):
"""If a default already provides a non-string value, the caster should skip it."""
defaults = {"database": {"port": 3306}}
type_map = {"database.port": int}
@@ -210,22 +210,22 @@ class TestParseDictFromString:
class TestGetBoolFromEnv:
def test_existing_env_var(self, mocker) -> None:
def test_existing_env_var(self, mocker):
"""Test that an existing environment variable is read and converted."""
mocker.patch.dict(os.environ, {"TEST_VAR": "true"})
assert get_bool_from_env("TEST_VAR") is True
def test_missing_env_var_uses_default_no(self, mocker) -> None:
def test_missing_env_var_uses_default_no(self, mocker):
"""Test that a missing environment variable uses default 'NO' and returns False."""
mocker.patch.dict(os.environ, {}, clear=True)
assert get_bool_from_env("MISSING_VAR") is False
def test_missing_env_var_with_explicit_default(self, mocker) -> None:
def test_missing_env_var_with_explicit_default(self, mocker):
"""Test that a missing environment variable uses the provided default."""
mocker.patch.dict(os.environ, {}, clear=True)
assert get_bool_from_env("MISSING_VAR", default="yes") is True
def test_invalid_value_raises_error(self, mocker) -> None:
def test_invalid_value_raises_error(self, mocker):
"""Test that an invalid value raises ValueError (delegates to str_to_bool)."""
mocker.patch.dict(os.environ, {"INVALID_VAR": "maybe"})
with pytest.raises(ValueError):
@@ -243,7 +243,7 @@ class TestGetIntFromEnv:
pytest.param("-999", -999, id="large_negative"),
],
)
def test_existing_env_var_valid_ints(self, mocker, env_value, expected) -> None:
def test_existing_env_var_valid_ints(self, mocker, env_value, expected):
"""Test that existing environment variables with valid integers return correct values."""
mocker.patch.dict(os.environ, {"INT_VAR": env_value})
assert get_int_from_env("INT_VAR") == expected
@@ -257,12 +257,12 @@ class TestGetIntFromEnv:
pytest.param(None, None, id="none_default"),
],
)
def test_missing_env_var_with_defaults(self, mocker, default, expected) -> None:
def test_missing_env_var_with_defaults(self, mocker, default, expected):
"""Test that missing environment variables return provided defaults."""
mocker.patch.dict(os.environ, {}, clear=True)
assert get_int_from_env("MISSING_VAR", default=default) == expected
def test_missing_env_var_no_default(self, mocker) -> None:
def test_missing_env_var_no_default(self, mocker):
"""Test that missing environment variable with no default returns None."""
mocker.patch.dict(os.environ, {}, clear=True)
assert get_int_from_env("MISSING_VAR") is None
@@ -279,7 +279,7 @@ class TestGetIntFromEnv:
pytest.param("1.0", id="decimal"),
],
)
def test_invalid_int_values_raise_error(self, mocker, invalid_value) -> None:
def test_invalid_int_values_raise_error(self, mocker, invalid_value):
"""Test that invalid integer values raise ValueError."""
mocker.patch.dict(os.environ, {"INVALID_INT": invalid_value})
with pytest.raises(ValueError):
@@ -300,7 +300,7 @@ class TestGetFloatFromEnv:
pytest.param("-1.23e4", -12300.0, id="sci_large"),
],
)
def test_existing_env_var_valid_floats(self, mocker, env_value, expected) -> None:
def test_existing_env_var_valid_floats(self, mocker, env_value, expected):
"""Test that existing environment variables with valid floats return correct values."""
mocker.patch.dict(os.environ, {"FLOAT_VAR": env_value})
assert get_float_from_env("FLOAT_VAR") == expected
@@ -314,12 +314,12 @@ class TestGetFloatFromEnv:
pytest.param(None, None, id="none_default"),
],
)
def test_missing_env_var_with_defaults(self, mocker, default, expected) -> None:
def test_missing_env_var_with_defaults(self, mocker, default, expected):
"""Test that missing environment variables return provided defaults."""
mocker.patch.dict(os.environ, {}, clear=True)
assert get_float_from_env("MISSING_VAR", default=default) == expected
def test_missing_env_var_no_default(self, mocker) -> None:
def test_missing_env_var_no_default(self, mocker):
"""Test that missing environment variable with no default returns None."""
mocker.patch.dict(os.environ, {}, clear=True)
assert get_float_from_env("MISSING_VAR") is None
@@ -336,7 +336,7 @@ class TestGetFloatFromEnv:
pytest.param("1.2.3", id="triple_decimal"),
],
)
def test_invalid_float_values_raise_error(self, mocker, invalid_value) -> None:
def test_invalid_float_values_raise_error(self, mocker, invalid_value):
"""Test that invalid float values raise ValueError."""
mocker.patch.dict(os.environ, {"INVALID_FLOAT": invalid_value})
with pytest.raises(ValueError):
@@ -355,19 +355,19 @@ class TestGetPathFromEnv:
pytest.param("/", id="root"),
],
)
def test_existing_env_var_paths(self, mocker, env_value) -> None:
def test_existing_env_var_paths(self, mocker, env_value):
"""Test that existing environment variables with paths return resolved Path objects."""
mocker.patch.dict(os.environ, {"PATH_VAR": env_value})
result = get_path_from_env("PATH_VAR")
assert isinstance(result, Path)
assert result == Path(env_value).resolve()
def test_missing_env_var_no_default(self, mocker) -> None:
def test_missing_env_var_no_default(self, mocker):
"""Test that missing environment variable with no default returns None."""
mocker.patch.dict(os.environ, {}, clear=True)
assert get_path_from_env("MISSING_VAR") is None
def test_missing_env_var_with_none_default(self, mocker) -> None:
def test_missing_env_var_with_none_default(self, mocker):
"""Test that missing environment variable with None default returns None."""
mocker.patch.dict(os.environ, {}, clear=True)
assert get_path_from_env("MISSING_VAR", default=None) is None
@@ -380,7 +380,7 @@ class TestGetPathFromEnv:
pytest.param(".", id="current_default"),
],
)
def test_missing_env_var_with_path_defaults(self, mocker, default_path_str) -> None:
def test_missing_env_var_with_path_defaults(self, mocker, default_path_str):
"""Test that missing environment variables return resolved default Path objects."""
mocker.patch.dict(os.environ, {}, clear=True)
default_path = Path(default_path_str)
@@ -388,7 +388,7 @@ class TestGetPathFromEnv:
assert isinstance(result, Path)
assert result == default_path.resolve()
def test_relative_paths_are_resolved(self, mocker) -> None:
def test_relative_paths_are_resolved(self, mocker):
"""Test that relative paths are properly resolved to absolute paths."""
mocker.patch.dict(os.environ, {"REL_PATH": "relative/path"})
result = get_path_from_env("REL_PATH")
@@ -407,7 +407,7 @@ class TestGetListFromEnv:
pytest.param("a,,b,c", ["a", "b", "c"], id="empty_elements_removed"),
],
)
def test_existing_env_var_basic_parsing(self, mocker, env_value, expected) -> None:
def test_existing_env_var_basic_parsing(self, mocker, env_value, expected):
"""Test that existing environment variables are parsed correctly."""
mocker.patch.dict(os.environ, {"LIST_VAR": env_value})
result = get_list_from_env("LIST_VAR")
@@ -421,7 +421,7 @@ class TestGetListFromEnv:
pytest.param(";", "a;b;c", ["a", "b", "c"], id="semicolon_separator"),
],
)
def test_custom_separators(self, mocker, separator, env_value, expected) -> None:
def test_custom_separators(self, mocker, separator, env_value, expected):
"""Test that custom separators work correctly."""
mocker.patch.dict(os.environ, {"LIST_VAR": env_value})
result = get_list_from_env("LIST_VAR", separator=separator)
@@ -439,19 +439,19 @@ class TestGetListFromEnv:
pytest.param(None, [], id="none_default_returns_empty_list"),
],
)
def test_missing_env_var_with_defaults(self, mocker, default, expected) -> None:
def test_missing_env_var_with_defaults(self, mocker, default, expected):
"""Test that missing environment variables return provided defaults."""
mocker.patch.dict(os.environ, {}, clear=True)
result = get_list_from_env("MISSING_VAR", default=default)
assert result == expected
def test_missing_env_var_no_default(self, mocker) -> None:
def test_missing_env_var_no_default(self, mocker):
"""Test that missing environment variable with no default returns empty list."""
mocker.patch.dict(os.environ, {}, clear=True)
result = get_list_from_env("MISSING_VAR")
assert result == []
def test_required_env_var_missing_raises_error(self, mocker) -> None:
def test_required_env_var_missing_raises_error(self, mocker):
"""Test that missing required environment variable raises ValueError."""
mocker.patch.dict(os.environ, {}, clear=True)
with pytest.raises(
@@ -460,19 +460,19 @@ class TestGetListFromEnv:
):
get_list_from_env("REQUIRED_VAR", required=True)
def test_required_env_var_with_default_does_not_raise(self, mocker) -> None:
def test_required_env_var_with_default_does_not_raise(self, mocker):
"""Test that required environment variable with default does not raise error."""
mocker.patch.dict(os.environ, {}, clear=True)
result = get_list_from_env("REQUIRED_VAR", default=["default"], required=True)
assert result == ["default"]
def test_strip_whitespace_false(self, mocker) -> None:
def test_strip_whitespace_false(self, mocker):
"""Test that whitespace is preserved when strip_whitespace=False."""
mocker.patch.dict(os.environ, {"LIST_VAR": " a , b , c "})
result = get_list_from_env("LIST_VAR", strip_whitespace=False)
assert result == [" a ", " b ", " c "]
def test_remove_empty_false(self, mocker) -> None:
def test_remove_empty_false(self, mocker):
"""Test that empty elements are preserved when remove_empty=False."""
mocker.patch.dict(os.environ, {"LIST_VAR": "a,,b,,c"})
result = get_list_from_env("LIST_VAR", remove_empty=False)
-48
View File
@@ -1,48 +0,0 @@
import uuid
from django.contrib.auth.models import User
from django.test import TestCase
from django.test import override_settings
from django.urls import resolve
from django.urls import reverse
from rest_framework import status
class TestApiAuthViews(TestCase):
def test_api_auth_login_uses_allauth_login_view(self):
response = self.client.get(reverse("rest_framework:login"))
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.assertTemplateUsed(response, "account/login.html")
def test_api_auth_login_uses_same_view_as_account_login(self):
api_match = resolve("/api/auth/login/")
account_match = resolve("/accounts/login/")
self.assertIs(api_match.func.view_class, account_match.func.view_class)
@override_settings(DISABLE_REGULAR_LOGIN=True)
def test_api_auth_login_respects_disable_regular_login(self):
username = f"testuser-{uuid.uuid4().hex}"
User.objects.create_user(
username=username,
password="testpassword",
)
response = self.client.post(
reverse("rest_framework:login"),
data={
"login": username,
"password": "testpassword",
"next": "/api/documents/",
},
)
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.assertTemplateUsed(response, "account/login.html")
self.assertContains(response, "Regular login is disabled")
self.assertNotIn("_auth_user_id", self.client.session)
def test_api_auth_logout_uses_named_route(self):
self.assertEqual(reverse("rest_framework:login"), "/api/auth/login/")
self.assertEqual(reverse("rest_framework:logout"), "/api/auth/logout/")
+9 -9
View File
@@ -1,7 +1,6 @@
import hmac
import pickle
from hashlib import sha256
from pathlib import Path
import pytest
from django.test import override_settings
@@ -12,20 +11,21 @@ from paperless.celery import signed_pickle_loads
class TestSignedPickleSerializer:
def test_roundtrip_simple_types(self) -> None:
def test_roundtrip_simple_types(self):
"""Signed pickle can round-trip basic JSON-like types."""
for obj in [42, "hello", [1, 2, 3], {"key": "value"}, None, True]:
assert signed_pickle_loads(signed_pickle_dumps(obj)) == obj
def test_roundtrip_complex_types(self) -> None:
def test_roundtrip_complex_types(self):
"""Signed pickle can round-trip types that JSON cannot."""
from pathlib import Path
obj = {"path": Path("/tmp/test"), "data": {1, 2, 3}}
result = signed_pickle_loads(signed_pickle_dumps(obj))
assert result["path"] == Path("/tmp/test")
assert result["data"] == {1, 2, 3}
def test_tampered_data_rejected(self) -> None:
def test_tampered_data_rejected(self):
"""Flipping a byte in the data portion causes HMAC failure."""
payload = signed_pickle_dumps({"task": "test"})
tampered = bytearray(payload)
@@ -33,7 +33,7 @@ class TestSignedPickleSerializer:
with pytest.raises(ValueError, match="HMAC verification failed"):
signed_pickle_loads(bytes(tampered))
def test_tampered_signature_rejected(self) -> None:
def test_tampered_signature_rejected(self):
"""Flipping a byte in the signature portion causes HMAC failure."""
payload = signed_pickle_dumps({"task": "test"})
tampered = bytearray(payload)
@@ -41,17 +41,17 @@ class TestSignedPickleSerializer:
with pytest.raises(ValueError, match="HMAC verification failed"):
signed_pickle_loads(bytes(tampered))
def test_truncated_payload_rejected(self) -> None:
def test_truncated_payload_rejected(self):
"""A payload shorter than HMAC_SIZE is rejected."""
with pytest.raises(ValueError, match="too short"):
signed_pickle_loads(b"\x00" * (HMAC_SIZE - 1))
def test_empty_payload_rejected(self) -> None:
def test_empty_payload_rejected(self):
with pytest.raises(ValueError, match="too short"):
signed_pickle_loads(b"")
@override_settings(SECRET_KEY="different-secret-key")
def test_wrong_secret_key_rejected(self) -> None:
def test_wrong_secret_key_rejected(self):
"""A message signed with one key cannot be loaded with another."""
original_key = b"test-secret-key-do-not-use-in-production"
obj = {"task": "test"}
@@ -61,7 +61,7 @@ class TestSignedPickleSerializer:
with pytest.raises(ValueError, match="HMAC verification failed"):
signed_pickle_loads(payload)
def test_forged_pickle_rejected(self) -> None:
def test_forged_pickle_rejected(self):
"""A raw pickle payload (no signature) is rejected."""
raw_pickle = pickle.dumps({"task": "test"})
# Raw pickle won't have a valid HMAC prefix
@@ -45,45 +45,45 @@ class TestMigrateSkipArchiveFile(TestMigrations):
)
return ApplicationConfiguration.objects.get(pk=pk)
def test_skip_mapped_to_auto(self) -> None:
def test_skip_mapped_to_auto(self):
config = self._get_config(1)
assert config.mode == "auto"
def test_skip_archive_always_mapped_to_never(self) -> None:
def test_skip_archive_always_mapped_to_never(self):
config = self._get_config(1)
assert config.archive_file_generation == "never"
def test_redo_unchanged(self) -> None:
def test_redo_unchanged(self):
config = self._get_config(2)
assert config.mode == "redo"
def test_skip_archive_with_text_mapped_to_auto(self) -> None:
def test_skip_archive_with_text_mapped_to_auto(self):
config = self._get_config(2)
assert config.archive_file_generation == "auto"
def test_force_unchanged(self) -> None:
def test_force_unchanged(self):
config = self._get_config(3)
assert config.mode == "force"
def test_skip_archive_never_mapped_to_always(self) -> None:
def test_skip_archive_never_mapped_to_always(self):
config = self._get_config(3)
assert config.archive_file_generation == "always"
def test_skip_noarchive_mapped_to_auto(self) -> None:
def test_skip_noarchive_mapped_to_auto(self):
config = self._get_config(4)
assert config.mode == "auto"
def test_skip_noarchive_implies_archive_never(self) -> None:
def test_skip_noarchive_implies_archive_never(self):
config = self._get_config(4)
assert config.archive_file_generation == "never"
def test_skip_noarchive_explicit_skip_archive_takes_precedence(self) -> None:
def test_skip_noarchive_explicit_skip_archive_takes_precedence(self):
"""skip_archive_file=never maps to always, not overridden by skip_noarchive."""
config = self._get_config(5)
assert config.mode == "auto"
assert config.archive_file_generation == "always"
def test_null_values_remain_null(self) -> None:
def test_null_values_remain_null(self):
config = self._get_config(6)
assert config.mode is None
assert config.archive_file_generation is None
+1 -15
View File
@@ -98,21 +98,7 @@ urlpatterns = [
re_path(
"^auth/",
include(
(
[
path(
"login/",
allauth_account_views.login,
name="login",
),
path(
"logout/",
allauth_account_views.logout,
name="logout",
),
],
"rest_framework",
),
("rest_framework.urls", "rest_framework"),
namespace="rest_framework",
),
),
+1 -1
View File
@@ -1,6 +1,6 @@
from typing import Final
__version__: Final[tuple[int, int, int]] = (2, 20, 15)
__version__: Final[tuple[int, int, int]] = (2, 20, 14)
# Version string like X.Y.Z
__full_version_str__: Final[str] = ".".join(map(str, __version__))
# Version string like X.Y
-10
View File
@@ -192,16 +192,6 @@ class UserViewSet(ModelViewSet[User]):
)
return super().update(request, *args, **kwargs)
def destroy(self, request, *args, **kwargs):
user_to_delete: User = self.get_object()
if not request.user.is_superuser and user_to_delete.is_superuser:
return HttpResponseForbidden(
"Superusers can only be deleted by other superusers",
)
return super().destroy(request, *args, **kwargs)
@extend_schema(
request=None,
responses={
-54
View File
@@ -1,4 +1,3 @@
import json
import logging
import sys
@@ -10,8 +9,6 @@ logger = logging.getLogger("paperless_ai.chat")
MAX_SINGLE_DOC_CONTEXT_CHARS = 15000
SINGLE_DOC_SNIPPET_CHARS = 800
CHAT_METADATA_DELIMITER = "\n\n__PAPERLESS_CHAT_METADATA__"
MAX_CHAT_REFERENCES = 3
CHAT_PROMPT_TMPL = """Context information is below.
---------------------
@@ -22,52 +19,6 @@ CHAT_PROMPT_TMPL = """Context information is below.
Answer:"""
def _build_document_reference(
document: Document,
title: str | None = None,
) -> dict[str, int | str]:
return {
"id": document.pk,
"title": title or document.title or document.filename,
}
def _get_document_references(
documents: list[Document],
top_nodes: list,
) -> list[dict[str, int | str]]:
allowed_documents = {doc.pk: doc for doc in documents}
references: list[dict[str, int | str]] = []
seen_document_ids: set[int] = set()
for node in top_nodes:
try:
document_id = int(node.metadata["document_id"])
except (KeyError, TypeError, ValueError): # pragma: no cover
continue
if document_id in seen_document_ids or document_id not in allowed_documents:
continue
seen_document_ids.add(document_id)
document = allowed_documents[document_id]
references.append(
_build_document_reference(document, node.metadata.get("title")),
)
if len(references) >= MAX_CHAT_REFERENCES: # pragma: no cover
break
return references
def _format_chat_metadata_trailer(references: list[dict[str, int | str]]) -> str:
return (
f"{CHAT_METADATA_DELIMITER}"
f"{json.dumps({'references': references}, separators=(',', ':'))}"
)
def stream_chat_with_documents(query_str: str, documents: list[Document]):
client = AIClient()
index = load_or_build_index()
@@ -98,7 +49,6 @@ def stream_chat_with_documents(query_str: str, documents: list[Document]):
if len(documents) == 1:
# Just one doc — provide full content
doc = documents[0]
references = [_build_document_reference(doc)]
# TODO: include document metadata in the context
content = doc.content or ""
context_body = content
@@ -128,7 +78,6 @@ def stream_chat_with_documents(query_str: str, documents: list[Document]):
yield "Sorry, I couldn't find any content to answer your question."
return
references = _get_document_references(documents, top_nodes)
context = "\n\n".join(
f"TITLE: {node.metadata.get('title')}\n{node.text[:SINGLE_DOC_SNIPPET_CHARS]}"
for node in top_nodes
@@ -153,6 +102,3 @@ def stream_chat_with_documents(query_str: str, documents: list[Document]):
for chunk in response_stream.response_gen:
yield chunk
sys.stdout.flush()
if references:
yield _format_chat_metadata_trailer(references)
+6 -10
View File
@@ -1,12 +1,10 @@
import logging
from typing import TYPE_CHECKING
from paperless.models import LLMBackend
if TYPE_CHECKING:
from llama_index.core.llms import ChatMessage
from llama_index.llms.ollama import Ollama
from llama_index.llms.openai_like import OpenAILike
from llama_index.llms.openai import OpenAI
from paperless.config import AIConfig
from paperless.network import validate_outbound_http_url
@@ -24,8 +22,8 @@ class AIClient:
self.settings = AIConfig()
self.llm = self.get_llm()
def get_llm(self) -> "Ollama | OpenAILike":
if self.settings.llm_backend == LLMBackend.OLLAMA:
def get_llm(self) -> "Ollama | OpenAI":
if self.settings.llm_backend == "ollama":
from llama_index.llms.ollama import Ollama
endpoint = self.settings.llm_endpoint or "http://localhost:11434"
@@ -38,8 +36,8 @@ class AIClient:
base_url=endpoint,
request_timeout=120,
)
elif self.settings.llm_backend == LLMBackend.OPENAI_LIKE:
from llama_index.llms.openai_like import OpenAILike
elif self.settings.llm_backend == "openai":
from llama_index.llms.openai import OpenAI
endpoint = self.settings.llm_endpoint or None
if endpoint:
@@ -47,12 +45,10 @@ class AIClient:
endpoint,
allow_internal=self.settings.llm_allow_internal_endpoints,
)
return OpenAILike(
return OpenAI(
model=self.settings.llm_model or "gpt-3.5-turbo",
api_base=endpoint,
api_key=self.settings.llm_api_key,
is_chat_model=True,
is_function_calling_model=True,
)
else:
raise ValueError(f"Unsupported LLM backend: {self.settings.llm_backend}")
+5 -5
View File
@@ -19,8 +19,8 @@ def get_embedding_model() -> "BaseEmbedding":
config = AIConfig()
match config.llm_embedding_backend:
case LLMEmbeddingBackend.OPENAI_LIKE:
from llama_index.embeddings.openai_like import OpenAILikeEmbedding
case LLMEmbeddingBackend.OPENAI:
from llama_index.embeddings.openai import OpenAIEmbedding
endpoint = config.llm_endpoint or None
if endpoint:
@@ -28,8 +28,8 @@ def get_embedding_model() -> "BaseEmbedding":
endpoint,
allow_internal=config.llm_allow_internal_endpoints,
)
return OpenAILikeEmbedding(
model_name=config.llm_embedding_model or "text-embedding-3-small",
return OpenAIEmbedding(
model=config.llm_embedding_model or "text-embedding-3-small",
api_key=config.llm_api_key,
api_base=endpoint,
)
@@ -54,7 +54,7 @@ def get_embedding_dim() -> int:
config = AIConfig()
model = config.llm_embedding_model or (
"text-embedding-3-small"
if config.llm_embedding_backend == LLMEmbeddingBackend.OPENAI_LIKE
if config.llm_embedding_backend == "openai"
else "sentence-transformers/all-MiniLM-L6-v2"
)
+1 -1
View File
@@ -98,7 +98,7 @@ def test_update_llm_index_removes_meta(
config = AIConfig()
expected_model = config.llm_embedding_model or (
"text-embedding-3-small"
if config.llm_embedding_backend == "openai-like"
if config.llm_embedding_backend == "openai"
else "sentence-transformers/all-MiniLM-L6-v2"
)
assert meta == {"embedding_model": expected_model, "dim": 384}
+5 -48
View File
@@ -1,4 +1,3 @@
import json
from unittest.mock import MagicMock
from unittest.mock import patch
@@ -6,7 +5,6 @@ import pytest
from llama_index.core import VectorStoreIndex
from llama_index.core.schema import TextNode
from paperless_ai.chat import CHAT_METADATA_DELIMITER
from paperless_ai.chat import stream_chat_with_documents
@@ -42,21 +40,6 @@ def mock_document():
return doc
def assert_chat_output(
output: list[str],
*,
expected_chunks: list[str],
expected_references: list[dict[str, int | str]],
) -> None:
assert output[:-1] == expected_chunks
trailer = output[-1]
assert trailer.startswith(CHAT_METADATA_DELIMITER)
assert json.loads(trailer.removeprefix(CHAT_METADATA_DELIMITER)) == {
"references": expected_references,
}
def test_stream_chat_with_one_document_full_content(mock_document) -> None:
with (
patch("paperless_ai.chat.AIClient") as mock_client_cls,
@@ -85,13 +68,7 @@ def test_stream_chat_with_one_document_full_content(mock_document) -> None:
output = list(stream_chat_with_documents("What is this?", [mock_document]))
assert_chat_output(
output,
expected_chunks=["chunk1", "chunk2"],
expected_references=[
{"id": mock_document.pk, "title": "Test Document"},
],
)
assert output == ["chunk1", "chunk2"]
def test_stream_chat_with_multiple_documents_retrieval(patch_embed_nodes) -> None:
@@ -123,20 +100,7 @@ def test_stream_chat_with_multiple_documents_retrieval(patch_embed_nodes) -> Non
# Patch as_retriever to return a retriever whose retrieve() returns mock_node1 and mock_node2
mock_retriever = MagicMock()
mock_duplicate_node = TextNode(
text="More content for doc 1.",
metadata={"document_id": "1", "title": "Document 1 Duplicate"},
)
mock_foreign_node = TextNode(
text="Content for doc 3.",
metadata={"document_id": "3", "title": "Document 3"},
)
mock_retriever.retrieve.return_value = [
mock_node1,
mock_duplicate_node,
mock_node2,
mock_foreign_node,
]
mock_retriever.retrieve.return_value = [mock_node1, mock_node2]
mock_as_retriever.return_value = mock_retriever
# Mock response stream
@@ -149,19 +113,12 @@ def test_stream_chat_with_multiple_documents_retrieval(patch_embed_nodes) -> Non
mock_query_engine.query.return_value = mock_response_stream
# Fake documents
doc1 = MagicMock(pk=1, title="Document 1", filename="doc1.pdf")
doc2 = MagicMock(pk=2, title="Document 2", filename="doc2.pdf")
doc1 = MagicMock(pk=1)
doc2 = MagicMock(pk=2)
output = list(stream_chat_with_documents("What's up?", [doc1, doc2]))
assert_chat_output(
output,
expected_chunks=["chunk1", "chunk2"],
expected_references=[
{"id": 1, "title": "Document 1"},
{"id": 2, "title": "Document 2"},
],
)
assert output == ["chunk1", "chunk2"]
def test_stream_chat_no_matching_nodes() -> None:
+4 -6
View File
@@ -25,8 +25,8 @@ def mock_ollama_llm():
@pytest.fixture
def mock_openai_llm():
with patch("llama_index.llms.openai_like.OpenAILike") as MockOpenAILike:
yield MockOpenAILike
with patch("llama_index.llms.openai.OpenAI") as MockOpenAI:
yield MockOpenAI
def test_get_llm_ollama(mock_ai_config, mock_ollama_llm):
@@ -45,7 +45,7 @@ def test_get_llm_ollama(mock_ai_config, mock_ollama_llm):
def test_get_llm_openai(mock_ai_config, mock_openai_llm):
mock_ai_config.llm_backend = "openai-like"
mock_ai_config.llm_backend = "openai"
mock_ai_config.llm_model = "test_model"
mock_ai_config.llm_api_key = "test_api_key"
mock_ai_config.llm_endpoint = "http://test-url"
@@ -56,14 +56,12 @@ def test_get_llm_openai(mock_ai_config, mock_openai_llm):
model="test_model",
api_base="http://test-url",
api_key="test_api_key",
is_chat_model=True,
is_function_calling_model=True,
)
assert client.llm == mock_openai_llm.return_value
def test_get_llm_openai_blocks_internal_endpoint_when_disallowed(mock_ai_config):
mock_ai_config.llm_backend = "openai-like"
mock_ai_config.llm_backend = "openai"
mock_ai_config.llm_model = "test_model"
mock_ai_config.llm_api_key = "test_api_key"
mock_ai_config.llm_endpoint = "http://127.0.0.1:1234"
+7 -9
View File
@@ -54,17 +54,15 @@ def mock_document():
def test_get_embedding_model_openai(mock_ai_config):
mock_ai_config.return_value.llm_embedding_backend = LLMEmbeddingBackend.OPENAI_LIKE
mock_ai_config.return_value.llm_embedding_backend = LLMEmbeddingBackend.OPENAI
mock_ai_config.return_value.llm_embedding_model = "text-embedding-3-small"
mock_ai_config.return_value.llm_api_key = "test_api_key"
mock_ai_config.return_value.llm_endpoint = "http://test-url"
with patch(
"llama_index.embeddings.openai_like.OpenAILikeEmbedding",
) as MockOpenAIEmbedding:
with patch("llama_index.embeddings.openai.OpenAIEmbedding") as MockOpenAIEmbedding:
model = get_embedding_model()
MockOpenAIEmbedding.assert_called_once_with(
model_name="text-embedding-3-small",
model="text-embedding-3-small",
api_key="test_api_key",
api_base="http://test-url",
)
@@ -74,7 +72,7 @@ def test_get_embedding_model_openai(mock_ai_config):
def test_get_embedding_model_openai_blocks_internal_endpoint_when_disallowed(
mock_ai_config,
):
mock_ai_config.return_value.llm_embedding_backend = LLMEmbeddingBackend.OPENAI_LIKE
mock_ai_config.return_value.llm_embedding_backend = LLMEmbeddingBackend.OPENAI
mock_ai_config.return_value.llm_embedding_model = "text-embedding-3-small"
mock_ai_config.return_value.llm_api_key = "test_api_key"
mock_ai_config.return_value.llm_endpoint = "http://127.0.0.1:11434"
@@ -111,7 +109,7 @@ def test_get_embedding_model_invalid_backend(mock_ai_config):
def test_get_embedding_dim_infers_and_saves(temp_llm_index_dir, mock_ai_config):
mock_ai_config.return_value.llm_embedding_backend = "openai-like"
mock_ai_config.return_value.llm_embedding_backend = "openai"
mock_ai_config.return_value.llm_embedding_model = None
class DummyEmbedding:
@@ -131,7 +129,7 @@ def test_get_embedding_dim_infers_and_saves(temp_llm_index_dir, mock_ai_config):
def test_get_embedding_dim_reads_existing_meta(temp_llm_index_dir, mock_ai_config):
mock_ai_config.return_value.llm_embedding_backend = "openai-like"
mock_ai_config.return_value.llm_embedding_backend = "openai"
mock_ai_config.return_value.llm_embedding_model = None
(temp_llm_index_dir / "meta.json").write_text(
@@ -144,7 +142,7 @@ def test_get_embedding_dim_reads_existing_meta(temp_llm_index_dir, mock_ai_confi
def test_get_embedding_dim_raises_on_model_change(temp_llm_index_dir, mock_ai_config):
mock_ai_config.return_value.llm_embedding_backend = "openai-like"
mock_ai_config.return_value.llm_embedding_backend = "openai"
mock_ai_config.return_value.llm_embedding_model = None
(temp_llm_index_dir / "meta.json").write_text(
+7 -10
View File
@@ -10,7 +10,6 @@ from fnmatch import fnmatch
from pathlib import Path
from typing import TYPE_CHECKING
import magic
import pathvalidate
from celery import chord
from celery import shared_task
@@ -40,6 +39,7 @@ from documents.models import Correspondent
from documents.models import PaperlessTask
from documents.parsers import is_mime_type_supported
from documents.tasks import consume_file
from paperless import mime_detection
from paperless.network import is_public_ip
from paperless.network import resolve_hostname_ips
from paperless_mail.models import MailAccount
@@ -848,7 +848,7 @@ class MailAccountHandler(LoggingMixin):
# don't trust the content type of the attachment. Could be
# generic application/octet-stream.
mime_type = magic.from_buffer(att.payload, mime=True)
mime_type = mime_detection.from_buffer(att.payload)
if is_mime_type_supported(mime_type):
self.log.info(
@@ -954,14 +954,11 @@ class MailAccountHandler(LoggingMixin):
)
with Path(temp_filename).open("wb") as f:
# Move "From"-header to beginning of file
# TODO: This ugly workaround is needed because the parser is
# chosen only by the mime_type detected via magic
# (see documents/consumer.py "mime_type = magic.from_file")
# Unfortunately magic sometimes fails to detect the mime
# type of .eml files correctly as message/rfc822 and instead
# detects text/plain.
# This also effects direct file consumption of .eml files
# which are not treated with this workaround.
# TODO: This workaround may no longer be needed with Magika,
# which has better text-format detection than libmagic.
# Previously libmagic would misidentify .eml files as text/plain
# instead of message/rfc822. Verify and remove if Magika handles
# it correctly.
from_element = None
for i, header in enumerate(message.obj._headers):
if header[0] == "From":
+1 -13
View File
@@ -2,7 +2,6 @@ from django.utils.translation import gettext as _
from rest_framework import serializers
from rest_framework.exceptions import PermissionDenied
from documents.permissions import get_objects_for_user_owner_aware
from documents.permissions import has_perms_owner_aware
from documents.serialisers import CorrespondentField
from documents.serialisers import DocumentTypeField
@@ -60,18 +59,7 @@ class MailAccountSerializer(OwnedObjectSerializer):
class AccountField(serializers.PrimaryKeyRelatedField[MailAccount]):
def get_queryset(self):
user = getattr(self.context.get("request"), "user", None)
if user is None:
user = getattr(self.root, "user", None)
if user is None:
return MailAccount.objects.none()
return get_objects_for_user_owner_aware(
user,
"change_mailaccount",
MailAccount,
).order_by("-id")
return MailAccount.objects.all().order_by("-id")
class MailRuleSerializer(OwnedObjectSerializer):
-108
View File
@@ -1,19 +1,11 @@
from collections.abc import Generator
from typing import TYPE_CHECKING
import pytest
import pytest_mock
from django.contrib.auth.models import User
from django.test import Client
from rest_framework.test import APIClient
from paperless_mail.mail import MailAccountHandler
from paperless_mail.models import MailAccount
from paperless_mail.tests.factories import MailAccountFactory
if TYPE_CHECKING:
from paperless_mail.tests.test_mail import BogusMailBox
@pytest.fixture()
def greenmail_mail_account(db: None) -> Generator[MailAccount, None, None]:
@@ -35,103 +27,3 @@ def greenmail_mail_account(db: None) -> Generator[MailAccount, None, None]:
@pytest.fixture()
def mail_account_handler() -> MailAccountHandler:
return MailAccountHandler()
@pytest.fixture()
def mail_user(
db: None,
django_user_model,
client: Client,
):
"""
Create a user with the `add_mailaccount` permission and log them in via
the test client. Returned so tests can mutate permissions if needed.
"""
from django.contrib.auth.models import Permission
user = django_user_model.objects.create_user("testuser")
user.user_permissions.add(
*Permission.objects.filter(codename__in=["add_mailaccount"]),
)
user.save()
client.force_login(user)
return user
@pytest.fixture()
def oauth_settings(settings):
"""
Apply the OAuth callback / client-id settings the OAuth flow needs. Uses
pytest-django's `settings` fixture so values are reverted automatically.
"""
settings.OAUTH_CALLBACK_BASE_URL = "http://localhost:8000"
settings.GMAIL_OAUTH_CLIENT_ID = "test_gmail_client_id"
settings.GMAIL_OAUTH_CLIENT_SECRET = "test_gmail_client_secret"
settings.OUTLOOK_OAUTH_CLIENT_ID = "test_outlook_client_id"
settings.OUTLOOK_OAUTH_CLIENT_SECRET = "test_outlook_client_secret"
return settings
@pytest.fixture()
def mail_mocker(db: None):
"""
Provides a MailMocker instance with its `MailBox` and
`queue_consumption_tasks` patches active. Cleanups registered via
TestCase.addCleanup are run on teardown by calling doCleanups().
"""
from paperless_mail.tests.test_mail import MailMocker
mocker = MailMocker()
mocker.setUp()
try:
yield mocker
finally:
mocker.doCleanups()
@pytest.fixture()
def mail_api_user(
db: None,
django_user_model: type[User],
) -> User:
"""
Fully-permissioned (regular) user used by the mail API tests.
Has every model-level permission but is NOT a Django superuser/staff:
the owner-aware filtering and bulk_delete permission tests rely on
django-guardian's object-level checks, and `is_superuser` short-circuits
those checks. The name avoids `admin` to make this distinction explicit.
"""
from django.contrib.auth.models import Permission
user = django_user_model.objects.create_user(username="mail_api_user")
user.user_permissions.add(*Permission.objects.all())
user.save()
return user
@pytest.fixture()
def mail_api_client(mail_api_user: User) -> APIClient:
"""
DRF APIClient force-authenticated as `mail_api_user` and pinned to API v10
via the Accept header (matches `documents/tests/conftest.py:admin_client`).
"""
client = APIClient()
client.force_authenticate(user=mail_api_user)
client.credentials(HTTP_ACCEPT="application/json; version=10")
return client
@pytest.fixture()
def bogus_mailbox(mocker: pytest_mock.MockerFixture) -> "BogusMailBox":
"""
Patch `paperless_mail.mail.MailBox` with a `BogusMailBox` instance so the
`/api/mail_accounts/test/` endpoint can run without a real IMAP server.
Returns the bogus mailbox so tests can introspect/manipulate it.
"""
from paperless_mail.tests.test_mail import BogusMailBox
mailbox = BogusMailBox()
mock_mailbox_cls = mocker.patch("paperless_mail.mail.MailBox")
mock_mailbox_cls.return_value = mailbox
return mailbox
+3 -3
View File
@@ -9,7 +9,7 @@ from paperless_mail.models import MailRule
from paperless_mail.models import ProcessedMail
class MailAccountFactory(DjangoModelFactory[MailAccount]):
class MailAccountFactory(DjangoModelFactory):
class Meta:
model = MailAccount
@@ -24,7 +24,7 @@ class MailAccountFactory(DjangoModelFactory[MailAccount]):
is_token = False
class MailRuleFactory(DjangoModelFactory[MailRule]):
class MailRuleFactory(DjangoModelFactory):
class Meta:
model = MailRule
@@ -44,7 +44,7 @@ class MailRuleFactory(DjangoModelFactory[MailRule]):
stop_processing = False
class ProcessedMailFactory(DjangoModelFactory[ProcessedMail]):
class ProcessedMailFactory(DjangoModelFactory):
class Meta:
model = ProcessedMail
+268 -318
View File
@@ -1,39 +1,44 @@
import json
from typing import TYPE_CHECKING
from unittest import mock
import pytest
from django.contrib.auth.models import Permission
from django.contrib.auth.models import User
from guardian.shortcuts import assign_perm
from rest_framework import status
from rest_framework.test import APIClient
from rest_framework.test import APITestCase
from documents.tests.factories import CorrespondentFactory
from documents.tests.factories import DocumentTypeFactory
from documents.tests.factories import TagFactory
from documents.tests.utils import DirectoriesMixin
from paperless_mail.models import MailAccount
from paperless_mail.models import MailRule
from paperless_mail.models import ProcessedMail
from paperless_mail.tests.factories import MailAccountFactory
from paperless_mail.tests.factories import MailRuleFactory
from paperless_mail.tests.factories import ProcessedMailFactory
if TYPE_CHECKING:
from paperless_mail.tests.test_mail import BogusMailBox
from paperless_mail.tests.test_mail import BogusMailBox
MAIL_ACCOUNTS_ENDPOINT = "/api/mail_accounts/"
MAIL_ACCOUNTS_TEST_ENDPOINT = f"{MAIL_ACCOUNTS_ENDPOINT}test/"
MAIL_RULES_ENDPOINT = "/api/mail_rules/"
PROCESSED_MAIL_ENDPOINT = "/api/processed_mail/"
PROCESSED_MAIL_BULK_DELETE_ENDPOINT = f"{PROCESSED_MAIL_ENDPOINT}bulk_delete/"
class TestAPIMailAccounts(DirectoriesMixin, APITestCase):
ENDPOINT = "/api/mail_accounts/"
def setUp(self) -> None:
self.bogus_mailbox = BogusMailBox()
@pytest.mark.django_db
class TestAPIMailAccounts:
def test_get_mail_accounts(
self,
mail_api_client: APIClient,
) -> None:
patcher = mock.patch("paperless_mail.mail.MailBox")
m = patcher.start()
m.return_value = self.bogus_mailbox
self.addCleanup(patcher.stop)
super().setUp()
self.user = User.objects.create_user(username="temp_admin")
self.user.user_permissions.add(*Permission.objects.all())
self.user.save()
self.client.force_authenticate(user=self.user)
def test_get_mail_accounts(self) -> None:
"""
GIVEN:
- Configured mail accounts
@@ -42,6 +47,7 @@ class TestAPIMailAccounts:
THEN:
- Configured mail accounts are provided
"""
account1 = MailAccountFactory(
name="Email1",
username="username1",
@@ -50,30 +56,31 @@ class TestAPIMailAccounts:
imap_port=443,
)
response = mail_api_client.get(MAIL_ACCOUNTS_ENDPOINT)
response = self.client.get(self.ENDPOINT)
assert response.status_code == status.HTTP_200_OK
assert response.data["count"] == 1
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.assertEqual(response.data["count"], 1)
returned_account1 = response.data["results"][0]
assert returned_account1["name"] == account1.name
assert returned_account1["username"] == account1.username
assert returned_account1["password"] == "**********"
assert returned_account1["imap_server"] == account1.imap_server
assert returned_account1["imap_port"] == account1.imap_port
assert returned_account1["imap_security"] == account1.imap_security
assert returned_account1["character_set"] == account1.character_set
self.assertEqual(returned_account1["name"], account1.name)
self.assertEqual(returned_account1["username"], account1.username)
self.assertEqual(
returned_account1["password"],
"**********",
)
self.assertEqual(returned_account1["imap_server"], account1.imap_server)
self.assertEqual(returned_account1["imap_port"], account1.imap_port)
self.assertEqual(returned_account1["imap_security"], account1.imap_security)
self.assertEqual(returned_account1["character_set"], account1.character_set)
def test_create_mail_account(
self,
mail_api_client: APIClient,
) -> None:
def test_create_mail_account(self) -> None:
"""
WHEN:
- API request is made to add a mail account
THEN:
- A new mail account is created
"""
account1 = {
"name": "Email1",
"username": "username1",
@@ -84,27 +91,24 @@ class TestAPIMailAccounts:
"character_set": "UTF-8",
}
response = mail_api_client.post(
MAIL_ACCOUNTS_ENDPOINT,
response = self.client.post(
self.ENDPOINT,
data=account1,
)
assert response.status_code == status.HTTP_201_CREATED
self.assertEqual(response.status_code, status.HTTP_201_CREATED)
returned_account1 = MailAccount.objects.get(name="Email1")
assert returned_account1.name == account1["name"]
assert returned_account1.username == account1["username"]
assert returned_account1.password == account1["password"]
assert returned_account1.imap_server == account1["imap_server"]
assert returned_account1.imap_port == account1["imap_port"]
assert returned_account1.imap_security == account1["imap_security"]
assert returned_account1.character_set == account1["character_set"]
self.assertEqual(returned_account1.name, account1["name"])
self.assertEqual(returned_account1.username, account1["username"])
self.assertEqual(returned_account1.password, account1["password"])
self.assertEqual(returned_account1.imap_server, account1["imap_server"])
self.assertEqual(returned_account1.imap_port, account1["imap_port"])
self.assertEqual(returned_account1.imap_security, account1["imap_security"])
self.assertEqual(returned_account1.character_set, account1["character_set"])
def test_delete_mail_account(
self,
mail_api_client: APIClient,
) -> None:
def test_delete_mail_account(self) -> None:
"""
GIVEN:
- Existing mail account
@@ -113,20 +117,18 @@ class TestAPIMailAccounts:
THEN:
- Account is deleted
"""
account1 = MailAccountFactory()
response = mail_api_client.delete(
f"{MAIL_ACCOUNTS_ENDPOINT}{account1.pk}/",
response = self.client.delete(
f"{self.ENDPOINT}{account1.pk}/",
)
assert response.status_code == status.HTTP_204_NO_CONTENT
self.assertEqual(response.status_code, status.HTTP_204_NO_CONTENT)
assert MailAccount.objects.count() == 0
self.assertEqual(len(MailAccount.objects.all()), 0)
def test_update_mail_account(
self,
mail_api_client: APIClient,
) -> None:
def test_update_mail_account(self) -> None:
"""
GIVEN:
- Existing mail accounts
@@ -135,51 +137,49 @@ class TestAPIMailAccounts:
THEN:
- The mail account is updated, password only updated if not '****'
"""
account1 = MailAccountFactory()
response = mail_api_client.patch(
f"{MAIL_ACCOUNTS_ENDPOINT}{account1.pk}/",
response = self.client.patch(
f"{self.ENDPOINT}{account1.pk}/",
data={
"name": "Updated Name 1",
"password": "******",
},
)
assert response.status_code == status.HTTP_200_OK
self.assertEqual(response.status_code, status.HTTP_200_OK)
returned_account1 = MailAccount.objects.get(pk=account1.pk)
assert returned_account1.name == "Updated Name 1"
assert returned_account1.password == account1.password
self.assertEqual(returned_account1.name, "Updated Name 1")
self.assertEqual(returned_account1.password, account1.password)
response = mail_api_client.patch(
f"{MAIL_ACCOUNTS_ENDPOINT}{account1.pk}/",
response = self.client.patch(
f"{self.ENDPOINT}{account1.pk}/",
data={
"name": "Updated Name 2",
"password": "123xyz",
},
)
assert response.status_code == status.HTTP_200_OK
self.assertEqual(response.status_code, status.HTTP_200_OK)
returned_account2 = MailAccount.objects.get(pk=account1.pk)
assert returned_account2.name == "Updated Name 2"
assert returned_account2.password == "123xyz"
self.assertEqual(returned_account2.name, "Updated Name 2")
self.assertEqual(returned_account2.password, "123xyz")
def test_mail_account_test_fail(
self,
mail_api_client: APIClient,
bogus_mailbox: "BogusMailBox",
) -> None:
def test_mail_account_test_fail(self) -> None:
"""
GIVEN:
- Erroneous mail account details
- Errnoeous mail account details
WHEN:
- API call is made to test account
THEN:
- API returns 400 bad request
"""
response = mail_api_client.post(
MAIL_ACCOUNTS_TEST_ENDPOINT,
response = self.client.post(
f"{self.ENDPOINT}test/",
json.dumps(
{
"imap_server": "server.example.com",
@@ -192,13 +192,9 @@ class TestAPIMailAccounts:
content_type="application/json",
)
assert response.status_code == status.HTTP_400_BAD_REQUEST
self.assertEqual(response.status_code, status.HTTP_400_BAD_REQUEST)
def test_mail_account_test_success(
self,
mail_api_client: APIClient,
bogus_mailbox: "BogusMailBox",
) -> None:
def test_mail_account_test_success(self) -> None:
"""
GIVEN:
- Working mail account details
@@ -207,8 +203,9 @@ class TestAPIMailAccounts:
THEN:
- API returns success
"""
response = mail_api_client.post(
MAIL_ACCOUNTS_TEST_ENDPOINT,
response = self.client.post(
f"{self.ENDPOINT}test/",
json.dumps(
{
"imap_server": "server.example.com",
@@ -220,14 +217,10 @@ class TestAPIMailAccounts:
),
content_type="application/json",
)
assert response.status_code == status.HTTP_200_OK
assert response.data["success"] is True
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.assertEqual(response.data["success"], True)
def test_mail_account_test_existing(
self,
mail_api_client: APIClient,
bogus_mailbox: "BogusMailBox",
) -> None:
def test_mail_account_test_existing(self) -> None:
"""
GIVEN:
- Testing server details for an existing account with obfuscated password (***)
@@ -243,8 +236,8 @@ class TestAPIMailAccounts:
imap_port=443,
)
response = mail_api_client.post(
MAIL_ACCOUNTS_TEST_ENDPOINT,
response = self.client.post(
f"{self.ENDPOINT}test/",
json.dumps(
{
"id": account.pk,
@@ -257,16 +250,12 @@ class TestAPIMailAccounts:
),
content_type="application/json",
)
assert response.status_code == status.HTTP_200_OK
assert response.data["success"] is True
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.assertEqual(response.data["success"], True)
def test_mail_account_test_existing_nonexistent_id_forbidden(
self,
mail_api_client: APIClient,
bogus_mailbox: "BogusMailBox",
) -> None:
response = mail_api_client.post(
MAIL_ACCOUNTS_TEST_ENDPOINT,
def test_mail_account_test_existing_nonexistent_id_forbidden(self) -> None:
response = self.client.post(
f"{self.ENDPOINT}test/",
json.dumps(
{
"id": 999999,
@@ -279,15 +268,10 @@ class TestAPIMailAccounts:
),
content_type="application/json",
)
assert response.status_code == status.HTTP_403_FORBIDDEN
assert response.content.decode() == "Insufficient permissions"
self.assertEqual(response.status_code, status.HTTP_403_FORBIDDEN)
self.assertEqual(response.content.decode(), "Insufficient permissions")
def test_get_mail_accounts_owner_aware(
self,
mail_api_client: APIClient,
mail_api_user: User,
django_user_model: type[User],
) -> None:
def test_get_mail_accounts_owner_aware(self) -> None:
"""
GIVEN:
- Configured accounts with different users
@@ -296,29 +280,36 @@ class TestAPIMailAccounts:
THEN:
- Only unowned, owned by user or granted accounts are provided
"""
user2 = django_user_model.objects.create_user(username="temp_admin2")
user2 = User.objects.create_user(username="temp_admin2")
account1 = MailAccountFactory(name="Email1")
account2 = MailAccountFactory(name="Email2", owner=mail_api_user)
account2 = MailAccountFactory(name="Email2", owner=self.user)
_account3 = MailAccountFactory(name="Email3", owner=user2)
account4 = MailAccountFactory(name="Email4", owner=user2)
assign_perm("view_mailaccount", mail_api_user, account4)
assign_perm("view_mailaccount", self.user, account4)
response = mail_api_client.get(MAIL_ACCOUNTS_ENDPOINT)
response = self.client.get(self.ENDPOINT)
assert response.status_code == status.HTTP_200_OK
assert response.data["count"] == 3
assert response.data["results"][0]["name"] == account1.name
assert response.data["results"][1]["name"] == account2.name
assert response.data["results"][2]["name"] == account4.name
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.assertEqual(response.data["count"], 3)
self.assertEqual(response.data["results"][0]["name"], account1.name)
self.assertEqual(response.data["results"][1]["name"], account2.name)
self.assertEqual(response.data["results"][2]["name"], account4.name)
@pytest.mark.django_db
class TestAPIMailRules:
def test_get_mail_rules(
self,
mail_api_client: APIClient,
) -> None:
class TestAPIMailRules(DirectoriesMixin, APITestCase):
ENDPOINT = "/api/mail_rules/"
def setUp(self) -> None:
super().setUp()
self.user = User.objects.create_user(username="temp_admin")
self.user.user_permissions.add(*Permission.objects.all())
self.user.save()
self.client.force_authenticate(user=self.user)
def test_get_mail_rules(self) -> None:
"""
GIVEN:
- Configured mail accounts and rules
@@ -327,6 +318,7 @@ class TestAPIMailRules:
THEN:
- Configured mail rules are provided
"""
account1 = MailAccountFactory()
rule1 = MailRuleFactory(
name="Rule1",
@@ -338,37 +330,34 @@ class TestAPIMailRules:
filter_attachment_filename_include="file.pdf",
)
response = mail_api_client.get(MAIL_RULES_ENDPOINT)
response = self.client.get(self.ENDPOINT)
assert response.status_code == status.HTTP_200_OK
assert response.data["count"] == 1
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.assertEqual(response.data["count"], 1)
returned_rule1 = response.data["results"][0]
assert returned_rule1["name"] == rule1.name
assert returned_rule1["account"] == account1.pk
assert returned_rule1["folder"] == rule1.folder
assert returned_rule1["filter_from"] == rule1.filter_from
assert returned_rule1["filter_to"] == rule1.filter_to
assert returned_rule1["filter_subject"] == rule1.filter_subject
assert returned_rule1["filter_body"] == rule1.filter_body
assert (
returned_rule1["filter_attachment_filename_include"]
== rule1.filter_attachment_filename_include
self.assertEqual(returned_rule1["name"], rule1.name)
self.assertEqual(returned_rule1["account"], account1.pk)
self.assertEqual(returned_rule1["folder"], rule1.folder)
self.assertEqual(returned_rule1["filter_from"], rule1.filter_from)
self.assertEqual(returned_rule1["filter_to"], rule1.filter_to)
self.assertEqual(returned_rule1["filter_subject"], rule1.filter_subject)
self.assertEqual(returned_rule1["filter_body"], rule1.filter_body)
self.assertEqual(
returned_rule1["filter_attachment_filename_include"],
rule1.filter_attachment_filename_include,
)
assert returned_rule1["maximum_age"] == rule1.maximum_age
assert returned_rule1["action"] == rule1.action
assert returned_rule1["assign_title_from"] == rule1.assign_title_from
assert (
returned_rule1["assign_correspondent_from"]
== rule1.assign_correspondent_from
self.assertEqual(returned_rule1["maximum_age"], rule1.maximum_age)
self.assertEqual(returned_rule1["action"], rule1.action)
self.assertEqual(returned_rule1["assign_title_from"], rule1.assign_title_from)
self.assertEqual(
returned_rule1["assign_correspondent_from"],
rule1.assign_correspondent_from,
)
assert returned_rule1["order"] == rule1.order
assert returned_rule1["attachment_type"] == rule1.attachment_type
self.assertEqual(returned_rule1["order"], rule1.order)
self.assertEqual(returned_rule1["attachment_type"], rule1.attachment_type)
def test_create_mail_rule(
self,
mail_api_client: APIClient,
) -> None:
def test_create_mail_rule(self) -> None:
"""
GIVEN:
- Configured mail account exists
@@ -377,6 +366,7 @@ class TestAPIMailRules:
THEN:
- A new mail rule is created
"""
account1 = MailAccountFactory()
tag = TagFactory(name="t")
correspondent = CorrespondentFactory(name="c")
@@ -404,51 +394,58 @@ class TestAPIMailRules:
"assign_owner_from_rule": True,
}
response = mail_api_client.post(
MAIL_RULES_ENDPOINT,
response = self.client.post(
self.ENDPOINT,
data=rule1,
)
assert response.status_code == status.HTTP_201_CREATED
self.assertEqual(response.status_code, status.HTTP_201_CREATED)
response = mail_api_client.get(MAIL_RULES_ENDPOINT)
response = self.client.get(self.ENDPOINT)
assert response.status_code == status.HTTP_200_OK
assert response.data["count"] == 1
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.assertEqual(response.data["count"], 1)
returned_rule1 = response.data["results"][0]
assert returned_rule1["name"] == rule1["name"]
assert returned_rule1["account"] == account1.pk
assert returned_rule1["folder"] == rule1["folder"]
assert returned_rule1["filter_from"] == rule1["filter_from"]
assert returned_rule1["filter_to"] == rule1["filter_to"]
assert returned_rule1["filter_subject"] == rule1["filter_subject"]
assert returned_rule1["filter_body"] == rule1["filter_body"]
assert (
returned_rule1["filter_attachment_filename_include"]
== rule1["filter_attachment_filename_include"]
self.assertEqual(returned_rule1["name"], rule1["name"])
self.assertEqual(returned_rule1["account"], account1.pk)
self.assertEqual(returned_rule1["folder"], rule1["folder"])
self.assertEqual(returned_rule1["filter_from"], rule1["filter_from"])
self.assertEqual(returned_rule1["filter_to"], rule1["filter_to"])
self.assertEqual(returned_rule1["filter_subject"], rule1["filter_subject"])
self.assertEqual(returned_rule1["filter_body"], rule1["filter_body"])
self.assertEqual(
returned_rule1["filter_attachment_filename_include"],
rule1["filter_attachment_filename_include"],
)
assert returned_rule1["maximum_age"] == rule1["maximum_age"]
assert returned_rule1["action"] == rule1["action"]
assert returned_rule1["assign_title_from"] == rule1["assign_title_from"]
assert (
returned_rule1["assign_correspondent_from"]
== rule1["assign_correspondent_from"]
self.assertEqual(returned_rule1["maximum_age"], rule1["maximum_age"])
self.assertEqual(returned_rule1["action"], rule1["action"])
self.assertEqual(
returned_rule1["assign_title_from"],
rule1["assign_title_from"],
)
assert returned_rule1["order"] == rule1["order"]
assert returned_rule1["attachment_type"] == rule1["attachment_type"]
assert returned_rule1["action_parameter"] == rule1["action_parameter"]
assert returned_rule1["assign_correspondent"] == rule1["assign_correspondent"]
assert returned_rule1["assign_document_type"] == rule1["assign_document_type"]
assert returned_rule1["assign_tags"] == rule1["assign_tags"]
assert (
returned_rule1["assign_owner_from_rule"] == rule1["assign_owner_from_rule"]
self.assertEqual(
returned_rule1["assign_correspondent_from"],
rule1["assign_correspondent_from"],
)
self.assertEqual(returned_rule1["order"], rule1["order"])
self.assertEqual(returned_rule1["attachment_type"], rule1["attachment_type"])
self.assertEqual(returned_rule1["action_parameter"], rule1["action_parameter"])
self.assertEqual(
returned_rule1["assign_correspondent"],
rule1["assign_correspondent"],
)
self.assertEqual(
returned_rule1["assign_document_type"],
rule1["assign_document_type"],
)
self.assertEqual(returned_rule1["assign_tags"], rule1["assign_tags"])
self.assertEqual(
returned_rule1["assign_owner_from_rule"],
rule1["assign_owner_from_rule"],
)
def test_delete_mail_rule(
self,
mail_api_client: APIClient,
) -> None:
def test_delete_mail_rule(self) -> None:
"""
GIVEN:
- Existing mail rule
@@ -457,21 +454,19 @@ class TestAPIMailRules:
THEN:
- Rule is deleted
"""
account1 = MailAccountFactory()
rule1 = MailRuleFactory(account=account1)
response = mail_api_client.delete(
f"{MAIL_RULES_ENDPOINT}{rule1.pk}/",
response = self.client.delete(
f"{self.ENDPOINT}{rule1.pk}/",
)
assert response.status_code == status.HTTP_204_NO_CONTENT
self.assertEqual(response.status_code, status.HTTP_204_NO_CONTENT)
assert MailRule.objects.count() == 0
self.assertEqual(len(MailRule.objects.all()), 0)
def test_update_mail_rule(
self,
mail_api_client: APIClient,
) -> None:
def test_update_mail_rule(self) -> None:
"""
GIVEN:
- Existing mail rule
@@ -480,33 +475,30 @@ class TestAPIMailRules:
THEN:
- The mail rule is updated
"""
account1 = MailAccountFactory()
rule1 = MailRuleFactory(account=account1)
response = mail_api_client.patch(
f"{MAIL_RULES_ENDPOINT}{rule1.pk}/",
response = self.client.patch(
f"{self.ENDPOINT}{rule1.pk}/",
data={
"name": "Updated Name 1",
"action": MailRule.MailAction.DELETE,
},
)
assert response.status_code == status.HTTP_200_OK
self.assertEqual(response.status_code, status.HTTP_200_OK)
returned_rule1 = MailRule.objects.get(pk=rule1.pk)
assert returned_rule1.name == "Updated Name 1"
assert returned_rule1.action == MailRule.MailAction.DELETE
self.assertEqual(returned_rule1.name, "Updated Name 1")
self.assertEqual(returned_rule1.action, MailRule.MailAction.DELETE)
def test_create_mail_rule_scopes_accounts(
self,
mail_api_client: APIClient,
django_user_model: type[User],
) -> None:
other_user = django_user_model.objects.create_user(username="mail-owner")
def test_create_mail_rule_forbidden_for_unpermitted_account(self) -> None:
other_user = User.objects.create_user(username="mail-owner")
foreign_account = MailAccountFactory(name="ForeignEmail", owner=other_user)
response = mail_api_client.post(
MAIL_RULES_ENDPOINT,
response = self.client.post(
self.ENDPOINT,
data={
"name": "Rule1",
"account": foreign_account.pk,
@@ -520,40 +512,19 @@ class TestAPIMailRules:
"attachment_type": MailRule.AttachmentProcessing.ATTACHMENTS_ONLY,
},
)
missing_response = mail_api_client.post(
MAIL_RULES_ENDPOINT,
data={
"name": "Rule1",
"account": foreign_account.pk + 1000,
"folder": "INBOX",
"filter_from": "from@example.com",
"maximum_age": 30,
"action": MailRule.MailAction.MARK_READ,
"assign_title_from": MailRule.TitleSource.FROM_SUBJECT,
"assign_correspondent_from": MailRule.CorrespondentSource.FROM_NOTHING,
"order": 0,
"attachment_type": MailRule.AttachmentProcessing.ATTACHMENTS_ONLY,
},
)
assert response.status_code == status.HTTP_400_BAD_REQUEST
assert missing_response.status_code == status.HTTP_400_BAD_REQUEST
assert response.data["account"][0].code == "does_not_exist"
assert missing_response.data["account"][0].code == "does_not_exist"
assert MailRule.objects.count() == 0
self.assertEqual(response.status_code, status.HTTP_403_FORBIDDEN)
self.assertEqual(MailRule.objects.count(), 0)
def test_create_mail_rule_allowed_for_granted_account_change_permission(
self,
mail_api_client: APIClient,
mail_api_user: User,
django_user_model: type[User],
) -> None:
other_user = django_user_model.objects.create_user(username="mail-owner")
other_user = User.objects.create_user(username="mail-owner")
foreign_account = MailAccountFactory(name="ForeignEmail", owner=other_user)
assign_perm("change_mailaccount", mail_api_user, foreign_account)
assign_perm("change_mailaccount", self.user, foreign_account)
response = mail_api_client.post(
MAIL_RULES_ENDPOINT,
response = self.client.post(
self.ENDPOINT,
data={
"name": "Rule1",
"account": foreign_account.pk,
@@ -568,34 +539,25 @@ class TestAPIMailRules:
},
)
assert response.status_code == status.HTTP_201_CREATED
assert MailRule.objects.get().account == foreign_account
self.assertEqual(response.status_code, status.HTTP_201_CREATED)
self.assertEqual(MailRule.objects.get().account, foreign_account)
def test_update_mail_rule_forbidden_for_unpermitted_account(
self,
mail_api_client: APIClient,
django_user_model: type[User],
) -> None:
def test_update_mail_rule_forbidden_for_unpermitted_account(self) -> None:
own_account = MailAccountFactory()
other_user = django_user_model.objects.create_user(username="mail-owner")
other_user = User.objects.create_user(username="mail-owner")
foreign_account = MailAccountFactory(owner=other_user)
rule1 = MailRuleFactory(account=own_account)
response = mail_api_client.patch(
f"{MAIL_RULES_ENDPOINT}{rule1.pk}/",
response = self.client.patch(
f"{self.ENDPOINT}{rule1.pk}/",
data={"account": foreign_account.pk},
)
assert response.status_code == status.HTTP_400_BAD_REQUEST
self.assertEqual(response.status_code, status.HTTP_403_FORBIDDEN)
rule1.refresh_from_db()
assert rule1.account == own_account
self.assertEqual(rule1.account, own_account)
def test_get_mail_rules_owner_aware(
self,
mail_api_client: APIClient,
mail_api_user: User,
django_user_model: type[User],
) -> None:
def test_get_mail_rules_owner_aware(self) -> None:
"""
GIVEN:
- Configured rules with different users
@@ -604,26 +566,24 @@ class TestAPIMailRules:
THEN:
- Only unowned, owned by user or granted mail rules are provided
"""
user2 = django_user_model.objects.create_user(username="temp_admin2")
user2 = User.objects.create_user(username="temp_admin2")
account1 = MailAccountFactory()
rule1 = MailRuleFactory(account=account1, order=0)
rule2 = MailRuleFactory(account=account1, order=1, owner=mail_api_user)
rule2 = MailRuleFactory(account=account1, order=1, owner=self.user)
MailRuleFactory(account=account1, order=2, owner=user2)
rule4 = MailRuleFactory(account=account1, order=3, owner=user2)
assign_perm("view_mailrule", mail_api_user, rule4)
assign_perm("view_mailrule", self.user, rule4)
response = mail_api_client.get(MAIL_RULES_ENDPOINT)
response = self.client.get(self.ENDPOINT)
assert response.status_code == status.HTTP_200_OK
assert response.data["count"] == 3
assert response.data["results"][0]["name"] == rule1.name
assert response.data["results"][1]["name"] == rule2.name
assert response.data["results"][2]["name"] == rule4.name
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.assertEqual(response.data["count"], 3)
self.assertEqual(response.data["results"][0]["name"], rule1.name)
self.assertEqual(response.data["results"][1]["name"], rule2.name)
self.assertEqual(response.data["results"][2]["name"], rule4.name)
def test_mailrule_maxage_validation(
self,
mail_api_client: APIClient,
) -> None:
def test_mailrule_maxage_validation(self) -> None:
"""
GIVEN:
- An existing mail account
@@ -651,24 +611,24 @@ class TestAPIMailRules:
"attachment_type": MailRule.AttachmentProcessing.ATTACHMENTS_ONLY,
}
response = mail_api_client.post(
MAIL_RULES_ENDPOINT,
data=rule_data,
format="json",
)
response = self.client.post(self.ENDPOINT, data=rule_data, format="json")
assert response.status_code == status.HTTP_400_BAD_REQUEST
assert "maximum_age" in response.data
self.assertEqual(response.status_code, status.HTTP_400_BAD_REQUEST)
self.assertIn("maximum_age", response.data)
@pytest.mark.django_db
class TestAPIProcessedMails:
def test_get_processed_mails_owner_aware(
self,
mail_api_client: APIClient,
mail_api_user: User,
django_user_model: type[User],
) -> None:
class TestAPIProcessedMails(DirectoriesMixin, APITestCase):
ENDPOINT = "/api/processed_mail/"
def setUp(self) -> None:
super().setUp()
self.user = User.objects.create_user(username="temp_admin")
self.user.user_permissions.add(*Permission.objects.all())
self.user.save()
self.client.force_authenticate(user=self.user)
def test_get_processed_mails_owner_aware(self) -> None:
"""
GIVEN:
- Configured processed mails with different users
@@ -677,31 +637,27 @@ class TestAPIProcessedMails:
THEN:
- Only unowned, owned by user or granted processed mails are provided
"""
user2 = django_user_model.objects.create_user(username="temp_admin2")
user2 = User.objects.create_user(username="temp_admin2")
rule = MailRuleFactory()
pm1 = ProcessedMailFactory(rule=rule)
pm2 = ProcessedMailFactory(
rule=rule,
status="FAILED",
error="err",
owner=mail_api_user,
owner=self.user,
)
ProcessedMailFactory(rule=rule, owner=user2)
pm4 = ProcessedMailFactory(rule=rule, owner=user2)
assign_perm("view_processedmail", mail_api_user, pm4)
assign_perm("view_processedmail", self.user, pm4)
response = mail_api_client.get(PROCESSED_MAIL_ENDPOINT)
response = self.client.get(self.ENDPOINT)
assert response.status_code == status.HTTP_200_OK
assert response.data["count"] == 3
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.assertEqual(response.data["count"], 3)
returned_ids = {r["id"] for r in response.data["results"]}
assert returned_ids == {pm1.id, pm2.id, pm4.id}
self.assertSetEqual(returned_ids, {pm1.id, pm2.id, pm4.id})
def test_get_processed_mails_filter_by_rule(
self,
mail_api_client: APIClient,
mail_api_user: User,
) -> None:
def test_get_processed_mails_filter_by_rule(self) -> None:
"""
GIVEN:
- Processed mails belonging to two different rules
@@ -713,22 +669,17 @@ class TestAPIProcessedMails:
account = MailAccountFactory()
rule1 = MailRuleFactory(account=account)
rule2 = MailRuleFactory(account=account)
pm1 = ProcessedMailFactory(rule=rule1, owner=mail_api_user)
pm1 = ProcessedMailFactory(rule=rule1, owner=self.user)
pm2 = ProcessedMailFactory(rule=rule1, status="FAILED", error="e")
ProcessedMailFactory(rule=rule2)
response = mail_api_client.get(f"{PROCESSED_MAIL_ENDPOINT}?rule={rule1.pk}")
response = self.client.get(f"{self.ENDPOINT}?rule={rule1.pk}")
assert response.status_code == status.HTTP_200_OK
self.assertEqual(response.status_code, status.HTTP_200_OK)
returned_ids = {r["id"] for r in response.data["results"]}
assert returned_ids == {pm1.id, pm2.id}
self.assertSetEqual(returned_ids, {pm1.id, pm2.id})
def test_bulk_delete_processed_mails(
self,
mail_api_client: APIClient,
mail_api_user: User,
django_user_model: type[User],
) -> None:
def test_bulk_delete_processed_mails(self) -> None:
"""
GIVEN:
- Processed mails belonging to two different rules and different users
@@ -737,7 +688,7 @@ class TestAPIProcessedMails:
THEN:
- Only the specified processed mails are deleted, respecting ownership and permissions
"""
user2 = django_user_model.objects.create_user(username="temp_admin2")
user2 = User.objects.create_user(username="temp_admin2")
rule = MailRuleFactory()
# unowned, owned by self, and one with explicit object perm
pm_unowned = ProcessedMailFactory(rule=rule)
@@ -745,47 +696,46 @@ class TestAPIProcessedMails:
rule=rule,
status="FAILED",
error="e",
owner=mail_api_user,
owner=self.user,
)
pm_granted = ProcessedMailFactory(rule=rule, owner=user2)
assign_perm("delete_processedmail", mail_api_user, pm_granted)
assign_perm("delete_processedmail", self.user, pm_granted)
pm_forbidden = ProcessedMailFactory(rule=rule, owner=user2)
# Success for allowed items
response = mail_api_client.post(
PROCESSED_MAIL_BULK_DELETE_ENDPOINT,
response = self.client.post(
f"{self.ENDPOINT}bulk_delete/",
data={
"mail_ids": [pm_unowned.id, pm_owned.id, pm_granted.id],
},
format="json",
)
assert response.status_code == status.HTTP_200_OK
assert response.data["result"] == "OK"
assert set(response.data["deleted_mail_ids"]) == {
pm_unowned.id,
pm_owned.id,
pm_granted.id,
}
assert not ProcessedMail.objects.filter(id=pm_unowned.id).exists()
assert not ProcessedMail.objects.filter(id=pm_owned.id).exists()
assert not ProcessedMail.objects.filter(id=pm_granted.id).exists()
assert ProcessedMail.objects.filter(id=pm_forbidden.id).exists()
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.assertEqual(response.data["result"], "OK")
self.assertSetEqual(
set(response.data["deleted_mail_ids"]),
{pm_unowned.id, pm_owned.id, pm_granted.id},
)
self.assertFalse(ProcessedMail.objects.filter(id=pm_unowned.id).exists())
self.assertFalse(ProcessedMail.objects.filter(id=pm_owned.id).exists())
self.assertFalse(ProcessedMail.objects.filter(id=pm_granted.id).exists())
self.assertTrue(ProcessedMail.objects.filter(id=pm_forbidden.id).exists())
# 403 and not deleted
response = mail_api_client.post(
PROCESSED_MAIL_BULK_DELETE_ENDPOINT,
response = self.client.post(
f"{self.ENDPOINT}bulk_delete/",
data={
"mail_ids": [pm_forbidden.id],
},
format="json",
)
assert response.status_code == status.HTTP_403_FORBIDDEN
assert ProcessedMail.objects.filter(id=pm_forbidden.id).exists()
self.assertEqual(response.status_code, status.HTTP_403_FORBIDDEN)
self.assertTrue(ProcessedMail.objects.filter(id=pm_forbidden.id).exists())
# missing mail_ids
response = mail_api_client.post(
PROCESSED_MAIL_BULK_DELETE_ENDPOINT,
response = self.client.post(
f"{self.ENDPOINT}bulk_delete/",
data={"mail_ids": "not-a-list"},
format="json",
)
assert response.status_code == status.HTTP_400_BAD_REQUEST
self.assertEqual(response.status_code, status.HTTP_400_BAD_REQUEST)

Some files were not shown because too many files have changed in this diff Show More