Compare commits

..

28 Commits

Author SHA1 Message Date
Trenton H
71c8077ca2 test(tasks): improve test_api_tasks.py structure and add api marker
- Move admin_client, v9_client, user_client fixtures to conftest.py so
  they can be reused by other API tests; all three now build on the
  rest_api_client fixture instead of creating APIClient() directly
- Move regular_user fixture to conftest.py (was already done, now also
  used by the new client fixtures)
- Add docstrings to every test method describing the behaviour under test
- Move timedelta/timezone imports to module level
- Register 'api' pytest marker in pyproject.toml and apply pytestmark to
  the entire file so all 40 tests are selectable via -m api

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 11:51:42 -07:00
Trenton H
fca565a169 test: fix remaining tests broken by task system redesign
Update all tests that created PaperlessTask objects with old field names
to use PaperlessTaskFactory and new field names (task_type, trigger_source,
status, result_message). Use apply_async instead of delay where mocked.
Drop TestCheckSanityTaskRecording — tests PaperlessTask creation that was
intentionally removed from check_sanity().

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 11:25:52 -07:00
Trenton H
9cbca02050 chore(tasks): remove django-celery-results
PaperlessTask now tracks all task results via Celery signals. The
django-celery-results DB backend was write-only -- nothing reads
from it. Drop the package and add a migration to clean up the
orphaned tables.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 10:55:04 -07:00
Trenton H
e00b8e7afa feat(tasks): update Angular types and service for task redesign
Replace PaperlessTaskName/PaperlessTaskType/PaperlessTaskStatus enums
with new PaperlessTaskType, PaperlessTaskTriggerSource, PaperlessTaskStatus
enums. Update PaperlessTask interface to new field names (task_type,
trigger_source, input_data, result_message, related_document_ids).
Update TasksService to filter by task_type instead of task_name.
Update tasks component and system-status-dialog to use new field names.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 10:51:44 -07:00
Trenton H
8e4ad33702 feat(tasks): update TaskAdmin for redesigned model
Add date_created, duration_seconds to list_display; add trigger_source
to list_filter; add input_data, duration_seconds, wait_time_seconds to
readonly_fields.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 10:36:19 -07:00
Trenton H
48db462c2c test(tasks): address code quality review findings
Remove trivial field-existence tests per project conventions. Fix
potentially flaky ordering test to use explicit date_created values.
Add is_complete=false filter test, v9 type filter input direction test,
and tighten TestActive second test to target REVOKED specifically.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 10:34:32 -07:00
Trenton H
e4f11d6cfa test(tasks): fix two spec gaps in task API test suite
Move test_list_is_owner_aware to TestGetTasksV10 (it tests GET /api/tasks/,
not acknowledge). Add test_related_document_ids_includes_duplicate_of to
cover the duplicate_of path in the related_document_ids property.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 10:19:11 -07:00
Trenton H
bedb965b84 test(tasks): rewrite API task tests for redesigned model and v9 compat
Replaces the old Django TestCase-based tests with pytest-style classes using
PaperlessTaskFactory. Covers v10 field names, v9 backwards-compat field
mapping, filtering, ordering, acknowledge, acknowledge_all, summary, active,
and run endpoints. Also adds PaperlessTaskFactory to factories.py and fixes
a redundant source= kwarg in TaskSerializerV10.related_document_ids.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 10:07:38 -07:00
Trenton H
901237962b fix(tasks): add read_only_fields to TaskSerializerV9, enforce admin via permission_classes on run action 2026-04-15 07:25:58 -07:00
Trenton H
c4794c0fe7 feat(tasks): update serializer, filter, and viewset with v9 backwards compat
- Replace TasksViewSerializer/RunTaskViewSerializer with TaskSerializerV10
  (new field names), TaskSerializerV9 (v9 compat), TaskSummarySerializer,
  and RunTaskSerializer
- Add AcknowledgeTasksViewSerializer unchanged (kept existing validation)
- Expand PaperlessTaskFilterSet with MultipleChoiceFilter for task_type,
  trigger_source, status; add is_complete, date_created_after/before filters
- Replace TasksViewSet.get_serializer_class() to branch on request.version
- Add get_queryset() v9 compat for task_name/type query params
- Add acknowledge_all, summary, active actions to TasksViewSet
- Rewrite run action to use apply_async with trigger_source header
- Add timedelta import to views.py; add MultipleChoiceFilter/DateTimeFilter
  to filters.py imports

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 06:58:42 -07:00
Trenton H
425c32122a feat(tasks): auto-inject trigger_source=scheduled header for all beat tasks
Inject `headers: {"trigger_source": "scheduled"}` into every Celery beat
schedule entry so signal handlers can identify scheduler-originated tasks
without per-task instrumentation.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 20:56:06 -07:00
Trenton H
f5c0834ce3 refactor(tasks): remove manual PaperlessTask creation and scheduled/auto params
All task records are now created exclusively via Celery signals (Task 2).
Removed PaperlessTask creation/update from train_classifier, sanity_check,
llmindex_index, and check_sanity. Removed scheduled= and auto= parameters
from all 7 call sites. Updated apply_async callers to use trigger_source
headers instead. Exceptions now propagate naturally from task functions.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 16:55:17 -07:00
Trenton H
bc561135e5 test(tasks): add traceback and revoked state coverage to signal tests 2026-04-14 16:19:23 -07:00
Trenton H
c306a4ffe7 feat(tasks): rewrite signal handlers to track all task types
Replace the old consume_file-only handler with a full rewrite that tracks
6 task types (consume_file, train_classifier, sanity_check, index_optimize,
llm_index, mail_fetch) with proper trigger source detection, input data
extraction, legacy result string parsing, duration/wait time recording,
and structured error capture on failure.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 15:41:45 -07:00
Trenton H
0780b7e506 feat(tasks): replace PaperlessTask model with structured redesign
Drop the old string-based PaperlessTask table and recreate it with
Status/TaskType/TriggerSource enums, JSONField result storage, and
duration tracking fields. Update all call sites to use the new API.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 15:29:38 -07:00
Trenton H
17c13c1a03 Breaking: Remove the positional arguments from the pre/post consume scripts. Environment has been preferred for quite some time now (#12573) 2026-04-14 21:18:20 +00:00
dependabot[bot]
1876e38540 Chore(deps): Bump follow-redirects (#12575)
Bumps the npm_and_yarn group with 1 update in the /src-ui directory: [follow-redirects](https://github.com/follow-redirects/follow-redirects).


Updates `follow-redirects` from 1.15.11 to 1.16.0
- [Release notes](https://github.com/follow-redirects/follow-redirects/releases)
- [Commits](https://github.com/follow-redirects/follow-redirects/compare/v1.15.11...v1.16.0)

---
updated-dependencies:
- dependency-name: follow-redirects
  dependency-version: 1.16.0
  dependency-type: indirect
  dependency-group: npm_and_yarn
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-04-14 20:24:03 +00:00
dependabot[bot]
b51c4572b1 Chore(deps): Bump pillow in the uv group across 1 directory (#12574)
Bumps the uv group with 1 update in the / directory: [pillow](https://github.com/python-pillow/Pillow).


Updates `pillow` from 12.1.1 to 12.2.0
- [Release notes](https://github.com/python-pillow/Pillow/releases)
- [Changelog](https://github.com/python-pillow/Pillow/blob/main/CHANGES.rst)
- [Commits](https://github.com/python-pillow/Pillow/compare/12.1.1...12.2.0)

---
updated-dependencies:
- dependency-name: pillow
  dependency-version: 12.2.0
  dependency-type: indirect
  dependency-group: uv
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-04-14 10:38:23 -07:00
dependabot[bot]
a5bb3b9f8e Chore(deps-dev): Bump pytest in the uv group across 1 directory (#12568)
Bumps the uv group with 1 update in the / directory: [pytest](https://github.com/pytest-dev/pytest).


Updates `pytest` from 9.0.2 to 9.0.3
- [Release notes](https://github.com/pytest-dev/pytest/releases)
- [Changelog](https://github.com/pytest-dev/pytest/blob/main/CHANGELOG.rst)
- [Commits](https://github.com/pytest-dev/pytest/compare/9.0.2...9.0.3)

---
updated-dependencies:
- dependency-name: pytest
  dependency-version: 9.0.3
  dependency-type: direct:development
  dependency-group: uv
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-04-14 08:42:39 -07:00
GitHub Actions
3ed7297939 Auto translate strings 2026-04-13 21:14:40 +00:00
Trenton H
3b6edcdd8e Chore: Add generic type params and update our baselines (#12566)
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-13 14:12:59 -07:00
GitHub Actions
b27d10646e Auto translate strings 2026-04-13 20:12:04 +00:00
Trenton H
8c1225e120 Fixes an N+1 query in matching with the version content fetching by prefetching versions (#12562) 2026-04-13 13:10:28 -07:00
Trenton H
54d5269145 Fix: Use an iterator in the sanity checking (#12563) 2026-04-13 12:32:22 -07:00
Trenton H
f5729811fe Chore: Upgrades Django manually, since dependabot is failing. Resolves security alerts (#12567) 2026-04-13 10:20:35 -07:00
shamoon
fdd5e3ecb2 Update SECURITY.md 2026-04-10 12:34:47 -07:00
shamoon
df3b656352 Add tests 2026-04-10 12:06:28 -07:00
shamoon
51e721733f Enhancement: validate and sanitize uploaded logos (#12551) 2026-04-10 11:50:58 -07:00
65 changed files with 10207 additions and 10303 deletions

View File

@@ -165,6 +165,7 @@ jobs:
contents: read
env:
DEFAULT_PYTHON: "3.12"
PAPERLESS_SECRET_KEY: "ci-typing-not-a-real-secret"
steps:
- name: Checkout
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2

View File

@@ -88,6 +88,7 @@ jobs:
uv export --quiet --no-dev --all-extras --format requirements-txt --output-file requirements.txt
- name: Compile messages
env:
PAPERLESS_SECRET_KEY: "ci-release-not-a-real-secret"
PYTHON_VERSION: ${{ steps.setup-python.outputs.python-version }}
run: |
cd src/
@@ -96,6 +97,7 @@ jobs:
manage.py compilemessages
- name: Collect static files
env:
PAPERLESS_SECRET_KEY: "ci-release-not-a-real-secret"
PYTHON_VERSION: ${{ steps.setup-python.outputs.python-version }}
run: |
cd src/

View File

@@ -36,6 +36,8 @@ jobs:
--group dev \
--frozen
- name: Generate backend translation strings
env:
PAPERLESS_SECRET_KEY: "ci-translate-not-a-real-secret"
run: cd src/ && uv run manage.py makemessages -l en_US -i "samples*"
- name: Install pnpm
uses: pnpm/action-setup@fc06bc1257f339d1d5d8b3a19a8cae5388b55320 # v5.0.0

3
.gitignore vendored
View File

@@ -79,6 +79,7 @@ virtualenv
/docker-compose.env
/docker-compose.yml
.ruff_cache/
.mypy_cache/
# Used for development
scripts/import-for-development
@@ -111,4 +112,6 @@ celerybeat-schedule*
# ignore pnpm package store folder created when setting up the devcontainer
.pnpm-store/
# Git worktree local folder
.worktrees

File diff suppressed because it is too large Load Diff

File diff suppressed because one or more lines are too long

View File

@@ -2,8 +2,83 @@
## Reporting a Vulnerability
The Paperless-ngx team and community take security bugs seriously. We appreciate your efforts to responsibly disclose your findings, and will make every effort to acknowledge your contributions.
The Paperless-ngx team and community take security issues seriously. We appreciate good-faith reports and will make every effort to review legitimate findings responsibly.
To report a security issue, please use the GitHub Security Advisory ["Report a Vulnerability"](https://github.com/paperless-ngx/paperless-ngx/security/advisories/new) tab.
The team will send a response indicating the next steps in handling your report. After the initial reply to your report, the security team will keep you informed of the progress towards a fix and full announcement, and may ask for additional information or guidance.
After the initial reply to your report, the team may ask for additional information, reproduction steps, affected versions, configuration details, or proof-of-concept material needed to verify the issue.
## What makes a helpful report
Please include as much of the following as possible:
- A clear description of the issue and why it is a security vulnerability.
- Affected Paperless-ngx version(s).
- Required configuration, permissions, or preconditions.
- Step-by-step reproduction instructions.
- A realistic impact statement.
- Logs, screenshots, requests, or a proof of concept where appropriate.
- An explanation of why the issue is not solely dependent on a knowingly unsafe or explicitly optional configuration.
Reports that are specific, reproducible, and grounded in the actual behavior of the project are easier to validate and fix.
## AI-assisted submissions
AI tools can be useful for security research, but they can also produce plausible-sounding reports that do not reflect actual vulnerabilities in Paperless-ngx.
If you use AI tools to help draft or investigate a report, please disclose this and verify the result yourself before submission. Make sure the report:
- identifies a real code path in this project
- includes a real reproduction, not just a theoretical concern
- demonstrates realistic impact
- accounts for existing permission checks, configuration requirements, and trust boundaries
- is about Paperless-ngx itself, rather than a hypothetical deployment, downstream customization, third-party app, scanner output, reverse proxy behavior, or external service unless you can show a direct product vulnerability
Submitting unverified AI-generated claims creates unnecessary review work and slows down handling of legitimate reports.
<!-- Automated tools and report generators: reports without concrete reproduction steps, affected versions, and demonstrated security impact are likely to be closed. -->
## Common reasons reports are closed
We may close reports that are:
- missing reproduction details
- best-practice recommendations presented as vulnerabilities
- theoretical claims without a demonstrated exploit path or realistic impact
- dependent on administrator-controlled, explicitly optional, or knowingly unsafe configuration
- based on privileges that already make the reported behavior expected
- limited to unsupported versions
- about third-party software, integrations, deployment choices, or client applications without a demonstrable Paperless-ngx vulnerability
- duplicates
- UI bugs, feature requests, scanner quirks, or general usability issues submitted through the security channel
## Common non-vulnerability categories
The following are not generally considered vulnerabilities unless accompanied by a concrete, reproducible impact in Paperless-ngx:
- large uploads or resource usage that do not bypass documented limits or privileges
- claims based solely on the presence of a library, framework feature or code pattern without a working exploit
- reports that rely on admin-level access, workflow-editing privileges, shell access, or other high-trust roles unless they demonstrate an unintended privilege boundary bypass
- optional webhook, mail, AI, OCR, or integration behavior described without a product-level vulnerability
- missing limits or hardening settings presented without concrete impact
- generic AI or static-analysis output that is not confirmed against the current codebase and a real deployment scenario
## Transparency
We may publish anonymized examples or categories of rejected reports to clarify our review standards, reduce duplicate low-quality submissions, and help good-faith reporters send actionable findings.
A mistaken report made in good faith is not misconduct. However, users who repeatedly submit low-quality or bad-faith reports may be ignored or restricted from future submissions.
## Scope and expectations
Please use the security reporting channel only for security vulnerabilities in Paperless-ngx.
Please do not use the security advisory system for:
- support questions
- general bug reports
- feature requests
- browser compatibility issues
- issues in third-party mobile apps, reverse proxies, or deployment tooling unless you can demonstrate a Paperless-ngx vulnerability
The team will review reports as time permits, but submission does not guarantee that a report is valid, in scope, or will result in a fix. Reports that do not describe a reproducible product-level issue may be closed without extended back-and-forth.

View File

@@ -241,3 +241,66 @@ For example:
}
}
```
## Consume Script Positional Arguments Removed
Pre- and post-consumption scripts no longer receive positional arguments. All information is
now passed exclusively via environment variables, which have been available since earlier versions.
### Pre-consumption script
Previously, the original file path was passed as `$1`. It is now only available as
`DOCUMENT_SOURCE_PATH`.
**Before:**
```bash
#!/usr/bin/env bash
# $1 was the original file path
process_document "$1"
```
**After:**
```bash
#!/usr/bin/env bash
process_document "${DOCUMENT_SOURCE_PATH}"
```
### Post-consumption script
Previously, document metadata was passed as positional arguments `$1` through `$8`:
| Argument | Environment Variable Equivalent |
| -------- | ------------------------------- |
| `$1` | `DOCUMENT_ID` |
| `$2` | `DOCUMENT_FILE_NAME` |
| `$3` | `DOCUMENT_SOURCE_PATH` |
| `$4` | `DOCUMENT_THUMBNAIL_PATH` |
| `$5` | `DOCUMENT_DOWNLOAD_URL` |
| `$6` | `DOCUMENT_THUMBNAIL_URL` |
| `$7` | `DOCUMENT_CORRESPONDENT` |
| `$8` | `DOCUMENT_TAGS` |
**Before:**
```bash
#!/usr/bin/env bash
DOCUMENT_ID=$1
CORRESPONDENT=$7
TAGS=$8
```
**After:**
```bash
#!/usr/bin/env bash
# Use environment variables directly
echo "Document ${DOCUMENT_ID} from ${DOCUMENT_CORRESPONDENT} tagged: ${DOCUMENT_TAGS}"
```
### Action Required
Update any pre- or post-consumption scripts that read `$1`, `$2`, etc. to use the
corresponding environment variables instead. Environment variables have been the preferred
option since v1.8.0.

View File

@@ -24,11 +24,10 @@ dependencies = [
"dateparser~=1.2",
# WARNING: django does not use semver.
# Only patch versions are guaranteed to not introduce breaking changes.
"django~=5.2.10",
"django~=5.2.13",
"django-allauth[mfa,socialaccount]~=65.15.0",
"django-auditlog~=3.4.1",
"django-cachalot~=2.9.0",
"django-celery-results~=2.6.0",
"django-compression-middleware~=0.5.0",
"django-cors-headers~=4.9.0",
"django-extensions~=4.1",
@@ -113,7 +112,7 @@ testing = [
"factory-boy~=3.3.1",
"faker~=40.12.0",
"imagehash",
"pytest~=9.0.0",
"pytest~=9.0.3",
"pytest-cov~=7.1.0",
"pytest-django~=4.12.0",
"pytest-env~=1.6.0",
@@ -312,6 +311,7 @@ markers = [
"date_parsing: Tests which cover date parsing from content or filename",
"management: Tests which cover management commands/functionality",
"search: Tests for the Tantivy search backend",
"api: Tests for REST API endpoints",
]
[tool.pytest_env]

8
src-ui/pnpm-lock.yaml generated
View File

@@ -4363,8 +4363,8 @@ packages:
flatted@3.4.2:
resolution: {integrity: sha512-PjDse7RzhcPkIJwy5t7KPWQSZ9cAbzQXcafsetQoD7sOJRQlGikNbx7yZp2OotDnJyrDcbyRq3Ttb18iYOqkxA==}
follow-redirects@1.15.11:
resolution: {integrity: sha512-deG2P0JfjrTxl50XGCDyfI97ZGVCxIpfKYmfyrQ54n5FO/0gfIES8C/Psl6kWVDolizcaaxZJnTS0QSMxvnsBQ==}
follow-redirects@1.16.0:
resolution: {integrity: sha512-y5rN/uOsadFT/JfYwhxRS5R7Qce+g3zG97+JrtFZlC9klX/W5hD7iiLzScI4nZqUS7DNUdhPgw4xI8W2LuXlUw==}
engines: {node: '>=4.0'}
peerDependencies:
debug: '*'
@@ -11427,7 +11427,7 @@ snapshots:
flatted@3.4.2: {}
follow-redirects@1.15.11(debug@4.4.3):
follow-redirects@1.16.0(debug@4.4.3):
optionalDependencies:
debug: 4.4.3
@@ -11634,7 +11634,7 @@ snapshots:
http-proxy@1.18.1(debug@4.4.3):
dependencies:
eventemitter3: 4.0.7
follow-redirects: 1.15.11(debug@4.4.3)
follow-redirects: 1.16.0(debug@4.4.3)
requires-port: 1.0.0
transitivePeerDependencies:
- debug

View File

@@ -76,33 +76,27 @@
<label class="form-check-label" for="task{{task.id}}"></label>
</div>
</td>
<td class="overflow-auto name-col">{{ task.task_file_name }}</td>
<td class="overflow-auto name-col">{{ task.input_data?.filename }}</td>
<td class="d-none d-lg-table-cell">{{ task.date_created | customDate:'short' }}</td>
@if (activeTab !== 'started' && activeTab !== 'queued') {
<td class="d-none d-lg-table-cell">
@if (task.result?.length > 50) {
@if (task.result_message?.length > 50) {
<div class="result" (click)="expandTask(task); $event.stopPropagation();"
[ngbPopover]="resultPopover" popoverClass="shadow small mobile" triggers="mouseenter:mouseleave" container="body">
<span class="small d-none d-md-inline-block font-monospace text-muted">{{ task.result | slice:0:50 }}&hellip;</span>
<span class="small d-none d-md-inline-block font-monospace text-muted">{{ task.result_message | slice:0:50 }}&hellip;</span>
</div>
}
@if (task.result?.length <= 50) {
<span class="small d-none d-md-inline-block font-monospace text-muted">{{ task.result }}</span>
@if (task.result_message?.length <= 50) {
<span class="small d-none d-md-inline-block font-monospace text-muted">{{ task.result_message }}</span>
}
<ng-template #resultPopover>
<pre class="small mb-0">{{ task.result | slice:0:300 }}@if (task.result.length > 300) {
<pre class="small mb-0">{{ task.result_message | slice:0:300 }}@if (task.result_message.length > 300) {
&hellip;
}</pre>
@if (task.result?.length > 300) {
@if (task.result_message?.length > 300) {
<br/><em>(<ng-container i18n>click for full output</ng-container>)</em>
}
</ng-template>
@if (task.duplicate_documents?.length > 0) {
<div class="small text-warning-emphasis d-flex align-items-center gap-1">
<i-bs class="lh-1" width="1em" height="1em" name="exclamation-triangle"></i-bs>
<span i18n>Duplicate(s) detected</span>
</div>
}
</td>
}
<td class="d-lg-none">
@@ -116,7 +110,7 @@
<i-bs name="check" class="me-1"></i-bs><ng-container i18n>Dismiss</ng-container>
</button>
<ng-container *pngxIfPermissions="{ action: PermissionAction.View, type: PermissionType.Document }">
@if (task.related_document) {
@if (task.related_document_ids?.[0]) {
<button class="btn btn-sm btn-outline-primary" (click)="dismissAndGo(task); $event.stopPropagation();">
<i-bs name="file-text" class="me-1"></i-bs><ng-container i18n>Open Document</ng-container>
</button>
@@ -127,7 +121,7 @@
</tr>
<tr>
<td class="p-0" [class.border-0]="expandedTask !== task.id" colspan="5">
<pre #collapse="ngbCollapse" [ngbCollapse]="expandedTask !== task.id" class="small mb-0"><div class="small p-1 p-lg-3 ms-lg-3">{{ task.result }}</div></pre>
<pre #collapse="ngbCollapse" [ngbCollapse]="expandedTask !== task.id" class="small mb-0"><div class="small p-1 p-lg-3 ms-lg-3">{{ task.result_message }}</div></pre>
</td>
</tr>
}

View File

@@ -20,8 +20,8 @@ import { throwError } from 'rxjs'
import { routes } from 'src/app/app-routing.module'
import {
PaperlessTask,
PaperlessTaskName,
PaperlessTaskStatus,
PaperlessTaskTriggerSource,
PaperlessTaskType,
} from 'src/app/data/paperless-task'
import { IfPermissionsDirective } from 'src/app/directives/if-permissions.directive'
@@ -39,81 +39,100 @@ const tasks: PaperlessTask[] = [
{
id: 467,
task_id: '11ca1a5b-9f81-442c-b2c8-7e4ae53657f1',
task_file_name: 'test.pdf',
input_data: { filename: 'test.pdf' },
date_created: new Date('2023-03-01T10:26:03.093116Z'),
date_done: new Date('2023-03-01T10:26:07.223048Z'),
type: PaperlessTaskType.Auto,
task_name: PaperlessTaskName.ConsumeFile,
status: PaperlessTaskStatus.Failed,
result: 'test.pd: Not consuming test.pdf: It is a duplicate of test (#100)',
task_type: PaperlessTaskType.ConsumeFile,
task_type_display: 'Consume File',
trigger_source: PaperlessTaskTriggerSource.FolderConsume,
trigger_source_display: 'Folder Consume',
status: PaperlessTaskStatus.Failure,
status_display: 'Failure',
result_message:
'test.pd: Not consuming test.pdf: It is a duplicate of test (#100)',
acknowledged: false,
related_document: null,
related_document_ids: [],
},
{
id: 466,
task_id: '10ca1a5b-3c08-442c-b2c8-7e4ae53657f1',
task_file_name: '191092.pdf',
input_data: { filename: '191092.pdf' },
date_created: new Date('2023-03-01T09:26:03.093116Z'),
date_done: new Date('2023-03-01T09:26:07.223048Z'),
type: PaperlessTaskType.Auto,
task_name: PaperlessTaskName.ConsumeFile,
status: PaperlessTaskStatus.Failed,
result:
task_type: PaperlessTaskType.ConsumeFile,
task_type_display: 'Consume File',
trigger_source: PaperlessTaskTriggerSource.FolderConsume,
trigger_source_display: 'Folder Consume',
status: PaperlessTaskStatus.Failure,
status_display: 'Failure',
result_message:
'191092.pd: Not consuming 191092.pdf: It is a duplicate of 191092 (#311)',
acknowledged: false,
related_document: null,
related_document_ids: [],
},
{
id: 465,
task_id: '3612d477-bb04-44e3-985b-ac580dd496d8',
task_file_name: 'Scan Jun 6, 2023 at 3.19 PM.pdf',
input_data: { filename: 'Scan Jun 6, 2023 at 3.19 PM.pdf' },
date_created: new Date('2023-06-06T15:22:05.722323-07:00'),
date_done: new Date('2023-06-06T15:22:14.564305-07:00'),
type: PaperlessTaskType.Auto,
task_name: PaperlessTaskName.ConsumeFile,
task_type: PaperlessTaskType.ConsumeFile,
task_type_display: 'Consume File',
trigger_source: PaperlessTaskTriggerSource.FolderConsume,
trigger_source_display: 'Folder Consume',
status: PaperlessTaskStatus.Pending,
result: null,
status_display: 'Pending',
result_message: null,
acknowledged: false,
related_document: null,
related_document_ids: [],
},
{
id: 464,
task_id: '2eac4716-2aa6-4dcd-9953-264e11656d7e',
task_file_name: 'paperless-mail-l4dkg8ir',
input_data: { filename: 'paperless-mail-l4dkg8ir' },
date_created: new Date('2023-06-04T11:24:32.898089-07:00'),
date_done: new Date('2023-06-04T11:24:44.678605-07:00'),
type: PaperlessTaskType.Auto,
task_name: PaperlessTaskName.ConsumeFile,
status: PaperlessTaskStatus.Complete,
result: 'Success. New document id 422 created',
task_type: PaperlessTaskType.ConsumeFile,
task_type_display: 'Consume File',
trigger_source: PaperlessTaskTriggerSource.EmailConsume,
trigger_source_display: 'Email Consume',
status: PaperlessTaskStatus.Success,
status_display: 'Success',
result_message: 'Success. New document id 422 created',
acknowledged: false,
related_document: 422,
related_document_ids: [422],
},
{
id: 463,
task_id: '28125528-1575-4d6b-99e6-168906e8fa5c',
task_file_name: 'onlinePaymentSummary.pdf',
input_data: { filename: 'onlinePaymentSummary.pdf' },
date_created: new Date('2023-06-01T13:49:51.631305-07:00'),
date_done: new Date('2023-06-01T13:49:54.190220-07:00'),
type: PaperlessTaskType.Auto,
task_name: PaperlessTaskName.ConsumeFile,
status: PaperlessTaskStatus.Complete,
result: 'Success. New document id 421 created',
task_type: PaperlessTaskType.ConsumeFile,
task_type_display: 'Consume File',
trigger_source: PaperlessTaskTriggerSource.FolderConsume,
trigger_source_display: 'Folder Consume',
status: PaperlessTaskStatus.Success,
status_display: 'Success',
result_message: 'Success. New document id 421 created',
acknowledged: false,
related_document: 421,
related_document_ids: [421],
},
{
id: 462,
task_id: 'a5b9ca47-0c8e-490f-a04c-6db5d5fc09e5',
task_file_name: 'paperless-mail-_rrpmqk6',
input_data: { filename: 'paperless-mail-_rrpmqk6' },
date_created: new Date('2023-06-07T02:54:35.694916Z'),
date_done: null,
type: PaperlessTaskType.Auto,
task_name: PaperlessTaskName.ConsumeFile,
task_type: PaperlessTaskType.ConsumeFile,
task_type_display: 'Consume File',
trigger_source: PaperlessTaskTriggerSource.EmailConsume,
trigger_source_display: 'Email Consume',
status: PaperlessTaskStatus.Started,
result: null,
status_display: 'Started',
result_message: null,
acknowledged: false,
related_document: null,
related_document_ids: [],
},
]
@@ -167,7 +186,7 @@ describe('TasksComponent', () => {
fixture.detectChanges()
httpTestingController
.expectOne(
`${environment.apiBaseUrl}tasks/?task_name=consume_file&acknowledged=false`
`${environment.apiBaseUrl}tasks/?task_type=consume_file&acknowledged=false`
)
.flush(tasks)
})
@@ -176,7 +195,7 @@ describe('TasksComponent', () => {
const tabButtons = fixture.debugElement.queryAll(By.directive(NgbNavItem))
let currentTasksLength = tasks.filter(
(t) => t.status === PaperlessTaskStatus.Failed
(t) => t.status === PaperlessTaskStatus.Failure
).length
component.activeTab = TaskTab.Failed
fixture.detectChanges()
@@ -188,7 +207,7 @@ describe('TasksComponent', () => {
).toHaveLength(currentTasksLength + 1)
currentTasksLength = tasks.filter(
(t) => t.status === PaperlessTaskStatus.Complete
(t) => t.status === PaperlessTaskStatus.Success
).length
component.activeTab = TaskTab.Completed
fixture.detectChanges()
@@ -308,7 +327,7 @@ describe('TasksComponent', () => {
expect(component.selectedTasks).toEqual(
new Set(
tasks
.filter((t) => t.status === PaperlessTaskStatus.Failed)
.filter((t) => t.status === PaperlessTaskStatus.Failure)
.map((t) => t.id)
)
)
@@ -322,7 +341,7 @@ describe('TasksComponent', () => {
component.dismissAndGo(tasks[3])
expect(routerSpy).toHaveBeenCalledWith([
'documents',
tasks[3].related_document,
tasks[3].related_document_ids?.[0],
])
})

View File

@@ -175,7 +175,7 @@ export class TasksComponent
dismissAndGo(task: PaperlessTask) {
this.dismissTask(task)
this.router.navigate(['documents', task.related_document])
this.router.navigate(['documents', task.related_document_ids?.[0]])
}
expandTask(task: PaperlessTask) {
@@ -207,11 +207,13 @@ export class TasksComponent
if (this._filterText.length) {
tasks = tasks.filter((t) => {
if (this.filterTargetID == TaskFilterTargetID.Name) {
return t.task_file_name
.toLowerCase()
return (t.input_data?.filename as string)
?.toLowerCase()
.includes(this._filterText.toLowerCase())
} else if (this.filterTargetID == TaskFilterTargetID.Result) {
return t.result.toLowerCase().includes(this._filterText.toLowerCase())
return t.result_message
?.toLowerCase()
.includes(this._filterText.toLowerCase())
}
})
}

View File

@@ -169,10 +169,10 @@
}
</button>
@if (currentUserIsSuperUser) {
@if (isRunning(PaperlessTaskName.IndexOptimize)) {
@if (isRunning(PaperlessTaskType.IndexOptimize)) {
<div class="spinner-border spinner-border-sm ms-2" role="status"></div>
} @else {
<button class="btn btn-sm d-flex align-items-center btn-dark small ms-2" (click)="runTask(PaperlessTaskName.IndexOptimize)">
<button class="btn btn-sm d-flex align-items-center btn-dark small ms-2" (click)="runTask(PaperlessTaskType.IndexOptimize)">
<i-bs name="play-fill" class="me-1"></i-bs>
<ng-container i18n>Run Task</ng-container>
</button>
@@ -203,10 +203,10 @@
}
</button>
@if (currentUserIsSuperUser) {
@if (isRunning(PaperlessTaskName.TrainClassifier)) {
@if (isRunning(PaperlessTaskType.TrainClassifier)) {
<div class="spinner-border spinner-border-sm ms-2" role="status"></div>
} @else {
<button class="btn btn-sm d-flex align-items-center btn-dark small ms-2" (click)="runTask(PaperlessTaskName.TrainClassifier)">
<button class="btn btn-sm d-flex align-items-center btn-dark small ms-2" (click)="runTask(PaperlessTaskType.TrainClassifier)">
<i-bs name="play-fill" class="me-1"></i-bs>
<ng-container i18n>Run Task</ng-container>
</button>
@@ -237,10 +237,10 @@
}
</button>
@if (currentUserIsSuperUser) {
@if (isRunning(PaperlessTaskName.SanityCheck)) {
@if (isRunning(PaperlessTaskType.SanityCheck)) {
<div class="spinner-border spinner-border-sm ms-2" role="status"></div>
} @else {
<button class="btn btn-sm d-flex align-items-center btn-dark small ms-2" (click)="runTask(PaperlessTaskName.SanityCheck)">
<button class="btn btn-sm d-flex align-items-center btn-dark small ms-2" (click)="runTask(PaperlessTaskType.SanityCheck)">
<i-bs name="play-fill" class="me-1"></i-bs>
<ng-container i18n>Run Task</ng-container>
</button>
@@ -285,10 +285,10 @@
}
</button>
@if (currentUserIsSuperUser) {
@if (isRunning(PaperlessTaskName.LLMIndexUpdate)) {
@if (isRunning(PaperlessTaskType.LlmIndex)) {
<div class="spinner-border spinner-border-sm ms-2" role="status"></div>
} @else {
<button class="btn btn-sm d-flex align-items-center btn-dark small ms-2" (click)="runTask(PaperlessTaskName.LLMIndexUpdate)">
<button class="btn btn-sm d-flex align-items-center btn-dark small ms-2" (click)="runTask(PaperlessTaskType.LlmIndex)">
<i-bs name="play-fill" class="me-1"></i-bs>
<ng-container i18n>Run Task</ng-container>
</button>

View File

@@ -25,7 +25,7 @@ import {
import { NgbActiveModal } from '@ng-bootstrap/ng-bootstrap'
import { NgxBootstrapIconsModule, allIcons } from 'ngx-bootstrap-icons'
import { Subject, of, throwError } from 'rxjs'
import { PaperlessTaskName } from 'src/app/data/paperless-task'
import { PaperlessTaskType } from 'src/app/data/paperless-task'
import {
InstallType,
SystemStatus,
@@ -138,9 +138,9 @@ describe('SystemStatusDialogComponent', () => {
})
it('should check if task is running', () => {
component.runTask(PaperlessTaskName.IndexOptimize)
expect(component.isRunning(PaperlessTaskName.IndexOptimize)).toBeTruthy()
expect(component.isRunning(PaperlessTaskName.SanityCheck)).toBeFalsy()
component.runTask(PaperlessTaskType.IndexOptimize)
expect(component.isRunning(PaperlessTaskType.IndexOptimize)).toBeTruthy()
expect(component.isRunning(PaperlessTaskType.SanityCheck)).toBeFalsy()
})
it('should support running tasks, refresh status and show toasts', () => {
@@ -151,22 +151,22 @@ describe('SystemStatusDialogComponent', () => {
// fail first
runSpy.mockReturnValue(throwError(() => new Error('error')))
component.runTask(PaperlessTaskName.IndexOptimize)
expect(runSpy).toHaveBeenCalledWith(PaperlessTaskName.IndexOptimize)
component.runTask(PaperlessTaskType.IndexOptimize)
expect(runSpy).toHaveBeenCalledWith(PaperlessTaskType.IndexOptimize)
expect(toastErrorSpy).toHaveBeenCalledWith(
`Failed to start task ${PaperlessTaskName.IndexOptimize}, see the logs for more details`,
`Failed to start task ${PaperlessTaskType.IndexOptimize}, see the logs for more details`,
expect.any(Error)
)
// succeed
runSpy.mockReturnValue(of({}))
getStatusSpy.mockReturnValue(of(status))
component.runTask(PaperlessTaskName.IndexOptimize)
expect(runSpy).toHaveBeenCalledWith(PaperlessTaskName.IndexOptimize)
component.runTask(PaperlessTaskType.IndexOptimize)
expect(runSpy).toHaveBeenCalledWith(PaperlessTaskType.IndexOptimize)
expect(getStatusSpy).toHaveBeenCalled()
expect(toastSpy).toHaveBeenCalledWith(
`Task ${PaperlessTaskName.IndexOptimize} started`
`Task ${PaperlessTaskType.IndexOptimize} started`
)
})

View File

@@ -8,7 +8,7 @@ import {
} from '@ng-bootstrap/ng-bootstrap'
import { NgxBootstrapIconsModule } from 'ngx-bootstrap-icons'
import { Subject, takeUntil } from 'rxjs'
import { PaperlessTaskName } from 'src/app/data/paperless-task'
import { PaperlessTaskType } from 'src/app/data/paperless-task'
import {
SystemStatus,
SystemStatusItemStatus,
@@ -49,14 +49,14 @@ export class SystemStatusDialogComponent implements OnInit, OnDestroy {
private settingsService = inject(SettingsService)
public SystemStatusItemStatus = SystemStatusItemStatus
public PaperlessTaskName = PaperlessTaskName
public PaperlessTaskType = PaperlessTaskType
public status: SystemStatus
public frontendVersion: string = environment.version
public versionMismatch: boolean = false
public copied: boolean = false
private runningTasks: Set<PaperlessTaskName> = new Set()
private runningTasks: Set<PaperlessTaskType> = new Set()
private unsubscribeNotifier: Subject<any> = new Subject()
get currentUserIsSuperUser(): boolean {
@@ -107,11 +107,11 @@ export class SystemStatusDialogComponent implements OnInit, OnDestroy {
return now.getTime() - date.getTime() > hours * 60 * 60 * 1000
}
public isRunning(taskName: PaperlessTaskName): boolean {
public isRunning(taskName: PaperlessTaskType): boolean {
return this.runningTasks.has(taskName)
}
public runTask(taskName: PaperlessTaskName) {
public runTask(taskName: PaperlessTaskType) {
this.runningTasks.add(taskName)
this.toastService.showInfo(`Task ${taskName} started`)
this.tasksService.run(taskName).subscribe({

View File

@@ -43,7 +43,7 @@
</div>
<p class="card-text">
@if (document) {
@if (hasSearchHighlights) {
@if (document.__search_hit__ && document.__search_hit__.highlights) {
<span [innerHtml]="document.__search_hit__.highlights"></span>
}
@for (highlight of searchNoteHighlights; track highlight) {
@@ -52,7 +52,7 @@
<span [innerHtml]="highlight"></span>
</span>
}
@if (shouldShowContentFallback) {
@if (!document.__search_hit__?.score) {
<span class="result-content">{{contentTrimmed}}</span>
}
} @else {

View File

@@ -127,19 +127,6 @@ describe('DocumentCardLargeComponent', () => {
expect(component.searchNoteHighlights).toContain('<span>bananas</span>')
})
it('should fall back to document content when a search hit has no highlights', () => {
component.document.__search_hit__ = {
score: 0.9,
rank: 1,
highlights: '',
note_highlights: null,
}
fixture.detectChanges()
expect(fixture.nativeElement.textContent).toContain('Cupcake ipsum')
expect(component.shouldShowContentFallback).toBe(true)
})
it('should try to close the preview on mouse leave', () => {
component.popupPreview = {
close: jest.fn(),

View File

@@ -164,17 +164,6 @@ export class DocumentCardLargeComponent
)
}
get hasSearchHighlights() {
return Boolean(this.document?.__search_hit__?.highlights?.trim()?.length)
}
get shouldShowContentFallback() {
return (
this.document?.__search_hit__?.score == null ||
(!this.hasSearchHighlights && this.searchNoteHighlights.length === 0)
)
}
get notesEnabled(): boolean {
return this.settingsService.get(SETTINGS_KEYS.NOTES_ENABLED)
}

View File

@@ -1,49 +1,63 @@
import { Document } from './document'
import { ObjectWithId } from './object-with-id'
export enum PaperlessTaskType {
Auto = 'auto_task',
ScheduledTask = 'scheduled_task',
ManualTask = 'manual_task',
}
export enum PaperlessTaskName {
ConsumeFile = 'consume_file',
TrainClassifier = 'train_classifier',
SanityCheck = 'check_sanity',
SanityCheck = 'sanity_check',
IndexOptimize = 'index_optimize',
LLMIndexUpdate = 'llmindex_update',
IndexRebuild = 'index_rebuild',
MailFetch = 'mail_fetch',
LlmIndex = 'llm_index',
}
export enum PaperlessTaskTriggerSource {
Scheduled = 'scheduled',
WebUI = 'web_ui',
ApiUpload = 'api_upload',
FolderConsume = 'folder_consume',
EmailConsume = 'email_consume',
System = 'system',
Manual = 'manual',
}
export enum PaperlessTaskStatus {
Pending = 'PENDING',
Started = 'STARTED',
Complete = 'SUCCESS',
Failed = 'FAILURE',
Pending = 'pending',
Started = 'started',
Success = 'success',
Failure = 'failure',
Revoked = 'revoked',
}
export interface PaperlessTask extends ObjectWithId {
type: PaperlessTaskType
status: PaperlessTaskStatus
acknowledged: boolean
task_id: string
task_file_name: string
task_name: PaperlessTaskName
task_type: PaperlessTaskType
task_type_display: string
trigger_source: PaperlessTaskTriggerSource
trigger_source_display: string
status: PaperlessTaskStatus
status_display: string
date_created: Date
date_started?: Date
date_done?: Date
result?: string
related_document?: number
duplicate_documents?: Document[]
duration_seconds?: number
wait_time_seconds?: number
input_data: Record<string, unknown>
result_data?: Record<string, unknown>
result_message?: string
related_document_ids: number[]
acknowledged: boolean
owner?: number
}
export interface PaperlessTaskSummary {
task_type: PaperlessTaskType
total_count: number
pending_count: number
success_count: number
failure_count: number
avg_duration_seconds: number | null
avg_wait_time_seconds: number | null
last_run: Date | null
last_success: Date | null
last_failure: Date | null
}

View File

@@ -5,11 +5,7 @@ import {
} from '@angular/common/http/testing'
import { TestBed } from '@angular/core/testing'
import { environment } from 'src/environments/environment'
import {
PaperlessTaskName,
PaperlessTaskStatus,
PaperlessTaskType,
} from '../data/paperless-task'
import { PaperlessTaskStatus, PaperlessTaskType } from '../data/paperless-task'
import { TasksService } from './tasks.service'
describe('TasksService', () => {
@@ -37,7 +33,7 @@ describe('TasksService', () => {
it('calls tasks api endpoint on reload', () => {
tasksService.reload()
const req = httpTestingController.expectOne(
`${environment.apiBaseUrl}tasks/?task_name=consume_file&acknowledged=false`
`${environment.apiBaseUrl}tasks/?task_type=consume_file&acknowledged=false`
)
expect(req.request.method).toEqual('GET')
})
@@ -46,7 +42,7 @@ describe('TasksService', () => {
tasksService.loading = true
tasksService.reload()
httpTestingController.expectNone(
`${environment.apiBaseUrl}tasks/?task_name=consume_file&acknowledged=false`
`${environment.apiBaseUrl}tasks/?task_type=consume_file&acknowledged=false`
)
})
@@ -63,7 +59,7 @@ describe('TasksService', () => {
// reload is then called
httpTestingController
.expectOne(
`${environment.apiBaseUrl}tasks/?task_name=consume_file&acknowledged=false`
`${environment.apiBaseUrl}tasks/?task_type=consume_file&acknowledged=false`
)
.flush([])
})
@@ -72,56 +68,56 @@ describe('TasksService', () => {
expect(tasksService.total).toEqual(0)
const mockTasks = [
{
type: PaperlessTaskType.Auto,
task_name: PaperlessTaskName.ConsumeFile,
status: PaperlessTaskStatus.Complete,
task_type: PaperlessTaskType.ConsumeFile,
status: PaperlessTaskStatus.Success,
acknowledged: false,
task_id: '1234',
task_file_name: 'file1.pdf',
input_data: { filename: 'file1.pdf' },
date_created: new Date(),
related_document_ids: [],
},
{
type: PaperlessTaskType.Auto,
task_name: PaperlessTaskName.ConsumeFile,
status: PaperlessTaskStatus.Failed,
task_type: PaperlessTaskType.ConsumeFile,
status: PaperlessTaskStatus.Failure,
acknowledged: false,
task_id: '1235',
task_file_name: 'file2.pdf',
input_data: { filename: 'file2.pdf' },
date_created: new Date(),
related_document_ids: [],
},
{
type: PaperlessTaskType.Auto,
task_name: PaperlessTaskName.ConsumeFile,
task_type: PaperlessTaskType.ConsumeFile,
status: PaperlessTaskStatus.Pending,
acknowledged: false,
task_id: '1236',
task_file_name: 'file3.pdf',
input_data: { filename: 'file3.pdf' },
date_created: new Date(),
related_document_ids: [],
},
{
type: PaperlessTaskType.Auto,
task_name: PaperlessTaskName.ConsumeFile,
task_type: PaperlessTaskType.ConsumeFile,
status: PaperlessTaskStatus.Started,
acknowledged: false,
task_id: '1237',
task_file_name: 'file4.pdf',
input_data: { filename: 'file4.pdf' },
date_created: new Date(),
related_document_ids: [],
},
{
type: PaperlessTaskType.Auto,
task_name: PaperlessTaskName.ConsumeFile,
status: PaperlessTaskStatus.Complete,
task_type: PaperlessTaskType.ConsumeFile,
status: PaperlessTaskStatus.Success,
acknowledged: false,
task_id: '1238',
task_file_name: 'file5.pdf',
input_data: { filename: 'file5.pdf' },
date_created: new Date(),
related_document_ids: [],
},
]
tasksService.reload()
const req = httpTestingController.expectOne(
`${environment.apiBaseUrl}tasks/?task_name=consume_file&acknowledged=false`
`${environment.apiBaseUrl}tasks/?task_type=consume_file&acknowledged=false`
)
req.flush(mockTasks)
@@ -134,9 +130,9 @@ describe('TasksService', () => {
})
it('supports running tasks', () => {
tasksService.run(PaperlessTaskName.SanityCheck).subscribe((res) => {
tasksService.run(PaperlessTaskType.SanityCheck).subscribe((res) => {
expect(res).toEqual({
result: 'success',
task_id: 'abc-123',
})
})
const req = httpTestingController.expectOne(
@@ -144,7 +140,7 @@ describe('TasksService', () => {
)
expect(req.request.method).toEqual('POST')
req.flush({
result: 'success',
task_id: 'abc-123',
})
})
})

View File

@@ -4,8 +4,8 @@ import { Observable, Subject } from 'rxjs'
import { first, takeUntil, tap } from 'rxjs/operators'
import {
PaperlessTask,
PaperlessTaskName,
PaperlessTaskStatus,
PaperlessTaskType,
} from 'src/app/data/paperless-task'
import { environment } from 'src/environments/environment'
@@ -18,7 +18,7 @@ export class TasksService {
private baseUrl: string = environment.apiBaseUrl
private endpoint: string = 'tasks'
public loading: boolean
public loading: boolean = false
private fileTasks: PaperlessTask[] = []
@@ -33,21 +33,27 @@ export class TasksService {
}
public get queuedFileTasks(): PaperlessTask[] {
return this.fileTasks.filter((t) => t.status == PaperlessTaskStatus.Pending)
return this.fileTasks.filter(
(t) => t.status === PaperlessTaskStatus.Pending
)
}
public get startedFileTasks(): PaperlessTask[] {
return this.fileTasks.filter((t) => t.status == PaperlessTaskStatus.Started)
return this.fileTasks.filter(
(t) => t.status === PaperlessTaskStatus.Started
)
}
public get completedFileTasks(): PaperlessTask[] {
return this.fileTasks.filter(
(t) => t.status == PaperlessTaskStatus.Complete
(t) => t.status === PaperlessTaskStatus.Success
)
}
public get failedFileTasks(): PaperlessTask[] {
return this.fileTasks.filter((t) => t.status == PaperlessTaskStatus.Failed)
return this.fileTasks.filter(
(t) => t.status === PaperlessTaskStatus.Failure
)
}
public reload() {
@@ -56,18 +62,16 @@ export class TasksService {
this.http
.get<PaperlessTask[]>(
`${this.baseUrl}${this.endpoint}/?task_name=consume_file&acknowledged=false`
`${this.baseUrl}${this.endpoint}/?task_type=${PaperlessTaskType.ConsumeFile}&acknowledged=false`
)
.pipe(takeUntil(this.unsubscribeNotifer), first())
.subscribe((r) => {
this.fileTasks = r.filter(
(t) => t.task_name == PaperlessTaskName.ConsumeFile
)
this.fileTasks = r
this.loading = false
})
}
public dismissTasks(task_ids: Set<number>) {
public dismissTasks(task_ids: Set<number>): Observable<any> {
return this.http
.post(`${this.baseUrl}tasks/acknowledge/`, {
tasks: [...task_ids],
@@ -81,16 +85,24 @@ export class TasksService {
)
}
public dismissAllTasks(): Observable<any> {
return this.http.post(`${this.baseUrl}tasks/acknowledge_all/`, {}).pipe(
first(),
takeUntil(this.unsubscribeNotifer),
tap(() => {
this.reload()
})
)
}
public cancelPending(): void {
this.unsubscribeNotifer.next(true)
}
public run(taskName: PaperlessTaskName): Observable<any> {
return this.http.post<any>(
public run(taskType: PaperlessTaskType): Observable<{ task_id: string }> {
return this.http.post<{ task_id: string }>(
`${environment.apiBaseUrl}${this.endpoint}/run/`,
{
task_name: taskName,
}
{ task_type: taskType }
)
}
}

View File

@@ -144,18 +144,30 @@ class StoragePathAdmin(GuardedModelAdmin):
class TaskAdmin(admin.ModelAdmin):
list_display = ("task_id", "task_file_name", "task_name", "date_done", "status")
list_filter = ("status", "date_done", "task_name")
search_fields = ("task_name", "task_id", "status", "task_file_name")
list_display = (
"task_id",
"task_type",
"trigger_source",
"status",
"date_created",
"date_done",
"duration_seconds",
)
list_filter = ("status", "task_type", "trigger_source", "date_done")
search_fields = ("task_id", "task_type", "status")
readonly_fields = (
"task_id",
"task_file_name",
"task_name",
"task_type",
"trigger_source",
"status",
"date_created",
"date_started",
"date_done",
"result",
"duration_seconds",
"wait_time_seconds",
"input_data",
"result_data",
"result_message",
)

View File

@@ -313,7 +313,6 @@ class ConsumerPlugin(
run_subprocess(
[
settings.PRE_CONSUME_SCRIPT,
original_file_path,
],
script_env,
self.log,
@@ -383,14 +382,6 @@ class ConsumerPlugin(
run_subprocess(
[
settings.POST_CONSUME_SCRIPT,
str(document.pk),
document.get_public_filename(),
os.path.normpath(document.source_path),
os.path.normpath(document.thumbnail_path),
reverse("document-download", kwargs={"pk": document.pk}),
reverse("document-thumb", kwargs={"pk": document.pk}),
str(document.correspondent),
str(",".join(document.tags.all().values_list("name", flat=True))),
],
script_env,
self.log,
@@ -650,6 +641,10 @@ class ConsumerPlugin(
# If we get here, it was successful. Proceed with post-consume
# hooks. If they fail, nothing will get changed.
document = Document.objects.prefetch_related("versions").get(
pk=document.pk,
)
document_consumption_finished.send(
sender=self.__class__,
document=document,

View File

@@ -26,8 +26,10 @@ from django.db.models.functions import Cast
from django.utils.translation import gettext_lazy as _
from django_filters import DateFilter
from django_filters.rest_framework import BooleanFilter
from django_filters.rest_framework import DateTimeFilter
from django_filters.rest_framework import Filter
from django_filters.rest_framework import FilterSet
from django_filters.rest_framework import MultipleChoiceFilter
from drf_spectacular.utils import extend_schema_field
from guardian.utils import get_group_obj_perms_model
from guardian.utils import get_user_obj_perms_model
@@ -862,18 +864,56 @@ class ShareLinkBundleFilterSet(FilterSet):
class PaperlessTaskFilterSet(FilterSet):
task_type = MultipleChoiceFilter(
choices=PaperlessTask.TaskType.choices,
label="Task Type",
)
trigger_source = MultipleChoiceFilter(
choices=PaperlessTask.TriggerSource.choices,
label="Trigger Source",
)
status = MultipleChoiceFilter(
choices=PaperlessTask.Status.choices,
label="Status",
)
is_complete = BooleanFilter(
method="filter_is_complete",
label="Is Complete",
)
acknowledged = BooleanFilter(
label="Acknowledged",
field_name="acknowledged",
)
date_created_after = DateTimeFilter(
field_name="date_created",
lookup_expr="gte",
label="Created After",
)
date_created_before = DateTimeFilter(
field_name="date_created",
lookup_expr="lte",
label="Created Before",
)
class Meta:
model = PaperlessTask
fields = {
"type": ["exact"],
"task_name": ["exact"],
"status": ["exact"],
}
fields = ["task_type", "trigger_source", "status", "acknowledged", "owner"]
def filter_is_complete(self, queryset, name, value):
complete = [
PaperlessTask.Status.SUCCESS,
PaperlessTask.Status.FAILURE,
PaperlessTask.Status.REVOKED,
]
if value:
return queryset.filter(status__in=complete)
return queryset.exclude(status__in=complete)
class ObjectOwnedOrGrantedPermissionsFilter(ObjectPermissionsFilter):

View File

@@ -22,7 +22,6 @@ class Command(PaperlessCommand):
self.buffered_logging("paperless.classifier"),
):
train_classifier(
scheduled=False,
status_callback=lambda msg: self.console.print(f" {msg}"),
)

View File

@@ -17,7 +17,6 @@ class Command(PaperlessCommand):
def handle(self, *args: Any, **options: Any) -> None:
llmindex_index(
rebuild=options["command"] == "rebuild",
scheduled=False,
iter_wrapper=lambda docs: self.track(
docs,
description="Indexing documents...",

View File

@@ -111,7 +111,6 @@ class Command(PaperlessCommand):
def handle(self, *args: Any, **options: Any) -> None:
messages = check_sanity(
scheduled=False,
iter_wrapper=lambda docs: self.track(
docs,
description="Checking documents...",

View File

@@ -0,0 +1,213 @@
"""
Drop and recreate the PaperlessTask table with the new structured schema.
We intentionally drop all existing task data -- the old schema was
string-based and incompatible with the new JSONField result storage.
"""
import django.db.models.deletion
import django.utils.timezone
from django.conf import settings
from django.db import migrations
from django.db import models
class Migration(migrations.Migration):
dependencies = [
("documents", "0018_saved_view_simple_search_rules"),
migrations.swappable_dependency(settings.AUTH_USER_MODEL),
]
operations = [
migrations.DeleteModel(name="PaperlessTask"),
migrations.CreateModel(
name="PaperlessTask",
fields=[
(
"id",
models.AutoField(
auto_created=True,
primary_key=True,
serialize=False,
verbose_name="ID",
),
),
(
"owner",
models.ForeignKey(
blank=True,
default=None,
null=True,
on_delete=django.db.models.deletion.SET_NULL,
to=settings.AUTH_USER_MODEL,
verbose_name="owner",
),
),
(
"task_id",
models.CharField(
help_text="Celery task ID",
max_length=255,
unique=True,
verbose_name="Task ID",
),
),
(
"task_type",
models.CharField(
choices=[
("consume_file", "Consume File"),
("train_classifier", "Train Classifier"),
("sanity_check", "Sanity Check"),
("index_optimize", "Index Optimize"),
("index_rebuild", "Index Rebuild"),
("mail_fetch", "Mail Fetch"),
("llm_index", "LLM Index"),
],
db_index=True,
help_text="The kind of work being performed",
max_length=50,
verbose_name="Task Type",
),
),
(
"trigger_source",
models.CharField(
choices=[
("scheduled", "Scheduled"),
("web_ui", "Web UI"),
("api_upload", "API Upload"),
("folder_consume", "Folder Consume"),
("email_consume", "Email Consume"),
("system", "System"),
("manual", "Manual"),
],
db_index=True,
help_text="What initiated this task",
max_length=50,
verbose_name="Trigger Source",
),
),
(
"status",
models.CharField(
choices=[
("pending", "Pending"),
("started", "Started"),
("success", "Success"),
("failure", "Failure"),
("revoked", "Revoked"),
],
db_index=True,
default="pending",
max_length=30,
verbose_name="Status",
),
),
(
"date_created",
models.DateTimeField(
db_index=True,
default=django.utils.timezone.now,
verbose_name="Created",
),
),
(
"date_started",
models.DateTimeField(
blank=True,
null=True,
verbose_name="Started",
),
),
(
"date_done",
models.DateTimeField(
blank=True,
db_index=True,
null=True,
verbose_name="Completed",
),
),
(
"duration_seconds",
models.FloatField(
blank=True,
help_text="Elapsed time from start to completion",
null=True,
verbose_name="Duration (seconds)",
),
),
(
"wait_time_seconds",
models.FloatField(
blank=True,
help_text="Time from task creation to worker pickup",
null=True,
verbose_name="Wait Time (seconds)",
),
),
(
"input_data",
models.JSONField(
blank=True,
default=dict,
help_text="Structured input parameters for the task",
verbose_name="Input Data",
),
),
(
"result_data",
models.JSONField(
blank=True,
help_text="Structured result data from task execution",
null=True,
verbose_name="Result Data",
),
),
(
"result_message",
models.TextField(
blank=True,
help_text="Human-readable result message",
null=True,
verbose_name="Result Message",
),
),
(
"acknowledged",
models.BooleanField(
db_index=True,
default=False,
verbose_name="Acknowledged",
),
),
],
options={
"verbose_name": "Task",
"verbose_name_plural": "Tasks",
"ordering": ["-date_created"],
},
),
migrations.AddIndex(
model_name="paperlesstask",
index=models.Index(
fields=["status", "date_created"],
name="documents_p_status_8aa687_idx",
),
),
migrations.AddIndex(
model_name="paperlesstask",
index=models.Index(
fields=["task_type", "status"],
name="documents_p_task_ty_e4a93f_idx",
),
),
migrations.AddIndex(
model_name="paperlesstask",
index=models.Index(
fields=["owner", "acknowledged", "date_created"],
name="documents_p_owner_i_62c545_idx",
),
),
]

View File

@@ -0,0 +1,22 @@
from django.db import migrations
class Migration(migrations.Migration):
dependencies = [
("documents", "0019_task_system_redesign"),
]
operations = [
migrations.RunSQL(
sql="DROP TABLE IF EXISTS django_celery_results_taskresult;",
reverse_sql=migrations.RunSQL.noop,
),
migrations.RunSQL(
sql="DROP TABLE IF EXISTS django_celery_results_groupresult;",
reverse_sql=migrations.RunSQL.noop,
),
migrations.RunSQL(
sql="DROP TABLE IF EXISTS django_celery_results_chordcounter;",
reverse_sql=migrations.RunSQL.noop,
),
]

View File

@@ -3,7 +3,6 @@ from pathlib import Path
from typing import Final
import pathvalidate
from celery import states
from django.conf import settings
from django.contrib.auth.models import Group
from django.contrib.auth.models import User
@@ -381,7 +380,10 @@ class Document(SoftDeleteModel, ModelWithOwner): # type: ignore[django-manager-
if isinstance(prefetched_cache, dict)
else None
)
if prefetched_versions:
if prefetched_versions is not None:
# Empty list means prefetch ran and found no versions — use own content.
if not prefetched_versions:
return self.content
latest_prefetched = max(prefetched_versions, key=lambda doc: doc.id)
return latest_prefetched.content
@@ -660,97 +662,170 @@ class UiSettings(models.Model):
class PaperlessTask(ModelWithOwner):
ALL_STATES = sorted(states.ALL_STATES)
TASK_STATE_CHOICES = sorted(zip(ALL_STATES, ALL_STATES))
"""
Tracks background task execution for user visibility and debugging.
State transitions:
PENDING -> STARTED -> SUCCESS
PENDING -> STARTED -> FAILURE
PENDING -> REVOKED (if cancelled before starting)
"""
class Status(models.TextChoices):
PENDING = "pending", _("Pending")
STARTED = "started", _("Started")
SUCCESS = "success", _("Success")
FAILURE = "failure", _("Failure")
REVOKED = "revoked", _("Revoked")
class TaskType(models.TextChoices):
AUTO = ("auto_task", _("Auto Task"))
SCHEDULED_TASK = ("scheduled_task", _("Scheduled Task"))
MANUAL_TASK = ("manual_task", _("Manual Task"))
CONSUME_FILE = "consume_file", _("Consume File")
TRAIN_CLASSIFIER = "train_classifier", _("Train Classifier")
SANITY_CHECK = "sanity_check", _("Sanity Check")
INDEX_OPTIMIZE = "index_optimize", _("Index Optimize")
INDEX_REBUILD = "index_rebuild", _("Index Rebuild")
MAIL_FETCH = "mail_fetch", _("Mail Fetch")
LLM_INDEX = "llm_index", _("LLM Index")
class TaskName(models.TextChoices):
CONSUME_FILE = ("consume_file", _("Consume File"))
TRAIN_CLASSIFIER = ("train_classifier", _("Train Classifier"))
CHECK_SANITY = ("check_sanity", _("Check Sanity"))
INDEX_OPTIMIZE = ("index_optimize", _("Index Optimize"))
LLMINDEX_UPDATE = ("llmindex_update", _("LLM Index Update"))
class TriggerSource(models.TextChoices):
SCHEDULED = "scheduled", _("Scheduled") # Celery beat
WEB_UI = "web_ui", _("Web UI") # Document uploaded via web
API_UPLOAD = "api_upload", _("API Upload") # Document uploaded via API
FOLDER_CONSUME = "folder_consume", _("Folder Consume") # Consume folder
EMAIL_CONSUME = "email_consume", _("Email Consume") # Email attachment
SYSTEM = (
"system",
_("System"),
) # Auto-triggered by system (self-heal, config side-effect)
MANUAL = "manual", _("Manual") # User explicitly ran via /api/tasks/run/
# Identification
task_id = models.CharField(
max_length=255,
unique=True,
verbose_name=_("Task ID"),
help_text=_("Celery ID for the Task that was run"),
help_text=_("Celery task ID"),
)
acknowledged = models.BooleanField(
default=False,
verbose_name=_("Acknowledged"),
help_text=_("If the task is acknowledged via the frontend or API"),
task_type = models.CharField(
max_length=50,
choices=TaskType.choices,
verbose_name=_("Task Type"),
help_text=_("The kind of work being performed"),
db_index=True,
)
task_file_name = models.CharField(
null=True,
max_length=255,
verbose_name=_("Task Filename"),
help_text=_("Name of the file which the Task was run for"),
)
task_name = models.CharField(
null=True,
max_length=255,
choices=TaskName.choices,
verbose_name=_("Task Name"),
help_text=_("Name of the task that was run"),
trigger_source = models.CharField(
max_length=50,
choices=TriggerSource.choices,
verbose_name=_("Trigger Source"),
help_text=_("What initiated this task"),
db_index=True,
)
# State tracking
status = models.CharField(
max_length=30,
default=states.PENDING,
choices=TASK_STATE_CHOICES,
verbose_name=_("Task State"),
help_text=_("Current state of the task being run"),
choices=Status.choices,
default=Status.PENDING,
verbose_name=_("Status"),
db_index=True,
)
# Timestamps
date_created = models.DateTimeField(
null=True,
default=timezone.now,
verbose_name=_("Created DateTime"),
help_text=_("Datetime field when the task result was created in UTC"),
verbose_name=_("Created"),
db_index=True,
)
date_started = models.DateTimeField(
null=True,
default=None,
verbose_name=_("Started DateTime"),
help_text=_("Datetime field when the task was started in UTC"),
blank=True,
verbose_name=_("Started"),
)
date_done = models.DateTimeField(
null=True,
default=None,
verbose_name=_("Completed DateTime"),
help_text=_("Datetime field when the task was completed in UTC"),
blank=True,
verbose_name=_("Completed"),
db_index=True,
)
result = models.TextField(
# Duration fields -- populated by task_postrun signal handler
duration_seconds = models.FloatField(
null=True,
default=None,
verbose_name=_("Result Data"),
help_text=_(
"The data returned by the task",
),
blank=True,
verbose_name=_("Duration (seconds)"),
help_text=_("Elapsed time from start to completion"),
)
type = models.CharField(
max_length=30,
choices=TaskType.choices,
default=TaskType.AUTO,
verbose_name=_("Task Type"),
help_text=_("The type of task that was run"),
wait_time_seconds = models.FloatField(
null=True,
blank=True,
verbose_name=_("Wait Time (seconds)"),
help_text=_("Time from task creation to worker pickup"),
)
# Input/Output data
input_data = models.JSONField(
default=dict,
blank=True,
verbose_name=_("Input Data"),
help_text=_("Structured input parameters for the task"),
)
result_data = models.JSONField(
null=True,
blank=True,
verbose_name=_("Result Data"),
help_text=_("Structured result data from task execution"),
)
result_message = models.TextField(
null=True,
blank=True,
verbose_name=_("Result Message"),
help_text=_("Human-readable result message"),
)
# Acknowledgment
acknowledged = models.BooleanField(
default=False,
verbose_name=_("Acknowledged"),
db_index=True,
)
class Meta:
verbose_name = _("Task")
verbose_name_plural = _("Tasks")
ordering = ["-date_created"]
indexes = [
models.Index(fields=["status", "date_created"]),
models.Index(fields=["task_type", "status"]),
models.Index(fields=["owner", "acknowledged", "date_created"]),
]
def __str__(self) -> str:
return f"Task {self.task_id}"
return f"{self.get_task_type_display()} [{self.task_id[:8]}]"
@property
def is_complete(self) -> bool:
return self.status in (
self.Status.SUCCESS,
self.Status.FAILURE,
self.Status.REVOKED,
)
@property
def related_document_ids(self) -> list[int]:
if not self.result_data:
return []
if doc_id := self.result_data.get("document_id"):
return [doc_id]
if dup_id := self.result_data.get("duplicate_of"):
return [dup_id]
return []
class Note(SoftDeleteModel):

View File

@@ -10,7 +10,6 @@ is an identity function that adds no overhead.
"""
import logging
import uuid
from collections import defaultdict
from collections.abc import Iterator
from pathlib import Path
@@ -18,12 +17,9 @@ from typing import TYPE_CHECKING
from typing import Final
from typing import TypedDict
from celery import states
from django.conf import settings
from django.utils import timezone
from documents.models import Document
from documents.models import PaperlessTask
from documents.utils import IterWrapper
from documents.utils import compute_checksum
from documents.utils import identity
@@ -182,8 +178,9 @@ def _check_thumbnail(
present_files: set[Path],
) -> None:
"""Verify the thumbnail exists and is readable."""
thumbnail_path: Final[Path] = Path(doc.thumbnail_path).resolve()
if not thumbnail_path.exists() or not thumbnail_path.is_file():
# doc.thumbnail_path already returns a resolved Path; no need to re-resolve.
thumbnail_path: Final[Path] = doc.thumbnail_path
if not thumbnail_path.is_file():
messages.error(doc.pk, "Thumbnail of document does not exist.")
return
@@ -200,8 +197,9 @@ def _check_original(
present_files: set[Path],
) -> None:
"""Verify the original file exists, is readable, and has matching checksum."""
source_path: Final[Path] = Path(doc.source_path).resolve()
if not source_path.exists() or not source_path.is_file():
# doc.source_path already returns a resolved Path; no need to re-resolve.
source_path: Final[Path] = doc.source_path
if not source_path.is_file():
messages.error(doc.pk, "Original of document does not exist.")
return
@@ -237,8 +235,9 @@ def _check_archive(
elif doc.has_archive_version:
if TYPE_CHECKING:
assert isinstance(doc.archive_path, Path)
archive_path: Final[Path] = Path(doc.archive_path).resolve()
if not archive_path.exists() or not archive_path.is_file():
# doc.archive_path already returns a resolved Path; no need to re-resolve.
archive_path: Final[Path] = doc.archive_path # type: ignore[assignment]
if not archive_path.is_file():
messages.error(doc.pk, "Archived version of document does not exist.")
return
@@ -284,59 +283,33 @@ def _check_document(
def check_sanity(
*,
scheduled: bool = True,
iter_wrapper: IterWrapper[Document] = identity,
) -> SanityCheckMessages:
"""Run a full sanity check on the document archive.
Args:
scheduled: Whether this is a scheduled (automatic) or manual check.
Controls the task type recorded in the database.
iter_wrapper: A callable that wraps the document iterable, e.g.,
for progress bar display. Defaults to identity (no wrapping).
Returns:
A SanityCheckMessages instance containing all detected issues.
"""
paperless_task = PaperlessTask.objects.create(
task_id=uuid.uuid4(),
type=(
PaperlessTask.TaskType.SCHEDULED_TASK
if scheduled
else PaperlessTask.TaskType.MANUAL_TASK
),
task_name=PaperlessTask.TaskName.CHECK_SANITY,
status=states.STARTED,
date_created=timezone.now(),
date_started=timezone.now(),
)
messages = SanityCheckMessages()
present_files = _build_present_files()
documents = Document.global_objects.all()
documents = Document.global_objects.only(
"pk",
"filename",
"mime_type",
"checksum",
"archive_checksum",
"archive_filename",
"content",
).iterator(chunk_size=500)
for doc in iter_wrapper(documents):
_check_document(doc, messages, present_files)
for extra_file in present_files:
messages.warning(None, f"Orphaned file in media dir: {extra_file}")
paperless_task.status = states.SUCCESS if not messages.has_error else states.FAILURE
if messages.total_issue_count == 0:
paperless_task.result = "No issues found."
else:
parts: list[str] = []
if messages.document_error_count:
parts.append(f"{messages.document_error_count} document(s) with errors")
if messages.document_warning_count:
parts.append(f"{messages.document_warning_count} document(s) with warnings")
if messages.global_warning_count:
parts.append(f"{messages.global_warning_count} global warning(s)")
paperless_task.result = ", ".join(parts) + " found."
if messages.has_error:
paperless_task.result += " Check logs for details."
paperless_task.date_done = timezone.now()
paperless_task.save(update_fields=["status", "result", "date_done"])
return messages

View File

@@ -1,6 +1,6 @@
from documents.search._backend import SearchHit
from documents.search._backend import SearchIndexLockError
from documents.search._backend import SearchMode
from documents.search._backend import SearchResults
from documents.search._backend import TantivyBackend
from documents.search._backend import TantivyRelevanceList
from documents.search._backend import WriteBatch
@@ -10,9 +10,9 @@ from documents.search._schema import needs_rebuild
from documents.search._schema import wipe_index
__all__ = [
"SearchHit",
"SearchIndexLockError",
"SearchMode",
"SearchResults",
"TantivyBackend",
"TantivyRelevanceList",
"WriteBatch",

View File

@@ -1,13 +1,12 @@
from __future__ import annotations
import logging
import re
import threading
from collections import Counter
from dataclasses import dataclass
from datetime import UTC
from datetime import datetime
from enum import StrEnum
from html import escape
from typing import TYPE_CHECKING
from typing import Self
from typing import TypedDict
@@ -55,36 +54,6 @@ class SearchMode(StrEnum):
TITLE = "title"
def _render_snippet_html(snippet: tantivy.Snippet) -> str:
fragment = snippet.fragment()
highlighted = sorted(snippet.highlighted(), key=lambda r: r.start)
if not highlighted:
return escape(fragment)
parts: list[str] = []
cursor = 0
fragment_len = len(fragment)
for highlight in highlighted:
start = max(0, min(fragment_len, highlight.start))
end = max(start, min(fragment_len, highlight.end))
if end <= cursor:
continue
if start > cursor:
parts.append(escape(fragment[cursor:start]))
parts.append(f'<span class="match">{escape(fragment[start:end])}</span>')
cursor = end
if cursor < fragment_len:
parts.append(escape(fragment[cursor:]))
return "".join(parts)
def _extract_autocomplete_words(text_sources: list[str]) -> set[str]:
"""Extract and normalize words for autocomplete.
@@ -119,63 +88,45 @@ class SearchHit(TypedDict):
highlights: dict[str, str]
@dataclass(frozen=True, slots=True)
class SearchResults:
"""
Container for search results with pagination metadata.
Attributes:
hits: List of search results with scores and highlights
total: Total matching documents across all pages (for pagination)
query: Preprocessed query string after date/syntax rewriting
"""
hits: list[SearchHit]
total: int # total matching documents (for pagination)
query: str # preprocessed query string
class TantivyRelevanceList:
"""
DRF-compatible list wrapper for Tantivy search results.
DRF-compatible list wrapper for Tantivy search hits.
Holds a lightweight ordered list of IDs (for pagination count and
``selection_data``) together with a small page of rich ``SearchHit``
dicts (for serialization). DRF's ``PageNumberPagination`` calls
``__len__`` to compute the total page count and ``__getitem__`` to
slice the displayed page.
Provides paginated access to search results while storing all hits in memory
for efficient ID retrieval. Used by Django REST framework for pagination.
Args:
ordered_ids: All matching document IDs in display order.
page_hits: Rich SearchHit dicts for the requested DRF page only.
page_offset: Index into *ordered_ids* where *page_hits* starts.
Methods:
__len__: Returns total hit count for pagination calculations
__getitem__: Slices the hit list for page-specific results
Note: Stores ALL post-filter hits so get_all_result_ids() can return
every matching document ID without requiring a second search query.
"""
def __init__(
self,
ordered_ids: list[int],
page_hits: list[SearchHit],
page_offset: int = 0,
) -> None:
self._ordered_ids = ordered_ids
self._page_hits = page_hits
self._page_offset = page_offset
def __init__(self, hits: list[SearchHit]) -> None:
self._hits = hits
def __len__(self) -> int:
return len(self._ordered_ids)
return len(self._hits)
def __getitem__(self, key: int | slice) -> SearchHit | list[SearchHit]:
if isinstance(key, int):
idx = key if key >= 0 else len(self._ordered_ids) + key
if self._page_offset <= idx < self._page_offset + len(self._page_hits):
return self._page_hits[idx - self._page_offset]
return SearchHit(
id=self._ordered_ids[key],
score=0.0,
rank=idx + 1,
highlights={},
)
start = key.start or 0
stop = key.stop or len(self._ordered_ids)
# DRF slices to extract the current page. If the slice aligns
# with our pre-fetched page_hits, return them directly.
# We only check start — DRF always slices with stop=start+page_size,
# which exceeds page_hits length on the last page.
if start == self._page_offset:
return self._page_hits[: stop - start]
# Fallback: return stub dicts (no highlights).
return [
SearchHit(id=doc_id, score=0.0, rank=start + i + 1, highlights={})
for i, doc_id in enumerate(self._ordered_ids[key])
]
def get_all_ids(self) -> list[int]:
"""Return all matching document IDs in display order."""
return self._ordered_ids
def __getitem__(self, key: slice) -> list[SearchHit]:
return self._hits[key]
class SearchIndexLockError(Exception):
@@ -255,13 +206,10 @@ class WriteBatch:
"""
Remove a document from the batch by its primary key.
Uses range_query instead of term_query to work around a tantivy-py bug
where Python integers are inferred as i64, producing Terms that never
match u64 fields.
TODO: Replace with term_query("id", doc_id) once
https://github.com/quickwit-oss/tantivy-py/pull/642 lands.
Uses range query instead of term query to work around unsigned integer
type detection bug in tantivy-py 0.25.
"""
# Use range query to work around u64 deletion bug
self._writer.delete_documents_by_query(
tantivy.Query.range_query(
self._backend._schema,
@@ -286,34 +234,6 @@ class TantivyBackend:
the underlying index directory changes (e.g., during test isolation).
"""
# Maps DRF ordering field names to Tantivy index field names.
SORT_FIELD_MAP: dict[str, str] = {
"title": "title_sort",
"correspondent__name": "correspondent_sort",
"document_type__name": "type_sort",
"created": "created",
"added": "added",
"modified": "modified",
"archive_serial_number": "asn",
"page_count": "page_count",
"num_notes": "num_notes",
}
# Fields where Tantivy's sort order matches the ORM's sort order.
# Text-based fields (title, correspondent__name, document_type__name)
# are excluded because Tantivy's tokenized fast fields produce different
# ordering than the ORM's collation-based ordering.
SORTABLE_FIELDS: frozenset[str] = frozenset(
{
"created",
"added",
"modified",
"archive_serial_number",
"page_count",
"num_notes",
},
)
def __init__(self, path: Path | None = None):
# path=None → in-memory index (for tests)
# path=some_dir → on-disk index (for production)
@@ -352,36 +272,6 @@ class TantivyBackend:
if self._index is None:
self.open() # pragma: no cover
def _parse_query(
self,
query: str,
search_mode: SearchMode,
) -> tantivy.Query:
"""Parse a user query string into a Tantivy Query object."""
tz = get_current_timezone()
if search_mode is SearchMode.TEXT:
return parse_simple_text_query(self._index, query)
elif search_mode is SearchMode.TITLE:
return parse_simple_title_query(self._index, query)
else:
return parse_user_query(self._index, query, tz)
def _apply_permission_filter(
self,
query: tantivy.Query,
user: AbstractBaseUser | None,
) -> tantivy.Query:
"""Wrap a query with a permission filter if the user is not a superuser."""
if user is not None:
permission_filter = build_permission_filter(self._schema, user)
return tantivy.Query.boolean_query(
[
(tantivy.Occur.Must, query),
(tantivy.Occur.Must, permission_filter),
],
)
return query
def _build_tantivy_doc(
self,
document: Document,
@@ -436,17 +326,12 @@ class TantivyBackend:
doc.add_unsigned("tag_id", tag.pk)
tag_names.append(tag.name)
# Notes — JSON for structured queries (notes.user:alice, notes.note:text).
# notes_text is a plain-text companion for snippet/highlight generation;
# tantivy's SnippetGenerator does not support JSON fields.
# Notes — JSON for structured queries (notes.user:alice, notes.note:text),
# companion text field for default full-text search.
num_notes = 0
note_texts: list[str] = []
for note in document.notes.all():
num_notes += 1
doc.add_json("notes", {"note": note.note, "user": note.user.username})
note_texts.append(note.note)
if note_texts:
doc.add_text("notes_text", " ".join(note_texts))
# Custom fields — JSON for structured queries (custom_fields.name:x, custom_fields.value:y),
# companion text field for default full-text search.
@@ -540,127 +425,155 @@ class TantivyBackend:
with self.batch_update(lock_timeout=5.0) as batch:
batch.remove(doc_id)
def highlight_hits(
def search(
self,
query: str,
doc_ids: list[int],
user: AbstractBaseUser | None,
page: int,
page_size: int,
sort_field: str | None,
*,
sort_reverse: bool,
search_mode: SearchMode = SearchMode.QUERY,
rank_start: int = 1,
) -> list[SearchHit]:
) -> SearchResults:
"""
Generate SearchHit dicts with highlights for specific document IDs.
Execute a search query against the document index.
Unlike search(), this does not execute a ranked query — it looks up
each document by ID and generates snippets against the provided query.
Use this when you already know which documents to display (from
search_ids + ORM filtering) and just need highlight data.
Processes the user query through date rewriting, normalization, and
permission filtering before executing against Tantivy. Supports both
relevance-based and field-based sorting.
Note: Each doc_id requires an individual index lookup because tantivy-py
does not yet expose a batch fast-field read API. This is acceptable for
page-sized batches (typically 25 docs) but should not be called with
thousands of IDs.
TODO: When https://github.com/quickwit-oss/tantivy-py/pull/641 lands,
the per-doc range_query lookups here can be replaced with a single
collect_u64_fast_field("id", doc_addresses) call.
QUERY search mode supports natural date keywords, field filters, etc.
TITLE search mode treats the query as plain text to search for in title only
TEXT search mode treats the query as plain text to search for in title and content
Args:
query: The search query (used for snippet generation)
doc_ids: Ordered list of document IDs to generate hits for
search_mode: Query parsing mode (for building the snippet query)
rank_start: Starting rank value (1-based absolute position in the
full result set; pass ``page_offset + 1`` for paginated calls)
query: User's search query
user: User for permission filtering (None for superuser/no filtering)
page: Page number (1-indexed) for pagination
page_size: Number of results per page
sort_field: Field to sort by (None for relevance ranking)
sort_reverse: Whether to reverse the sort order
search_mode: "query" for advanced Tantivy syntax, "text" for
plain-text search over title and content only, "title" for
plain-text search over title only
Returns:
List of SearchHit dicts in the same order as doc_ids
SearchResults with hits, total count, and processed query
"""
if not doc_ids:
return []
self._ensure_open()
user_query = self._parse_query(query, search_mode)
tz = get_current_timezone()
if search_mode is SearchMode.TEXT:
user_query = parse_simple_text_query(self._index, query)
elif search_mode is SearchMode.TITLE:
user_query = parse_simple_title_query(self._index, query)
else:
user_query = parse_user_query(self._index, query, tz)
# For notes_text snippet generation, we need a query that targets the
# notes_text field directly. user_query may contain JSON-field terms
# (e.g. notes.note:urgent) that the SnippetGenerator cannot resolve
# against a text field. Strip field:value prefixes so bare terms like
# "urgent" are re-parsed against notes_text, producing highlights even
# when the original query used structured syntax.
bare_query = re.sub(r"\w[\w.]*:", "", query).strip()
try:
notes_text_query = (
self._index.parse_query(bare_query, ["notes_text"])
if bare_query
else user_query
)
except Exception:
notes_text_query = user_query
searcher = self._index.searcher()
snippet_generator = None
notes_snippet_generator = None
hits: list[SearchHit] = []
for rank, doc_id in enumerate(doc_ids, start=rank_start):
# Look up document by ID, scoring against the user query so that
# the returned SearchHit carries a real BM25 relevance score.
id_query = tantivy.Query.range_query(
self._schema,
"id",
tantivy.FieldType.Unsigned,
doc_id,
doc_id,
)
scored_query = tantivy.Query.boolean_query(
# Apply permission filter if user is not None (not superuser)
if user is not None:
permission_filter = build_permission_filter(self._schema, user)
final_query = tantivy.Query.boolean_query(
[
(tantivy.Occur.Must, user_query),
(tantivy.Occur.Must, id_query),
(tantivy.Occur.Must, permission_filter),
],
)
results = searcher.search(scored_query, limit=1)
else:
final_query = user_query
if not results.hits:
continue
searcher = self._index.searcher()
offset = (page - 1) * page_size
score, doc_address = results.hits[0]
# Map sort fields
sort_field_map = {
"title": "title_sort",
"correspondent__name": "correspondent_sort",
"document_type__name": "type_sort",
"created": "created",
"added": "added",
"modified": "modified",
"archive_serial_number": "asn",
"page_count": "page_count",
"num_notes": "num_notes",
}
# Perform search
if sort_field and sort_field in sort_field_map:
mapped_field = sort_field_map[sort_field]
results = searcher.search(
final_query,
limit=offset + page_size,
order_by_field=mapped_field,
order=tantivy.Order.Desc if sort_reverse else tantivy.Order.Asc,
)
# Field sorting: hits are still (score, DocAddress) tuples; score unused
all_hits = [(hit[1], 0.0) for hit in results.hits]
else:
# Score-based search: hits are (score, DocAddress) tuples
results = searcher.search(final_query, limit=offset + page_size)
all_hits = [(hit[1], hit[0]) for hit in results.hits]
total = results.count
# Normalize scores for score-based searches
if not sort_field and all_hits:
max_score = max(hit[1] for hit in all_hits) or 1.0
all_hits = [(hit[0], hit[1] / max_score) for hit in all_hits]
# Apply threshold filter if configured (score-based search only)
threshold = settings.ADVANCED_FUZZY_SEARCH_THRESHOLD
if threshold is not None and not sort_field:
all_hits = [hit for hit in all_hits if hit[1] >= threshold]
# Get the page's hits
page_hits = all_hits[offset : offset + page_size]
# Build result hits with highlights
hits: list[SearchHit] = []
snippet_generator = None
notes_snippet_generator = None
for rank, (doc_address, score) in enumerate(page_hits, start=offset + 1):
# Get the actual document from the searcher using the doc address
actual_doc = searcher.doc(doc_address)
doc_dict = actual_doc.to_dict()
doc_id = doc_dict["id"][0]
highlights: dict[str, str] = {}
try:
if snippet_generator is None:
snippet_generator = tantivy.SnippetGenerator.create(
searcher,
user_query,
self._schema,
"content",
)
content_html = _render_snippet_html(
snippet_generator.snippet_from_doc(actual_doc),
)
if content_html:
highlights["content"] = content_html
if "notes_text" in doc_dict:
# Use notes_text (plain text) for snippet generation — tantivy's
# SnippetGenerator does not support JSON fields.
if notes_snippet_generator is None:
notes_snippet_generator = tantivy.SnippetGenerator.create(
# Generate highlights if score > 0
if score > 0:
try:
if snippet_generator is None:
snippet_generator = tantivy.SnippetGenerator.create(
searcher,
notes_text_query,
final_query,
self._schema,
"notes_text",
"content",
)
notes_html = _render_snippet_html(
notes_snippet_generator.snippet_from_doc(actual_doc),
)
if notes_html:
highlights["notes"] = notes_html
except Exception: # pragma: no cover
logger.debug("Failed to generate highlights for doc %s", doc_id)
content_snippet = snippet_generator.snippet_from_doc(actual_doc)
if content_snippet:
highlights["content"] = str(content_snippet)
# Try notes highlights
if "notes" in doc_dict:
if notes_snippet_generator is None:
notes_snippet_generator = tantivy.SnippetGenerator.create(
searcher,
final_query,
self._schema,
"notes",
)
notes_snippet = notes_snippet_generator.snippet_from_doc(
actual_doc,
)
if notes_snippet:
highlights["notes"] = str(notes_snippet)
except Exception: # pragma: no cover
logger.debug("Failed to generate highlights for doc %s", doc_id)
hits.append(
SearchHit(
@@ -671,69 +584,11 @@ class TantivyBackend:
),
)
return hits
def search_ids(
self,
query: str,
user: AbstractBaseUser | None,
*,
sort_field: str | None = None,
sort_reverse: bool = False,
search_mode: SearchMode = SearchMode.QUERY,
limit: int | None = None,
) -> list[int]:
"""
Return document IDs matching a query — no highlights or scores.
This is the lightweight companion to search(). Use it when you need the
full set of matching IDs (e.g. for ``selection_data``) but don't need
scores, ranks, or highlights.
Args:
query: User's search query
user: User for permission filtering (None for superuser/no filtering)
sort_field: Field to sort by (None for relevance ranking)
sort_reverse: Whether to reverse the sort order
search_mode: Query parsing mode (QUERY, TEXT, or TITLE)
limit: Maximum number of IDs to return (None = all matching docs)
Returns:
List of document IDs in the requested order
"""
self._ensure_open()
user_query = self._parse_query(query, search_mode)
final_query = self._apply_permission_filter(user_query, user)
searcher = self._index.searcher()
effective_limit = limit if limit is not None else searcher.num_docs
if sort_field and sort_field in self.SORT_FIELD_MAP:
mapped_field = self.SORT_FIELD_MAP[sort_field]
results = searcher.search(
final_query,
limit=effective_limit,
order_by_field=mapped_field,
order=tantivy.Order.Desc if sort_reverse else tantivy.Order.Asc,
)
all_hits = [(hit[1],) for hit in results.hits]
else:
results = searcher.search(final_query, limit=effective_limit)
all_hits = [(hit[1], hit[0]) for hit in results.hits]
# Normalize scores and apply threshold (relevance search only)
if all_hits:
max_score = max(hit[1] for hit in all_hits) or 1.0
all_hits = [(hit[0], hit[1] / max_score) for hit in all_hits]
threshold = settings.ADVANCED_FUZZY_SEARCH_THRESHOLD
if threshold is not None:
all_hits = [hit for hit in all_hits if hit[1] >= threshold]
# TODO: Replace with searcher.collect_u64_fast_field("id", addrs) once
# https://github.com/quickwit-oss/tantivy-py/pull/641 lands — eliminates
# one stored-doc fetch per result (~80% reduction in search_ids latency).
return [searcher.doc(doc_addr).to_dict()["id"][0] for doc_addr, *_ in all_hits]
return SearchResults(
hits=hits,
total=total,
query=query,
)
def autocomplete(
self,
@@ -748,10 +603,6 @@ class TantivyBackend:
frequency (how many documents contain each word). Optionally filters
results to only words from documents visible to the specified user.
NOTE: This is the hottest search path (called per keystroke).
A future improvement would be to cache results in Redis, keyed by
(prefix, user_id), and invalidate on index writes.
Args:
term: Prefix to match against autocomplete words
limit: Maximum number of suggestions to return
@@ -762,94 +613,64 @@ class TantivyBackend:
"""
self._ensure_open()
normalized_term = ascii_fold(term.lower())
if not normalized_term:
return []
searcher = self._index.searcher()
# Build a prefix query on autocomplete_word so we only scan docs
# containing words that start with the prefix, not the entire index.
# tantivy regex is implicitly anchored; .+ avoids the empty-match
# error that .* triggers. We OR with term_query to also match the
# exact prefix as a complete word.
escaped = re.escape(normalized_term)
prefix_query = tantivy.Query.boolean_query(
[
(
tantivy.Occur.Should,
tantivy.Query.term_query(
self._schema,
"autocomplete_word",
normalized_term,
),
),
(
tantivy.Occur.Should,
tantivy.Query.regex_query(
self._schema,
"autocomplete_word",
f"{escaped}.+",
),
),
],
)
# Intersect with permission filter so autocomplete words from
# invisible documents don't leak to other users.
# Apply permission filter for non-superusers so autocomplete words
# from invisible documents don't leak to other users.
if user is not None and not user.is_superuser:
final_query = tantivy.Query.boolean_query(
[
(tantivy.Occur.Must, prefix_query),
(tantivy.Occur.Must, build_permission_filter(self._schema, user)),
],
)
base_query = build_permission_filter(self._schema, user)
else:
final_query = prefix_query
base_query = tantivy.Query.all_query()
results = searcher.search(final_query, limit=searcher.num_docs)
results = searcher.search(base_query, limit=10000)
# Count how many visible documents each matching word appears in.
# Count how many visible documents each word appears in.
# Using Counter (not set) preserves per-word document frequency so
# we can rank suggestions by how commonly they occur — the same
# signal Whoosh used for Tf/Idf-based autocomplete ordering.
word_counts: Counter[str] = Counter()
for _score, doc_address in results.hits:
stored_doc = searcher.doc(doc_address)
doc_dict = stored_doc.to_dict()
if "autocomplete_word" in doc_dict:
for word in doc_dict["autocomplete_word"]:
if word.startswith(normalized_term):
word_counts[word] += 1
word_counts.update(doc_dict["autocomplete_word"])
# Sort by document frequency descending; break ties alphabetically.
# Filter to prefix matches, sort by document frequency descending;
# break ties alphabetically for stable, deterministic output.
matches = sorted(
word_counts,
(w for w in word_counts if w.startswith(normalized_term)),
key=lambda w: (-word_counts[w], w),
)
return matches[:limit]
def more_like_this_ids(
def more_like_this(
self,
doc_id: int,
user: AbstractBaseUser | None,
*,
limit: int | None = None,
) -> list[int]:
page: int,
page_size: int,
) -> SearchResults:
"""
Return IDs of documents similar to the given document — no highlights.
Find documents similar to the given document using content analysis.
Lightweight companion to more_like_this(). The original document is
excluded from results.
Uses Tantivy's "more like this" query to find documents with similar
content patterns. The original document is excluded from results.
Args:
doc_id: Primary key of the reference document
user: User for permission filtering (None for no filtering)
limit: Maximum number of IDs to return (None = all matching docs)
page: Page number (1-indexed) for pagination
page_size: Number of results per page
Returns:
List of similar document IDs (excluding the original)
SearchResults with similar documents (excluding the original)
"""
self._ensure_open()
searcher = self._index.searcher()
# First find the document address
id_query = tantivy.Query.range_query(
self._schema,
"id",
@@ -860,9 +681,13 @@ class TantivyBackend:
results = searcher.search(id_query, limit=1)
if not results.hits:
return []
# Document not found
return SearchResults(hits=[], total=0, query=f"more_like:{doc_id}")
# Extract doc_address from (score, doc_address) tuple
doc_address = results.hits[0][1]
# Build more like this query
mlt_query = tantivy.Query.more_like_this_query(
doc_address,
min_doc_frequency=1,
@@ -874,21 +699,59 @@ class TantivyBackend:
boost_factor=None,
)
final_query = self._apply_permission_filter(mlt_query, user)
# Apply permission filter
if user is not None:
permission_filter = build_permission_filter(self._schema, user)
final_query = tantivy.Query.boolean_query(
[
(tantivy.Occur.Must, mlt_query),
(tantivy.Occur.Must, permission_filter),
],
)
else:
final_query = mlt_query
effective_limit = limit if limit is not None else searcher.num_docs
# Fetch one extra to account for excluding the original document
results = searcher.search(final_query, limit=effective_limit + 1)
# Search
offset = (page - 1) * page_size
results = searcher.search(final_query, limit=offset + page_size)
# TODO: Replace with collect_u64_fast_field("id", addrs) once
# https://github.com/quickwit-oss/tantivy-py/pull/641 lands.
ids = []
for _score, doc_address in results.hits:
result_doc_id = searcher.doc(doc_address).to_dict()["id"][0]
if result_doc_id != doc_id:
ids.append(result_doc_id)
total = results.count
# Convert from (score, doc_address) to (doc_address, score)
all_hits = [(hit[1], hit[0]) for hit in results.hits]
return ids[:limit] if limit is not None else ids
# Normalize scores
if all_hits:
max_score = max(hit[1] for hit in all_hits) or 1.0
all_hits = [(hit[0], hit[1] / max_score) for hit in all_hits]
# Get page hits
page_hits = all_hits[offset : offset + page_size]
# Build results
hits: list[SearchHit] = []
for rank, (doc_address, score) in enumerate(page_hits, start=offset + 1):
actual_doc = searcher.doc(doc_address)
doc_dict = actual_doc.to_dict()
result_doc_id = doc_dict["id"][0]
# Skip the original document
if result_doc_id == doc_id:
continue
hits.append(
SearchHit(
id=result_doc_id,
score=score,
rank=rank,
highlights={}, # MLT doesn't generate highlights
),
)
return SearchResults(
hits=hits,
total=max(0, total - 1), # Subtract 1 for the original document
query=f"more_like:{doc_id}",
)
def batch_update(self, lock_timeout: float = 30.0) -> WriteBatch:
"""

View File

@@ -396,17 +396,10 @@ def build_permission_filter(
Tantivy query that filters results to visible documents
Implementation Notes:
- Uses range_query instead of term_query for owner_id/viewer_id to work
around a tantivy-py bug where Python ints are inferred as i64, causing
term_query to return no hits on u64 fields.
TODO: Replace with term_query once
https://github.com/quickwit-oss/tantivy-py/pull/642 lands.
- Uses range_query(owner_id, 1, MAX_U64) as an "owner exists" check
because exists_query is not yet available in tantivy-py 0.25.
TODO: Replace with exists_query("owner_id") once that is exposed in
a tantivy-py release.
- Uses range_query instead of term_query to work around unsigned integer
type detection bug in tantivy-py 0.25
- Uses boolean_query for "no owner" check since exists_query is not
available in tantivy-py 0.25.1 (available in master)
- Uses disjunction_max_query to combine permission clauses with OR logic
"""
owner_any = tantivy.Query.range_query(

View File

@@ -72,9 +72,6 @@ def build_schema() -> tantivy.Schema:
# JSON fields — structured queries: notes.user:alice, custom_fields.name:invoice
sb.add_json_field("notes", stored=True, tokenizer_name="paperless_text")
# Plain-text companion for notes — tantivy's SnippetGenerator does not support
# JSON fields, so highlights require a text field with the same content.
sb.add_text_field("notes_text", stored=True, tokenizer_name="paperless_text")
sb.add_json_field("custom_fields", stored=True, tokenizer_name="paperless_text")
for field in (

View File

@@ -12,7 +12,6 @@ from typing import Literal
from typing import TypedDict
import magic
from celery import states
from django.conf import settings
from django.contrib.auth.models import Group
from django.contrib.auth.models import User
@@ -100,7 +99,7 @@ logger = logging.getLogger("paperless.serializers")
# https://www.django-rest-framework.org/api-guide/serializers/#example
class DynamicFieldsModelSerializer(serializers.ModelSerializer):
class DynamicFieldsModelSerializer(serializers.ModelSerializer[Any]):
"""
A ModelSerializer that takes an additional `fields` argument that
controls which fields should be displayed.
@@ -121,7 +120,7 @@ class DynamicFieldsModelSerializer(serializers.ModelSerializer):
self.fields.pop(field_name)
class MatchingModelSerializer(serializers.ModelSerializer):
class MatchingModelSerializer(serializers.ModelSerializer[Any]):
document_count = serializers.IntegerField(read_only=True)
def get_slug(self, obj) -> str:
@@ -261,7 +260,7 @@ class SetPermissionsSerializer(serializers.DictField):
class OwnedObjectSerializer(
SerializerWithPerms,
serializers.ModelSerializer,
serializers.ModelSerializer[Any],
SetPermissionsMixin,
):
def __init__(self, *args, **kwargs) -> None:
@@ -469,7 +468,7 @@ class OwnedObjectSerializer(
return super().update(instance, validated_data)
class OwnedObjectListSerializer(serializers.ListSerializer):
class OwnedObjectListSerializer(serializers.ListSerializer[Any]):
def to_representation(self, documents):
self.child.context["shared_object_pks"] = self.child.get_shared_object_pks(
documents,
@@ -682,27 +681,27 @@ class TagSerializer(MatchingModelSerializer, OwnedObjectSerializer):
return super().validate(attrs)
class CorrespondentField(serializers.PrimaryKeyRelatedField):
class CorrespondentField(serializers.PrimaryKeyRelatedField[Correspondent]):
def get_queryset(self):
return Correspondent.objects.all()
class TagsField(serializers.PrimaryKeyRelatedField):
class TagsField(serializers.PrimaryKeyRelatedField[Tag]):
def get_queryset(self):
return Tag.objects.all()
class DocumentTypeField(serializers.PrimaryKeyRelatedField):
class DocumentTypeField(serializers.PrimaryKeyRelatedField[DocumentType]):
def get_queryset(self):
return DocumentType.objects.all()
class StoragePathField(serializers.PrimaryKeyRelatedField):
class StoragePathField(serializers.PrimaryKeyRelatedField[StoragePath]):
def get_queryset(self):
return StoragePath.objects.all()
class CustomFieldSerializer(serializers.ModelSerializer):
class CustomFieldSerializer(serializers.ModelSerializer[CustomField]):
data_type = serializers.ChoiceField(
choices=CustomField.FieldDataType,
read_only=False,
@@ -816,7 +815,7 @@ def validate_documentlink_targets(user, doc_ids):
)
class CustomFieldInstanceSerializer(serializers.ModelSerializer):
class CustomFieldInstanceSerializer(serializers.ModelSerializer[CustomFieldInstance]):
field = serializers.PrimaryKeyRelatedField(queryset=CustomField.objects.all())
value = ReadWriteSerializerMethodField(allow_null=True)
@@ -922,14 +921,14 @@ class CustomFieldInstanceSerializer(serializers.ModelSerializer):
]
class BasicUserSerializer(serializers.ModelSerializer):
class BasicUserSerializer(serializers.ModelSerializer[User]):
# Different than paperless.serializers.UserSerializer
class Meta:
model = User
fields = ["id", "username", "first_name", "last_name"]
class NotesSerializer(serializers.ModelSerializer):
class NotesSerializer(serializers.ModelSerializer[Note]):
user = BasicUserSerializer(read_only=True)
class Meta:
@@ -1256,7 +1255,7 @@ class DocumentSerializer(
list_serializer_class = OwnedObjectListSerializer
class SearchResultListSerializer(serializers.ListSerializer):
class SearchResultListSerializer(serializers.ListSerializer[Document]):
def to_representation(self, hits):
document_ids = [hit["id"] for hit in hits]
# Fetch all Document objects in the list in one SQL query.
@@ -1313,7 +1312,7 @@ class SearchResultSerializer(DocumentSerializer):
list_serializer_class = SearchResultListSerializer
class SavedViewFilterRuleSerializer(serializers.ModelSerializer):
class SavedViewFilterRuleSerializer(serializers.ModelSerializer[SavedViewFilterRule]):
class Meta:
model = SavedViewFilterRule
fields = ["rule_type", "value"]
@@ -2401,7 +2400,7 @@ class StoragePathSerializer(MatchingModelSerializer, OwnedObjectSerializer):
return super().update(instance, validated_data)
class UiSettingsViewSerializer(serializers.ModelSerializer):
class UiSettingsViewSerializer(serializers.ModelSerializer[UiSettings]):
settings = serializers.DictField(required=False, allow_null=True)
class Meta:
@@ -2429,7 +2428,81 @@ class UiSettingsViewSerializer(serializers.ModelSerializer):
return ui_settings
class TasksViewSerializer(OwnedObjectSerializer):
class TaskSerializerV10(OwnedObjectSerializer):
"""Task serializer for API v10+ using new field names."""
related_document_ids = serializers.ListField(
child=serializers.IntegerField(),
read_only=True,
)
task_type_display = serializers.CharField(
source="get_task_type_display",
read_only=True,
)
trigger_source_display = serializers.CharField(
source="get_trigger_source_display",
read_only=True,
)
status_display = serializers.CharField(
source="get_status_display",
read_only=True,
)
class Meta:
model = PaperlessTask
fields = (
"id",
"task_id",
"task_type",
"task_type_display",
"trigger_source",
"trigger_source_display",
"status",
"status_display",
"date_created",
"date_started",
"date_done",
"duration_seconds",
"wait_time_seconds",
"input_data",
"result_data",
"result_message",
"related_document_ids",
"acknowledged",
"owner",
)
read_only_fields = fields
class TaskSerializerV9(serializers.ModelSerializer):
"""Task serializer for API v9 backwards compatibility.
Maps old field names to the new model fields so existing clients continue
to work unchanged.
"""
# v9 field: task_name -> task_type
task_name = serializers.CharField(source="task_type", read_only=True)
# v9 field: task_file_name -> input_data.filename
task_file_name = serializers.SerializerMethodField()
# v9 field: type -> trigger_source (mapped to old enum labels)
type = serializers.SerializerMethodField()
# v9 field: result -> result_message (with legacy format fallback)
result = serializers.CharField(
source="result_message",
read_only=True,
allow_null=True,
)
# v9 field: related_document -> first document ID from result_data
related_document = serializers.SerializerMethodField()
# v9 field: duplicate_documents -> list of duplicate IDs from result_data
duplicate_documents = serializers.SerializerMethodField()
class Meta:
model = PaperlessTask
fields = (
@@ -2437,59 +2510,59 @@ class TasksViewSerializer(OwnedObjectSerializer):
"task_id",
"task_name",
"task_file_name",
"date_created",
"date_done",
"type",
"status",
"date_created",
"date_done",
"result",
"acknowledged",
"related_document",
"duplicate_documents",
"owner",
)
read_only_fields = fields
related_document = serializers.SerializerMethodField()
duplicate_documents = serializers.SerializerMethodField()
created_doc_re = re.compile(r"New document id (\d+) created")
duplicate_doc_re = re.compile(r"It is a duplicate of .* \(#(\d+)\)")
def get_task_file_name(self, obj: PaperlessTask) -> str | None:
if not obj.input_data:
return None
return obj.input_data.get("filename")
def get_related_document(self, obj) -> str | None:
result = None
re = None
if obj.result:
match obj.status:
case states.SUCCESS:
re = self.created_doc_re
case states.FAILURE:
re = (
self.duplicate_doc_re
if "existing document is in the trash" not in obj.result
else None
)
if re is not None:
try:
result = re.search(obj.result).group(1)
except Exception:
pass
def get_type(self, obj: PaperlessTask) -> str:
# Old type values: AUTO_TASK, SCHEDULED_TASK, MANUAL_TASK
source_to_old_type = {
PaperlessTask.TriggerSource.SCHEDULED: "SCHEDULED_TASK",
PaperlessTask.TriggerSource.SYSTEM: "AUTO_TASK",
}
return source_to_old_type.get(obj.trigger_source, "MANUAL_TASK")
return result
def get_related_document(self, obj: PaperlessTask) -> int | None:
ids = obj.related_document_ids
return ids[0] if ids else None
@extend_schema_field(DuplicateDocumentSummarySerializer(many=True))
def get_duplicate_documents(self, obj):
related_document = self.get_related_document(obj)
request = self.context.get("request")
user = request.user if request else None
document = Document.global_objects.filter(pk=related_document).first()
if not related_document or not user or not document:
def get_duplicate_documents(self, obj: PaperlessTask) -> list[int]:
if not obj.result_data:
return []
duplicates = _get_viewable_duplicates(document, user)
return list(duplicates.values("id", "title", "deleted_at"))
dup_of = obj.result_data.get("duplicate_of")
return [dup_of] if dup_of is not None else []
class RunTaskViewSerializer(serializers.Serializer[dict[str, Any]]):
task_name = serializers.ChoiceField(
choices=PaperlessTask.TaskName.choices,
label="Task Name",
class TaskSummarySerializer(serializers.Serializer):
task_type = serializers.CharField()
total_count = serializers.IntegerField()
pending_count = serializers.IntegerField()
success_count = serializers.IntegerField()
failure_count = serializers.IntegerField()
avg_duration_seconds = serializers.FloatField(allow_null=True)
avg_wait_time_seconds = serializers.FloatField(allow_null=True)
last_run = serializers.DateTimeField(allow_null=True)
last_success = serializers.DateTimeField(allow_null=True)
last_failure = serializers.DateTimeField(allow_null=True)
class RunTaskSerializer(serializers.Serializer):
task_type = serializers.ChoiceField(
choices=PaperlessTask.TaskType.choices,
label="Task Type",
write_only=True,
)
@@ -2760,7 +2833,7 @@ class BulkEditObjectsSerializer(SerializerWithPerms, SetPermissionsMixin):
return attrs
class WorkflowTriggerSerializer(serializers.ModelSerializer):
class WorkflowTriggerSerializer(serializers.ModelSerializer[WorkflowTrigger]):
id = serializers.IntegerField(required=False, allow_null=True)
sources = fields.MultipleChoiceField(
choices=WorkflowTrigger.DocumentSourceChoices.choices,
@@ -2870,7 +2943,7 @@ class WorkflowTriggerSerializer(serializers.ModelSerializer):
return super().update(instance, validated_data)
class WorkflowActionEmailSerializer(serializers.ModelSerializer):
class WorkflowActionEmailSerializer(serializers.ModelSerializer[WorkflowActionEmail]):
id = serializers.IntegerField(allow_null=True, required=False)
class Meta:
@@ -2884,7 +2957,9 @@ class WorkflowActionEmailSerializer(serializers.ModelSerializer):
]
class WorkflowActionWebhookSerializer(serializers.ModelSerializer):
class WorkflowActionWebhookSerializer(
serializers.ModelSerializer[WorkflowActionWebhook],
):
id = serializers.IntegerField(allow_null=True, required=False)
def validate_url(self, url):
@@ -2905,7 +2980,7 @@ class WorkflowActionWebhookSerializer(serializers.ModelSerializer):
]
class WorkflowActionSerializer(serializers.ModelSerializer):
class WorkflowActionSerializer(serializers.ModelSerializer[WorkflowAction]):
id = serializers.IntegerField(required=False, allow_null=True)
assign_correspondent = CorrespondentField(allow_null=True, required=False)
assign_tags = TagsField(many=True, allow_null=True, required=False)
@@ -3027,7 +3102,7 @@ class WorkflowActionSerializer(serializers.ModelSerializer):
return attrs
class WorkflowSerializer(serializers.ModelSerializer):
class WorkflowSerializer(serializers.ModelSerializer[Workflow]):
order = serializers.IntegerField(required=False)
triggers = WorkflowTriggerSerializer(many=True)

View File

@@ -8,7 +8,6 @@ from typing import TYPE_CHECKING
from typing import Any
from celery import shared_task
from celery import states
from celery.signals import before_task_publish
from celery.signals import task_failure
from celery.signals import task_postrun
@@ -31,6 +30,7 @@ from documents import matching
from documents.caching import clear_document_caches
from documents.caching import invalidate_llm_suggestions_cache
from documents.data_models import ConsumableDocument
from documents.data_models import DocumentSource
from documents.file_handling import create_source_path_directory
from documents.file_handling import delete_empty_directories
from documents.file_handling import generate_filename
@@ -999,68 +999,175 @@ def run_workflows(
return overrides, "\n".join(messages)
# ---------------------------------------------------------------------------
# Task tracking -- Celery signal handlers
# ---------------------------------------------------------------------------
TRACKED_TASKS: dict[str, PaperlessTask.TaskType] = {
"documents.tasks.consume_file": PaperlessTask.TaskType.CONSUME_FILE,
"documents.tasks.train_classifier": PaperlessTask.TaskType.TRAIN_CLASSIFIER,
"documents.tasks.sanity_check": PaperlessTask.TaskType.SANITY_CHECK,
"documents.tasks.index_optimize": PaperlessTask.TaskType.INDEX_OPTIMIZE,
"documents.tasks.llmindex_index": PaperlessTask.TaskType.LLM_INDEX,
"paperless_mail.tasks.process_mail_accounts": PaperlessTask.TaskType.MAIL_FETCH,
}
_DOCUMENT_SOURCE_TO_TRIGGER: dict[Any, PaperlessTask.TriggerSource] = {
DocumentSource.ConsumeFolder: PaperlessTask.TriggerSource.FOLDER_CONSUME,
DocumentSource.ApiUpload: PaperlessTask.TriggerSource.API_UPLOAD,
DocumentSource.MailFetch: PaperlessTask.TriggerSource.EMAIL_CONSUME,
DocumentSource.WebUI: PaperlessTask.TriggerSource.WEB_UI,
}
def _extract_input_data(
task_type: PaperlessTask.TaskType,
args: tuple,
task_kwargs: dict,
) -> dict:
if task_type == PaperlessTask.TaskType.CONSUME_FILE:
input_doc = args[0] if args else task_kwargs.get("input_doc")
overrides = args[1] if len(args) >= 2 else task_kwargs.get("overrides")
if input_doc is None:
return {}
data: dict = {
"filename": input_doc.original_file.name,
"mime_type": input_doc.mime_type,
}
if input_doc.original_path:
data["source_path"] = str(input_doc.original_path)
if input_doc.mailrule_id:
data["mailrule_id"] = input_doc.mailrule_id
if overrides:
override_dict = {
k: v
for k, v in vars(overrides).items()
if v is not None and not k.startswith("_")
}
if override_dict:
data["overrides"] = override_dict
return data
if task_type == PaperlessTask.TaskType.MAIL_FETCH:
account_ids = args[0] if args else task_kwargs.get("account_ids")
return {"account_ids": account_ids}
return {}
def _determine_trigger_source(
task_type: PaperlessTask.TaskType,
args: tuple,
task_kwargs: dict,
headers: dict,
) -> PaperlessTask.TriggerSource:
# Explicit header takes priority -- covers beat ("scheduled") and system auto-runs ("system")
header_source = headers.get("trigger_source")
if header_source == "scheduled":
return PaperlessTask.TriggerSource.SCHEDULED
if header_source == "system":
return PaperlessTask.TriggerSource.SYSTEM
if task_type == PaperlessTask.TaskType.CONSUME_FILE:
input_doc = args[0] if args else task_kwargs.get("input_doc")
if input_doc is not None:
return _DOCUMENT_SOURCE_TO_TRIGGER.get(
input_doc.source,
PaperlessTask.TriggerSource.API_UPLOAD,
)
return PaperlessTask.TriggerSource.MANUAL
def _extract_owner_id(
task_type: PaperlessTask.TaskType,
args: tuple,
task_kwargs: dict,
) -> int | None:
if task_type != PaperlessTask.TaskType.CONSUME_FILE:
return None
overrides = args[1] if len(args) >= 2 else task_kwargs.get("overrides")
if overrides and hasattr(overrides, "owner_id"):
return overrides.owner_id
return None
def _parse_legacy_result(result: str) -> dict | None:
import re as _re
if match := _re.search(r"New document id (\d+) created", result):
return {"document_id": int(match.group(1))}
if match := _re.search(r"It is a duplicate of .* \(#(\d+)\)", result):
return {
"duplicate_of": int(match.group(1)),
"duplicate_in_trash": "existing document is in the trash" in result,
}
return None
@before_task_publish.connect
def before_task_publish_handler(sender=None, headers=None, body=None, **kwargs) -> None:
def before_task_publish_handler(
sender=None,
headers=None,
body=None,
**kwargs,
) -> None:
"""
Creates the PaperlessTask object in a pending state. This is sent before
the task reaches the broker, but before it begins executing on a worker.
Creates the PaperlessTask record when the task is published to broker.
https://docs.celeryq.dev/en/stable/userguide/signals.html#before-task-publish
https://docs.celeryq.dev/en/stable/internals/protocol.html#version-2
"""
if "task" not in headers or headers["task"] != "documents.tasks.consume_file":
# Assumption: this is only ever a v2 message
if headers is None or body is None:
return
task_name = headers.get("task", "")
task_type = TRACKED_TASKS.get(task_name)
if task_type is None:
return
try:
close_old_connections()
args, task_kwargs, _ = body
task_id = headers["id"]
task_args = body[0]
input_doc, overrides = task_args
task_file_name = input_doc.original_file.name
user_id = overrides.owner_id if overrides else None
input_data = _extract_input_data(task_type, args, task_kwargs)
trigger_source = _determine_trigger_source(
task_type,
args,
task_kwargs,
headers,
)
owner_id = _extract_owner_id(task_type, args, task_kwargs)
PaperlessTask.objects.create(
type=PaperlessTask.TaskType.AUTO,
task_id=headers["id"],
status=states.PENDING,
task_file_name=task_file_name,
task_name=PaperlessTask.TaskName.CONSUME_FILE,
result=None,
date_created=timezone.now(),
date_started=None,
date_done=None,
owner_id=user_id,
task_id=task_id,
task_type=task_type,
trigger_source=trigger_source,
status=PaperlessTask.Status.PENDING,
input_data=input_data,
owner_id=owner_id,
)
except Exception: # pragma: no cover
# Don't let an exception in the signal handlers prevent
# a document from being consumed.
except Exception:
logger.exception("Creating PaperlessTask failed")
@task_prerun.connect
def task_prerun_handler(sender=None, task_id=None, task=None, **kwargs) -> None:
"""
Updates the PaperlessTask to be started. Sent before the task begins execution
on a worker.
Marks the task STARTED when execution begins on a worker.
https://docs.celeryq.dev/en/stable/userguide/signals.html#task-prerun
"""
if task_id is None:
return
try:
close_old_connections()
task_instance = PaperlessTask.objects.filter(task_id=task_id).first()
if task_instance is not None:
task_instance.status = states.STARTED
task_instance.date_started = timezone.now()
task_instance.save()
except Exception: # pragma: no cover
# Don't let an exception in the signal handlers prevent
# a document from being consumed.
PaperlessTask.objects.filter(task_id=task_id).update(
status=PaperlessTask.Status.STARTED,
date_started=timezone.now(),
)
except Exception:
logger.exception("Setting PaperlessTask started failed")
@@ -1074,22 +1181,53 @@ def task_postrun_handler(
**kwargs,
) -> None:
"""
Updates the result of the PaperlessTask.
Records task completion and result data.
https://docs.celeryq.dev/en/stable/userguide/signals.html#task-postrun
"""
if task_id is None:
return
try:
close_old_connections()
task_instance = PaperlessTask.objects.filter(task_id=task_id).first()
if task_instance is not None:
task_instance.status = state or states.FAILURE
task_instance.result = retval
task_instance.date_done = timezone.now()
task_instance.save()
except Exception: # pragma: no cover
# Don't let an exception in the signal handlers prevent
# a document from being consumed.
status_map = {
"SUCCESS": PaperlessTask.Status.SUCCESS,
"FAILURE": PaperlessTask.Status.FAILURE,
"REVOKED": PaperlessTask.Status.REVOKED,
}
new_status = status_map.get(state, PaperlessTask.Status.FAILURE)
result_data: dict | None = None
result_message: str | None = None
if isinstance(retval, dict):
result_data = retval
elif isinstance(retval, str):
result_message = retval
result_data = _parse_legacy_result(retval)
now = timezone.now()
task_instance = PaperlessTask.objects.filter(task_id=task_id).first()
if task_instance is None:
return
duration_seconds: float | None = None
wait_time_seconds: float | None = None
if task_instance.date_started:
duration_seconds = (now - task_instance.date_started).total_seconds()
if task_instance.date_started and task_instance.date_created:
wait_time_seconds = (
task_instance.date_started - task_instance.date_created
).total_seconds()
PaperlessTask.objects.filter(task_id=task_id).update(
status=new_status,
result_data=result_data,
result_message=result_message,
date_done=now,
duration_seconds=duration_seconds,
wait_time_seconds=wait_time_seconds,
)
except Exception:
logger.exception("Updating PaperlessTask failed")
@@ -1103,21 +1241,33 @@ def task_failure_handler(
**kwargs,
) -> None:
"""
Updates the result of a failed PaperlessTask.
Records failure details when a task raises an exception.
https://docs.celeryq.dev/en/stable/userguide/signals.html#task-failure
"""
if task_id is None:
return
try:
close_old_connections()
task_instance = PaperlessTask.objects.filter(task_id=task_id).first()
if task_instance is not None and task_instance.result is None:
task_instance.status = states.FAILURE
task_instance.result = traceback
task_instance.date_done = timezone.now()
task_instance.save()
except Exception: # pragma: no cover
logger.exception("Updating PaperlessTask failed")
result_data: dict = {
"error_type": type(exception).__name__ if exception else "Unknown",
"error_message": str(exception) if exception else "Unknown error",
}
if traceback:
import traceback as _tb
tb_str = "".join(_tb.format_tb(traceback))
result_data["traceback"] = tb_str[:5000]
PaperlessTask.objects.filter(task_id=task_id).update(
status=PaperlessTask.Status.FAILURE,
result_data=result_data,
result_message=str(exception) if exception else None,
date_done=timezone.now(),
)
except Exception:
logger.exception("Updating PaperlessTask on failure failed")
@worker_process_init.connect

View File

@@ -10,7 +10,6 @@ from tempfile import mkstemp
from celery import Task
from celery import shared_task
from celery import states
from django.conf import settings
from django.contrib.contenttypes.models import ContentType
from django.db import models
@@ -41,7 +40,6 @@ from documents.models import Correspondent
from documents.models import CustomFieldInstance
from documents.models import Document
from documents.models import DocumentType
from documents.models import PaperlessTask
from documents.models import ShareLink
from documents.models import ShareLinkBundle
from documents.models import StoragePath
@@ -84,19 +82,8 @@ def index_optimize() -> None:
@shared_task
def train_classifier(
*,
scheduled=True,
status_callback: Callable[[str], None] | None = None,
) -> None:
task = PaperlessTask.objects.create(
type=PaperlessTask.TaskType.SCHEDULED_TASK
if scheduled
else PaperlessTask.TaskType.MANUAL_TASK,
task_id=uuid.uuid4(),
task_name=PaperlessTask.TaskName.TRAIN_CLASSIFIER,
status=states.STARTED,
date_created=timezone.now(),
date_started=timezone.now(),
)
) -> str:
if (
not Tag.objects.filter(matching_algorithm=Tag.MATCH_AUTO).exists()
and not DocumentType.objects.filter(matching_algorithm=Tag.MATCH_AUTO).exists()
@@ -110,37 +97,22 @@ def train_classifier(
if settings.MODEL_FILE.exists():
logger.info(f"Removing {settings.MODEL_FILE} so it won't be used")
settings.MODEL_FILE.unlink()
task.status = states.SUCCESS
task.result = result
task.date_done = timezone.now()
task.save()
return
return result
classifier = load_classifier()
if not classifier:
classifier = DocumentClassifier()
try:
if classifier.train(status_callback=status_callback):
logger.info(
f"Saving updated classifier model to {settings.MODEL_FILE}...",
)
classifier.save()
task.result = "Training completed successfully"
else:
logger.debug("Training data unchanged.")
task.result = "Training data unchanged"
task.status = states.SUCCESS
except Exception as e:
logger.warning("Classifier error: " + str(e))
task.status = states.FAILURE
task.result = str(e)
task.date_done = timezone.now()
task.save(update_fields=["status", "result", "date_done"])
if classifier.train(status_callback=status_callback):
logger.info(
f"Saving updated classifier model to {settings.MODEL_FILE}...",
)
classifier.save()
return "Training completed successfully"
else:
logger.debug("Training data unchanged.")
return "Training data unchanged"
@shared_task(bind=True)
@@ -231,8 +203,8 @@ def consume_file(
@shared_task
def sanity_check(*, scheduled=True, raise_on_error=True):
messages = sanity_checker.check_sanity(scheduled=scheduled)
def sanity_check(*, raise_on_error: bool = True) -> str:
messages = sanity_checker.check_sanity()
messages.log_messages()
if not messages.has_error and not messages.has_warning and not messages.has_info:
@@ -635,42 +607,19 @@ def update_document_parent_tags(tag: Tag, new_parent: Tag) -> None:
def llmindex_index(
*,
iter_wrapper: IterWrapper[Document] = identity,
rebuild=False,
scheduled=True,
auto=False,
) -> None:
rebuild: bool = False,
) -> str | None:
ai_config = AIConfig()
if ai_config.llm_index_enabled:
task = PaperlessTask.objects.create(
type=PaperlessTask.TaskType.SCHEDULED_TASK
if scheduled
else PaperlessTask.TaskType.AUTO
if auto
else PaperlessTask.TaskType.MANUAL_TASK,
task_id=uuid.uuid4(),
task_name=PaperlessTask.TaskName.LLMINDEX_UPDATE,
status=states.STARTED,
date_created=timezone.now(),
date_started=timezone.now(),
)
from paperless_ai.indexing import update_llm_index
try:
result = update_llm_index(
iter_wrapper=iter_wrapper,
rebuild=rebuild,
)
task.status = states.SUCCESS
task.result = result
except Exception as e:
logger.error("LLM index error: " + str(e))
task.status = states.FAILURE
task.result = str(e)
task.date_done = timezone.now()
task.save(update_fields=["status", "result", "date_done"])
else:
if not ai_config.llm_index_enabled:
logger.info("LLM index is disabled, skipping update.")
return None
from paperless_ai.indexing import update_llm_index
return update_llm_index(
iter_wrapper=iter_wrapper,
rebuild=rebuild,
)
@shared_task

View File

@@ -13,6 +13,8 @@ from rest_framework.test import APIClient
from documents.tests.factories import DocumentFactory
UserModelT = get_user_model()
if TYPE_CHECKING:
from documents.models import Document
@@ -126,15 +128,34 @@ def rest_api_client():
yield APIClient()
@pytest.fixture
def authenticated_rest_api_client(rest_api_client: APIClient):
"""
The basic DRF ApiClient which has been authenticated
"""
UserModel = get_user_model()
user = UserModel.objects.create_user(username="testuser", password="password")
rest_api_client.force_authenticate(user=user)
yield rest_api_client
@pytest.fixture()
def regular_user(django_user_model: type[UserModelT]) -> UserModelT:
"""Unprivileged authenticated user for permission boundary tests."""
return django_user_model.objects.create_user(username="regular", password="regular")
@pytest.fixture()
def admin_client(rest_api_client: APIClient, admin_user: UserModelT) -> APIClient:
"""Admin client pre-authenticated and sending the v10 Accept header."""
rest_api_client.force_authenticate(user=admin_user)
rest_api_client.credentials(HTTP_ACCEPT="application/json; version=10")
return rest_api_client
@pytest.fixture()
def v9_client(rest_api_client: APIClient, admin_user: UserModelT) -> APIClient:
"""Admin client pre-authenticated and sending the v9 Accept header."""
rest_api_client.force_authenticate(user=admin_user)
rest_api_client.credentials(HTTP_ACCEPT="application/json; version=9")
return rest_api_client
@pytest.fixture()
def user_client(rest_api_client: APIClient, regular_user: UserModelT) -> APIClient:
"""Regular-user client pre-authenticated and sending the v10 Accept header."""
rest_api_client.force_authenticate(user=regular_user)
rest_api_client.credentials(HTTP_ACCEPT="application/json; version=10")
return rest_api_client
@pytest.fixture(scope="session", autouse=True)

View File

@@ -11,6 +11,7 @@ from documents.models import Correspondent
from documents.models import Document
from documents.models import DocumentType
from documents.models import MatchingModel
from documents.models import PaperlessTask
from documents.models import StoragePath
from documents.models import Tag
@@ -65,3 +66,17 @@ class DocumentFactory(DjangoModelFactory):
correspondent = None
document_type = None
storage_path = None
class PaperlessTaskFactory(DjangoModelFactory):
class Meta:
model = PaperlessTask
task_id = factory.LazyFunction(lambda: str(__import__("uuid").uuid4()))
task_type = PaperlessTask.TaskType.CONSUME_FILE
trigger_source = PaperlessTask.TriggerSource.WEB_UI
status = PaperlessTask.Status.PENDING
input_data = factory.LazyFunction(dict)
result_data = None
result_message = None
acknowledged = False

View File

@@ -33,12 +33,19 @@ class TestWriteBatch:
except RuntimeError:
pass
ids = backend.search_ids("should survive", user=None)
assert len(ids) == 1
r = backend.search(
"should survive",
user=None,
page=1,
page_size=10,
sort_field=None,
sort_reverse=False,
)
assert r.total == 1
class TestSearch:
"""Test search query parsing and matching via search_ids."""
"""Test search functionality."""
def test_text_mode_limits_default_search_to_title_and_content(
self,
@@ -53,20 +60,27 @@ class TestSearch:
)
backend.add_or_update(doc)
assert (
len(
backend.search_ids(
"document_type:invoice",
user=None,
search_mode=SearchMode.TEXT,
),
)
== 0
metadata_only = backend.search(
"document_type:invoice",
user=None,
page=1,
page_size=10,
sort_field=None,
sort_reverse=False,
search_mode=SearchMode.TEXT,
)
assert (
len(backend.search_ids("monthly", user=None, search_mode=SearchMode.TEXT))
== 1
assert metadata_only.total == 0
content_match = backend.search(
"monthly",
user=None,
page=1,
page_size=10,
sort_field=None,
sort_reverse=False,
search_mode=SearchMode.TEXT,
)
assert content_match.total == 1
def test_title_mode_limits_default_search_to_title_only(
self,
@@ -81,14 +95,27 @@ class TestSearch:
)
backend.add_or_update(doc)
assert (
len(backend.search_ids("monthly", user=None, search_mode=SearchMode.TITLE))
== 0
content_only = backend.search(
"monthly",
user=None,
page=1,
page_size=10,
sort_field=None,
sort_reverse=False,
search_mode=SearchMode.TITLE,
)
assert (
len(backend.search_ids("invoice", user=None, search_mode=SearchMode.TITLE))
== 1
assert content_only.total == 0
title_match = backend.search(
"invoice",
user=None,
page=1,
page_size=10,
sort_field=None,
sort_reverse=False,
search_mode=SearchMode.TITLE,
)
assert title_match.total == 1
def test_text_mode_matches_partial_term_substrings(
self,
@@ -103,16 +130,38 @@ class TestSearch:
)
backend.add_or_update(doc)
assert (
len(backend.search_ids("pass", user=None, search_mode=SearchMode.TEXT)) == 1
prefix_match = backend.search(
"pass",
user=None,
page=1,
page_size=10,
sort_field=None,
sort_reverse=False,
search_mode=SearchMode.TEXT,
)
assert (
len(backend.search_ids("sswo", user=None, search_mode=SearchMode.TEXT)) == 1
assert prefix_match.total == 1
infix_match = backend.search(
"sswo",
user=None,
page=1,
page_size=10,
sort_field=None,
sort_reverse=False,
search_mode=SearchMode.TEXT,
)
assert (
len(backend.search_ids("sswo re", user=None, search_mode=SearchMode.TEXT))
== 1
assert infix_match.total == 1
phrase_match = backend.search(
"sswo re",
user=None,
page=1,
page_size=10,
sort_field=None,
sort_reverse=False,
search_mode=SearchMode.TEXT,
)
assert phrase_match.total == 1
def test_text_mode_does_not_match_on_partial_term_overlap(
self,
@@ -127,10 +176,16 @@ class TestSearch:
)
backend.add_or_update(doc)
assert (
len(backend.search_ids("raptor", user=None, search_mode=SearchMode.TEXT))
== 0
non_match = backend.search(
"raptor",
user=None,
page=1,
page_size=10,
sort_field=None,
sort_reverse=False,
search_mode=SearchMode.TEXT,
)
assert non_match.total == 0
def test_text_mode_anchors_later_query_tokens_to_token_starts(
self,
@@ -159,9 +214,16 @@ class TestSearch:
backend.add_or_update(prefix_doc)
backend.add_or_update(false_positive)
result_ids = set(
backend.search_ids("Z-Berichte 6", user=None, search_mode=SearchMode.TEXT),
results = backend.search(
"Z-Berichte 6",
user=None,
page=1,
page_size=10,
sort_field=None,
sort_reverse=False,
search_mode=SearchMode.TEXT,
)
result_ids = {hit["id"] for hit in results.hits}
assert exact_doc.id in result_ids
assert prefix_doc.id in result_ids
@@ -180,9 +242,16 @@ class TestSearch:
)
backend.add_or_update(doc)
assert (
len(backend.search_ids("!!!", user=None, search_mode=SearchMode.TEXT)) == 0
no_tokens = backend.search(
"!!!",
user=None,
page=1,
page_size=10,
sort_field=None,
sort_reverse=False,
search_mode=SearchMode.TEXT,
)
assert no_tokens.total == 0
def test_title_mode_matches_partial_term_substrings(
self,
@@ -197,18 +266,59 @@ class TestSearch:
)
backend.add_or_update(doc)
assert (
len(backend.search_ids("pass", user=None, search_mode=SearchMode.TITLE))
== 1
prefix_match = backend.search(
"pass",
user=None,
page=1,
page_size=10,
sort_field=None,
sort_reverse=False,
search_mode=SearchMode.TITLE,
)
assert (
len(backend.search_ids("sswo", user=None, search_mode=SearchMode.TITLE))
== 1
assert prefix_match.total == 1
infix_match = backend.search(
"sswo",
user=None,
page=1,
page_size=10,
sort_field=None,
sort_reverse=False,
search_mode=SearchMode.TITLE,
)
assert (
len(backend.search_ids("sswo gu", user=None, search_mode=SearchMode.TITLE))
== 1
assert infix_match.total == 1
phrase_match = backend.search(
"sswo gu",
user=None,
page=1,
page_size=10,
sort_field=None,
sort_reverse=False,
search_mode=SearchMode.TITLE,
)
assert phrase_match.total == 1
def test_scores_normalised_top_hit_is_one(self, backend: TantivyBackend):
"""Search scores must be normalized so top hit has score 1.0 for UI consistency."""
for i, title in enumerate(["bank invoice", "bank statement", "bank receipt"]):
doc = Document.objects.create(
title=title,
content=title,
checksum=f"SN{i}",
pk=10 + i,
)
backend.add_or_update(doc)
r = backend.search(
"bank",
user=None,
page=1,
page_size=10,
sort_field=None,
sort_reverse=False,
)
assert r.hits[0]["score"] == pytest.approx(1.0)
assert all(0.0 <= h["score"] <= 1.0 for h in r.hits)
def test_sort_field_ascending(self, backend: TantivyBackend):
"""Searching with sort_reverse=False must return results in ascending ASN order."""
@@ -221,14 +331,16 @@ class TestSearch:
)
backend.add_or_update(doc)
ids = backend.search_ids(
r = backend.search(
"sortable",
user=None,
page=1,
page_size=10,
sort_field="archive_serial_number",
sort_reverse=False,
)
assert len(ids) == 3
asns = [Document.objects.get(pk=doc_id).archive_serial_number for doc_id in ids]
assert r.total == 3
asns = [Document.objects.get(pk=h["id"]).archive_serial_number for h in r.hits]
assert asns == [10, 20, 30]
def test_sort_field_descending(self, backend: TantivyBackend):
@@ -242,91 +354,79 @@ class TestSearch:
)
backend.add_or_update(doc)
ids = backend.search_ids(
r = backend.search(
"sortable",
user=None,
page=1,
page_size=10,
sort_field="archive_serial_number",
sort_reverse=True,
)
assert len(ids) == 3
asns = [Document.objects.get(pk=doc_id).archive_serial_number for doc_id in ids]
assert r.total == 3
asns = [Document.objects.get(pk=h["id"]).archive_serial_number for h in r.hits]
assert asns == [30, 20, 10]
class TestSearchIds:
"""Test lightweight ID-only search."""
def test_returns_matching_ids(self, backend: TantivyBackend):
"""search_ids must return IDs of all matching documents."""
docs = []
for i in range(5):
doc = Document.objects.create(
title=f"findable doc {i}",
content="common keyword",
checksum=f"SI{i}",
)
backend.add_or_update(doc)
docs.append(doc)
other = Document.objects.create(
title="unrelated",
content="nothing here",
checksum="SI_other",
)
backend.add_or_update(other)
ids = backend.search_ids(
"common keyword",
user=None,
search_mode=SearchMode.QUERY,
)
assert set(ids) == {d.pk for d in docs}
assert other.pk not in ids
def test_respects_permission_filter(self, backend: TantivyBackend):
"""search_ids must respect user permission filtering."""
owner = User.objects.create_user("ids_owner")
other = User.objects.create_user("ids_other")
def test_fuzzy_threshold_filters_low_score_hits(
self,
backend: TantivyBackend,
settings,
):
"""When ADVANCED_FUZZY_SEARCH_THRESHOLD exceeds all normalized scores, hits must be filtered out."""
doc = Document.objects.create(
title="private doc",
content="secret keyword",
checksum="SIP1",
title="Invoice document",
content="financial report",
checksum="FT1",
pk=120,
)
backend.add_or_update(doc)
# Threshold above 1.0 filters every hit (normalized scores top out at 1.0)
settings.ADVANCED_FUZZY_SEARCH_THRESHOLD = 1.1
r = backend.search(
"invoice",
user=None,
page=1,
page_size=10,
sort_field=None,
sort_reverse=False,
)
assert r.hits == []
def test_owner_filter(self, backend: TantivyBackend):
"""Document owners can search their private documents; other users cannot access them."""
owner = User.objects.create_user("owner")
other = User.objects.create_user("other")
doc = Document.objects.create(
title="Private",
content="secret",
checksum="PF1",
pk=20,
owner=owner,
)
backend.add_or_update(doc)
assert backend.search_ids(
"secret",
user=owner,
search_mode=SearchMode.QUERY,
) == [doc.pk]
assert (
backend.search_ids("secret", user=other, search_mode=SearchMode.QUERY) == []
backend.search(
"secret",
user=owner,
page=1,
page_size=10,
sort_field=None,
sort_reverse=False,
).total
== 1
)
def test_respects_fuzzy_threshold(self, backend: TantivyBackend, settings):
"""search_ids must apply the same fuzzy threshold as search()."""
doc = Document.objects.create(
title="threshold test",
content="unique term",
checksum="SIT1",
assert (
backend.search(
"secret",
user=other,
page=1,
page_size=10,
sort_field=None,
sort_reverse=False,
).total
== 0
)
backend.add_or_update(doc)
settings.ADVANCED_FUZZY_SEARCH_THRESHOLD = 1.1
ids = backend.search_ids("unique", user=None, search_mode=SearchMode.QUERY)
assert ids == []
def test_returns_ids_for_text_mode(self, backend: TantivyBackend):
"""search_ids must work with TEXT search mode."""
doc = Document.objects.create(
title="text mode doc",
content="findable phrase",
checksum="SIM1",
)
backend.add_or_update(doc)
ids = backend.search_ids("findable", user=None, search_mode=SearchMode.TEXT)
assert ids == [doc.pk]
class TestRebuild:
@@ -390,26 +490,57 @@ class TestAutocomplete:
class TestMoreLikeThis:
"""Test more like this functionality."""
def test_more_like_this_ids_excludes_original(self, backend: TantivyBackend):
"""more_like_this_ids must return IDs of similar documents, excluding the original."""
def test_excludes_original(self, backend: TantivyBackend):
"""More like this queries must exclude the reference document from results."""
doc1 = Document.objects.create(
title="Important document",
content="financial information report",
checksum="MLTI1",
pk=150,
content="financial information",
checksum="MLT1",
pk=50,
)
doc2 = Document.objects.create(
title="Another document",
content="financial information report",
checksum="MLTI2",
pk=151,
content="financial report",
checksum="MLT2",
pk=51,
)
backend.add_or_update(doc1)
backend.add_or_update(doc2)
ids = backend.more_like_this_ids(doc_id=150, user=None)
assert 150 not in ids
assert 151 in ids
results = backend.more_like_this(doc_id=50, user=None, page=1, page_size=10)
returned_ids = [hit["id"] for hit in results.hits]
assert 50 not in returned_ids # Original document excluded
def test_with_user_applies_permission_filter(self, backend: TantivyBackend):
"""more_like_this with a user must exclude documents that user cannot see."""
viewer = User.objects.create_user("mlt_viewer")
other = User.objects.create_user("mlt_other")
public_doc = Document.objects.create(
title="Public financial document",
content="quarterly financial analysis report figures",
checksum="MLT3",
pk=52,
)
private_doc = Document.objects.create(
title="Private financial document",
content="quarterly financial analysis report figures",
checksum="MLT4",
pk=53,
owner=other,
)
backend.add_or_update(public_doc)
backend.add_or_update(private_doc)
results = backend.more_like_this(doc_id=52, user=viewer, page=1, page_size=10)
returned_ids = [hit["id"] for hit in results.hits]
# private_doc is owned by other, so viewer cannot see it
assert 53 not in returned_ids
def test_document_not_in_index_returns_empty(self, backend: TantivyBackend):
"""more_like_this for a doc_id absent from the index must return empty results."""
results = backend.more_like_this(doc_id=9999, user=None, page=1, page_size=10)
assert results.hits == []
assert results.total == 0
class TestSingleton:
@@ -462,10 +593,19 @@ class TestFieldHandling:
# Should not raise an exception
backend.add_or_update(doc)
assert len(backend.search_ids("test", user=None)) == 1
results = backend.search(
"test",
user=None,
page=1,
page_size=10,
sort_field=None,
sort_reverse=False,
)
assert results.total == 1
def test_custom_fields_include_name_and_value(self, backend: TantivyBackend):
"""Custom fields must be indexed with both field name and value for structured queries."""
# Create a custom field
field = CustomField.objects.create(
name="Invoice Number",
data_type=CustomField.FieldDataType.STRING,
@@ -482,9 +622,18 @@ class TestFieldHandling:
value_text="INV-2024-001",
)
# Should not raise an exception during indexing
backend.add_or_update(doc)
assert len(backend.search_ids("invoice", user=None)) == 1
results = backend.search(
"invoice",
user=None,
page=1,
page_size=10,
sort_field=None,
sort_reverse=False,
)
assert results.total == 1
def test_select_custom_field_indexes_label_not_id(self, backend: TantivyBackend):
"""SELECT custom fields must index the human-readable label, not the opaque option ID."""
@@ -511,8 +660,27 @@ class TestFieldHandling:
)
backend.add_or_update(doc)
assert len(backend.search_ids("custom_fields.value:invoice", user=None)) == 1
assert len(backend.search_ids("custom_fields.value:opt_abc", user=None)) == 0
# Label should be findable
results = backend.search(
"custom_fields.value:invoice",
user=None,
page=1,
page_size=10,
sort_field=None,
sort_reverse=False,
)
assert results.total == 1
# Opaque ID must not appear in the index
results = backend.search(
"custom_fields.value:opt_abc",
user=None,
page=1,
page_size=10,
sort_field=None,
sort_reverse=False,
)
assert results.total == 0
def test_none_custom_field_value_not_indexed(self, backend: TantivyBackend):
"""Custom field instances with no value set must not produce an index entry."""
@@ -534,7 +702,16 @@ class TestFieldHandling:
)
backend.add_or_update(doc)
assert len(backend.search_ids("custom_fields.value:none", user=None)) == 0
# The string "none" must not appear as an indexed value
results = backend.search(
"custom_fields.value:none",
user=None,
page=1,
page_size=10,
sort_field=None,
sort_reverse=False,
)
assert results.total == 0
def test_notes_include_user_information(self, backend: TantivyBackend):
"""Notes must be indexed with user information when available for structured queries."""
@@ -547,101 +724,32 @@ class TestFieldHandling:
)
Note.objects.create(document=doc, note="Important note", user=user)
# Should not raise an exception during indexing
backend.add_or_update(doc)
ids = backend.search_ids("test", user=None)
assert len(ids) == 1, (
f"Expected 1, got {len(ids)}. Document content should be searchable."
# Test basic document search first
results = backend.search(
"test",
user=None,
page=1,
page_size=10,
sort_field=None,
sort_reverse=False,
)
assert results.total == 1, (
f"Expected 1, got {results.total}. Document content should be searchable."
)
ids = backend.search_ids("notes.note:important", user=None)
assert len(ids) == 1, (
f"Expected 1, got {len(ids)}. Note content should be searchable via notes.note: prefix."
# Test notes search — must use structured JSON syntax now that note
# is no longer in DEFAULT_SEARCH_FIELDS
results = backend.search(
"notes.note:important",
user=None,
page=1,
page_size=10,
sort_field=None,
sort_reverse=False,
)
class TestHighlightHits:
"""Test highlight_hits returns proper HTML strings, not raw Snippet objects."""
def test_highlights_content_returns_match_span_html(
self,
backend: TantivyBackend,
):
"""highlight_hits must return frontend-ready highlight spans."""
doc = Document.objects.create(
title="Highlight Test",
content="The quick brown fox jumps over the lazy dog",
checksum="HH1",
pk=90,
assert results.total == 1, (
f"Expected 1, got {results.total}. Note content should be searchable via notes.note: prefix."
)
backend.add_or_update(doc)
hits = backend.highlight_hits("quick", [doc.pk])
assert len(hits) == 1
highlights = hits[0]["highlights"]
assert "content" in highlights
content_highlight = highlights["content"]
assert isinstance(content_highlight, str), (
f"Expected str, got {type(content_highlight)}: {content_highlight!r}"
)
assert '<span class="match">' in content_highlight, (
f"Expected HTML with match span, got: {content_highlight!r}"
)
def test_highlights_notes_returns_match_span_html(
self,
backend: TantivyBackend,
):
"""Note highlights must be frontend-ready HTML via notes_text companion field.
The notes JSON field does not support tantivy SnippetGenerator; the
notes_text plain-text field is used instead. We use the full-text
query "urgent" (not notes.note:) because notes_text IS in
DEFAULT_SEARCH_FIELDS via the normal search path… actually, we use
notes.note: prefix so the query targets notes content directly, but
the snippet is generated from notes_text which stores the same text.
"""
user = User.objects.create_user("hl_noteuser")
doc = Document.objects.create(
title="Doc with matching note",
content="unrelated content",
checksum="HH2",
pk=91,
)
Note.objects.create(document=doc, note="urgent payment required", user=user)
backend.add_or_update(doc)
# Use notes.note: prefix so the document matches the query and the
# notes_text snippet generator can produce highlights.
hits = backend.highlight_hits("notes.note:urgent", [doc.pk])
assert len(hits) == 1
highlights = hits[0]["highlights"]
assert "notes" in highlights
note_highlight = highlights["notes"]
assert isinstance(note_highlight, str), (
f"Expected str, got {type(note_highlight)}: {note_highlight!r}"
)
assert '<span class="match">' in note_highlight, (
f"Expected HTML with match span, got: {note_highlight!r}"
)
def test_empty_doc_list_returns_empty_hits(self, backend: TantivyBackend):
"""highlight_hits with no doc IDs must return an empty list."""
hits = backend.highlight_hits("anything", [])
assert hits == []
def test_no_highlights_when_no_match(self, backend: TantivyBackend):
"""Documents not matching the query should not appear in results."""
doc = Document.objects.create(
title="Unrelated",
content="completely different text",
checksum="HH3",
pk=92,
)
backend.add_or_update(doc)
hits = backend.highlight_hits("quick", [doc.pk])
assert len(hits) == 0

View File

@@ -6,6 +6,8 @@ from unittest.mock import patch
from django.contrib.auth.models import User
from django.core.files.uploadedfile import SimpleUploadedFile
from django.test import override_settings
from PIL import Image
from PIL.PngImagePlugin import PngInfo
from rest_framework import status
from rest_framework.test import APITestCase
@@ -201,6 +203,156 @@ class TestApiAppConfig(DirectoriesMixin, APITestCase):
)
self.assertFalse(Path(old_logo.path).exists())
def test_api_strips_exif_data_from_uploaded_logo(self) -> None:
"""
GIVEN:
- A JPEG logo upload containing EXIF metadata
WHEN:
- Uploaded via PATCH to app config
THEN:
- Stored logo image has EXIF metadata removed
"""
image = Image.new("RGB", (12, 12), "blue")
exif = Image.Exif()
exif[315] = "Paperless Test Author"
logo = BytesIO()
image.save(logo, format="JPEG", exif=exif)
logo.seek(0)
response = self.client.patch(
f"{self.ENDPOINT}1/",
{
"app_logo": SimpleUploadedFile(
name="logo-with-exif.jpg",
content=logo.getvalue(),
content_type="image/jpeg",
),
},
)
self.assertEqual(response.status_code, status.HTTP_200_OK)
config = ApplicationConfiguration.objects.first()
with Image.open(config.app_logo.path) as stored_logo:
stored_exif = stored_logo.getexif()
self.assertEqual(len(stored_exif), 0)
def test_api_strips_png_metadata_from_uploaded_logo(self) -> None:
"""
GIVEN:
- A PNG logo upload containing text metadata
WHEN:
- Uploaded via PATCH to app config
THEN:
- Stored logo image has metadata removed
"""
image = Image.new("RGB", (12, 12), "green")
pnginfo = PngInfo()
pnginfo.add_text("Author", "Paperless Test Author")
logo = BytesIO()
image.save(logo, format="PNG", pnginfo=pnginfo)
logo.seek(0)
response = self.client.patch(
f"{self.ENDPOINT}1/",
{
"app_logo": SimpleUploadedFile(
name="logo-with-metadata.png",
content=logo.getvalue(),
content_type="image/png",
),
},
)
self.assertEqual(response.status_code, status.HTTP_200_OK)
config = ApplicationConfiguration.objects.first()
with Image.open(config.app_logo.path) as stored_logo:
stored_text = stored_logo.text
self.assertEqual(stored_text, {})
def test_api_accepts_valid_gif_logo(self) -> None:
"""
GIVEN:
- A valid GIF logo upload
WHEN:
- Uploaded via PATCH to app config
THEN:
- Upload succeeds
"""
image = Image.new("RGB", (12, 12), "red")
logo = BytesIO()
image.save(logo, format="GIF", comment=b"Paperless Test Comment")
logo.seek(0)
response = self.client.patch(
f"{self.ENDPOINT}1/",
{
"app_logo": SimpleUploadedFile(
name="logo.gif",
content=logo.getvalue(),
content_type="image/gif",
),
},
)
self.assertEqual(response.status_code, status.HTTP_200_OK)
def test_api_rejects_invalid_raster_logo(self) -> None:
"""
GIVEN:
- A file named as a JPEG but containing non-image payload data
WHEN:
- Uploaded via PATCH to app config
THEN:
- Upload is rejected with 400
"""
response = self.client.patch(
f"{self.ENDPOINT}1/",
{
"app_logo": SimpleUploadedFile(
name="not-an-image.jpg",
content=b"<script>alert('xss')</script>",
content_type="image/jpeg",
),
},
)
self.assertEqual(response.status_code, status.HTTP_400_BAD_REQUEST)
self.assertIn("invalid logo image", str(response.data).lower())
@override_settings(MAX_IMAGE_PIXELS=100)
def test_api_rejects_logo_exceeding_max_image_pixels(self) -> None:
"""
GIVEN:
- A raster logo larger than the configured MAX_IMAGE_PIXELS limit
WHEN:
- Uploaded via PATCH to app config
THEN:
- Upload is rejected with 400
"""
image = Image.new("RGB", (12, 12), "purple")
logo = BytesIO()
image.save(logo, format="PNG")
logo.seek(0)
response = self.client.patch(
f"{self.ENDPOINT}1/",
{
"app_logo": SimpleUploadedFile(
name="too-large.png",
content=logo.getvalue(),
content_type="image/png",
),
},
)
self.assertEqual(response.status_code, status.HTTP_400_BAD_REQUEST)
self.assertIn(
"uploaded logo exceeds the maximum allowed image size",
str(response.data).lower(),
)
def test_api_rejects_malicious_svg_logo(self) -> None:
"""
GIVEN:
@@ -679,7 +831,7 @@ class TestApiAppConfig(DirectoriesMixin, APITestCase):
config.save()
with (
patch("documents.tasks.llmindex_index.delay") as mock_update,
patch("documents.tasks.llmindex_index.apply_async") as mock_update,
patch("paperless_ai.indexing.vector_store_file_exists") as mock_exists,
):
mock_exists.return_value = False

View File

@@ -18,6 +18,7 @@ from django.contrib.auth.models import Permission
from django.contrib.auth.models import User
from django.core import mail
from django.core.cache import cache
from django.core.files.uploadedfile import SimpleUploadedFile
from django.db import DataError
from django.test import override_settings
from django.utils import timezone
@@ -1377,6 +1378,79 @@ class TestDocumentApi(DirectoriesMixin, DocumentConsumeDelayMixin, APITestCase):
self.assertIsNone(overrides.document_type_id)
self.assertIsNone(overrides.tag_ids)
def test_upload_with_path_traversal_filename_is_reduced_to_basename(self) -> None:
self.consume_file_mock.return_value = celery.result.AsyncResult(
id=str(uuid.uuid4()),
)
payload = SimpleUploadedFile(
"../../outside.pdf",
(Path(__file__).parent / "samples" / "simple.pdf").read_bytes(),
content_type="application/pdf",
)
response = self.client.post(
"/api/documents/post_document/",
{"document": payload},
)
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.consume_file_mock.assert_called_once()
input_doc, overrides = self.get_last_consume_delay_call_args()
self.assertEqual(input_doc.original_file.name, "outside.pdf")
self.assertEqual(overrides.filename, "outside.pdf")
self.assertNotIn("..", input_doc.original_file.name)
self.assertNotIn("..", overrides.filename)
self.assertTrue(
input_doc.original_file.resolve(strict=False).is_relative_to(
Path(settings.SCRATCH_DIR).resolve(strict=False),
),
)
def test_upload_with_path_traversal_content_disposition_filename_is_reduced_to_basename(
self,
) -> None:
self.consume_file_mock.return_value = celery.result.AsyncResult(
id=str(uuid.uuid4()),
)
pdf_bytes = (Path(__file__).parent / "samples" / "simple.pdf").read_bytes()
boundary = "paperless-boundary"
payload = (
(
f"--{boundary}\r\n"
'Content-Disposition: form-data; name="document"; '
'filename="../../outside.pdf"\r\n'
"Content-Type: application/pdf\r\n\r\n"
).encode()
+ pdf_bytes
+ f"\r\n--{boundary}--\r\n".encode()
)
response = self.client.generic(
"POST",
"/api/documents/post_document/",
payload,
content_type=f"multipart/form-data; boundary={boundary}",
)
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.consume_file_mock.assert_called_once()
input_doc, overrides = self.get_last_consume_delay_call_args()
self.assertEqual(input_doc.original_file.name, "outside.pdf")
self.assertEqual(overrides.filename, "outside.pdf")
self.assertNotIn("..", input_doc.original_file.name)
self.assertNotIn("..", overrides.filename)
self.assertTrue(
input_doc.original_file.resolve(strict=False).is_relative_to(
Path(settings.SCRATCH_DIR).resolve(strict=False),
),
)
def test_document_filters_use_latest_version_content(self) -> None:
root = Document.objects.create(
title="versioned root",

View File

@@ -1503,126 +1503,6 @@ class TestDocumentSearchApi(DirectoriesMixin, APITestCase):
[d2.id, d1.id, d3.id],
)
def test_search_ordering_by_score(self) -> None:
"""ordering=-score must return results in descending relevance order (best first)."""
backend = get_backend()
# doc_high has more occurrences of the search term → higher BM25 score
doc_low = Document.objects.create(
title="score sort low",
content="apple",
checksum="SCL1",
)
doc_high = Document.objects.create(
title="score sort high",
content="apple apple apple apple apple",
checksum="SCH1",
)
backend.add_or_update(doc_low)
backend.add_or_update(doc_high)
# -score = descending = best first (highest score)
response = self.client.get("/api/documents/?query=apple&ordering=-score")
self.assertEqual(response.status_code, status.HTTP_200_OK)
ids = [r["id"] for r in response.data["results"]]
self.assertEqual(
ids[0],
doc_high.id,
"Most relevant doc should be first for -score",
)
# score = ascending = worst first (lowest score)
response = self.client.get("/api/documents/?query=apple&ordering=score")
self.assertEqual(response.status_code, status.HTTP_200_OK)
ids = [r["id"] for r in response.data["results"]]
self.assertEqual(
ids[0],
doc_low.id,
"Least relevant doc should be first for +score",
)
def test_search_with_tantivy_native_sort(self) -> None:
"""When ordering by a Tantivy-sortable field, results must be correctly sorted."""
backend = get_backend()
for i, asn in enumerate([30, 10, 20]):
doc = Document.objects.create(
title=f"sortable doc {i}",
content="searchable content",
checksum=f"TNS{i}",
archive_serial_number=asn,
)
backend.add_or_update(doc)
response = self.client.get(
"/api/documents/?query=searchable&ordering=archive_serial_number",
)
self.assertEqual(response.status_code, status.HTTP_200_OK)
asns = [doc["archive_serial_number"] for doc in response.data["results"]]
self.assertEqual(asns, [10, 20, 30])
response = self.client.get(
"/api/documents/?query=searchable&ordering=-archive_serial_number",
)
self.assertEqual(response.status_code, status.HTTP_200_OK)
asns = [doc["archive_serial_number"] for doc in response.data["results"]]
self.assertEqual(asns, [30, 20, 10])
def test_search_page_2_returns_correct_slice(self) -> None:
"""Page 2 must return the second slice, not overlap with page 1."""
backend = get_backend()
for i in range(10):
doc = Document.objects.create(
title=f"doc {i}",
content="paginated content",
checksum=f"PG2{i}",
archive_serial_number=i + 1,
)
backend.add_or_update(doc)
response = self.client.get(
"/api/documents/?query=paginated&ordering=archive_serial_number&page=1&page_size=3",
)
page1_ids = [r["id"] for r in response.data["results"]]
self.assertEqual(len(page1_ids), 3)
response = self.client.get(
"/api/documents/?query=paginated&ordering=archive_serial_number&page=2&page_size=3",
)
page2_ids = [r["id"] for r in response.data["results"]]
self.assertEqual(len(page2_ids), 3)
# No overlap between pages
self.assertEqual(set(page1_ids) & set(page2_ids), set())
# Page 2 ASNs are higher than page 1
page1_asns = [
Document.objects.get(pk=pk).archive_serial_number for pk in page1_ids
]
page2_asns = [
Document.objects.get(pk=pk).archive_serial_number for pk in page2_ids
]
self.assertTrue(max(page1_asns) < min(page2_asns))
def test_search_all_field_contains_all_ids_when_paginated(self) -> None:
"""The 'all' field must contain every matching ID, even when paginated."""
backend = get_backend()
doc_ids = []
for i in range(10):
doc = Document.objects.create(
title=f"all field doc {i}",
content="allfield content",
checksum=f"AF{i}",
)
backend.add_or_update(doc)
doc_ids.append(doc.pk)
response = self.client.get(
"/api/documents/?query=allfield&page=1&page_size=3",
headers={"Accept": "application/json; version=9"},
)
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.assertEqual(len(response.data["results"]), 3)
# "all" must contain ALL 10 matching IDs
self.assertCountEqual(response.data["all"], doc_ids)
@mock.patch("documents.bulk_edit.bulk_update_documents")
def test_global_search(self, m) -> None:
"""

View File

@@ -4,7 +4,6 @@ import tempfile
from pathlib import Path
from unittest import mock
from celery import states
from django.contrib.auth.models import Permission
from django.contrib.auth.models import User
from django.test import override_settings
@@ -13,6 +12,7 @@ from rest_framework.test import APITestCase
from documents.models import PaperlessTask
from documents.permissions import has_system_status_permission
from documents.tests.factories import PaperlessTaskFactory
from paperless import version
@@ -258,10 +258,10 @@ class TestSystemStatus(APITestCase):
THEN:
- The response contains an OK classifier status
"""
PaperlessTask.objects.create(
type=PaperlessTask.TaskType.SCHEDULED_TASK,
status=states.SUCCESS,
task_name=PaperlessTask.TaskName.TRAIN_CLASSIFIER,
PaperlessTaskFactory(
task_type=PaperlessTask.TaskType.TRAIN_CLASSIFIER,
trigger_source=PaperlessTask.TriggerSource.SCHEDULED,
status=PaperlessTask.Status.SUCCESS,
)
self.client.force_login(self.user)
response = self.client.get(self.ENDPOINT)
@@ -295,11 +295,11 @@ class TestSystemStatus(APITestCase):
THEN:
- The response contains an ERROR classifier status
"""
PaperlessTask.objects.create(
type=PaperlessTask.TaskType.SCHEDULED_TASK,
status=states.FAILURE,
task_name=PaperlessTask.TaskName.TRAIN_CLASSIFIER,
result="Classifier training failed",
PaperlessTaskFactory(
task_type=PaperlessTask.TaskType.TRAIN_CLASSIFIER,
trigger_source=PaperlessTask.TriggerSource.SCHEDULED,
status=PaperlessTask.Status.FAILURE,
result_message="Classifier training failed",
)
self.client.force_login(self.user)
response = self.client.get(self.ENDPOINT)
@@ -319,10 +319,10 @@ class TestSystemStatus(APITestCase):
THEN:
- The response contains an OK sanity check status
"""
PaperlessTask.objects.create(
type=PaperlessTask.TaskType.SCHEDULED_TASK,
status=states.SUCCESS,
task_name=PaperlessTask.TaskName.CHECK_SANITY,
PaperlessTaskFactory(
task_type=PaperlessTask.TaskType.SANITY_CHECK,
trigger_source=PaperlessTask.TriggerSource.SCHEDULED,
status=PaperlessTask.Status.SUCCESS,
)
self.client.force_login(self.user)
response = self.client.get(self.ENDPOINT)
@@ -356,11 +356,11 @@ class TestSystemStatus(APITestCase):
THEN:
- The response contains an ERROR sanity check status
"""
PaperlessTask.objects.create(
type=PaperlessTask.TaskType.SCHEDULED_TASK,
status=states.FAILURE,
task_name=PaperlessTask.TaskName.CHECK_SANITY,
result="5 issues found.",
PaperlessTaskFactory(
task_type=PaperlessTask.TaskType.SANITY_CHECK,
trigger_source=PaperlessTask.TriggerSource.SCHEDULED,
status=PaperlessTask.Status.FAILURE,
result_message="5 issues found.",
)
self.client.force_login(self.user)
response = self.client.get(self.ENDPOINT)
@@ -405,10 +405,10 @@ class TestSystemStatus(APITestCase):
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.assertEqual(response.data["tasks"]["llmindex_status"], "WARNING")
PaperlessTask.objects.create(
type=PaperlessTask.TaskType.SCHEDULED_TASK,
status=states.SUCCESS,
task_name=PaperlessTask.TaskName.LLMINDEX_UPDATE,
PaperlessTaskFactory(
task_type=PaperlessTask.TaskType.LLM_INDEX,
trigger_source=PaperlessTask.TriggerSource.SCHEDULED,
status=PaperlessTask.Status.SUCCESS,
)
response = self.client.get(self.ENDPOINT)
self.assertEqual(response.status_code, status.HTTP_200_OK)
@@ -425,11 +425,11 @@ class TestSystemStatus(APITestCase):
- The response contains the correct AI status
"""
with override_settings(AI_ENABLED=True, LLM_EMBEDDING_BACKEND="openai"):
PaperlessTask.objects.create(
type=PaperlessTask.TaskType.SCHEDULED_TASK,
status=states.FAILURE,
task_name=PaperlessTask.TaskName.LLMINDEX_UPDATE,
result="AI index update failed",
PaperlessTaskFactory(
task_type=PaperlessTask.TaskType.LLM_INDEX,
trigger_source=PaperlessTask.TriggerSource.SCHEDULED,
status=PaperlessTask.Status.FAILURE,
result_message="AI index update failed",
)
self.client.force_login(self.user)
response = self.client.get(self.ENDPOINT)

View File

@@ -1,425 +1,586 @@
"""Tests for the /api/tasks/ endpoint.
Covers:
- v10 serializer (new field names)
- v9 serializer (backwards-compatible field names)
- Filtering, ordering, acknowledge, acknowledge_all, summary, active, run
"""
import uuid
from datetime import timedelta
from unittest import mock
import celery
import pytest
from django.contrib.auth.models import Permission
from django.contrib.auth.models import User
from django.utils import timezone
from rest_framework import status
from rest_framework.test import APITestCase
from rest_framework.test import APIClient
from documents.models import Document
from documents.models import PaperlessTask
from documents.tests.utils import DirectoriesMixin
from documents.views import TasksViewSet
from documents.tests.factories import PaperlessTaskFactory
pytestmark = pytest.mark.api
ENDPOINT = "/api/tasks/"
ACCEPT_V10 = "application/json; version=10"
ACCEPT_V9 = "application/json; version=9"
class TestTasks(DirectoriesMixin, APITestCase):
ENDPOINT = "/api/tasks/"
@pytest.mark.django_db()
class TestGetTasksV10:
def test_list_returns_tasks(self, admin_client: APIClient) -> None:
"""GET /api/tasks/ returns all tasks visible to the admin."""
PaperlessTaskFactory.create_batch(2)
def setUp(self) -> None:
super().setUp()
response = admin_client.get(ENDPOINT)
self.user = User.objects.create_superuser(username="temp_admin")
self.client.force_authenticate(user=self.user)
assert response.status_code == status.HTTP_200_OK
assert len(response.data) == 2
def test_get_tasks(self) -> None:
"""
GIVEN:
- Attempted celery tasks
WHEN:
- API call is made to get tasks
THEN:
- Attempting and pending tasks are serialized and provided
"""
task1 = PaperlessTask.objects.create(
task_id=str(uuid.uuid4()),
task_file_name="task_one.pdf",
def test_related_document_ids_populated_from_result_data(
self,
admin_client: APIClient,
) -> None:
"""related_document_ids includes the consumed document_id from result_data."""
PaperlessTaskFactory(
status=PaperlessTask.Status.SUCCESS,
result_data={"document_id": 7},
)
task2 = PaperlessTask.objects.create(
task_id=str(uuid.uuid4()),
task_file_name="task_two.pdf",
response = admin_client.get(ENDPOINT)
assert response.status_code == status.HTTP_200_OK
assert response.data[0]["related_document_ids"] == [7]
def test_related_document_ids_includes_duplicate_of(
self,
admin_client: APIClient,
) -> None:
"""related_document_ids includes duplicate_of when the file was already archived."""
PaperlessTaskFactory(
status=PaperlessTask.Status.SUCCESS,
result_data={"duplicate_of": 12},
)
response = self.client.get(self.ENDPOINT)
response = admin_client.get(ENDPOINT)
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.assertEqual(len(response.data), 2)
returned_task1 = response.data[1]
returned_task2 = response.data[0]
assert response.status_code == status.HTTP_200_OK
assert response.data[0]["related_document_ids"] == [12]
self.assertEqual(returned_task1["task_id"], task1.task_id)
self.assertEqual(returned_task1["status"], celery.states.PENDING)
self.assertEqual(returned_task1["task_file_name"], task1.task_file_name)
def test_filter_by_task_type(self, admin_client: APIClient) -> None:
"""?task_type= filters results to tasks of that type only."""
PaperlessTaskFactory(task_type=PaperlessTask.TaskType.CONSUME_FILE)
PaperlessTaskFactory(task_type=PaperlessTask.TaskType.TRAIN_CLASSIFIER)
self.assertEqual(returned_task2["task_id"], task2.task_id)
self.assertEqual(returned_task2["status"], celery.states.PENDING)
self.assertEqual(returned_task2["task_file_name"], task2.task_file_name)
def test_get_single_task_status(self) -> None:
"""
GIVEN
- Query parameter for a valid task ID
WHEN:
- API call is made to get task status
THEN:
- Single task data is returned
"""
id1 = str(uuid.uuid4())
task1 = PaperlessTask.objects.create(
task_id=id1,
task_file_name="task_one.pdf",
response = admin_client.get(
ENDPOINT,
{"task_type": PaperlessTask.TaskType.TRAIN_CLASSIFIER},
)
_ = PaperlessTask.objects.create(
task_id=str(uuid.uuid4()),
task_file_name="task_two.pdf",
assert response.status_code == status.HTTP_200_OK
assert len(response.data) == 1
assert response.data[0]["task_type"] == PaperlessTask.TaskType.TRAIN_CLASSIFIER
def test_filter_by_status(self, admin_client: APIClient) -> None:
"""?status= filters results to tasks with that status only."""
PaperlessTaskFactory(status=PaperlessTask.Status.PENDING)
PaperlessTaskFactory(status=PaperlessTask.Status.SUCCESS)
response = admin_client.get(
ENDPOINT,
{"status": PaperlessTask.Status.SUCCESS},
)
response = self.client.get(self.ENDPOINT + f"?task_id={id1}")
assert response.status_code == status.HTTP_200_OK
assert len(response.data) == 1
assert response.data[0]["status"] == PaperlessTask.Status.SUCCESS
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.assertEqual(len(response.data), 1)
returned_task1 = response.data[0]
def test_filter_by_task_id(self, admin_client: APIClient) -> None:
"""?task_id= returns only the task with that UUID."""
task = PaperlessTaskFactory()
PaperlessTaskFactory() # unrelated task that should not appear
self.assertEqual(returned_task1["task_id"], task1.task_id)
response = admin_client.get(ENDPOINT, {"task_id": task.task_id})
def test_get_single_task_status_not_valid(self) -> None:
"""
GIVEN
- Query parameter for a non-existent task ID
WHEN:
- API call is made to get task status
THEN:
- No task data is returned
"""
PaperlessTask.objects.create(
task_id=str(uuid.uuid4()),
task_file_name="task_one.pdf",
assert response.status_code == status.HTTP_200_OK
assert len(response.data) == 1
assert response.data[0]["task_id"] == task.task_id
def test_filter_by_acknowledged(self, admin_client: APIClient) -> None:
"""?acknowledged=false returns only tasks that have not been acknowledged."""
PaperlessTaskFactory(acknowledged=False)
PaperlessTaskFactory(acknowledged=True)
response = admin_client.get(ENDPOINT, {"acknowledged": "false"})
assert response.status_code == status.HTTP_200_OK
assert len(response.data) == 1
assert response.data[0]["acknowledged"] is False
def test_filter_is_complete_true(self, admin_client: APIClient) -> None:
"""?is_complete=true returns only SUCCESS and FAILURE tasks."""
PaperlessTaskFactory(status=PaperlessTask.Status.PENDING)
PaperlessTaskFactory(status=PaperlessTask.Status.SUCCESS)
PaperlessTaskFactory(status=PaperlessTask.Status.FAILURE)
response = admin_client.get(ENDPOINT, {"is_complete": "true"})
assert response.status_code == status.HTTP_200_OK
assert len(response.data) == 2
returned_statuses = {t["status"] for t in response.data}
assert returned_statuses == {
PaperlessTask.Status.SUCCESS,
PaperlessTask.Status.FAILURE,
}
def test_filter_is_complete_false(self, admin_client: APIClient) -> None:
"""?is_complete=false returns only PENDING and STARTED tasks."""
PaperlessTaskFactory(status=PaperlessTask.Status.PENDING)
PaperlessTaskFactory(status=PaperlessTask.Status.STARTED)
PaperlessTaskFactory(status=PaperlessTask.Status.SUCCESS)
response = admin_client.get(ENDPOINT, {"is_complete": "false"})
assert response.status_code == status.HTTP_200_OK
assert len(response.data) == 2
returned_statuses = {t["status"] for t in response.data}
assert returned_statuses == {
PaperlessTask.Status.PENDING,
PaperlessTask.Status.STARTED,
}
def test_default_ordering_is_newest_first(self, admin_client: APIClient) -> None:
"""Tasks are returned in descending date_created order (newest first)."""
base = timezone.now()
t1 = PaperlessTaskFactory(date_created=base)
t2 = PaperlessTaskFactory(date_created=base + timedelta(seconds=1))
t3 = PaperlessTaskFactory(date_created=base + timedelta(seconds=2))
response = admin_client.get(ENDPOINT)
assert response.status_code == status.HTTP_200_OK
ids = [t["task_id"] for t in response.data]
assert ids == [t3.task_id, t2.task_id, t1.task_id]
def test_list_is_owner_aware(
self,
admin_user: User,
regular_user: User,
) -> None:
"""The task list only shows tasks the user owns or that are unowned."""
regular_user.user_permissions.add(
Permission.objects.get(codename="view_paperlesstask"),
)
_ = PaperlessTask.objects.create(
task_id=str(uuid.uuid4()),
task_file_name="task_two.pdf",
client = APIClient()
client.force_authenticate(user=regular_user)
client.credentials(HTTP_ACCEPT=ACCEPT_V10)
PaperlessTaskFactory(owner=admin_user)
shared_task = PaperlessTaskFactory()
own_task = PaperlessTaskFactory(owner=regular_user)
response = client.get(ENDPOINT)
assert response.status_code == status.HTTP_200_OK
assert len(response.data) == 2
returned_task_ids = {t["task_id"] for t in response.data}
assert shared_task.task_id in returned_task_ids
assert own_task.task_id in returned_task_ids
@pytest.mark.django_db()
class TestGetTasksV9:
def test_task_name_equals_task_type_value(self, v9_client: APIClient) -> None:
"""task_name mirrors the task_type value for v9 backwards compatibility."""
PaperlessTaskFactory(task_type=PaperlessTask.TaskType.CONSUME_FILE)
response = v9_client.get(ENDPOINT)
assert response.status_code == status.HTTP_200_OK
assert response.data[0]["task_name"] == "consume_file"
def test_task_file_name_from_input_data(self, v9_client: APIClient) -> None:
"""task_file_name is read from input_data['filename']."""
PaperlessTaskFactory(input_data={"filename": "report.pdf"})
response = v9_client.get(ENDPOINT)
assert response.status_code == status.HTTP_200_OK
assert response.data[0]["task_file_name"] == "report.pdf"
def test_task_file_name_none_when_no_filename_key(
self,
v9_client: APIClient,
) -> None:
"""task_file_name is None when filename is absent from input_data."""
PaperlessTaskFactory(input_data={})
response = v9_client.get(ENDPOINT)
assert response.status_code == status.HTTP_200_OK
assert response.data[0]["task_file_name"] is None
def test_type_scheduled_maps_to_scheduled_task(self, v9_client: APIClient) -> None:
"""trigger_source=scheduled maps to type='SCHEDULED_TASK' in v9."""
PaperlessTaskFactory(trigger_source=PaperlessTask.TriggerSource.SCHEDULED)
response = v9_client.get(ENDPOINT)
assert response.status_code == status.HTTP_200_OK
assert response.data[0]["type"] == "SCHEDULED_TASK"
def test_type_system_maps_to_auto_task(self, v9_client: APIClient) -> None:
"""trigger_source=system maps to type='AUTO_TASK' in v9."""
PaperlessTaskFactory(trigger_source=PaperlessTask.TriggerSource.SYSTEM)
response = v9_client.get(ENDPOINT)
assert response.status_code == status.HTTP_200_OK
assert response.data[0]["type"] == "AUTO_TASK"
def test_type_web_ui_maps_to_manual_task(self, v9_client: APIClient) -> None:
"""trigger_source=web_ui maps to type='MANUAL_TASK' in v9."""
PaperlessTaskFactory(trigger_source=PaperlessTask.TriggerSource.WEB_UI)
response = v9_client.get(ENDPOINT)
assert response.status_code == status.HTTP_200_OK
assert response.data[0]["type"] == "MANUAL_TASK"
def test_type_manual_maps_to_manual_task(self, v9_client: APIClient) -> None:
"""trigger_source=manual maps to type='MANUAL_TASK' in v9."""
PaperlessTaskFactory(trigger_source=PaperlessTask.TriggerSource.MANUAL)
response = v9_client.get(ENDPOINT)
assert response.status_code == status.HTTP_200_OK
assert response.data[0]["type"] == "MANUAL_TASK"
def test_related_document_from_result_data_document_id(
self,
v9_client: APIClient,
) -> None:
"""related_document is taken from result_data['document_id'] in v9."""
PaperlessTaskFactory(
status=PaperlessTask.Status.SUCCESS,
result_data={"document_id": 99},
)
response = self.client.get(self.ENDPOINT + "?task_id=bad-task-id")
response = v9_client.get(ENDPOINT)
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.assertEqual(len(response.data), 0)
assert response.status_code == status.HTTP_200_OK
assert response.data[0]["related_document"] == 99
def test_acknowledge_tasks(self) -> None:
"""
GIVEN:
- Attempted celery tasks
WHEN:
- API call is made to get mark task as acknowledged
THEN:
- Task is marked as acknowledged
"""
task = PaperlessTask.objects.create(
task_id=str(uuid.uuid4()),
task_file_name="task_one.pdf",
def test_related_document_none_when_no_result_data(
self,
v9_client: APIClient,
) -> None:
"""related_document is None when result_data is absent in v9."""
PaperlessTaskFactory(result_data=None)
response = v9_client.get(ENDPOINT)
assert response.status_code == status.HTTP_200_OK
assert response.data[0]["related_document"] is None
def test_duplicate_documents_from_result_data(self, v9_client: APIClient) -> None:
"""duplicate_documents includes duplicate_of from result_data in v9."""
PaperlessTaskFactory(
status=PaperlessTask.Status.SUCCESS,
result_data={"duplicate_of": 55},
)
response = self.client.get(self.ENDPOINT)
self.assertEqual(len(response.data), 1)
response = v9_client.get(ENDPOINT)
response = self.client.post(
self.ENDPOINT + "acknowledge/",
assert response.status_code == status.HTTP_200_OK
assert response.data[0]["duplicate_documents"] == [55]
def test_duplicate_documents_empty_when_no_result_data(
self,
v9_client: APIClient,
) -> None:
"""duplicate_documents is an empty list when result_data is absent in v9."""
PaperlessTaskFactory(result_data=None)
response = v9_client.get(ENDPOINT)
assert response.status_code == status.HTTP_200_OK
assert response.data[0]["duplicate_documents"] == []
def test_filter_by_task_name_maps_to_task_type(self, v9_client: APIClient) -> None:
"""?task_name=consume_file filter maps to the task_type field for v9 compatibility."""
PaperlessTaskFactory(task_type=PaperlessTask.TaskType.CONSUME_FILE)
PaperlessTaskFactory(task_type=PaperlessTask.TaskType.TRAIN_CLASSIFIER)
response = v9_client.get(ENDPOINT, {"task_name": "consume_file"})
assert response.status_code == status.HTTP_200_OK
assert len(response.data) == 1
assert response.data[0]["task_name"] == "consume_file"
def test_filter_by_type_maps_to_trigger_source(self, v9_client: APIClient) -> None:
"""?type=SCHEDULED_TASK filter maps to trigger_source=scheduled for v9 compatibility."""
PaperlessTaskFactory(trigger_source=PaperlessTask.TriggerSource.SCHEDULED)
PaperlessTaskFactory(trigger_source=PaperlessTask.TriggerSource.WEB_UI)
response = v9_client.get(ENDPOINT, {"type": "SCHEDULED_TASK"})
assert response.status_code == status.HTTP_200_OK
assert len(response.data) == 1
assert response.data[0]["type"] == "SCHEDULED_TASK"
@pytest.mark.django_db()
class TestAcknowledge:
def test_returns_count(self, admin_client: APIClient) -> None:
"""POST acknowledge/ returns the count of tasks that were acknowledged."""
task1 = PaperlessTaskFactory()
task2 = PaperlessTaskFactory()
response = admin_client.post(
ENDPOINT + "acknowledge/",
{"tasks": [task1.id, task2.id]},
format="json",
)
assert response.status_code == status.HTTP_200_OK
assert response.data == {"result": 2}
def test_acknowledged_tasks_excluded_from_unacked_filter(
self,
admin_client: APIClient,
) -> None:
"""Acknowledged tasks no longer appear when filtering with ?acknowledged=false."""
task = PaperlessTaskFactory()
admin_client.post(
ENDPOINT + "acknowledge/",
{"tasks": [task.id]},
)
self.assertEqual(response.status_code, status.HTTP_200_OK)
response = self.client.get(self.ENDPOINT + "?acknowledged=false")
self.assertEqual(len(response.data), 0)
def test_acknowledge_tasks_requires_change_permission(self) -> None:
"""
GIVEN:
- A regular user initially without change permissions
- A regular user with change permissions
WHEN:
- API call is made to acknowledge tasks
THEN:
- The first user is forbidden from acknowledging tasks
- The second user is allowed to acknowledge tasks
"""
regular_user = User.objects.create_user(username="test")
self.client.force_authenticate(user=regular_user)
task = PaperlessTask.objects.create(
task_id=str(uuid.uuid4()),
task_file_name="task_one.pdf",
format="json",
)
response = self.client.post(
self.ENDPOINT + "acknowledge/",
response = admin_client.get(ENDPOINT, {"acknowledged": "false"})
assert response.status_code == status.HTTP_200_OK
assert len(response.data) == 0
def test_requires_change_permission(self, user_client: APIClient) -> None:
"""Regular users without change_paperlesstask permission receive 403."""
task = PaperlessTaskFactory()
response = user_client.post(
ENDPOINT + "acknowledge/",
{"tasks": [task.id]},
format="json",
)
self.assertEqual(response.status_code, status.HTTP_403_FORBIDDEN)
regular_user2 = User.objects.create_user(username="test2")
regular_user2.user_permissions.add(
assert response.status_code == status.HTTP_403_FORBIDDEN
def test_succeeds_with_change_permission(self, regular_user: User) -> None:
"""Users granted change_paperlesstask permission can acknowledge tasks."""
regular_user.user_permissions.add(
Permission.objects.get(codename="change_paperlesstask"),
)
regular_user2.save()
self.client.force_authenticate(user=regular_user2)
regular_user.save()
response = self.client.post(
self.ENDPOINT + "acknowledge/",
client = APIClient()
client.force_authenticate(user=regular_user)
client.credentials(HTTP_ACCEPT=ACCEPT_V10)
task = PaperlessTaskFactory()
response = client.post(
ENDPOINT + "acknowledge/",
{"tasks": [task.id]},
)
self.assertEqual(response.status_code, status.HTTP_200_OK)
def test_tasks_owner_aware(self) -> None:
"""
GIVEN:
- Existing PaperlessTasks with owner and with no owner
WHEN:
- API call is made to get tasks
THEN:
- Only tasks with no owner or request user are returned
"""
regular_user = User.objects.create_user(username="test")
regular_user.user_permissions.add(*Permission.objects.all())
self.client.logout()
self.client.force_authenticate(user=regular_user)
task1 = PaperlessTask.objects.create(
task_id=str(uuid.uuid4()),
task_file_name="task_one.pdf",
owner=self.user,
format="json",
)
task2 = PaperlessTask.objects.create(
task_id=str(uuid.uuid4()),
task_file_name="task_two.pdf",
assert response.status_code == status.HTTP_200_OK
@pytest.mark.django_db()
class TestAcknowledgeAll:
def test_marks_only_completed_tasks(self, admin_client: APIClient) -> None:
"""acknowledge_all/ marks only SUCCESS and FAILURE tasks as acknowledged."""
PaperlessTaskFactory(status=PaperlessTask.Status.SUCCESS, acknowledged=False)
PaperlessTaskFactory(status=PaperlessTask.Status.FAILURE, acknowledged=False)
PaperlessTaskFactory(status=PaperlessTask.Status.PENDING, acknowledged=False)
response = admin_client.post(ENDPOINT + "acknowledge_all/")
assert response.status_code == status.HTTP_200_OK
assert response.data == {"result": 2}
def test_skips_already_acknowledged(self, admin_client: APIClient) -> None:
"""acknowledge_all/ does not re-acknowledge tasks that are already acknowledged."""
PaperlessTaskFactory(status=PaperlessTask.Status.SUCCESS, acknowledged=True)
PaperlessTaskFactory(status=PaperlessTask.Status.SUCCESS, acknowledged=False)
response = admin_client.post(ENDPOINT + "acknowledge_all/")
assert response.status_code == status.HTTP_200_OK
assert response.data == {"result": 1}
def test_skips_pending_and_started(self, admin_client: APIClient) -> None:
"""acknowledge_all/ does not touch PENDING or STARTED tasks."""
PaperlessTaskFactory(status=PaperlessTask.Status.PENDING)
PaperlessTaskFactory(status=PaperlessTask.Status.STARTED)
response = admin_client.post(ENDPOINT + "acknowledge_all/")
assert response.status_code == status.HTTP_200_OK
assert response.data == {"result": 0}
def test_includes_revoked(self, admin_client: APIClient) -> None:
"""acknowledge_all/ marks REVOKED tasks as acknowledged."""
PaperlessTaskFactory(status=PaperlessTask.Status.REVOKED, acknowledged=False)
response = admin_client.post(ENDPOINT + "acknowledge_all/")
assert response.status_code == status.HTTP_200_OK
assert response.data == {"result": 1}
@pytest.mark.django_db()
class TestSummary:
def test_returns_per_type_totals(self, admin_client: APIClient) -> None:
"""summary/ returns per-type counts of total, success, and failure tasks."""
PaperlessTaskFactory(
task_type=PaperlessTask.TaskType.CONSUME_FILE,
status=PaperlessTask.Status.SUCCESS,
)
PaperlessTaskFactory(
task_type=PaperlessTask.TaskType.CONSUME_FILE,
status=PaperlessTask.Status.FAILURE,
)
PaperlessTaskFactory(
task_type=PaperlessTask.TaskType.TRAIN_CLASSIFIER,
status=PaperlessTask.Status.SUCCESS,
)
task3 = PaperlessTask.objects.create(
task_id=str(uuid.uuid4()),
task_file_name="task_three.pdf",
owner=regular_user,
)
response = admin_client.get(ENDPOINT + "summary/")
response = self.client.get(self.ENDPOINT)
assert response.status_code == status.HTTP_200_OK
by_type = {item["task_type"]: item for item in response.data}
assert by_type["consume_file"]["total_count"] == 2
assert by_type["consume_file"]["success_count"] == 1
assert by_type["consume_file"]["failure_count"] == 1
assert by_type["train_classifier"]["total_count"] == 1
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.assertEqual(len(response.data), 2)
self.assertEqual(response.data[0]["task_id"], task3.task_id)
self.assertEqual(response.data[1]["task_id"], task2.task_id)
acknowledge_response = self.client.post(
self.ENDPOINT + "acknowledge/",
{"tasks": [task1.id, task2.id, task3.id]},
)
self.assertEqual(acknowledge_response.status_code, status.HTTP_200_OK)
self.assertEqual(acknowledge_response.data, {"result": 2})
@pytest.mark.django_db()
class TestActive:
def test_returns_pending_and_started_only(self, admin_client: APIClient) -> None:
"""active/ returns only tasks in PENDING or STARTED status."""
PaperlessTaskFactory(status=PaperlessTask.Status.PENDING)
PaperlessTaskFactory(status=PaperlessTask.Status.STARTED)
PaperlessTaskFactory(status=PaperlessTask.Status.SUCCESS)
PaperlessTaskFactory(status=PaperlessTask.Status.FAILURE)
def test_task_result_no_error(self) -> None:
"""
GIVEN:
- A celery task completed without error
WHEN:
- API call is made to get tasks
THEN:
- The returned data includes the task result
"""
PaperlessTask.objects.create(
task_id=str(uuid.uuid4()),
task_file_name="task_one.pdf",
status=celery.states.SUCCESS,
result="Success. New document id 1 created",
)
response = admin_client.get(ENDPOINT + "active/")
response = self.client.get(self.ENDPOINT)
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.assertEqual(len(response.data), 1)
returned_data = response.data[0]
self.assertEqual(returned_data["result"], "Success. New document id 1 created")
self.assertEqual(returned_data["related_document"], "1")
def test_task_result_with_error(self) -> None:
"""
GIVEN:
- A celery task completed with an exception
WHEN:
- API call is made to get tasks
THEN:
- The returned result is the exception info
"""
PaperlessTask.objects.create(
task_id=str(uuid.uuid4()),
task_file_name="task_one.pdf",
status=celery.states.FAILURE,
result="test.pdf: Unexpected error during ingestion.",
)
response = self.client.get(self.ENDPOINT)
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.assertEqual(len(response.data), 1)
returned_data = response.data[0]
self.assertEqual(
returned_data["result"],
"test.pdf: Unexpected error during ingestion.",
)
def test_task_name_webui(self) -> None:
"""
GIVEN:
- Attempted celery task
- Task was created through the webui
WHEN:
- API call is made to get tasks
THEN:
- Returned data include the filename
"""
PaperlessTask.objects.create(
task_id=str(uuid.uuid4()),
task_file_name="test.pdf",
task_name=PaperlessTask.TaskName.CONSUME_FILE,
status=celery.states.SUCCESS,
)
response = self.client.get(self.ENDPOINT)
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.assertEqual(len(response.data), 1)
returned_data = response.data[0]
self.assertEqual(returned_data["task_file_name"], "test.pdf")
def test_task_name_consume_folder(self) -> None:
"""
GIVEN:
- Attempted celery task
- Task was created through the consume folder
WHEN:
- API call is made to get tasks
THEN:
- Returned data include the filename
"""
PaperlessTask.objects.create(
task_id=str(uuid.uuid4()),
task_file_name="anothertest.pdf",
task_name=PaperlessTask.TaskName.CONSUME_FILE,
status=celery.states.SUCCESS,
)
response = self.client.get(self.ENDPOINT)
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.assertEqual(len(response.data), 1)
returned_data = response.data[0]
self.assertEqual(returned_data["task_file_name"], "anothertest.pdf")
def test_task_result_duplicate_warning_includes_count(self) -> None:
"""
GIVEN:
- A celery task succeeds, but a duplicate exists
WHEN:
- API call is made to get tasks
THEN:
- The returned data includes duplicate warning metadata
"""
checksum = "duplicate-checksum"
Document.objects.create(
title="Existing",
content="",
mime_type="application/pdf",
checksum=checksum,
)
created_doc = Document.objects.create(
title="Created",
content="",
mime_type="application/pdf",
checksum=checksum,
archive_checksum="another-checksum",
)
PaperlessTask.objects.create(
task_id=str(uuid.uuid4()),
task_file_name="task_one.pdf",
status=celery.states.SUCCESS,
result=f"Success. New document id {created_doc.pk} created",
)
response = self.client.get(self.ENDPOINT)
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.assertEqual(len(response.data), 1)
returned_data = response.data[0]
self.assertEqual(returned_data["related_document"], str(created_doc.pk))
def test_run_train_classifier_task(self) -> None:
"""
GIVEN:
- A superuser
WHEN:
- API call is made to run the train classifier task
THEN:
- The task is run
"""
mock_train_classifier = mock.Mock(return_value="Task started")
TasksViewSet.TASK_AND_ARGS_BY_NAME = {
PaperlessTask.TaskName.TRAIN_CLASSIFIER: (
mock_train_classifier,
{"scheduled": False},
),
assert response.status_code == status.HTTP_200_OK
assert len(response.data) == 2
active_statuses = {t["status"] for t in response.data}
assert active_statuses == {
PaperlessTask.Status.PENDING,
PaperlessTask.Status.STARTED,
}
response = self.client.post(
self.ENDPOINT + "run/",
{"task_name": PaperlessTask.TaskName.TRAIN_CLASSIFIER},
def test_excludes_revoked_tasks_from_active(self, admin_client: APIClient) -> None:
"""active/ excludes REVOKED tasks."""
PaperlessTaskFactory(status=PaperlessTask.Status.REVOKED)
response = admin_client.get(ENDPOINT + "active/")
assert response.status_code == status.HTTP_200_OK
assert len(response.data) == 0
@pytest.mark.django_db()
class TestRun:
def test_forbidden_for_regular_user(self, user_client: APIClient) -> None:
"""Regular users without add_paperlesstask permission receive 403 from run/."""
response = user_client.post(
ENDPOINT + "run/",
{"task_type": PaperlessTask.TaskType.TRAIN_CLASSIFIER},
format="json",
)
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.assertEqual(response.data, {"result": "Task started"})
mock_train_classifier.assert_called_once_with(scheduled=False)
assert response.status_code == status.HTTP_403_FORBIDDEN
# mock error
mock_train_classifier.reset_mock()
mock_train_classifier.side_effect = Exception("Error")
response = self.client.post(
self.ENDPOINT + "run/",
{"task_name": PaperlessTask.TaskName.TRAIN_CLASSIFIER},
def test_dispatches_via_apply_async_with_manual_trigger_header(
self,
admin_client: APIClient,
) -> None:
"""run/ dispatches the task via apply_async with trigger_source=manual in headers."""
fake_task_id = str(uuid.uuid4())
mock_async_result = mock.Mock()
mock_async_result.id = fake_task_id
mock_apply_async = mock.Mock(return_value=mock_async_result)
with mock.patch(
"documents.views.train_classifier.apply_async",
mock_apply_async,
):
response = admin_client.post(
ENDPOINT + "run/",
{"task_type": PaperlessTask.TaskType.TRAIN_CLASSIFIER},
format="json",
)
assert response.status_code == status.HTTP_200_OK
assert response.data == {"task_id": fake_task_id}
mock_apply_async.assert_called_once_with(
kwargs={},
headers={"trigger_source": "manual"},
)
self.assertEqual(response.status_code, status.HTTP_500_INTERNAL_SERVER_ERROR)
mock_train_classifier.assert_called_once_with(scheduled=False)
@mock.patch("documents.tasks.sanity_check")
def test_run_task_requires_superuser(self, mock_check_sanity) -> None:
"""
GIVEN:
- A regular user
WHEN:
- API call is made to run a task
THEN:
- The task is not run
"""
regular_user = User.objects.create_user(username="test")
regular_user.user_permissions.add(*Permission.objects.all())
self.client.logout()
self.client.force_authenticate(user=regular_user)
response = self.client.post(
self.ENDPOINT + "run/",
{"task_name": PaperlessTask.TaskName.CHECK_SANITY},
def test_returns_400_for_consume_file(self, admin_client: APIClient) -> None:
"""consume_file cannot be manually triggered via the run endpoint."""
response = admin_client.post(
ENDPOINT + "run/",
{"task_type": PaperlessTask.TaskType.CONSUME_FILE},
format="json",
)
self.assertEqual(response.status_code, status.HTTP_403_FORBIDDEN)
mock_check_sanity.assert_not_called()
assert response.status_code == status.HTTP_400_BAD_REQUEST
def test_returns_400_for_invalid_task_type(self, admin_client: APIClient) -> None:
"""run/ returns 400 for an unrecognized task_type value."""
response = admin_client.post(
ENDPOINT + "run/",
{"task_type": "not_a_real_type"},
format="json",
)
assert response.status_code == status.HTTP_400_BAD_REQUEST
def test_sanity_check_dispatched_with_correct_kwargs(
self,
admin_client: APIClient,
) -> None:
"""run/ dispatches sanity_check with raise_on_error=False and manual trigger header."""
fake_task_id = str(uuid.uuid4())
mock_async_result = mock.Mock()
mock_async_result.id = fake_task_id
mock_apply_async = mock.Mock(return_value=mock_async_result)
with mock.patch(
"documents.views.sanity_check.apply_async",
mock_apply_async,
):
response = admin_client.post(
ENDPOINT + "run/",
{"task_type": PaperlessTask.TaskType.SANITY_CHECK},
format="json",
)
assert response.status_code == status.HTTP_200_OK
assert response.data == {"task_id": fake_task_id}
mock_apply_async.assert_called_once_with(
kwargs={"raise_on_error": False},
headers={"trigger_source": "manual"},
)

View File

@@ -1328,7 +1328,7 @@ class PreConsumeTestCase(DirectoriesMixin, GetConsumerMixin, TestCase):
environment = args[1]
self.assertEqual(command[0], script.name)
self.assertEqual(command[1], str(self.test_file))
self.assertEqual(len(command), 1)
subset = {
"DOCUMENT_SOURCE_PATH": str(c.input_doc.original_file),
@@ -1478,11 +1478,7 @@ class PostConsumeTestCase(DirectoriesMixin, GetConsumerMixin, TestCase):
environment = args[1]
self.assertEqual(command[0], script.name)
self.assertEqual(command[1], str(doc.pk))
self.assertEqual(command[5], f"/api/documents/{doc.pk}/download/")
self.assertEqual(command[6], f"/api/documents/{doc.pk}/thumb/")
self.assertEqual(command[7], "my_bank")
self.assertCountEqual(command[8].split(","), ["a", "b"])
self.assertEqual(len(command), 1)
subset = {
"DOCUMENT_ID": str(doc.pk),

View File

@@ -211,7 +211,7 @@ class TestCreateClassifier:
call_command("document_create_classifier", skip_checks=True)
m.assert_called_once_with(scheduled=False, status_callback=mocker.ANY)
m.assert_called_once_with(status_callback=mocker.ANY)
assert callable(m.call_args.kwargs["status_callback"])
def test_create_classifier_callback_output(self, mocker: MockerFixture) -> None:

View File

@@ -1,7 +1,7 @@
"""Tests for the sanity checker module.
Tests exercise ``check_sanity`` as a whole, verifying document validation,
orphan detection, task recording, and the iter_wrapper contract.
orphan detection, and the iter_wrapper contract.
"""
from __future__ import annotations
@@ -12,13 +12,12 @@ from typing import TYPE_CHECKING
import pytest
from documents.models import Document
from documents.models import PaperlessTask
from documents.sanity_checker import check_sanity
if TYPE_CHECKING:
from collections.abc import Iterable
from documents.models import Document
from documents.tests.conftest import PaperlessDirs
@@ -229,35 +228,6 @@ class TestCheckSanityIterWrapper:
assert not messages.has_error
@pytest.mark.django_db
class TestCheckSanityTaskRecording:
@pytest.mark.parametrize(
("expected_type", "scheduled"),
[
pytest.param(PaperlessTask.TaskType.SCHEDULED_TASK, True, id="scheduled"),
pytest.param(PaperlessTask.TaskType.MANUAL_TASK, False, id="manual"),
],
)
@pytest.mark.usefixtures("_media_settings")
def test_task_type(self, expected_type: str, *, scheduled: bool) -> None:
check_sanity(scheduled=scheduled)
task = PaperlessTask.objects.latest("date_created")
assert task.task_name == PaperlessTask.TaskName.CHECK_SANITY
assert task.type == expected_type
def test_success_status(self, sample_doc: Document) -> None:
check_sanity()
task = PaperlessTask.objects.latest("date_created")
assert task.status == "SUCCESS"
def test_failure_status(self, sample_doc: Document) -> None:
Path(sample_doc.source_path).unlink()
check_sanity()
task = PaperlessTask.objects.latest("date_created")
assert task.status == "FAILURE"
assert "Check logs for details" in task.result
@pytest.mark.django_db
class TestCheckSanityLogMessages:
def test_logs_doc_issues(

View File

@@ -1,250 +1,302 @@
import uuid
from unittest import mock
import celery
from django.contrib.auth import get_user_model
from django.test import TestCase
import pytest
from documents.data_models import ConsumableDocument
from documents.data_models import DocumentMetadataOverrides
from documents.data_models import DocumentSource
from documents.models import Document
from documents.models import PaperlessTask
from documents.signals.handlers import add_to_index
from documents.signals.handlers import before_task_publish_handler
from documents.signals.handlers import task_failure_handler
from documents.signals.handlers import task_postrun_handler
from documents.signals.handlers import task_prerun_handler
from documents.tests.test_consumer import fake_magic_from_file
from documents.tests.utils import DirectoriesMixin
@mock.patch("documents.consumer.magic.from_file", fake_magic_from_file)
class TestTaskSignalHandler(DirectoriesMixin, TestCase):
@classmethod
def setUpTestData(cls) -> None:
super().setUpTestData()
cls.user = get_user_model().objects.create_user(username="testuser")
@pytest.fixture
def consume_input_doc():
doc = mock.MagicMock(spec=ConsumableDocument)
# original_file is a Path; configure the nested mock so .name works
doc.original_file = mock.MagicMock()
doc.original_file.name = "invoice.pdf"
doc.original_path = None
doc.mime_type = "application/pdf"
doc.mailrule_id = None
doc.source = DocumentSource.WebUI
return doc
def util_call_before_task_publish_handler(
@pytest.fixture
def consume_overrides(django_user_model):
user = django_user_model.objects.create_user(username="testuser")
overrides = mock.MagicMock(spec=DocumentMetadataOverrides)
overrides.owner_id = user.id
return overrides
def send_publish(
task_name: str,
args: tuple,
kwargs: dict,
headers: dict | None = None,
) -> str:
from documents.signals.handlers import before_task_publish_handler
task_id = str(uuid.uuid4())
hdrs = {"task": task_name, "id": task_id, **(headers or {})}
before_task_publish_handler(sender=task_name, headers=hdrs, body=(args, kwargs, {}))
return task_id
@pytest.mark.django_db
class TestBeforeTaskPublishHandler:
def test_creates_task_for_consume_file(self, consume_input_doc, consume_overrides):
task_id = send_publish(
"documents.tasks.consume_file",
(consume_input_doc, consume_overrides),
{},
)
task = PaperlessTask.objects.get(task_id=task_id)
assert task.task_type == PaperlessTask.TaskType.CONSUME_FILE
assert task.status == PaperlessTask.Status.PENDING
assert task.trigger_source == PaperlessTask.TriggerSource.WEB_UI
assert task.input_data["filename"] == "invoice.pdf"
assert task.owner_id == consume_overrides.owner_id
def test_creates_task_for_train_classifier(self):
task_id = send_publish("documents.tasks.train_classifier", (), {})
task = PaperlessTask.objects.get(task_id=task_id)
assert task.task_type == PaperlessTask.TaskType.TRAIN_CLASSIFIER
assert task.trigger_source == PaperlessTask.TriggerSource.MANUAL
def test_creates_task_for_sanity_check(self):
task_id = send_publish("documents.tasks.sanity_check", (), {})
task = PaperlessTask.objects.get(task_id=task_id)
assert task.task_type == PaperlessTask.TaskType.SANITY_CHECK
def test_creates_task_for_process_mail_accounts(self):
task_id = send_publish(
"paperless_mail.tasks.process_mail_accounts",
(),
{"account_ids": [1, 2]},
)
task = PaperlessTask.objects.get(task_id=task_id)
assert task.task_type == PaperlessTask.TaskType.MAIL_FETCH
assert task.input_data["account_ids"] == [1, 2]
def test_scheduled_header_sets_trigger_source(self):
task_id = send_publish(
"documents.tasks.train_classifier",
(),
{},
headers={"trigger_source": "scheduled"},
)
task = PaperlessTask.objects.get(task_id=task_id)
assert task.trigger_source == PaperlessTask.TriggerSource.SCHEDULED
def test_system_header_sets_trigger_source(self):
task_id = send_publish(
"documents.tasks.llmindex_index",
(),
{"rebuild": True},
headers={"trigger_source": "system"},
)
task = PaperlessTask.objects.get(task_id=task_id)
assert task.trigger_source == PaperlessTask.TriggerSource.SYSTEM
def test_ignores_untracked_task(self):
send_publish("documents.tasks.bulk_update_documents", ([1, 2],), {})
assert PaperlessTask.objects.count() == 0
def test_ignores_none_headers(self):
from documents.signals.handlers import before_task_publish_handler
before_task_publish_handler(sender=None, headers=None, body=None)
assert PaperlessTask.objects.count() == 0
def test_consume_folder_source_maps_correctly(
self,
headers_to_use,
body_to_use,
) -> None:
"""
Simple utility to call the pre-run handle and ensure it created a single task
instance
"""
self.assertEqual(PaperlessTask.objects.all().count(), 0)
before_task_publish_handler(headers=headers_to_use, body=body_to_use)
self.assertEqual(PaperlessTask.objects.all().count(), 1)
def test_before_task_publish_handler_consume(self) -> None:
"""
GIVEN:
- A celery task is started via the consume folder
WHEN:
- Task before publish handler is called
THEN:
- The task is created and marked as pending
"""
headers = {
"id": str(uuid.uuid4()),
"task": "documents.tasks.consume_file",
}
body = (
# args
(
ConsumableDocument(
source=DocumentSource.ConsumeFolder,
original_file="/consume/hello-999.pdf",
),
DocumentMetadataOverrides(
title="Hello world",
owner_id=self.user.id,
),
),
# kwargs
consume_input_doc,
consume_overrides,
):
consume_input_doc.source = DocumentSource.ConsumeFolder
task_id = send_publish(
"documents.tasks.consume_file",
(consume_input_doc, consume_overrides),
{},
# celery stuff
{"callbacks": None, "errbacks": None, "chain": None, "chord": None},
)
self.util_call_before_task_publish_handler(
headers_to_use=headers,
body_to_use=body,
)
task = PaperlessTask.objects.get(task_id=task_id)
assert task.trigger_source == PaperlessTask.TriggerSource.FOLDER_CONSUME
task = PaperlessTask.objects.get()
self.assertIsNotNone(task)
self.assertEqual(headers["id"], task.task_id)
self.assertEqual("hello-999.pdf", task.task_file_name)
self.assertEqual(PaperlessTask.TaskName.CONSUME_FILE, task.task_name)
self.assertEqual(self.user.id, task.owner_id)
self.assertEqual(celery.states.PENDING, task.status)
def test_task_prerun_handler(self) -> None:
"""
GIVEN:
- A celery task is started via the consume folder
WHEN:
- Task starts execution
THEN:
- The task is marked as started
"""
headers = {
"id": str(uuid.uuid4()),
"task": "documents.tasks.consume_file",
}
body = (
# args
(
ConsumableDocument(
source=DocumentSource.ConsumeFolder,
original_file="/consume/hello-99.pdf",
),
None,
),
# kwargs
def test_email_source_maps_correctly(self, consume_input_doc, consume_overrides):
consume_input_doc.source = DocumentSource.MailFetch
task_id = send_publish(
"documents.tasks.consume_file",
(consume_input_doc, consume_overrides),
{},
# celery stuff
{"callbacks": None, "errbacks": None, "chain": None, "chord": None},
)
task = PaperlessTask.objects.get(task_id=task_id)
assert task.trigger_source == PaperlessTask.TriggerSource.EMAIL_CONSUME
@pytest.mark.django_db
class TestTaskPrerunHandler:
def test_marks_task_started(self):
task = PaperlessTask.objects.create(
task_id=str(uuid.uuid4()),
task_type=PaperlessTask.TaskType.CONSUME_FILE,
trigger_source=PaperlessTask.TriggerSource.MANUAL,
status=PaperlessTask.Status.PENDING,
)
from documents.signals.handlers import task_prerun_handler
task_prerun_handler(task_id=task.task_id)
task.refresh_from_db()
assert task.status == PaperlessTask.Status.STARTED
assert task.date_started is not None
def test_ignores_unknown_task_id(self):
from documents.signals.handlers import task_prerun_handler
task_prerun_handler(task_id="nonexistent-id") # must not raise
def test_ignores_none_task_id(self):
from documents.signals.handlers import task_prerun_handler
task_prerun_handler(task_id=None) # must not raise
@pytest.mark.django_db
class TestTaskPostrunHandler:
def _started_task(self) -> PaperlessTask:
from django.utils import timezone
return PaperlessTask.objects.create(
task_id=str(uuid.uuid4()),
task_type=PaperlessTask.TaskType.TRAIN_CLASSIFIER,
trigger_source=PaperlessTask.TriggerSource.MANUAL,
status=PaperlessTask.Status.STARTED,
date_started=timezone.now(),
)
self.util_call_before_task_publish_handler(
headers_to_use=headers,
body_to_use=body,
)
task_prerun_handler(task_id=headers["id"])
task = PaperlessTask.objects.get()
self.assertEqual(celery.states.STARTED, task.status)
def test_task_postrun_handler(self) -> None:
"""
GIVEN:
- A celery task is started via the consume folder
WHEN:
- Task finished execution
THEN:
- The task is marked as started
"""
headers = {
"id": str(uuid.uuid4()),
"task": "documents.tasks.consume_file",
}
body = (
# args
(
ConsumableDocument(
source=DocumentSource.ConsumeFolder,
original_file="/consume/hello-9.pdf",
),
None,
),
# kwargs
{},
# celery stuff
{"callbacks": None, "errbacks": None, "chain": None, "chord": None},
)
self.util_call_before_task_publish_handler(
headers_to_use=headers,
body_to_use=body,
)
def test_records_success_with_dict_result(self):
task = self._started_task()
from documents.signals.handlers import task_postrun_handler
task_postrun_handler(
task_id=headers["id"],
retval="Success. New document id 1 created",
state=celery.states.SUCCESS,
task_id=task.task_id,
retval={"document_id": 42},
state="SUCCESS",
)
task.refresh_from_db()
assert task.status == PaperlessTask.Status.SUCCESS
assert task.result_data == {"document_id": 42}
assert task.date_done is not None
assert task.duration_seconds is not None
assert task.wait_time_seconds is not None
task = PaperlessTask.objects.get()
def test_records_failure_state(self):
task = self._started_task()
from documents.signals.handlers import task_postrun_handler
self.assertEqual(celery.states.SUCCESS, task.status)
task_postrun_handler(task_id=task.task_id, retval="some error", state="FAILURE")
task.refresh_from_db()
assert task.status == PaperlessTask.Status.FAILURE
def test_task_failure_handler(self) -> None:
"""
GIVEN:
- A celery task is started via the consume folder
WHEN:
- Task failed execution
THEN:
- The task is marked as failed
"""
headers = {
"id": str(uuid.uuid4()),
"task": "documents.tasks.consume_file",
}
body = (
# args
(
ConsumableDocument(
source=DocumentSource.ConsumeFolder,
original_file="/consume/hello-9.pdf",
),
None,
),
# kwargs
{},
# celery stuff
{"callbacks": None, "errbacks": None, "chain": None, "chord": None},
def test_parses_legacy_new_document_string(self):
task = self._started_task()
from documents.signals.handlers import task_postrun_handler
task_postrun_handler(
task_id=task.task_id,
retval="New document id 42 created",
state="SUCCESS",
)
self.util_call_before_task_publish_handler(
headers_to_use=headers,
body_to_use=body,
task.refresh_from_db()
assert task.result_data["document_id"] == 42
assert task.result_message == "New document id 42 created"
def test_parses_legacy_duplicate_string(self):
task = self._started_task()
from documents.signals.handlers import task_postrun_handler
task_postrun_handler(
task_id=task.task_id,
retval="It is a duplicate of some document (#99).",
state="FAILURE",
)
task.refresh_from_db()
assert task.result_data["duplicate_of"] == 99
assert task.result_data["duplicate_in_trash"] is False
def test_ignores_unknown_task_id(self):
from documents.signals.handlers import task_postrun_handler
task_postrun_handler(
task_id="nonexistent",
retval=None,
state="SUCCESS",
) # must not raise
def test_records_revoked_state(self):
task = self._started_task()
from documents.signals.handlers import task_postrun_handler
task_postrun_handler(task_id=task.task_id, retval=None, state="REVOKED")
task.refresh_from_db()
assert task.status == PaperlessTask.Status.REVOKED
@pytest.mark.django_db
class TestTaskFailureHandler:
def test_records_failure_with_exception(self):
from django.utils import timezone
task = PaperlessTask.objects.create(
task_id=str(uuid.uuid4()),
task_type=PaperlessTask.TaskType.CONSUME_FILE,
trigger_source=PaperlessTask.TriggerSource.WEB_UI,
status=PaperlessTask.Status.STARTED,
date_started=timezone.now(),
)
from documents.signals.handlers import task_failure_handler
task_failure_handler(
task_id=headers["id"],
exception="Example failure",
task_id=task.task_id,
exception=ValueError("PDF parse failed"),
traceback=None,
)
task.refresh_from_db()
assert task.status == PaperlessTask.Status.FAILURE
assert task.result_data["error_type"] == "ValueError"
assert task.result_data["error_message"] == "PDF parse failed"
assert task.date_done is not None
task = PaperlessTask.objects.get()
def test_records_traceback_when_provided(self):
import sys
self.assertEqual(celery.states.FAILURE, task.status)
from django.utils import timezone
def test_add_to_index_indexes_root_once_for_root_documents(self) -> None:
root = Document.objects.create(
title="root",
checksum="root",
mime_type="application/pdf",
task = PaperlessTask.objects.create(
task_id=str(uuid.uuid4()),
task_type=PaperlessTask.TaskType.CONSUME_FILE,
trigger_source=PaperlessTask.TriggerSource.WEB_UI,
status=PaperlessTask.Status.STARTED,
date_started=timezone.now(),
)
try:
raise ValueError("test error")
except ValueError:
tb = sys.exc_info()[2]
with mock.patch("documents.search.get_backend") as mock_get_backend:
mock_backend = mock.MagicMock()
mock_get_backend.return_value = mock_backend
add_to_index(sender=None, document=root)
from documents.signals.handlers import task_failure_handler
mock_backend.add_or_update.assert_called_once_with(root, effective_content="")
def test_add_to_index_reindexes_root_for_version_documents(self) -> None:
root = Document.objects.create(
title="root",
checksum="root",
mime_type="application/pdf",
)
version = Document.objects.create(
title="version",
checksum="version",
mime_type="application/pdf",
root_document=root,
task_failure_handler(
task_id=task.task_id,
exception=ValueError("test error"),
traceback=tb,
)
task.refresh_from_db()
assert "traceback" in task.result_data
assert len(task.result_data["traceback"]) <= 5000
with mock.patch("documents.search.get_backend") as mock_get_backend:
mock_backend = mock.MagicMock()
mock_get_backend.return_value = mock_backend
add_to_index(sender=None, document=version)
def test_ignores_none_task_id(self):
from documents.signals.handlers import task_failure_handler
self.assertEqual(mock_backend.add_or_update.call_count, 1)
self.assertEqual(
mock_backend.add_or_update.call_args_list[0].args[0].id,
version.id,
)
self.assertEqual(
mock_backend.add_or_update.call_args_list[0].kwargs,
{"effective_content": version.content},
)
task_failure_handler(task_id=None, exception=ValueError("x"), traceback=None)

View File

@@ -4,7 +4,6 @@ from pathlib import Path
from unittest import mock
import pytest
from celery import states
from django.conf import settings
from django.test import TestCase
from django.test import override_settings
@@ -14,7 +13,6 @@ from documents import tasks
from documents.models import Correspondent
from documents.models import Document
from documents.models import DocumentType
from documents.models import PaperlessTask
from documents.models import Tag
from documents.sanity_checker import SanityCheckFailedException
from documents.sanity_checker import SanityCheckMessages
@@ -40,7 +38,8 @@ class TestClassifier(DirectoriesMixin, FileSystemAssertsMixin, TestCase):
def test_train_classifier_with_auto_tag(self, load_classifier) -> None:
load_classifier.return_value = None
Tag.objects.create(matching_algorithm=Tag.MATCH_AUTO, name="test")
tasks.train_classifier()
with self.assertRaises(ValueError):
tasks.train_classifier()
load_classifier.assert_called_once()
self.assertIsNotFile(settings.MODEL_FILE)
@@ -48,7 +47,8 @@ class TestClassifier(DirectoriesMixin, FileSystemAssertsMixin, TestCase):
def test_train_classifier_with_auto_type(self, load_classifier) -> None:
load_classifier.return_value = None
DocumentType.objects.create(matching_algorithm=Tag.MATCH_AUTO, name="test")
tasks.train_classifier()
with self.assertRaises(ValueError):
tasks.train_classifier()
load_classifier.assert_called_once()
self.assertIsNotFile(settings.MODEL_FILE)
@@ -56,7 +56,8 @@ class TestClassifier(DirectoriesMixin, FileSystemAssertsMixin, TestCase):
def test_train_classifier_with_auto_correspondent(self, load_classifier) -> None:
load_classifier.return_value = None
Correspondent.objects.create(matching_algorithm=Tag.MATCH_AUTO, name="test")
tasks.train_classifier()
with self.assertRaises(ValueError):
tasks.train_classifier()
load_classifier.assert_called_once()
self.assertIsNotFile(settings.MODEL_FILE)
@@ -298,7 +299,7 @@ class TestAIIndex(DirectoriesMixin, TestCase):
WHEN:
- llmindex_index task is called
THEN:
- update_llm_index is called, and the task is marked as success
- update_llm_index is called and its result is returned
"""
Document.objects.create(
title="test",
@@ -308,13 +309,9 @@ class TestAIIndex(DirectoriesMixin, TestCase):
# lazy-loaded so mock the actual function
with mock.patch("paperless_ai.indexing.update_llm_index") as update_llm_index:
update_llm_index.return_value = "LLM index updated successfully."
tasks.llmindex_index()
result = tasks.llmindex_index()
update_llm_index.assert_called_once()
task = PaperlessTask.objects.get(
task_name=PaperlessTask.TaskName.LLMINDEX_UPDATE,
)
self.assertEqual(task.status, states.SUCCESS)
self.assertEqual(task.result, "LLM index updated successfully.")
self.assertEqual(result, "LLM index updated successfully.")
@override_settings(
AI_ENABLED=True,
@@ -325,9 +322,9 @@ class TestAIIndex(DirectoriesMixin, TestCase):
GIVEN:
- Document exists, AI is enabled, llm index backend is set
WHEN:
- llmindex_index task is called
- llmindex_index task is called and update_llm_index raises an exception
THEN:
- update_llm_index raises an exception, and the task is marked as failure
- the exception propagates to the caller
"""
Document.objects.create(
title="test",
@@ -337,13 +334,9 @@ class TestAIIndex(DirectoriesMixin, TestCase):
# lazy-loaded so mock the actual function
with mock.patch("paperless_ai.indexing.update_llm_index") as update_llm_index:
update_llm_index.side_effect = Exception("LLM index update failed.")
tasks.llmindex_index()
with self.assertRaises(Exception, msg="LLM index update failed."):
tasks.llmindex_index()
update_llm_index.assert_called_once()
task = PaperlessTask.objects.get(
task_name=PaperlessTask.TaskName.LLMINDEX_UPDATE,
)
self.assertEqual(task.status, states.FAILURE)
self.assertIn("LLM index update failed.", task.result)
def test_update_document_in_llm_index(self) -> None:
"""

View File

@@ -8,6 +8,7 @@ import zipfile
from collections import defaultdict
from collections import deque
from datetime import datetime
from datetime import timedelta
from pathlib import Path
from time import mktime
from typing import TYPE_CHECKING
@@ -20,7 +21,6 @@ from urllib.parse import urlparse
import httpx
import magic
import pathvalidate
from celery import states
from django.conf import settings
from django.contrib.auth.models import Group
from django.contrib.auth.models import User
@@ -38,7 +38,6 @@ from django.db.models import Model
from django.db.models import OuterRef
from django.db.models import Prefetch
from django.db.models import Q
from django.db.models import QuerySet
from django.db.models import Subquery
from django.db.models import Sum
from django.db.models import When
@@ -92,6 +91,7 @@ from rest_framework.mixins import DestroyModelMixin
from rest_framework.mixins import ListModelMixin
from rest_framework.mixins import RetrieveModelMixin
from rest_framework.mixins import UpdateModelMixin
from rest_framework.permissions import IsAdminUser
from rest_framework.permissions import IsAuthenticated
from rest_framework.request import Request
from rest_framework.response import Response
@@ -192,7 +192,7 @@ from documents.serialisers import PostDocumentSerializer
from documents.serialisers import RemovePasswordDocumentsSerializer
from documents.serialisers import ReprocessDocumentsSerializer
from documents.serialisers import RotateDocumentsSerializer
from documents.serialisers import RunTaskViewSerializer
from documents.serialisers import RunTaskSerializer
from documents.serialisers import SavedViewSerializer
from documents.serialisers import SearchResultSerializer
from documents.serialisers import SerializerWithPerms
@@ -201,7 +201,9 @@ from documents.serialisers import ShareLinkSerializer
from documents.serialisers import StoragePathSerializer
from documents.serialisers import StoragePathTestSerializer
from documents.serialisers import TagSerializer
from documents.serialisers import TasksViewSerializer
from documents.serialisers import TaskSerializerV9
from documents.serialisers import TaskSerializerV10
from documents.serialisers import TaskSummarySerializer
from documents.serialisers import TrashSerializer
from documents.serialisers import UiSettingsViewSerializer
from documents.serialisers import WorkflowActionSerializer
@@ -249,13 +251,6 @@ if settings.AUDIT_LOG_ENABLED:
logger = logging.getLogger("paperless.api")
# Crossover point for intersect_and_order: below this count use a targeted
# IN-clause query; at or above this count fall back to a full-table scan +
# Python set intersection. The IN-clause is faster for small result sets but
# degrades on SQLite with thousands of parameters. PostgreSQL handles large IN
# clauses efficiently, so this threshold mainly protects SQLite users.
_TANTIVY_INTERSECT_THRESHOLD = 5_000
class IndexView(TemplateView):
template_name = "index.html"
@@ -299,7 +294,7 @@ class IndexView(TemplateView):
return context
class PassUserMixin(GenericAPIView):
class PassUserMixin(GenericAPIView[Any]):
"""
Pass a user object to serializer
"""
@@ -465,7 +460,10 @@ class PermissionsAwareDocumentCountMixin(BulkPermissionMixin, PassUserMixin):
@extend_schema_view(**generate_object_with_permissions_schema(CorrespondentSerializer))
class CorrespondentViewSet(PermissionsAwareDocumentCountMixin, ModelViewSet):
class CorrespondentViewSet(
PermissionsAwareDocumentCountMixin,
ModelViewSet[Correspondent],
):
model = Correspondent
queryset = Correspondent.objects.select_related("owner").order_by(Lower("name"))
@@ -502,7 +500,7 @@ class CorrespondentViewSet(PermissionsAwareDocumentCountMixin, ModelViewSet):
@extend_schema_view(**generate_object_with_permissions_schema(TagSerializer))
class TagViewSet(PermissionsAwareDocumentCountMixin, ModelViewSet):
class TagViewSet(PermissionsAwareDocumentCountMixin, ModelViewSet[Tag]):
model = Tag
serializer_class = TagSerializer
document_count_through = Document.tags.through
@@ -581,7 +579,10 @@ class TagViewSet(PermissionsAwareDocumentCountMixin, ModelViewSet):
@extend_schema_view(**generate_object_with_permissions_schema(DocumentTypeSerializer))
class DocumentTypeViewSet(PermissionsAwareDocumentCountMixin, ModelViewSet):
class DocumentTypeViewSet(
PermissionsAwareDocumentCountMixin,
ModelViewSet[DocumentType],
):
model = DocumentType
queryset = DocumentType.objects.select_related("owner").order_by(Lower("name"))
@@ -816,7 +817,7 @@ class DocumentViewSet(
UpdateModelMixin,
DestroyModelMixin,
ListModelMixin,
GenericViewSet,
GenericViewSet[Document],
):
model = Document
queryset = Document.objects.all()
@@ -1256,7 +1257,10 @@ class DocumentViewSet(
),
)
def suggestions(self, request, pk=None):
doc = get_object_or_404(Document.objects.select_related("owner"), pk=pk)
doc = get_object_or_404(
Document.objects.select_related("owner").prefetch_related("versions"),
pk=pk,
)
if request.user is not None and not has_perms_owner_aware(
request.user,
"view_document",
@@ -1960,7 +1964,7 @@ class ChatStreamingSerializer(serializers.Serializer):
],
name="dispatch",
)
class ChatStreamingView(GenericAPIView):
class ChatStreamingView(GenericAPIView[Any]):
permission_classes = (IsAuthenticated,)
serializer_class = ChatStreamingSerializer
@@ -2068,16 +2072,19 @@ class UnifiedSearchViewSet(DocumentViewSet):
if not self._is_search_request():
return super().list(request)
from documents.search import SearchHit
from documents.search import SearchMode
from documents.search import TantivyBackend
from documents.search import TantivyRelevanceList
from documents.search import get_backend
def parse_search_params() -> tuple[str | None, bool, bool, int, int]:
"""Extract query string, search mode, and ordering from request."""
active = self._get_active_search_params(request)
if len(active) > 1:
try:
backend = get_backend()
# ORM-filtered queryset: permissions + field filters + ordering (DRF backends applied)
filtered_qs = self.filter_queryset(self.get_queryset())
user = None if request.user.is_superuser else request.user
active_search_params = self._get_active_search_params(request)
if len(active_search_params) > 1:
raise ValidationError(
{
"detail": _(
@@ -2086,161 +2093,73 @@ class UnifiedSearchViewSet(DocumentViewSet):
},
)
ordering_param = request.query_params.get("ordering", "")
sort_reverse = ordering_param.startswith("-")
sort_field_name = ordering_param.lstrip("-") or None
# "score" means relevance order — Tantivy handles it natively,
# so treat it as a Tantivy sort to preserve the ranked order through
# the ORM intersection step.
use_tantivy_sort = (
sort_field_name in TantivyBackend.SORTABLE_FIELDS
or sort_field_name is None
or sort_field_name == "score"
)
try:
page_num = int(request.query_params.get("page", 1))
except (TypeError, ValueError):
page_num = 1
page_size = (
self.paginator.get_page_size(request) or self.paginator.page_size
)
return sort_field_name, sort_reverse, use_tantivy_sort, page_num, page_size
def intersect_and_order(
all_ids: list[int],
filtered_qs: QuerySet[Document],
*,
use_tantivy_sort: bool,
) -> list[int]:
"""Intersect search IDs with ORM-visible IDs, preserving order."""
if not all_ids:
return []
if use_tantivy_sort:
if len(all_ids) <= _TANTIVY_INTERSECT_THRESHOLD:
# Small result set: targeted IN-clause avoids a full-table scan.
visible_ids = set(
filtered_qs.filter(pk__in=all_ids).values_list("pk", flat=True),
)
else:
# Large result set: full-table scan + Python intersection is faster
# than a large IN-clause on SQLite.
visible_ids = set(
filtered_qs.values_list("pk", flat=True),
)
return [doc_id for doc_id in all_ids if doc_id in visible_ids]
return list(
filtered_qs.filter(id__in=all_ids).values_list("pk", flat=True),
)
def run_text_search(
backend: TantivyBackend,
user: User | None,
filtered_qs: QuerySet[Document],
) -> tuple[list[int], list[SearchHit], int]:
"""Handle text/title/query search: IDs, ORM intersection, page highlights."""
if "text" in request.query_params:
search_mode = SearchMode.TEXT
query_str = request.query_params["text"]
elif "title_search" in request.query_params:
search_mode = SearchMode.TITLE
query_str = request.query_params["title_search"]
else:
search_mode = SearchMode.QUERY
query_str = request.query_params["query"]
# "score" is not a real Tantivy sort field — it means relevance order,
# which is Tantivy's default when no sort field is specified.
is_score_sort = sort_field_name == "score"
all_ids = backend.search_ids(
query_str,
user=user,
sort_field=(
None if (not use_tantivy_sort or is_score_sort) else sort_field_name
),
sort_reverse=sort_reverse,
search_mode=search_mode,
)
ordered_ids = intersect_and_order(
all_ids,
filtered_qs,
use_tantivy_sort=use_tantivy_sort,
)
# Tantivy returns relevance results best-first (descending score).
# ordering=score (ascending, worst-first) requires a reversal.
if is_score_sort and not sort_reverse:
ordered_ids = list(reversed(ordered_ids))
page_offset = (page_num - 1) * page_size
page_ids = ordered_ids[page_offset : page_offset + page_size]
page_hits = backend.highlight_hits(
query_str,
page_ids,
search_mode=search_mode,
rank_start=page_offset + 1,
)
return ordered_ids, page_hits, page_offset
def run_more_like_this(
backend: TantivyBackend,
user: User | None,
filtered_qs: QuerySet[Document],
) -> tuple[list[int], list[SearchHit], int]:
"""Handle more_like_id search: permission check, IDs, stub hits."""
try:
more_like_doc_id = int(request.query_params["more_like_id"])
more_like_doc = Document.objects.select_related("owner").get(
pk=more_like_doc_id,
)
except (TypeError, ValueError, Document.DoesNotExist):
raise PermissionDenied(_("Invalid more_like_id"))
if not has_perms_owner_aware(
request.user,
"view_document",
more_like_doc,
if (
"text" in request.query_params
or "title_search" in request.query_params
or "query" in request.query_params
):
raise PermissionDenied(_("Insufficient permissions."))
all_ids = backend.more_like_this_ids(more_like_doc_id, user=user)
ordered_ids = intersect_and_order(
all_ids,
filtered_qs,
use_tantivy_sort=True,
)
page_offset = (page_num - 1) * page_size
page_ids = ordered_ids[page_offset : page_offset + page_size]
page_hits = [
SearchHit(id=doc_id, score=0.0, rank=rank, highlights={})
for rank, doc_id in enumerate(page_ids, start=page_offset + 1)
]
return ordered_ids, page_hits, page_offset
try:
sort_field_name, sort_reverse, use_tantivy_sort, page_num, page_size = (
parse_search_params()
)
backend = get_backend()
filtered_qs = self.filter_queryset(self.get_queryset())
user = None if request.user.is_superuser else request.user
if "more_like_id" in request.query_params:
ordered_ids, page_hits, page_offset = run_more_like_this(
backend,
user,
filtered_qs,
if "text" in request.query_params:
search_mode = SearchMode.TEXT
query_str = request.query_params["text"]
elif "title_search" in request.query_params:
search_mode = SearchMode.TITLE
query_str = request.query_params["title_search"]
else:
search_mode = SearchMode.QUERY
query_str = request.query_params["query"]
results = backend.search(
query_str,
user=user,
page=1,
page_size=10000,
sort_field=None,
sort_reverse=False,
search_mode=search_mode,
)
else:
ordered_ids, page_hits, page_offset = run_text_search(
backend,
user,
filtered_qs,
# more_like_id — validate permission on the seed document first
try:
more_like_doc_id = int(request.query_params["more_like_id"])
more_like_doc = Document.objects.select_related("owner").get(
pk=more_like_doc_id,
)
except (TypeError, ValueError, Document.DoesNotExist):
raise PermissionDenied(_("Invalid more_like_id"))
if not has_perms_owner_aware(
request.user,
"view_document",
more_like_doc,
):
raise PermissionDenied(_("Insufficient permissions."))
results = backend.more_like_this(
more_like_doc_id,
user=user,
page=1,
page_size=10000,
)
rl = TantivyRelevanceList(ordered_ids, page_hits, page_offset)
hits_by_id = {h["id"]: h for h in results.hits}
# Determine sort order: no ordering param -> Tantivy relevance; otherwise -> ORM order
ordering_param = request.query_params.get("ordering", "").lstrip("-")
if not ordering_param:
# Preserve Tantivy relevance order; intersect with ORM-visible IDs
orm_ids = set(filtered_qs.values_list("pk", flat=True))
ordered_hits = [h for h in results.hits if h["id"] in orm_ids]
else:
# Use ORM ordering (already applied by DocumentsOrderingFilter)
hit_ids = set(hits_by_id.keys())
orm_ordered_ids = filtered_qs.filter(id__in=hit_ids).values_list(
"pk",
flat=True,
)
ordered_hits = [
hits_by_id[pk] for pk in orm_ordered_ids if pk in hits_by_id
]
rl = TantivyRelevanceList(ordered_hits)
page = self.paginate_queryset(rl)
if page is not None:
@@ -2250,18 +2169,15 @@ class UnifiedSearchViewSet(DocumentViewSet):
if get_boolean(
str(request.query_params.get("include_selection_data", "false")),
):
# NOTE: pk__in=ordered_ids generates a large SQL IN clause
# for big result sets. Acceptable today but may need a temp
# table or chunked approach if selection_data becomes slow
# at scale (tens of thousands of matching documents).
all_ids = [h["id"] for h in ordered_hits]
response.data["selection_data"] = (
self._get_selection_data_for_queryset(
filtered_qs.filter(pk__in=ordered_ids),
filtered_qs.filter(pk__in=all_ids),
)
)
return response
serializer = self.get_serializer(page_hits, many=True)
serializer = self.get_serializer(ordered_hits, many=True)
return Response(serializer.data)
except NotFound:
@@ -2374,7 +2290,7 @@ class LogViewSet(ViewSet):
@extend_schema_view(**generate_object_with_permissions_schema(SavedViewSerializer))
class SavedViewViewSet(BulkPermissionMixin, PassUserMixin, ModelViewSet):
class SavedViewViewSet(BulkPermissionMixin, PassUserMixin, ModelViewSet[SavedView]):
model = SavedView
queryset = SavedView.objects.select_related("owner").prefetch_related(
@@ -2852,7 +2768,7 @@ class RemovePasswordDocumentsView(DocumentOperationPermissionMixin):
},
),
)
class PostDocumentView(GenericAPIView):
class PostDocumentView(GenericAPIView[Any]):
permission_classes = (IsAuthenticated,)
serializer_class = PostDocumentSerializer
parser_classes = (parsers.MultiPartParser,)
@@ -2973,7 +2889,7 @@ class PostDocumentView(GenericAPIView):
},
),
)
class SelectionDataView(GenericAPIView):
class SelectionDataView(GenericAPIView[Any]):
permission_classes = (IsAuthenticated,)
serializer_class = DocumentListSerializer
parser_classes = (parsers.MultiPartParser, parsers.JSONParser)
@@ -3077,7 +2993,7 @@ class SelectionDataView(GenericAPIView):
},
),
)
class SearchAutoCompleteView(GenericAPIView):
class SearchAutoCompleteView(GenericAPIView[Any]):
permission_classes = (IsAuthenticated,)
def get(self, request, format=None):
@@ -3167,17 +3083,20 @@ class GlobalSearchView(PassUserMixin):
docs = all_docs.filter(title__icontains=query)[:OBJECT_LIMIT]
else:
user = None if request.user.is_superuser else request.user
matching_ids = get_backend().search_ids(
fts_results = get_backend().search(
query,
user=user,
page=1,
page_size=1000,
sort_field=None,
sort_reverse=False,
search_mode=SearchMode.TEXT,
limit=OBJECT_LIMIT * 3,
)
docs_by_id = all_docs.in_bulk(matching_ids)
docs_by_id = all_docs.in_bulk([hit["id"] for hit in fts_results.hits])
docs = [
docs_by_id[doc_id]
for doc_id in matching_ids
if doc_id in docs_by_id
docs_by_id[hit["id"]]
for hit in fts_results.hits
if hit["id"] in docs_by_id
][:OBJECT_LIMIT]
saved_views = (
get_objects_for_user_owner_aware(
@@ -3355,7 +3274,7 @@ class GlobalSearchView(PassUserMixin):
},
),
)
class StatisticsView(GenericAPIView):
class StatisticsView(GenericAPIView[Any]):
permission_classes = (IsAuthenticated,)
def get(self, request, format=None):
@@ -3457,7 +3376,7 @@ class StatisticsView(GenericAPIView):
)
class BulkDownloadView(DocumentSelectionMixin, GenericAPIView):
class BulkDownloadView(DocumentSelectionMixin, GenericAPIView[Any]):
permission_classes = (IsAuthenticated,)
serializer_class = BulkDownloadSerializer
parser_classes = (parsers.JSONParser,)
@@ -3510,7 +3429,7 @@ class BulkDownloadView(DocumentSelectionMixin, GenericAPIView):
@extend_schema_view(**generate_object_with_permissions_schema(StoragePathSerializer))
class StoragePathViewSet(PermissionsAwareDocumentCountMixin, ModelViewSet):
class StoragePathViewSet(PermissionsAwareDocumentCountMixin, ModelViewSet[StoragePath]):
model = StoragePath
queryset = StoragePath.objects.select_related("owner").order_by(
@@ -3574,7 +3493,7 @@ class StoragePathViewSet(PermissionsAwareDocumentCountMixin, ModelViewSet):
return Response(result)
class UiSettingsView(GenericAPIView):
class UiSettingsView(GenericAPIView[Any]):
queryset = UiSettings.objects.all()
permission_classes = (IsAuthenticated, PaperlessObjectPermissions)
serializer_class = UiSettingsViewSerializer
@@ -3672,7 +3591,7 @@ class UiSettingsView(GenericAPIView):
},
),
)
class RemoteVersionView(GenericAPIView):
class RemoteVersionView(GenericAPIView[Any]):
cache_key = "remote_version_view_latest_release"
def get(self, request, format=None):
@@ -3749,37 +3668,52 @@ class RemoteVersionView(GenericAPIView):
),
],
)
class TasksViewSet(ReadOnlyModelViewSet):
class TasksViewSet(ReadOnlyModelViewSet[PaperlessTask]):
permission_classes = (IsAuthenticated, PaperlessObjectPermissions)
serializer_class = TasksViewSerializer
filter_backends = (
DjangoFilterBackend,
OrderingFilter,
ObjectOwnedOrGrantedPermissionsFilter,
)
filterset_class = PaperlessTaskFilterSet
ordering_fields = [
"date_created",
"date_done",
"status",
"task_type",
"duration_seconds",
"wait_time_seconds",
]
ordering = ["-date_created"]
TASK_AND_ARGS_BY_NAME = {
PaperlessTask.TaskName.INDEX_OPTIMIZE: (index_optimize, {}),
PaperlessTask.TaskName.TRAIN_CLASSIFIER: (
train_classifier,
{"scheduled": False},
),
PaperlessTask.TaskName.CHECK_SANITY: (
sanity_check,
{"scheduled": False, "raise_on_error": False},
),
PaperlessTask.TaskName.LLMINDEX_UPDATE: (
llmindex_index,
{"scheduled": False, "rebuild": False},
),
}
def get_serializer_class(self):
# v9: use backwards-compatible serializer with old field names
if self.request.version and int(self.request.version) < 10:
return TaskSerializerV9
return TaskSerializerV10
def get_queryset(self):
queryset = PaperlessTask.objects.all().order_by("-date_created")
queryset = PaperlessTask.objects.all()
# v9 backwards compat: map old query params to new field names
if self.request.version and int(self.request.version) < 10:
task_name = self.request.query_params.get("task_name")
if task_name is not None:
queryset = queryset.filter(task_type=task_name)
task_type_old = self.request.query_params.get("type")
if task_type_old is not None:
# Old type values: AUTO_TASK -> SYSTEM, SCHEDULED_TASK -> SCHEDULED, MANUAL_TASK -> MANUAL
old_to_new = {
"AUTO_TASK": PaperlessTask.TriggerSource.SYSTEM,
"SCHEDULED_TASK": PaperlessTask.TriggerSource.SCHEDULED,
"MANUAL_TASK": PaperlessTask.TriggerSource.MANUAL,
}
new_source = old_to_new.get(task_type_old)
if new_source:
queryset = queryset.filter(trigger_source=new_source)
# v10+: direct task_id param for backwards compat
task_id = self.request.query_params.get("task_id")
if task_id is not None:
queryset = PaperlessTask.objects.filter(task_id=task_id)
queryset = queryset.filter(task_id=task_id)
return queryset
@action(
@@ -3791,39 +3725,123 @@ class TasksViewSet(ReadOnlyModelViewSet):
serializer = AcknowledgeTasksViewSerializer(data=request.data)
serializer.is_valid(raise_exception=True)
task_ids = serializer.validated_data.get("tasks")
tasks = self.get_queryset().filter(id__in=task_ids)
count = tasks.update(acknowledged=True)
return Response({"result": count})
try:
tasks = PaperlessTask.objects.filter(id__in=task_ids)
if request.user is not None and not request.user.is_superuser:
tasks = tasks.filter(owner=request.user) | tasks.filter(owner=None)
result = tasks.update(
acknowledged=True,
@action(
methods=["post"],
detail=False,
permission_classes=[IsAuthenticated, AcknowledgeTasksPermissions],
)
def acknowledge_all(self, request):
"""Acknowledge all completed tasks visible to the requesting user."""
count = (
self.get_queryset()
.filter(
acknowledged=False,
status__in=[
PaperlessTask.Status.SUCCESS,
PaperlessTask.Status.FAILURE,
PaperlessTask.Status.REVOKED,
],
)
return Response({"result": result})
except Exception:
return HttpResponseBadRequest()
.update(acknowledged=True)
)
return Response({"result": count})
@action(methods=["post"], detail=False)
@action(methods=["get"], detail=False)
def summary(self, request):
"""Aggregated task statistics per task_type over the last N days (default 30)."""
from django.db.models import Avg
from django.db.models import Count
from django.db.models import Max
from django.db.models import Q
days = int(request.query_params.get("days", 30))
cutoff = timezone.now() - timedelta(days=days)
queryset = self.get_queryset().filter(date_created__gte=cutoff)
data = queryset.values("task_type").annotate(
total_count=Count("id"),
pending_count=Count("id", filter=Q(status=PaperlessTask.Status.PENDING)),
success_count=Count("id", filter=Q(status=PaperlessTask.Status.SUCCESS)),
failure_count=Count("id", filter=Q(status=PaperlessTask.Status.FAILURE)),
avg_duration_seconds=Avg(
"duration_seconds",
filter=Q(duration_seconds__isnull=False),
),
avg_wait_time_seconds=Avg(
"wait_time_seconds",
filter=Q(wait_time_seconds__isnull=False),
),
last_run=Max("date_created"),
last_success=Max(
"date_done",
filter=Q(status=PaperlessTask.Status.SUCCESS),
),
last_failure=Max(
"date_done",
filter=Q(status=PaperlessTask.Status.FAILURE),
),
)
serializer = TaskSummarySerializer(data, many=True)
return Response(serializer.data)
@action(methods=["get"], detail=False)
def active(self, request):
"""Currently pending and running tasks (capped at 50)."""
queryset = (
self.get_queryset()
.filter(
status__in=[PaperlessTask.Status.PENDING, PaperlessTask.Status.STARTED],
)
.order_by("-date_created")[:50]
)
serializer = self.get_serializer(queryset, many=True)
return Response(serializer.data)
@action(methods=["post"], detail=False, permission_classes=[IsAdminUser])
def run(self, request):
serializer = RunTaskViewSerializer(data=request.data)
"""Manually dispatch a background task. Superuser (admin) only."""
serializer = RunTaskSerializer(data=request.data)
serializer.is_valid(raise_exception=True)
task_name = serializer.validated_data.get("task_name")
task_type = serializer.validated_data.get("task_type")
if not request.user.is_superuser:
return HttpResponseForbidden("Insufficient permissions")
task_func_map = {
PaperlessTask.TaskType.INDEX_OPTIMIZE: (index_optimize, {}),
PaperlessTask.TaskType.TRAIN_CLASSIFIER: (train_classifier, {}),
PaperlessTask.TaskType.SANITY_CHECK: (
sanity_check,
{"raise_on_error": False},
),
PaperlessTask.TaskType.LLM_INDEX: (
llmindex_index,
{"rebuild": False},
),
}
if task_type not in task_func_map:
return Response(
{"error": f"Task type '{task_type}' cannot be manually triggered"},
status=status.HTTP_400_BAD_REQUEST,
)
try:
task_func, task_args = self.TASK_AND_ARGS_BY_NAME[task_name]
result = task_func(**task_args)
return Response({"result": result})
task_func, task_kwargs = task_func_map[task_type]
async_result = task_func.apply_async(
kwargs=task_kwargs,
headers={"trigger_source": "manual"},
)
return Response({"task_id": async_result.id})
except Exception as e:
logger.warning(f"An error occurred running task: {e!s}")
logger.warning(f"Error running task: {e!s}")
return HttpResponseServerError(
"Error running task, check logs for more detail.",
)
class ShareLinkViewSet(ModelViewSet, PassUserMixin):
class ShareLinkViewSet(PassUserMixin, ModelViewSet[ShareLink]):
model = ShareLink
queryset = ShareLink.objects.all()
@@ -3840,7 +3858,7 @@ class ShareLinkViewSet(ModelViewSet, PassUserMixin):
ordering_fields = ("created", "expiration", "document")
class ShareLinkBundleViewSet(ModelViewSet, PassUserMixin):
class ShareLinkBundleViewSet(PassUserMixin, ModelViewSet[ShareLinkBundle]):
model = ShareLinkBundle
queryset = ShareLinkBundle.objects.all()
@@ -4197,7 +4215,7 @@ class BulkEditObjectsView(PassUserMixin):
return Response({"result": "OK"})
class WorkflowTriggerViewSet(ModelViewSet):
class WorkflowTriggerViewSet(ModelViewSet[WorkflowTrigger]):
permission_classes = (IsAuthenticated, PaperlessObjectPermissions)
serializer_class = WorkflowTriggerSerializer
@@ -4215,7 +4233,7 @@ class WorkflowTriggerViewSet(ModelViewSet):
return super().partial_update(request, *args, **kwargs)
class WorkflowActionViewSet(ModelViewSet):
class WorkflowActionViewSet(ModelViewSet[WorkflowAction]):
permission_classes = (IsAuthenticated, PaperlessObjectPermissions)
serializer_class = WorkflowActionSerializer
@@ -4240,7 +4258,7 @@ class WorkflowActionViewSet(ModelViewSet):
return super().partial_update(request, *args, **kwargs)
class WorkflowViewSet(ModelViewSet):
class WorkflowViewSet(ModelViewSet[Workflow]):
permission_classes = (IsAuthenticated, PaperlessObjectPermissions)
serializer_class = WorkflowSerializer
@@ -4258,7 +4276,7 @@ class WorkflowViewSet(ModelViewSet):
)
class CustomFieldViewSet(PermissionsAwareDocumentCountMixin, ModelViewSet):
class CustomFieldViewSet(PermissionsAwareDocumentCountMixin, ModelViewSet[CustomField]):
permission_classes = (IsAuthenticated, PaperlessObjectPermissions)
serializer_class = CustomFieldSerializer
@@ -4442,11 +4460,11 @@ class SystemStatusView(PassUserMixin):
last_trained_task = (
PaperlessTask.objects.filter(
task_name=PaperlessTask.TaskName.TRAIN_CLASSIFIER,
task_type=PaperlessTask.TaskType.TRAIN_CLASSIFIER,
status__in=[
states.SUCCESS,
states.FAILURE,
states.REVOKED,
PaperlessTask.Status.SUCCESS,
PaperlessTask.Status.FAILURE,
PaperlessTask.Status.REVOKED,
], # ignore running tasks
)
.order_by("-date_done")
@@ -4457,20 +4475,23 @@ class SystemStatusView(PassUserMixin):
if last_trained_task is None:
classifier_status = "WARNING"
classifier_error = "No classifier training tasks found"
elif last_trained_task and last_trained_task.status != states.SUCCESS:
elif (
last_trained_task
and last_trained_task.status != PaperlessTask.Status.SUCCESS
):
classifier_status = "ERROR"
classifier_error = last_trained_task.result
classifier_error = last_trained_task.result_message
classifier_last_trained = (
last_trained_task.date_done if last_trained_task else None
)
last_sanity_check = (
PaperlessTask.objects.filter(
task_name=PaperlessTask.TaskName.CHECK_SANITY,
task_type=PaperlessTask.TaskType.SANITY_CHECK,
status__in=[
states.SUCCESS,
states.FAILURE,
states.REVOKED,
PaperlessTask.Status.SUCCESS,
PaperlessTask.Status.FAILURE,
PaperlessTask.Status.REVOKED,
], # ignore running tasks
)
.order_by("-date_done")
@@ -4481,9 +4502,12 @@ class SystemStatusView(PassUserMixin):
if last_sanity_check is None:
sanity_check_status = "WARNING"
sanity_check_error = "No sanity check tasks found"
elif last_sanity_check and last_sanity_check.status != states.SUCCESS:
elif (
last_sanity_check
and last_sanity_check.status != PaperlessTask.Status.SUCCESS
):
sanity_check_status = "ERROR"
sanity_check_error = last_sanity_check.result
sanity_check_error = last_sanity_check.result_message
sanity_check_last_run = (
last_sanity_check.date_done if last_sanity_check else None
)
@@ -4496,7 +4520,7 @@ class SystemStatusView(PassUserMixin):
else:
last_llmindex_update = (
PaperlessTask.objects.filter(
task_name=PaperlessTask.TaskName.LLMINDEX_UPDATE,
task_type=PaperlessTask.TaskType.LLM_INDEX,
)
.order_by("-date_done")
.first()
@@ -4506,9 +4530,12 @@ class SystemStatusView(PassUserMixin):
if last_llmindex_update is None:
llmindex_status = "WARNING"
llmindex_error = "No LLM index update tasks found"
elif last_llmindex_update and last_llmindex_update.status == states.FAILURE:
elif (
last_llmindex_update
and last_llmindex_update.status == PaperlessTask.Status.FAILURE
):
llmindex_status = "ERROR"
llmindex_error = last_llmindex_update.result
llmindex_error = last_llmindex_update.result_message
llmindex_last_modified = (
last_llmindex_update.date_done if last_llmindex_update else None
)

File diff suppressed because it is too large Load Diff

View File

@@ -1,4 +1,5 @@
import logging
from io import BytesIO
import magic
from allauth.mfa.adapter import get_adapter as get_mfa_adapter
@@ -11,13 +12,16 @@ from django.contrib.auth.models import Group
from django.contrib.auth.models import Permission
from django.contrib.auth.models import User
from django.contrib.auth.password_validation import validate_password
from django.core.files.uploadedfile import InMemoryUploadedFile
from django.core.files.uploadedfile import UploadedFile
from PIL import Image
from rest_framework import serializers
from rest_framework.authtoken.serializers import AuthTokenSerializer
from paperless.models import ApplicationConfiguration
from paperless.network import validate_outbound_http_url
from paperless.validators import reject_dangerous_svg
from paperless.validators import validate_raster_image
from paperless_mail.serialisers import ObfuscatedPasswordField
logger = logging.getLogger("paperless.settings")
@@ -70,7 +74,7 @@ class PaperlessAuthTokenSerializer(AuthTokenSerializer):
return attrs
class UserSerializer(PasswordValidationMixin, serializers.ModelSerializer):
class UserSerializer(PasswordValidationMixin, serializers.ModelSerializer[User]):
password = ObfuscatedPasswordField(required=False)
user_permissions = serializers.SlugRelatedField(
many=True,
@@ -138,7 +142,7 @@ class UserSerializer(PasswordValidationMixin, serializers.ModelSerializer):
return user
class GroupSerializer(serializers.ModelSerializer):
class GroupSerializer(serializers.ModelSerializer[Group]):
permissions = serializers.SlugRelatedField(
many=True,
queryset=Permission.objects.exclude(content_type__app_label="admin"),
@@ -154,7 +158,7 @@ class GroupSerializer(serializers.ModelSerializer):
)
class SocialAccountSerializer(serializers.ModelSerializer):
class SocialAccountSerializer(serializers.ModelSerializer[SocialAccount]):
name = serializers.SerializerMethodField()
class Meta:
@@ -172,7 +176,7 @@ class SocialAccountSerializer(serializers.ModelSerializer):
return "Unknown App"
class ProfileSerializer(PasswordValidationMixin, serializers.ModelSerializer):
class ProfileSerializer(PasswordValidationMixin, serializers.ModelSerializer[User]):
email = serializers.EmailField(allow_blank=True, required=False)
password = ObfuscatedPasswordField(required=False, allow_null=False)
auth_token = serializers.SlugRelatedField(read_only=True, slug_field="key")
@@ -205,7 +209,9 @@ class ProfileSerializer(PasswordValidationMixin, serializers.ModelSerializer):
)
class ApplicationConfigurationSerializer(serializers.ModelSerializer):
class ApplicationConfigurationSerializer(
serializers.ModelSerializer[ApplicationConfiguration],
):
user_args = serializers.JSONField(binary=True, allow_null=True)
barcode_tag_mapping = serializers.JSONField(binary=True, allow_null=True)
llm_api_key = ObfuscatedPasswordField(
@@ -233,9 +239,40 @@ class ApplicationConfigurationSerializer(serializers.ModelSerializer):
instance.app_logo.delete()
return super().update(instance, validated_data)
def _sanitize_raster_image(self, file: UploadedFile) -> UploadedFile:
try:
data = BytesIO()
image = Image.open(file)
image.save(data, format=image.format)
data.seek(0)
return InMemoryUploadedFile(
file=data,
field_name=file.field_name,
name=file.name,
content_type=file.content_type,
size=data.getbuffer().nbytes,
charset=getattr(file, "charset", None),
)
finally:
image.close()
def validate_app_logo(self, file: UploadedFile):
if file and magic.from_buffer(file.read(2048), mime=True) == "image/svg+xml":
reject_dangerous_svg(file)
"""
Validates and sanitizes the uploaded app logo image. Model field already restricts to
jpg/png/gif/svg.
"""
if file:
mime_type = magic.from_buffer(file.read(2048), mime=True)
if mime_type == "image/svg+xml":
reject_dangerous_svg(file)
else:
validate_raster_image(file)
if mime_type in {"image/jpeg", "image/png"}:
file = self._sanitize_raster_image(file)
return file
def validate_llm_endpoint(self, value: str | None) -> str | None:

View File

@@ -133,7 +133,6 @@ INSTALLED_APPS = [
"rest_framework",
"rest_framework.authtoken",
"django_filters",
"django_celery_results",
"guardian",
"allauth",
"allauth.account",
@@ -669,8 +668,6 @@ CELERY_BROKER_TRANSPORT_OPTIONS = {
CELERY_TASK_TRACK_STARTED = True
CELERY_TASK_TIME_LIMIT: Final[int] = get_int_from_env("PAPERLESS_WORKER_TIMEOUT", 1800)
CELERY_RESULT_EXTENDED = True
CELERY_RESULT_BACKEND = "django-db"
CELERY_CACHE_BACKEND = "default"
# https://docs.celeryq.dev/en/stable/userguide/configuration.html#task-serializer

View File

@@ -181,7 +181,10 @@ def parse_beat_schedule() -> dict:
schedule[task["name"]] = {
"task": task["task"],
"schedule": crontab(minute, hour, day_week, day_month, month),
"options": task["options"],
"options": {
**task["options"],
"headers": {"trigger_source": "scheduled"},
},
}
return schedule

View File

@@ -186,42 +186,66 @@ def make_expected_schedule(
"Check all e-mail accounts": {
"task": "paperless_mail.tasks.process_mail_accounts",
"schedule": crontab(minute="*/10"),
"options": {"expires": mail_expire},
"options": {
"expires": mail_expire,
"headers": {"trigger_source": "scheduled"},
},
},
"Train the classifier": {
"task": "documents.tasks.train_classifier",
"schedule": crontab(minute="5", hour="*/1"),
"options": {"expires": classifier_expire},
"options": {
"expires": classifier_expire,
"headers": {"trigger_source": "scheduled"},
},
},
"Optimize the index": {
"task": "documents.tasks.index_optimize",
"schedule": crontab(minute=0, hour=0),
"options": {"expires": index_expire},
"options": {
"expires": index_expire,
"headers": {"trigger_source": "scheduled"},
},
},
"Perform sanity check": {
"task": "documents.tasks.sanity_check",
"schedule": crontab(minute=30, hour=0, day_of_week="sun"),
"options": {"expires": sanity_expire},
"options": {
"expires": sanity_expire,
"headers": {"trigger_source": "scheduled"},
},
},
"Empty trash": {
"task": "documents.tasks.empty_trash",
"schedule": crontab(minute=0, hour="1"),
"options": {"expires": empty_trash_expire},
"options": {
"expires": empty_trash_expire,
"headers": {"trigger_source": "scheduled"},
},
},
"Check and run scheduled workflows": {
"task": "documents.tasks.check_scheduled_workflows",
"schedule": crontab(minute="5", hour="*/1"),
"options": {"expires": workflow_expire},
"options": {
"expires": workflow_expire,
"headers": {"trigger_source": "scheduled"},
},
},
"Rebuild LLM index": {
"task": "documents.tasks.llmindex_index",
"schedule": crontab(minute="10", hour="2"),
"options": {"expires": llm_index_expire},
"options": {
"expires": llm_index_expire,
"headers": {"trigger_source": "scheduled"},
},
},
"Cleanup expired share link bundles": {
"task": "documents.tasks.cleanup_expired_share_link_bundles",
"schedule": crontab(minute=0, hour="2"),
"options": {"expires": share_link_cleanup_expire},
"options": {
"expires": share_link_cleanup_expire,
"headers": {"trigger_source": "scheduled"},
},
},
}
@@ -284,6 +308,16 @@ class TestParseBeatSchedule:
schedule = parse_beat_schedule()
assert schedule == expected
def test_parse_beat_schedule_all_entries_have_trigger_source_header(self) -> None:
"""Every beat entry must carry trigger_source=scheduled so the task signal
handler can identify scheduler-originated tasks."""
schedule = parse_beat_schedule()
for name, entry in schedule.items():
headers = entry.get("options", {}).get("headers", {})
assert headers.get("trigger_source") == "scheduled", (
f"Beat entry '{name}' is missing trigger_source header"
)
class TestParseDbSettings:
"""Test suite for parse_db_settings function."""

View File

@@ -1,6 +1,10 @@
from io import BytesIO
from django.conf import settings
from django.core.exceptions import ValidationError
from django.core.files.uploadedfile import UploadedFile
from lxml import etree
from PIL import Image
ALLOWED_SVG_TAGS: set[str] = {
# Basic shapes
@@ -254,3 +258,30 @@ def reject_dangerous_svg(file: UploadedFile) -> None:
raise ValidationError(
f"URI scheme not allowed in {attr_name}: must be #anchor, relative path, or data:image/*",
)
def validate_raster_image(file: UploadedFile) -> None:
"""
Validates that the uploaded file is a valid raster image (JPEG, PNG, etc.)
and does not exceed maximum pixel limits.
Raises ValidationError if the image is invalid or exceeds the allowed size.
"""
file.seek(0)
image_data = file.read()
try:
with Image.open(BytesIO(image_data)) as image:
image.verify()
if (
settings.MAX_IMAGE_PIXELS is not None
and settings.MAX_IMAGE_PIXELS > 0
and image.width * image.height > settings.MAX_IMAGE_PIXELS
):
raise ValidationError(
"Uploaded logo exceeds the maximum allowed image size.",
)
if image.format is None: # pragma: no cover
raise ValidationError("Invalid logo image.")
except (OSError, Image.DecompressionBombError) as e:
raise ValidationError("Invalid logo image.") from e

View File

@@ -1,5 +1,6 @@
from collections import OrderedDict
from pathlib import Path
from typing import Any
from allauth.mfa import signals
from allauth.mfa.adapter import get_adapter as get_mfa_adapter
@@ -89,7 +90,7 @@ class StandardPagination(PageNumberPagination):
query = self.page.paginator.object_list
if isinstance(query, TantivyRelevanceList):
return query.get_all_ids()
return [h["id"] for h in query._hits]
return self.page.paginator.object_list.values_list("pk", flat=True)
def get_paginated_response_schema(self, schema):
@@ -114,7 +115,7 @@ class FaviconView(View):
return HttpResponseNotFound("favicon.ico not found")
class UserViewSet(ModelViewSet):
class UserViewSet(ModelViewSet[User]):
_BOOL_NOT_PROVIDED = object()
model = User
@@ -216,7 +217,7 @@ class UserViewSet(ModelViewSet):
return HttpResponseNotFound("TOTP not found")
class GroupViewSet(ModelViewSet):
class GroupViewSet(ModelViewSet[Group]):
model = Group
queryset = Group.objects.order_by(Lower("name"))
@@ -229,7 +230,7 @@ class GroupViewSet(ModelViewSet):
ordering_fields = ("name",)
class ProfileView(GenericAPIView):
class ProfileView(GenericAPIView[Any]):
"""
User profile view, only available when logged in
"""
@@ -288,7 +289,7 @@ class ProfileView(GenericAPIView):
},
),
)
class TOTPView(GenericAPIView):
class TOTPView(GenericAPIView[Any]):
"""
TOTP views
"""
@@ -368,7 +369,7 @@ class TOTPView(GenericAPIView):
},
),
)
class GenerateAuthTokenView(GenericAPIView):
class GenerateAuthTokenView(GenericAPIView[Any]):
"""
Generates (or re-generates) an auth token, requires a logged in user
unlike the default DRF endpoint
@@ -397,7 +398,7 @@ class GenerateAuthTokenView(GenericAPIView):
},
),
)
class ApplicationConfigurationViewSet(ModelViewSet):
class ApplicationConfigurationViewSet(ModelViewSet[ApplicationConfiguration]):
model = ApplicationConfiguration
queryset = ApplicationConfiguration.objects
@@ -426,10 +427,9 @@ class ApplicationConfigurationViewSet(ModelViewSet):
and not vector_store_file_exists()
):
# AI index was just enabled and vector store file does not exist
llmindex_index.delay(
rebuild=True,
scheduled=False,
auto=True,
llmindex_index.apply_async(
kwargs={"rebuild": True},
headers={"trigger_source": "system"},
)
@@ -450,7 +450,7 @@ class ApplicationConfigurationViewSet(ModelViewSet):
},
),
)
class DisconnectSocialAccountView(GenericAPIView):
class DisconnectSocialAccountView(GenericAPIView[Any]):
"""
Disconnects a social account provider from the user account
"""
@@ -476,7 +476,7 @@ class DisconnectSocialAccountView(GenericAPIView):
},
),
)
class SocialAccountProvidersView(GenericAPIView):
class SocialAccountProvidersView(GenericAPIView[Any]):
"""
List of social account providers
"""

View File

@@ -4,7 +4,6 @@ from datetime import timedelta
from pathlib import Path
from typing import TYPE_CHECKING
from celery import states
from django.conf import settings
from django.utils import timezone
@@ -28,17 +27,20 @@ def queue_llm_index_update_if_needed(*, rebuild: bool, reason: str) -> bool:
from documents.tasks import llmindex_index
has_running = PaperlessTask.objects.filter(
task_name=PaperlessTask.TaskName.LLMINDEX_UPDATE,
status__in=[states.PENDING, states.STARTED],
task_type=PaperlessTask.TaskType.LLM_INDEX,
status__in=[PaperlessTask.Status.PENDING, PaperlessTask.Status.STARTED],
).exists()
has_recent = PaperlessTask.objects.filter(
task_name=PaperlessTask.TaskName.LLMINDEX_UPDATE,
task_type=PaperlessTask.TaskType.LLM_INDEX,
date_created__gte=(timezone.now() - timedelta(minutes=5)),
).exists()
if has_running or has_recent:
return False
llmindex_index.delay(rebuild=rebuild, scheduled=False, auto=True)
llmindex_index.apply_async(
kwargs={"rebuild": rebuild},
headers={"trigger_source": "system"},
)
logger.warning(
"Queued LLM index update%s: %s",
" (rebuild)" if rebuild else "",

View File

@@ -3,13 +3,13 @@ from unittest.mock import MagicMock
from unittest.mock import patch
import pytest
from celery import states
from django.test import override_settings
from django.utils import timezone
from llama_index.core.base.embeddings.base import BaseEmbedding
from documents.models import Document
from documents.models import PaperlessTask
from documents.tests.factories import PaperlessTaskFactory
from paperless_ai import indexing
@@ -292,13 +292,15 @@ def test_queue_llm_index_update_if_needed_enqueues_when_idle_or_skips_recent() -
)
assert result is True
mock_task.delay.assert_called_once_with(rebuild=True, scheduled=False, auto=True)
mock_task.apply_async.assert_called_once_with(
kwargs={"rebuild": True},
headers={"trigger_source": "system"},
)
PaperlessTask.objects.create(
task_id="task-1",
task_name=PaperlessTask.TaskName.LLMINDEX_UPDATE,
status=states.STARTED,
date_created=timezone.now(),
PaperlessTaskFactory(
task_type=PaperlessTask.TaskType.LLM_INDEX,
trigger_source=PaperlessTask.TriggerSource.SYSTEM,
status=PaperlessTask.Status.STARTED,
)
# Existing running task
@@ -309,7 +311,7 @@ def test_queue_llm_index_update_if_needed_enqueues_when_idle_or_skips_recent() -
)
assert result is False
mock_task.delay.assert_not_called()
mock_task.apply_async.assert_not_called()
@override_settings(

View File

@@ -57,7 +57,7 @@ class MailAccountSerializer(OwnedObjectSerializer):
return instance
class AccountField(serializers.PrimaryKeyRelatedField):
class AccountField(serializers.PrimaryKeyRelatedField[MailAccount]):
def get_queryset(self):
return MailAccount.objects.all().order_by("-id")

View File

@@ -1,6 +1,7 @@
import datetime
import logging
from datetime import timedelta
from typing import Any
from django.http import HttpResponseBadRequest
from django.http import HttpResponseForbidden
@@ -65,7 +66,7 @@ from paperless_mail.tasks import process_mail_accounts
},
),
)
class MailAccountViewSet(ModelViewSet, PassUserMixin):
class MailAccountViewSet(PassUserMixin, ModelViewSet[MailAccount]):
model = MailAccount
queryset = MailAccount.objects.all().order_by("pk")
@@ -159,7 +160,7 @@ class MailAccountViewSet(ModelViewSet, PassUserMixin):
return Response({"result": "OK"})
class ProcessedMailViewSet(ReadOnlyModelViewSet, PassUserMixin):
class ProcessedMailViewSet(PassUserMixin, ReadOnlyModelViewSet[ProcessedMail]):
permission_classes = (IsAuthenticated, PaperlessObjectPermissions)
serializer_class = ProcessedMailSerializer
pagination_class = StandardPagination
@@ -187,7 +188,7 @@ class ProcessedMailViewSet(ReadOnlyModelViewSet, PassUserMixin):
return Response({"result": "OK", "deleted_mail_ids": mail_ids})
class MailRuleViewSet(ModelViewSet, PassUserMixin):
class MailRuleViewSet(PassUserMixin, ModelViewSet[MailRule]):
model = MailRule
queryset = MailRule.objects.all().order_by("order")
@@ -203,7 +204,7 @@ class MailRuleViewSet(ModelViewSet, PassUserMixin):
responses={200: None},
),
)
class OauthCallbackView(GenericAPIView):
class OauthCallbackView(GenericAPIView[Any]):
permission_classes = (IsAuthenticated,)
def get(self, request, format=None):

157
uv.lock generated
View File

@@ -875,15 +875,15 @@ wheels = [
[[package]]
name = "django"
version = "5.2.12"
version = "5.2.13"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "asgiref", marker = "sys_platform == 'darwin' or sys_platform == 'linux'" },
{ name = "sqlparse", marker = "sys_platform == 'darwin' or sys_platform == 'linux'" },
]
sdist = { url = "https://files.pythonhosted.org/packages/bd/55/b9445fc0695b03746f355c05b2eecc54c34e05198c686f4fc4406b722b52/django-5.2.12.tar.gz", hash = "sha256:6b809af7165c73eff5ce1c87fdae75d4da6520d6667f86401ecf55b681eb1eeb", size = 10860574, upload-time = "2026-03-03T13:56:05.509Z" }
sdist = { url = "https://files.pythonhosted.org/packages/1f/c5/c69e338eb2959f641045802e5ea87ca4bf5ac90c5fd08953ca10742fad51/django-5.2.13.tar.gz", hash = "sha256:a31589db5188d074c63f0945c3888fad104627dfcc236fb2b97f71f89da33bc4", size = 10890368, upload-time = "2026-04-07T14:02:15.072Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/4e/32/4b144e125678efccf5d5b61581de1c4088d6b0286e46096e3b8de0d556c8/django-5.2.12-py3-none-any.whl", hash = "sha256:4853482f395c3a151937f6991272540fcbf531464f254a347bf7c89f53c8cff7", size = 8310245, upload-time = "2026-03-03T13:56:01.174Z" },
{ url = "https://files.pythonhosted.org/packages/59/b1/51ab36b2eefcf8cdb9338c7188668a157e29e30306bfc98a379704c9e10d/django-5.2.13-py3-none-any.whl", hash = "sha256:5788fce61da23788a8ce6f02583765ab060d396720924789f97fa42119d37f7a", size = 8310982, upload-time = "2026-04-07T14:02:08.883Z" },
]
[[package]]
@@ -935,19 +935,6 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/12/bf/af8ad2aa5a402f278b444ca70729fb12ee96ddb89c19c32a2d7c5189358f/django_cachalot-2.9.0-py3-none-any.whl", hash = "sha256:b80ac4930613a7849988ea772a53598d262a15eaf55e5ec8c78accae7fdd99ff", size = 57814, upload-time = "2026-01-28T05:23:28.741Z" },
]
[[package]]
name = "django-celery-results"
version = "2.6.0"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "celery", marker = "sys_platform == 'darwin' or sys_platform == 'linux'" },
{ name = "django", marker = "sys_platform == 'darwin' or sys_platform == 'linux'" },
]
sdist = { url = "https://files.pythonhosted.org/packages/a6/b5/9966c28e31014c228305e09d48b19b35522a8f941fe5af5f81f40dc8fa80/django_celery_results-2.6.0.tar.gz", hash = "sha256:9abcd836ae6b61063779244d8887a88fe80bbfaba143df36d3cb07034671277c", size = 83985, upload-time = "2025-04-10T08:23:52.677Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/2c/da/70f0f3c5364735344c4bc89e53413bcaae95b4fc1de4e98a7a3b9fb70c88/django_celery_results-2.6.0-py3-none-any.whl", hash = "sha256:b9ccdca2695b98c7cbbb8dea742311ba9a92773d71d7b4944a676e69a7df1c73", size = 38351, upload-time = "2025-04-10T08:23:49.965Z" },
]
[[package]]
name = "django-compression-middleware"
version = "0.5.0"
@@ -2869,7 +2856,6 @@ dependencies = [
{ name = "django-allauth", extra = ["mfa", "socialaccount"], marker = "sys_platform == 'darwin' or sys_platform == 'linux'" },
{ name = "django-auditlog", marker = "sys_platform == 'darwin' or sys_platform == 'linux'" },
{ name = "django-cachalot", marker = "sys_platform == 'darwin' or sys_platform == 'linux'" },
{ name = "django-celery-results", marker = "sys_platform == 'darwin' or sys_platform == 'linux'" },
{ name = "django-compression-middleware", marker = "sys_platform == 'darwin' or sys_platform == 'linux'" },
{ name = "django-cors-headers", marker = "sys_platform == 'darwin' or sys_platform == 'linux'" },
{ name = "django-extensions", marker = "sys_platform == 'darwin' or sys_platform == 'linux'" },
@@ -3014,11 +3000,10 @@ requires-dist = [
{ name = "channels-redis", specifier = "~=4.2" },
{ name = "concurrent-log-handler", specifier = "~=0.9.25" },
{ name = "dateparser", specifier = "~=1.2" },
{ name = "django", specifier = "~=5.2.10" },
{ name = "django", specifier = "~=5.2.13" },
{ name = "django-allauth", extras = ["mfa", "socialaccount"], specifier = "~=65.15.0" },
{ name = "django-auditlog", specifier = "~=3.4.1" },
{ name = "django-cachalot", specifier = "~=2.9.0" },
{ name = "django-celery-results", specifier = "~=2.6.0" },
{ name = "django-compression-middleware", specifier = "~=0.5.0" },
{ name = "django-cors-headers", specifier = "~=4.9.0" },
{ name = "django-extensions", specifier = "~=4.1" },
@@ -3087,7 +3072,7 @@ dev = [
{ name = "faker", specifier = "~=40.12.0" },
{ name = "imagehash" },
{ name = "prek", specifier = "~=0.3.0" },
{ name = "pytest", specifier = "~=9.0.0" },
{ name = "pytest", specifier = "~=9.0.3" },
{ name = "pytest-cov", specifier = "~=7.1.0" },
{ name = "pytest-django", specifier = "~=4.12.0" },
{ name = "pytest-env", specifier = "~=1.6.0" },
@@ -3110,7 +3095,7 @@ testing = [
{ name = "factory-boy", specifier = "~=3.3.1" },
{ name = "faker", specifier = "~=40.12.0" },
{ name = "imagehash" },
{ name = "pytest", specifier = "~=9.0.0" },
{ name = "pytest", specifier = "~=9.0.3" },
{ name = "pytest-cov", specifier = "~=7.1.0" },
{ name = "pytest-django", specifier = "~=4.12.0" },
{ name = "pytest-env", specifier = "~=1.6.0" },
@@ -3265,70 +3250,70 @@ wheels = [
[[package]]
name = "pillow"
version = "12.1.1"
version = "12.2.0"
source = { registry = "https://pypi.org/simple" }
sdist = { url = "https://files.pythonhosted.org/packages/1f/42/5c74462b4fd957fcd7b13b04fb3205ff8349236ea74c7c375766d6c82288/pillow-12.1.1.tar.gz", hash = "sha256:9ad8fa5937ab05218e2b6a4cff30295ad35afd2f83ac592e68c0d871bb0fdbc4", size = 46980264, upload-time = "2026-02-11T04:23:07.146Z" }
sdist = { url = "https://files.pythonhosted.org/packages/8c/21/c2bcdd5906101a30244eaffc1b6e6ce71a31bd0742a01eb89e660ebfac2d/pillow-12.2.0.tar.gz", hash = "sha256:a830b1a40919539d07806aa58e1b114df53ddd43213d9c8b75847eee6c0182b5", size = 46987819, upload-time = "2026-04-01T14:46:17.687Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/2b/46/5da1ec4a5171ee7bf1a0efa064aba70ba3d6e0788ce3f5acd1375d23c8c0/pillow-12.1.1-cp311-cp311-macosx_10_10_x86_64.whl", hash = "sha256:e879bb6cd5c73848ef3b2b48b8af9ff08c5b71ecda8048b7dd22d8a33f60be32", size = 5304084, upload-time = "2026-02-11T04:20:27.501Z" },
{ url = "https://files.pythonhosted.org/packages/78/93/a29e9bc02d1cf557a834da780ceccd54e02421627200696fcf805ebdc3fb/pillow-12.1.1-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:365b10bb9417dd4498c0e3b128018c4a624dc11c7b97d8cc54effe3b096f4c38", size = 4657866, upload-time = "2026-02-11T04:20:29.827Z" },
{ url = "https://files.pythonhosted.org/packages/13/84/583a4558d492a179d31e4aae32eadce94b9acf49c0337c4ce0b70e0a01f2/pillow-12.1.1-cp311-cp311-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:d4ce8e329c93845720cd2014659ca67eac35f6433fd3050393d85f3ecef0dad5", size = 6232148, upload-time = "2026-02-11T04:20:31.329Z" },
{ url = "https://files.pythonhosted.org/packages/d5/e2/53c43334bbbb2d3b938978532fbda8e62bb6e0b23a26ce8592f36bcc4987/pillow-12.1.1-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:fc354a04072b765eccf2204f588a7a532c9511e8b9c7f900e1b64e3e33487090", size = 8038007, upload-time = "2026-02-11T04:20:34.225Z" },
{ url = "https://files.pythonhosted.org/packages/b8/a6/3d0e79c8a9d58150dd98e199d7c1c56861027f3829a3a60b3c2784190180/pillow-12.1.1-cp311-cp311-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:7e7976bf1910a8116b523b9f9f58bf410f3e8aa330cd9a2bb2953f9266ab49af", size = 6345418, upload-time = "2026-02-11T04:20:35.858Z" },
{ url = "https://files.pythonhosted.org/packages/a2/c8/46dfeac5825e600579157eea177be43e2f7ff4a99da9d0d0a49533509ac5/pillow-12.1.1-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:597bd9c8419bc7c6af5604e55847789b69123bbe25d65cc6ad3012b4f3c98d8b", size = 7034590, upload-time = "2026-02-11T04:20:37.91Z" },
{ url = "https://files.pythonhosted.org/packages/af/bf/e6f65d3db8a8bbfeaf9e13cc0417813f6319863a73de934f14b2229ada18/pillow-12.1.1-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:2c1fc0f2ca5f96a3c8407e41cca26a16e46b21060fe6d5b099d2cb01412222f5", size = 6458655, upload-time = "2026-02-11T04:20:39.496Z" },
{ url = "https://files.pythonhosted.org/packages/f9/c2/66091f3f34a25894ca129362e510b956ef26f8fb67a0e6417bc5744e56f1/pillow-12.1.1-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:578510d88c6229d735855e1f278aa305270438d36a05031dfaae5067cc8eb04d", size = 7159286, upload-time = "2026-02-11T04:20:41.139Z" },
{ url = "https://files.pythonhosted.org/packages/07/d3/8df65da0d4df36b094351dce696f2989bec731d4f10e743b1c5f4da4d3bf/pillow-12.1.1-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:ab323b787d6e18b3d91a72fc99b1a2c28651e4358749842b8f8dfacd28ef2052", size = 5262803, upload-time = "2026-02-11T04:20:47.653Z" },
{ url = "https://files.pythonhosted.org/packages/d6/71/5026395b290ff404b836e636f51d7297e6c83beceaa87c592718747e670f/pillow-12.1.1-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:adebb5bee0f0af4909c30db0d890c773d1a92ffe83da908e2e9e720f8edf3984", size = 4657601, upload-time = "2026-02-11T04:20:49.328Z" },
{ url = "https://files.pythonhosted.org/packages/b1/2e/1001613d941c67442f745aff0f7cc66dd8df9a9c084eb497e6a543ee6f7e/pillow-12.1.1-cp312-cp312-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:bb66b7cc26f50977108790e2456b7921e773f23db5630261102233eb355a3b79", size = 6234995, upload-time = "2026-02-11T04:20:51.032Z" },
{ url = "https://files.pythonhosted.org/packages/07/26/246ab11455b2549b9233dbd44d358d033a2f780fa9007b61a913c5b2d24e/pillow-12.1.1-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:aee2810642b2898bb187ced9b349e95d2a7272930796e022efaf12e99dccd293", size = 8045012, upload-time = "2026-02-11T04:20:52.882Z" },
{ url = "https://files.pythonhosted.org/packages/b2/8b/07587069c27be7535ac1fe33874e32de118fbd34e2a73b7f83436a88368c/pillow-12.1.1-cp312-cp312-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:a0b1cd6232e2b618adcc54d9882e4e662a089d5768cd188f7c245b4c8c44a397", size = 6349638, upload-time = "2026-02-11T04:20:54.444Z" },
{ url = "https://files.pythonhosted.org/packages/ff/79/6df7b2ee763d619cda2fb4fea498e5f79d984dae304d45a8999b80d6cf5c/pillow-12.1.1-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:7aac39bcf8d4770d089588a2e1dd111cbaa42df5a94be3114222057d68336bd0", size = 7041540, upload-time = "2026-02-11T04:20:55.97Z" },
{ url = "https://files.pythonhosted.org/packages/2c/5e/2ba19e7e7236d7529f4d873bdaf317a318896bac289abebd4bb00ef247f0/pillow-12.1.1-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:ab174cd7d29a62dd139c44bf74b698039328f45cb03b4596c43473a46656b2f3", size = 6462613, upload-time = "2026-02-11T04:20:57.542Z" },
{ url = "https://files.pythonhosted.org/packages/03/03/31216ec124bb5c3dacd74ce8efff4cc7f52643653bad4825f8f08c697743/pillow-12.1.1-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:339ffdcb7cbeaa08221cd401d517d4b1fe7a9ed5d400e4a8039719238620ca35", size = 7166745, upload-time = "2026-02-11T04:20:59.196Z" },
{ url = "https://files.pythonhosted.org/packages/d5/11/6db24d4bd7685583caeae54b7009584e38da3c3d4488ed4cd25b439de486/pillow-12.1.1-cp313-cp313-ios_13_0_arm64_iphoneos.whl", hash = "sha256:d242e8ac078781f1de88bf823d70c1a9b3c7950a44cdf4b7c012e22ccbcd8e4e", size = 4062689, upload-time = "2026-02-11T04:21:06.804Z" },
{ url = "https://files.pythonhosted.org/packages/33/c0/ce6d3b1fe190f0021203e0d9b5b99e57843e345f15f9ef22fcd43842fd21/pillow-12.1.1-cp313-cp313-ios_13_0_arm64_iphonesimulator.whl", hash = "sha256:02f84dfad02693676692746df05b89cf25597560db2857363a208e393429f5e9", size = 4138535, upload-time = "2026-02-11T04:21:08.452Z" },
{ url = "https://files.pythonhosted.org/packages/a0/c6/d5eb6a4fb32a3f9c21a8c7613ec706534ea1cf9f4b3663e99f0d83f6fca8/pillow-12.1.1-cp313-cp313-ios_13_0_x86_64_iphonesimulator.whl", hash = "sha256:e65498daf4b583091ccbb2556c7000abf0f3349fcd57ef7adc9a84a394ed29f6", size = 3601364, upload-time = "2026-02-11T04:21:10.194Z" },
{ url = "https://files.pythonhosted.org/packages/14/a1/16c4b823838ba4c9c52c0e6bbda903a3fe5a1bdbf1b8eb4fff7156f3e318/pillow-12.1.1-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:6c6db3b84c87d48d0088943bf33440e0c42370b99b1c2a7989216f7b42eede60", size = 5262561, upload-time = "2026-02-11T04:21:11.742Z" },
{ url = "https://files.pythonhosted.org/packages/bb/ad/ad9dc98ff24f485008aa5cdedaf1a219876f6f6c42a4626c08bc4e80b120/pillow-12.1.1-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:8b7e5304e34942bf62e15184219a7b5ad4ff7f3bb5cca4d984f37df1a0e1aee2", size = 4657460, upload-time = "2026-02-11T04:21:13.786Z" },
{ url = "https://files.pythonhosted.org/packages/9e/1b/f1a4ea9a895b5732152789326202a82464d5254759fbacae4deea3069334/pillow-12.1.1-cp313-cp313-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:18e5bddd742a44b7e6b1e773ab5db102bd7a94c32555ba656e76d319d19c3850", size = 6232698, upload-time = "2026-02-11T04:21:15.949Z" },
{ url = "https://files.pythonhosted.org/packages/95/f4/86f51b8745070daf21fd2e5b1fe0eb35d4db9ca26e6d58366562fb56a743/pillow-12.1.1-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:fc44ef1f3de4f45b50ccf9136999d71abb99dca7706bc75d222ed350b9fd2289", size = 8041706, upload-time = "2026-02-11T04:21:17.723Z" },
{ url = "https://files.pythonhosted.org/packages/29/9b/d6ecd956bb1266dd1045e995cce9b8d77759e740953a1c9aad9502a0461e/pillow-12.1.1-cp313-cp313-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:5a8eb7ed8d4198bccbd07058416eeec51686b498e784eda166395a23eb99138e", size = 6346621, upload-time = "2026-02-11T04:21:19.547Z" },
{ url = "https://files.pythonhosted.org/packages/71/24/538bff45bde96535d7d998c6fed1a751c75ac7c53c37c90dc2601b243893/pillow-12.1.1-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:47b94983da0c642de92ced1702c5b6c292a84bd3a8e1d1702ff923f183594717", size = 7038069, upload-time = "2026-02-11T04:21:21.378Z" },
{ url = "https://files.pythonhosted.org/packages/94/0e/58cb1a6bc48f746bc4cb3adb8cabff73e2742c92b3bf7a220b7cf69b9177/pillow-12.1.1-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:518a48c2aab7ce596d3bf79d0e275661b846e86e4d0e7dec34712c30fe07f02a", size = 6460040, upload-time = "2026-02-11T04:21:23.148Z" },
{ url = "https://files.pythonhosted.org/packages/6c/57/9045cb3ff11eeb6c1adce3b2d60d7d299d7b273a2e6c8381a524abfdc474/pillow-12.1.1-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:a550ae29b95c6dc13cf69e2c9dc5747f814c54eeb2e32d683e5e93af56caa029", size = 7164523, upload-time = "2026-02-11T04:21:25.01Z" },
{ url = "https://files.pythonhosted.org/packages/19/2a/b9d62794fc8a0dd14c1943df68347badbd5511103e0d04c035ffe5cf2255/pillow-12.1.1-cp313-cp313t-macosx_10_13_x86_64.whl", hash = "sha256:0330d233c1a0ead844fc097a7d16c0abff4c12e856c0b325f231820fee1f39da", size = 5264880, upload-time = "2026-02-11T04:21:32.865Z" },
{ url = "https://files.pythonhosted.org/packages/26/9d/e03d857d1347fa5ed9247e123fcd2a97b6220e15e9cb73ca0a8d91702c6e/pillow-12.1.1-cp313-cp313t-macosx_11_0_arm64.whl", hash = "sha256:5dae5f21afb91322f2ff791895ddd8889e5e947ff59f71b46041c8ce6db790bc", size = 4660616, upload-time = "2026-02-11T04:21:34.97Z" },
{ url = "https://files.pythonhosted.org/packages/f7/ec/8a6d22afd02570d30954e043f09c32772bfe143ba9285e2fdb11284952cd/pillow-12.1.1-cp313-cp313t-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:2e0c664be47252947d870ac0d327fea7e63985a08794758aa8af5b6cb6ec0c9c", size = 6269008, upload-time = "2026-02-11T04:21:36.623Z" },
{ url = "https://files.pythonhosted.org/packages/3d/1d/6d875422c9f28a4a361f495a5f68d9de4a66941dc2c619103ca335fa6446/pillow-12.1.1-cp313-cp313t-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:691ab2ac363b8217f7d31b3497108fb1f50faab2f75dfb03284ec2f217e87bf8", size = 8073226, upload-time = "2026-02-11T04:21:38.585Z" },
{ url = "https://files.pythonhosted.org/packages/a1/cd/134b0b6ee5eda6dc09e25e24b40fdafe11a520bc725c1d0bbaa5e00bf95b/pillow-12.1.1-cp313-cp313t-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:e9e8064fb1cc019296958595f6db671fba95209e3ceb0c4734c9baf97de04b20", size = 6380136, upload-time = "2026-02-11T04:21:40.562Z" },
{ url = "https://files.pythonhosted.org/packages/7a/a9/7628f013f18f001c1b98d8fffe3452f306a70dc6aba7d931019e0492f45e/pillow-12.1.1-cp313-cp313t-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:472a8d7ded663e6162dafdf20015c486a7009483ca671cece7a9279b512fcb13", size = 7067129, upload-time = "2026-02-11T04:21:42.521Z" },
{ url = "https://files.pythonhosted.org/packages/1e/f8/66ab30a2193b277785601e82ee2d49f68ea575d9637e5e234faaa98efa4c/pillow-12.1.1-cp313-cp313t-musllinux_1_2_aarch64.whl", hash = "sha256:89b54027a766529136a06cfebeecb3a04900397a3590fd252160b888479517bf", size = 6491807, upload-time = "2026-02-11T04:21:44.22Z" },
{ url = "https://files.pythonhosted.org/packages/da/0b/a877a6627dc8318fdb84e357c5e1a758c0941ab1ddffdafd231983788579/pillow-12.1.1-cp313-cp313t-musllinux_1_2_x86_64.whl", hash = "sha256:86172b0831b82ce4f7877f280055892b31179e1576aa00d0df3bb1bbf8c3e524", size = 7190954, upload-time = "2026-02-11T04:21:46.114Z" },
{ url = "https://files.pythonhosted.org/packages/03/d0/bebb3ffbf31c5a8e97241476c4cf8b9828954693ce6744b4a2326af3e16b/pillow-12.1.1-cp314-cp314-ios_13_0_arm64_iphoneos.whl", hash = "sha256:417423db963cb4be8bac3fc1204fe61610f6abeed1580a7a2cbb2fbda20f12af", size = 4062652, upload-time = "2026-02-11T04:21:53.19Z" },
{ url = "https://files.pythonhosted.org/packages/2d/c0/0e16fb0addda4851445c28f8350d8c512f09de27bbb0d6d0bbf8b6709605/pillow-12.1.1-cp314-cp314-ios_13_0_arm64_iphonesimulator.whl", hash = "sha256:b957b71c6b2387610f556a7eb0828afbe40b4a98036fc0d2acfa5a44a0c2036f", size = 4138823, upload-time = "2026-02-11T04:22:03.088Z" },
{ url = "https://files.pythonhosted.org/packages/6b/fb/6170ec655d6f6bb6630a013dd7cf7bc218423d7b5fa9071bf63dc32175ae/pillow-12.1.1-cp314-cp314-ios_13_0_x86_64_iphonesimulator.whl", hash = "sha256:097690ba1f2efdeb165a20469d59d8bb03c55fb6621eb2041a060ae8ea3e9642", size = 3601143, upload-time = "2026-02-11T04:22:04.909Z" },
{ url = "https://files.pythonhosted.org/packages/59/04/dc5c3f297510ba9a6837cbb318b87dd2b8f73eb41a43cc63767f65cb599c/pillow-12.1.1-cp314-cp314-macosx_10_15_x86_64.whl", hash = "sha256:2815a87ab27848db0321fb78c7f0b2c8649dee134b7f2b80c6a45c6831d75ccd", size = 5266254, upload-time = "2026-02-11T04:22:07.656Z" },
{ url = "https://files.pythonhosted.org/packages/05/30/5db1236b0d6313f03ebf97f5e17cda9ca060f524b2fcc875149a8360b21c/pillow-12.1.1-cp314-cp314-macosx_11_0_arm64.whl", hash = "sha256:f7ed2c6543bad5a7d5530eb9e78c53132f93dfa44a28492db88b41cdab885202", size = 4657499, upload-time = "2026-02-11T04:22:09.613Z" },
{ url = "https://files.pythonhosted.org/packages/6f/18/008d2ca0eb612e81968e8be0bbae5051efba24d52debf930126d7eaacbba/pillow-12.1.1-cp314-cp314-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:652a2c9ccfb556235b2b501a3a7cf3742148cd22e04b5625c5fe057ea3e3191f", size = 6232137, upload-time = "2026-02-11T04:22:11.434Z" },
{ url = "https://files.pythonhosted.org/packages/70/f1/f14d5b8eeb4b2cd62b9f9f847eb6605f103df89ef619ac68f92f748614ea/pillow-12.1.1-cp314-cp314-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:d6e4571eedf43af33d0fc233a382a76e849badbccdf1ac438841308652a08e1f", size = 8042721, upload-time = "2026-02-11T04:22:13.321Z" },
{ url = "https://files.pythonhosted.org/packages/5a/d6/17824509146e4babbdabf04d8171491fa9d776f7061ff6e727522df9bd03/pillow-12.1.1-cp314-cp314-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:b574c51cf7d5d62e9be37ba446224b59a2da26dc4c1bb2ecbe936a4fb1a7cb7f", size = 6347798, upload-time = "2026-02-11T04:22:15.449Z" },
{ url = "https://files.pythonhosted.org/packages/d1/ee/c85a38a9ab92037a75615aba572c85ea51e605265036e00c5b67dfafbfe2/pillow-12.1.1-cp314-cp314-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:a37691702ed687799de29a518d63d4682d9016932db66d4e90c345831b02fb4e", size = 7039315, upload-time = "2026-02-11T04:22:17.24Z" },
{ url = "https://files.pythonhosted.org/packages/ec/f3/bc8ccc6e08a148290d7523bde4d9a0d6c981db34631390dc6e6ec34cacf6/pillow-12.1.1-cp314-cp314-musllinux_1_2_aarch64.whl", hash = "sha256:f95c00d5d6700b2b890479664a06e754974848afaae5e21beb4d83c106923fd0", size = 6462360, upload-time = "2026-02-11T04:22:19.111Z" },
{ url = "https://files.pythonhosted.org/packages/f6/ab/69a42656adb1d0665ab051eec58a41f169ad295cf81ad45406963105408f/pillow-12.1.1-cp314-cp314-musllinux_1_2_x86_64.whl", hash = "sha256:559b38da23606e68681337ad74622c4dbba02254fc9cb4488a305dd5975c7eeb", size = 7165438, upload-time = "2026-02-11T04:22:21.041Z" },
{ url = "https://files.pythonhosted.org/packages/6c/9d/efd18493f9de13b87ede7c47e69184b9e859e4427225ea962e32e56a49bc/pillow-12.1.1-cp314-cp314t-macosx_10_15_x86_64.whl", hash = "sha256:1f90cff8aa76835cba5769f0b3121a22bd4eb9e6884cfe338216e557a9a548b8", size = 5268612, upload-time = "2026-02-11T04:22:29.884Z" },
{ url = "https://files.pythonhosted.org/packages/f8/f1/4f42eb2b388eb2ffc660dcb7f7b556c1015c53ebd5f7f754965ef997585b/pillow-12.1.1-cp314-cp314t-macosx_11_0_arm64.whl", hash = "sha256:1f1be78ce9466a7ee64bfda57bdba0f7cc499d9794d518b854816c41bf0aa4e9", size = 4660567, upload-time = "2026-02-11T04:22:31.799Z" },
{ url = "https://files.pythonhosted.org/packages/01/54/df6ef130fa43e4b82e32624a7b821a2be1c5653a5fdad8469687a7db4e00/pillow-12.1.1-cp314-cp314t-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:42fc1f4677106188ad9a55562bbade416f8b55456f522430fadab3cef7cd4e60", size = 6269951, upload-time = "2026-02-11T04:22:33.921Z" },
{ url = "https://files.pythonhosted.org/packages/a9/48/618752d06cc44bb4aae8ce0cd4e6426871929ed7b46215638088270d9b34/pillow-12.1.1-cp314-cp314t-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:98edb152429ab62a1818039744d8fbb3ccab98a7c29fc3d5fcef158f3f1f68b7", size = 8074769, upload-time = "2026-02-11T04:22:35.877Z" },
{ url = "https://files.pythonhosted.org/packages/c3/bd/f1d71eb39a72fa088d938655afba3e00b38018d052752f435838961127d8/pillow-12.1.1-cp314-cp314t-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:d470ab1178551dd17fdba0fef463359c41aaa613cdcd7ff8373f54be629f9f8f", size = 6381358, upload-time = "2026-02-11T04:22:37.698Z" },
{ url = "https://files.pythonhosted.org/packages/64/ef/c784e20b96674ed36a5af839305f55616f8b4f8aa8eeccf8531a6e312243/pillow-12.1.1-cp314-cp314t-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:6408a7b064595afcab0a49393a413732a35788f2a5092fdc6266952ed67de586", size = 7068558, upload-time = "2026-02-11T04:22:39.597Z" },
{ url = "https://files.pythonhosted.org/packages/73/cb/8059688b74422ae61278202c4e1ad992e8a2e7375227be0a21c6b87ca8d5/pillow-12.1.1-cp314-cp314t-musllinux_1_2_aarch64.whl", hash = "sha256:5d8c41325b382c07799a3682c1c258469ea2ff97103c53717b7893862d0c98ce", size = 6493028, upload-time = "2026-02-11T04:22:42.73Z" },
{ url = "https://files.pythonhosted.org/packages/c6/da/e3c008ed7d2dd1f905b15949325934510b9d1931e5df999bb15972756818/pillow-12.1.1-cp314-cp314t-musllinux_1_2_x86_64.whl", hash = "sha256:c7697918b5be27424e9ce568193efd13d925c4481dd364e43f5dff72d33e10f8", size = 7191940, upload-time = "2026-02-11T04:22:44.543Z" },
{ url = "https://files.pythonhosted.org/packages/56/11/5d43209aa4cb58e0cc80127956ff1796a68b928e6324bbf06ef4db34367b/pillow-12.1.1-pp311-pypy311_pp73-macosx_10_15_x86_64.whl", hash = "sha256:600fd103672b925fe62ed08e0d874ea34d692474df6f4bf7ebe148b30f89f39f", size = 5228606, upload-time = "2026-02-11T04:22:52.106Z" },
{ url = "https://files.pythonhosted.org/packages/5f/d5/3b005b4e4fda6698b371fa6c21b097d4707585d7db99e98d9b0b87ac612a/pillow-12.1.1-pp311-pypy311_pp73-macosx_11_0_arm64.whl", hash = "sha256:665e1b916b043cef294bc54d47bf02d87e13f769bc4bc5fa225a24b3a6c5aca9", size = 4622321, upload-time = "2026-02-11T04:22:53.827Z" },
{ url = "https://files.pythonhosted.org/packages/df/36/ed3ea2d594356fd8037e5a01f6156c74bc8d92dbb0fa60746cc96cabb6e8/pillow-12.1.1-pp311-pypy311_pp73-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:495c302af3aad1ca67420ddd5c7bd480c8867ad173528767d906428057a11f0e", size = 5247579, upload-time = "2026-02-11T04:22:56.094Z" },
{ url = "https://files.pythonhosted.org/packages/54/9a/9cc3e029683cf6d20ae5085da0dafc63148e3252c2f13328e553aaa13cfb/pillow-12.1.1-pp311-pypy311_pp73-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:8fd420ef0c52c88b5a035a0886f367748c72147b2b8f384c9d12656678dfdfa9", size = 6989094, upload-time = "2026-02-11T04:22:58.288Z" },
{ url = "https://files.pythonhosted.org/packages/00/98/fc53ab36da80b88df0967896b6c4b4cd948a0dc5aa40a754266aa3ae48b3/pillow-12.1.1-pp311-pypy311_pp73-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:f975aa7ef9684ce7e2c18a3aa8f8e2106ce1e46b94ab713d156b2898811651d3", size = 5313850, upload-time = "2026-02-11T04:23:00.554Z" },
{ url = "https://files.pythonhosted.org/packages/30/02/00fa585abfd9fe9d73e5f6e554dc36cc2b842898cbfc46d70353dae227f8/pillow-12.1.1-pp311-pypy311_pp73-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:8089c852a56c2966cf18835db62d9b34fef7ba74c726ad943928d494fa7f4735", size = 5963343, upload-time = "2026-02-11T04:23:02.934Z" },
{ url = "https://files.pythonhosted.org/packages/68/e1/748f5663efe6edcfc4e74b2b93edfb9b8b99b67f21a854c3ae416500a2d9/pillow-12.2.0-cp311-cp311-macosx_10_10_x86_64.whl", hash = "sha256:8be29e59487a79f173507c30ddf57e733a357f67881430449bb32614075a40ab", size = 5354347, upload-time = "2026-04-01T14:42:44.255Z" },
{ url = "https://files.pythonhosted.org/packages/47/a1/d5ff69e747374c33a3b53b9f98cca7889fce1fd03d79cdc4e1bccc6c5a87/pillow-12.2.0-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:71cde9a1e1551df7d34a25462fc60325e8a11a82cc2e2f54578e5e9a1e153d65", size = 4695873, upload-time = "2026-04-01T14:42:46.452Z" },
{ url = "https://files.pythonhosted.org/packages/df/21/e3fbdf54408a973c7f7f89a23b2cb97a7ef30c61ab4142af31eee6aebc88/pillow-12.2.0-cp311-cp311-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:f490f9368b6fc026f021db16d7ec2fbf7d89e2edb42e8ec09d2c60505f5729c7", size = 6280168, upload-time = "2026-04-01T14:42:49.228Z" },
{ url = "https://files.pythonhosted.org/packages/d3/f1/00b7278c7dd52b17ad4329153748f87b6756ec195ff786c2bdf12518337d/pillow-12.2.0-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:8bd7903a5f2a4545f6fd5935c90058b89d30045568985a71c79f5fd6edf9b91e", size = 8088188, upload-time = "2026-04-01T14:42:51.735Z" },
{ url = "https://files.pythonhosted.org/packages/ad/cf/220a5994ef1b10e70e85748b75649d77d506499352be135a4989c957b701/pillow-12.2.0-cp311-cp311-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:3997232e10d2920a68d25191392e3a4487d8183039e1c74c2297f00ed1c50705", size = 6394401, upload-time = "2026-04-01T14:42:54.343Z" },
{ url = "https://files.pythonhosted.org/packages/e9/bd/e51a61b1054f09437acfbc2ff9106c30d1eb76bc1453d428399946781253/pillow-12.2.0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:e74473c875d78b8e9d5da2a70f7099549f9eb37ded4e2f6a463e60125bccd176", size = 7079655, upload-time = "2026-04-01T14:42:56.954Z" },
{ url = "https://files.pythonhosted.org/packages/6b/3d/45132c57d5fb4b5744567c3817026480ac7fc3ce5d4c47902bc0e7f6f853/pillow-12.2.0-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:56a3f9c60a13133a98ecff6197af34d7824de9b7b38c3654861a725c970c197b", size = 6503105, upload-time = "2026-04-01T14:42:59.847Z" },
{ url = "https://files.pythonhosted.org/packages/7d/2e/9df2fc1e82097b1df3dce58dc43286aa01068e918c07574711fcc53e6fb4/pillow-12.2.0-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:90e6f81de50ad6b534cab6e5aef77ff6e37722b2f5d908686f4a5c9eba17a909", size = 7203402, upload-time = "2026-04-01T14:43:02.664Z" },
{ url = "https://files.pythonhosted.org/packages/58/be/7482c8a5ebebbc6470b3eb791812fff7d5e0216c2be3827b30b8bb6603ed/pillow-12.2.0-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:2d192a155bbcec180f8564f693e6fd9bccff5a7af9b32e2e4bf8c9c69dbad6b5", size = 5308279, upload-time = "2026-04-01T14:43:13.246Z" },
{ url = "https://files.pythonhosted.org/packages/d8/95/0a351b9289c2b5cbde0bacd4a83ebc44023e835490a727b2a3bd60ddc0f4/pillow-12.2.0-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:f3f40b3c5a968281fd507d519e444c35f0ff171237f4fdde090dd60699458421", size = 4695490, upload-time = "2026-04-01T14:43:15.584Z" },
{ url = "https://files.pythonhosted.org/packages/de/af/4e8e6869cbed569d43c416fad3dc4ecb944cb5d9492defaed89ddd6fe871/pillow-12.2.0-cp312-cp312-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:03e7e372d5240cc23e9f07deca4d775c0817bffc641b01e9c3af208dbd300987", size = 6284462, upload-time = "2026-04-01T14:43:18.268Z" },
{ url = "https://files.pythonhosted.org/packages/e9/9e/c05e19657fd57841e476be1ab46c4d501bffbadbafdc31a6d665f8b737b6/pillow-12.2.0-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:b86024e52a1b269467a802258c25521e6d742349d760728092e1bc2d135b4d76", size = 8094744, upload-time = "2026-04-01T14:43:20.716Z" },
{ url = "https://files.pythonhosted.org/packages/2b/54/1789c455ed10176066b6e7e6da1b01e50e36f94ba584dc68d9eebfe9156d/pillow-12.2.0-cp312-cp312-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:7371b48c4fa448d20d2714c9a1f775a81155050d383333e0a6c15b1123dda005", size = 6398371, upload-time = "2026-04-01T14:43:23.443Z" },
{ url = "https://files.pythonhosted.org/packages/43/e3/fdc657359e919462369869f1c9f0e973f353f9a9ee295a39b1fea8ee1a77/pillow-12.2.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:62f5409336adb0663b7caa0da5c7d9e7bdbaae9ce761d34669420c2a801b2780", size = 7087215, upload-time = "2026-04-01T14:43:26.758Z" },
{ url = "https://files.pythonhosted.org/packages/8b/f8/2f6825e441d5b1959d2ca5adec984210f1ec086435b0ed5f52c19b3b8a6e/pillow-12.2.0-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:01afa7cf67f74f09523699b4e88c73fb55c13346d212a59a2db1f86b0a63e8c5", size = 6509783, upload-time = "2026-04-01T14:43:29.56Z" },
{ url = "https://files.pythonhosted.org/packages/67/f9/029a27095ad20f854f9dba026b3ea6428548316e057e6fc3545409e86651/pillow-12.2.0-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:fc3d34d4a8fbec3e88a79b92e5465e0f9b842b628675850d860b8bd300b159f5", size = 7212112, upload-time = "2026-04-01T14:43:32.091Z" },
{ url = "https://files.pythonhosted.org/packages/4a/01/53d10cf0dbad820a8db274d259a37ba50b88b24768ddccec07355382d5ad/pillow-12.2.0-cp313-cp313-ios_13_0_arm64_iphoneos.whl", hash = "sha256:8297651f5b5679c19968abefd6bb84d95fe30ef712eb1b2d9b2d31ca61267f4c", size = 4100837, upload-time = "2026-04-01T14:43:41.506Z" },
{ url = "https://files.pythonhosted.org/packages/0f/98/f3a6657ecb698c937f6c76ee564882945f29b79bad496abcba0e84659ec5/pillow-12.2.0-cp313-cp313-ios_13_0_arm64_iphonesimulator.whl", hash = "sha256:50d8520da2a6ce0af445fa6d648c4273c3eeefbc32d7ce049f22e8b5c3daecc2", size = 4176528, upload-time = "2026-04-01T14:43:43.773Z" },
{ url = "https://files.pythonhosted.org/packages/69/bc/8986948f05e3ea490b8442ea1c1d4d990b24a7e43d8a51b2c7d8b1dced36/pillow-12.2.0-cp313-cp313-ios_13_0_x86_64_iphonesimulator.whl", hash = "sha256:766cef22385fa1091258ad7e6216792b156dc16d8d3fa607e7545b2b72061f1c", size = 3640401, upload-time = "2026-04-01T14:43:45.87Z" },
{ url = "https://files.pythonhosted.org/packages/34/46/6c717baadcd62bc8ed51d238d521ab651eaa74838291bda1f86fe1f864c9/pillow-12.2.0-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:5d2fd0fa6b5d9d1de415060363433f28da8b1526c1c129020435e186794b3795", size = 5308094, upload-time = "2026-04-01T14:43:48.438Z" },
{ url = "https://files.pythonhosted.org/packages/71/43/905a14a8b17fdb1ccb58d282454490662d2cb89a6bfec26af6d3520da5ec/pillow-12.2.0-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:56b25336f502b6ed02e889f4ece894a72612fe885889a6e8c4c80239ff6e5f5f", size = 4695402, upload-time = "2026-04-01T14:43:51.292Z" },
{ url = "https://files.pythonhosted.org/packages/73/dd/42107efcb777b16fa0393317eac58f5b5cf30e8392e266e76e51cff28c3d/pillow-12.2.0-cp313-cp313-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:f1c943e96e85df3d3478f7b691f229887e143f81fedab9b20205349ab04d73ed", size = 6280005, upload-time = "2026-04-01T14:43:54.242Z" },
{ url = "https://files.pythonhosted.org/packages/a8/68/b93e09e5e8549019e61acf49f65b1a8530765a7f812c77a7461bca7e4494/pillow-12.2.0-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:03f6fab9219220f041c74aeaa2939ff0062bd5c364ba9ce037197f4c6d498cd9", size = 8090669, upload-time = "2026-04-01T14:43:57.335Z" },
{ url = "https://files.pythonhosted.org/packages/4b/6e/3ccb54ce8ec4ddd1accd2d89004308b7b0b21c4ac3d20fa70af4760a4330/pillow-12.2.0-cp313-cp313-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:5cdfebd752ec52bf5bb4e35d9c64b40826bc5b40a13df7c3cda20a2c03a0f5ed", size = 6395194, upload-time = "2026-04-01T14:43:59.864Z" },
{ url = "https://files.pythonhosted.org/packages/67/ee/21d4e8536afd1a328f01b359b4d3997b291ffd35a237c877b331c1c3b71c/pillow-12.2.0-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:eedf4b74eda2b5a4b2b2fb4c006d6295df3bf29e459e198c90ea48e130dc75c3", size = 7082423, upload-time = "2026-04-01T14:44:02.74Z" },
{ url = "https://files.pythonhosted.org/packages/78/5f/e9f86ab0146464e8c133fe85df987ed9e77e08b29d8d35f9f9f4d6f917ba/pillow-12.2.0-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:00a2865911330191c0b818c59103b58a5e697cae67042366970a6b6f1b20b7f9", size = 6505667, upload-time = "2026-04-01T14:44:05.381Z" },
{ url = "https://files.pythonhosted.org/packages/ed/1e/409007f56a2fdce61584fd3acbc2bbc259857d555196cedcadc68c015c82/pillow-12.2.0-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:1e1757442ed87f4912397c6d35a0db6a7b52592156014706f17658ff58bbf795", size = 7208580, upload-time = "2026-04-01T14:44:08.39Z" },
{ url = "https://files.pythonhosted.org/packages/4d/a4/b342930964e3cb4dce5038ae34b0eab4653334995336cd486c5a8c25a00c/pillow-12.2.0-cp313-cp313t-macosx_10_13_x86_64.whl", hash = "sha256:042db20a421b9bafecc4b84a8b6e444686bd9d836c7fd24542db3e7df7baad9b", size = 5309927, upload-time = "2026-04-01T14:44:18.89Z" },
{ url = "https://files.pythonhosted.org/packages/9f/de/23198e0a65a9cf06123f5435a5d95cea62a635697f8f03d134d3f3a96151/pillow-12.2.0-cp313-cp313t-macosx_11_0_arm64.whl", hash = "sha256:dd025009355c926a84a612fecf58bb315a3f6814b17ead51a8e48d3823d9087f", size = 4698624, upload-time = "2026-04-01T14:44:21.115Z" },
{ url = "https://files.pythonhosted.org/packages/01/a6/1265e977f17d93ea37aa28aa81bad4fa597933879fac2520d24e021c8da3/pillow-12.2.0-cp313-cp313t-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:88ddbc66737e277852913bd1e07c150cc7bb124539f94c4e2df5344494e0a612", size = 6321252, upload-time = "2026-04-01T14:44:23.663Z" },
{ url = "https://files.pythonhosted.org/packages/3c/83/5982eb4a285967baa70340320be9f88e57665a387e3a53a7f0db8231a0cd/pillow-12.2.0-cp313-cp313t-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:d362d1878f00c142b7e1a16e6e5e780f02be8195123f164edf7eddd911eefe7c", size = 8126550, upload-time = "2026-04-01T14:44:26.772Z" },
{ url = "https://files.pythonhosted.org/packages/4e/48/6ffc514adce69f6050d0753b1a18fd920fce8cac87620d5a31231b04bfc5/pillow-12.2.0-cp313-cp313t-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:2c727a6d53cb0018aadd8018c2b938376af27914a68a492f59dfcaca650d5eea", size = 6433114, upload-time = "2026-04-01T14:44:29.615Z" },
{ url = "https://files.pythonhosted.org/packages/36/a3/f9a77144231fb8d40ee27107b4463e205fa4677e2ca2548e14da5cf18dce/pillow-12.2.0-cp313-cp313t-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:efd8c21c98c5cc60653bcb311bef2ce0401642b7ce9d09e03a7da87c878289d4", size = 7115667, upload-time = "2026-04-01T14:44:32.773Z" },
{ url = "https://files.pythonhosted.org/packages/c1/fc/ac4ee3041e7d5a565e1c4fd72a113f03b6394cc72ab7089d27608f8aaccb/pillow-12.2.0-cp313-cp313t-musllinux_1_2_aarch64.whl", hash = "sha256:9f08483a632889536b8139663db60f6724bfcb443c96f1b18855860d7d5c0fd4", size = 6538966, upload-time = "2026-04-01T14:44:35.252Z" },
{ url = "https://files.pythonhosted.org/packages/c0/a8/27fb307055087f3668f6d0a8ccb636e7431d56ed0750e07a60547b1e083e/pillow-12.2.0-cp313-cp313t-musllinux_1_2_x86_64.whl", hash = "sha256:dac8d77255a37e81a2efcbd1fc05f1c15ee82200e6c240d7e127e25e365c39ea", size = 7238241, upload-time = "2026-04-01T14:44:37.875Z" },
{ url = "https://files.pythonhosted.org/packages/bf/98/4595daa2365416a86cb0d495248a393dfc84e96d62ad080c8546256cb9c0/pillow-12.2.0-cp314-cp314-ios_13_0_arm64_iphoneos.whl", hash = "sha256:3adc9215e8be0448ed6e814966ecf3d9952f0ea40eb14e89a102b87f450660d8", size = 4100848, upload-time = "2026-04-01T14:44:48.48Z" },
{ url = "https://files.pythonhosted.org/packages/0b/79/40184d464cf89f6663e18dfcf7ca21aae2491fff1a16127681bf1fa9b8cf/pillow-12.2.0-cp314-cp314-ios_13_0_arm64_iphonesimulator.whl", hash = "sha256:6a9adfc6d24b10f89588096364cc726174118c62130c817c2837c60cf08a392b", size = 4176515, upload-time = "2026-04-01T14:44:51.353Z" },
{ url = "https://files.pythonhosted.org/packages/b0/63/703f86fd4c422a9cf722833670f4f71418fb116b2853ff7da722ea43f184/pillow-12.2.0-cp314-cp314-ios_13_0_x86_64_iphonesimulator.whl", hash = "sha256:6a6e67ea2e6feda684ed370f9a1c52e7a243631c025ba42149a2cc5934dec295", size = 3640159, upload-time = "2026-04-01T14:44:53.588Z" },
{ url = "https://files.pythonhosted.org/packages/71/e0/fb22f797187d0be2270f83500aab851536101b254bfa1eae10795709d283/pillow-12.2.0-cp314-cp314-macosx_10_15_x86_64.whl", hash = "sha256:2bb4a8d594eacdfc59d9e5ad972aa8afdd48d584ffd5f13a937a664c3e7db0ed", size = 5312185, upload-time = "2026-04-01T14:44:56.039Z" },
{ url = "https://files.pythonhosted.org/packages/ba/8c/1a9e46228571de18f8e28f16fabdfc20212a5d019f3e3303452b3f0a580d/pillow-12.2.0-cp314-cp314-macosx_11_0_arm64.whl", hash = "sha256:80b2da48193b2f33ed0c32c38140f9d3186583ce7d516526d462645fd98660ae", size = 4695386, upload-time = "2026-04-01T14:44:58.663Z" },
{ url = "https://files.pythonhosted.org/packages/70/62/98f6b7f0c88b9addd0e87c217ded307b36be024d4ff8869a812b241d1345/pillow-12.2.0-cp314-cp314-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:22db17c68434de69d8ecfc2fe821569195c0c373b25cccb9cbdacf2c6e53c601", size = 6280384, upload-time = "2026-04-01T14:45:01.5Z" },
{ url = "https://files.pythonhosted.org/packages/5e/03/688747d2e91cfbe0e64f316cd2e8005698f76ada3130d0194664174fa5de/pillow-12.2.0-cp314-cp314-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:7b14cc0106cd9aecda615dd6903840a058b4700fcb817687d0ee4fc8b6e389be", size = 8091599, upload-time = "2026-04-01T14:45:04.5Z" },
{ url = "https://files.pythonhosted.org/packages/f6/35/577e22b936fcdd66537329b33af0b4ccfefaeabd8aec04b266528cddb33c/pillow-12.2.0-cp314-cp314-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:8cbeb542b2ebc6fcdacabf8aca8c1a97c9b3ad3927d46b8723f9d4f033288a0f", size = 6396021, upload-time = "2026-04-01T14:45:07.117Z" },
{ url = "https://files.pythonhosted.org/packages/11/8d/d2532ad2a603ca2b93ad9f5135732124e57811d0168155852f37fbce2458/pillow-12.2.0-cp314-cp314-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:4bfd07bc812fbd20395212969e41931001fd59eb55a60658b0e5710872e95286", size = 7083360, upload-time = "2026-04-01T14:45:09.763Z" },
{ url = "https://files.pythonhosted.org/packages/5e/26/d325f9f56c7e039034897e7380e9cc202b1e368bfd04d4cbe6a441f02885/pillow-12.2.0-cp314-cp314-musllinux_1_2_aarch64.whl", hash = "sha256:9aba9a17b623ef750a4d11b742cbafffeb48a869821252b30ee21b5e91392c50", size = 6507628, upload-time = "2026-04-01T14:45:12.378Z" },
{ url = "https://files.pythonhosted.org/packages/5f/f7/769d5632ffb0988f1c5e7660b3e731e30f7f8ec4318e94d0a5d674eb65a4/pillow-12.2.0-cp314-cp314-musllinux_1_2_x86_64.whl", hash = "sha256:deede7c263feb25dba4e82ea23058a235dcc2fe1f6021025dc71f2b618e26104", size = 7209321, upload-time = "2026-04-01T14:45:15.122Z" },
{ url = "https://files.pythonhosted.org/packages/b6/ab/1b426a3974cb0e7da5c29ccff4807871d48110933a57207b5a676cccc155/pillow-12.2.0-cp314-cp314t-macosx_10_15_x86_64.whl", hash = "sha256:57850958fe9c751670e49b2cecf6294acc99e562531f4bd317fa5ddee2068463", size = 5314225, upload-time = "2026-04-01T14:45:25.637Z" },
{ url = "https://files.pythonhosted.org/packages/19/1e/dce46f371be2438eecfee2a1960ee2a243bbe5e961890146d2dee1ff0f12/pillow-12.2.0-cp314-cp314t-macosx_11_0_arm64.whl", hash = "sha256:d5d38f1411c0ed9f97bcb49b7bd59b6b7c314e0e27420e34d99d844b9ce3b6f3", size = 4698541, upload-time = "2026-04-01T14:45:28.355Z" },
{ url = "https://files.pythonhosted.org/packages/55/c3/7fbecf70adb3a0c33b77a300dc52e424dc22ad8cdc06557a2e49523b703d/pillow-12.2.0-cp314-cp314t-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:5c0a9f29ca8e79f09de89293f82fc9b0270bb4af1d58bc98f540cc4aedf03166", size = 6322251, upload-time = "2026-04-01T14:45:30.924Z" },
{ url = "https://files.pythonhosted.org/packages/1c/3c/7fbc17cfb7e4fe0ef1642e0abc17fc6c94c9f7a16be41498e12e2ba60408/pillow-12.2.0-cp314-cp314t-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:1610dd6c61621ae1cf811bef44d77e149ce3f7b95afe66a4512f8c59f25d9ebe", size = 8127807, upload-time = "2026-04-01T14:45:33.908Z" },
{ url = "https://files.pythonhosted.org/packages/ff/c3/a8ae14d6defd2e448493ff512fae903b1e9bd40b72efb6ec55ce0048c8ce/pillow-12.2.0-cp314-cp314t-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:0a34329707af4f73cf1782a36cd2289c0368880654a2c11f027bcee9052d35dd", size = 6433935, upload-time = "2026-04-01T14:45:36.623Z" },
{ url = "https://files.pythonhosted.org/packages/6e/32/2880fb3a074847ac159d8f902cb43278a61e85f681661e7419e6596803ed/pillow-12.2.0-cp314-cp314t-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:8e9c4f5b3c546fa3458a29ab22646c1c6c787ea8f5ef51300e5a60300736905e", size = 7116720, upload-time = "2026-04-01T14:45:39.258Z" },
{ url = "https://files.pythonhosted.org/packages/46/87/495cc9c30e0129501643f24d320076f4cc54f718341df18cc70ec94c44e1/pillow-12.2.0-cp314-cp314t-musllinux_1_2_aarch64.whl", hash = "sha256:fb043ee2f06b41473269765c2feae53fc2e2fbf96e5e22ca94fb5ad677856f06", size = 6540498, upload-time = "2026-04-01T14:45:41.879Z" },
{ url = "https://files.pythonhosted.org/packages/18/53/773f5edca692009d883a72211b60fdaf8871cbef075eaa9d577f0a2f989e/pillow-12.2.0-cp314-cp314t-musllinux_1_2_x86_64.whl", hash = "sha256:f278f034eb75b4e8a13a54a876cc4a5ab39173d2cdd93a638e1b467fc545ac43", size = 7239413, upload-time = "2026-04-01T14:45:44.705Z" },
{ url = "https://files.pythonhosted.org/packages/4e/b7/2437044fb910f499610356d1352e3423753c98e34f915252aafecc64889f/pillow-12.2.0-pp311-pypy311_pp73-macosx_10_15_x86_64.whl", hash = "sha256:0538bd5e05efec03ae613fd89c4ce0368ecd2ba239cc25b9f9be7ed426b0af1f", size = 5273969, upload-time = "2026-04-01T14:45:55.538Z" },
{ url = "https://files.pythonhosted.org/packages/f6/f4/8316e31de11b780f4ac08ef3654a75555e624a98db1056ecb2122d008d5a/pillow-12.2.0-pp311-pypy311_pp73-macosx_11_0_arm64.whl", hash = "sha256:394167b21da716608eac917c60aa9b969421b5dcbbe02ae7f013e7b85811c69d", size = 4659674, upload-time = "2026-04-01T14:45:58.093Z" },
{ url = "https://files.pythonhosted.org/packages/d4/37/664fca7201f8bb2aa1d20e2c3d5564a62e6ae5111741966c8319ca802361/pillow-12.2.0-pp311-pypy311_pp73-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:5d04bfa02cc2d23b497d1e90a0f927070043f6cbf303e738300532379a4b4e0f", size = 5288479, upload-time = "2026-04-01T14:46:01.141Z" },
{ url = "https://files.pythonhosted.org/packages/49/62/5b0ed78fce87346be7a5cfcfaaad91f6a1f98c26f86bdbafa2066c647ef6/pillow-12.2.0-pp311-pypy311_pp73-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:0c838a5125cee37e68edec915651521191cef1e6aa336b855f495766e77a366e", size = 7032230, upload-time = "2026-04-01T14:46:03.874Z" },
{ url = "https://files.pythonhosted.org/packages/c3/28/ec0fc38107fc32536908034e990c47914c57cd7c5a3ece4d8d8f7ffd7e27/pillow-12.2.0-pp311-pypy311_pp73-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:4a6c9fa44005fa37a91ebfc95d081e8079757d2e904b27103f4f5fa6f0bf78c0", size = 5355404, upload-time = "2026-04-01T14:46:06.33Z" },
{ url = "https://files.pythonhosted.org/packages/5e/8b/51b0eddcfa2180d60e41f06bd6d0a62202b20b59c68f5a132e615b75aecf/pillow-12.2.0-pp311-pypy311_pp73-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:25373b66e0dd5905ed63fa3cae13c82fbddf3079f2c8bf15c6fb6a35586324c1", size = 6002215, upload-time = "2026-04-01T14:46:08.83Z" },
]
[[package]]
@@ -3769,7 +3754,7 @@ wheels = [
[[package]]
name = "pytest"
version = "9.0.2"
version = "9.0.3"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "iniconfig", marker = "sys_platform == 'darwin' or sys_platform == 'linux'" },
@@ -3777,9 +3762,9 @@ dependencies = [
{ name = "pluggy", marker = "sys_platform == 'darwin' or sys_platform == 'linux'" },
{ name = "pygments", marker = "sys_platform == 'darwin' or sys_platform == 'linux'" },
]
sdist = { url = "https://files.pythonhosted.org/packages/d1/db/7ef3487e0fb0049ddb5ce41d3a49c235bf9ad299b6a25d5780a89f19230f/pytest-9.0.2.tar.gz", hash = "sha256:75186651a92bd89611d1d9fc20f0b4345fd827c41ccd5c299a868a05d70edf11", size = 1568901, upload-time = "2025-12-06T21:30:51.014Z" }
sdist = { url = "https://files.pythonhosted.org/packages/7d/0d/549bd94f1a0a402dc8cf64563a117c0f3765662e2e668477624baeec44d5/pytest-9.0.3.tar.gz", hash = "sha256:b86ada508af81d19edeb213c681b1d48246c1a91d304c6c81a427674c17eb91c", size = 1572165, upload-time = "2026-04-07T17:16:18.027Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/3b/ab/b3226f0bd7cdcf710fbede2b3548584366da3b19b5021e74f5bde2a8fa3f/pytest-9.0.2-py3-none-any.whl", hash = "sha256:711ffd45bf766d5264d487b917733b453d917afd2b0ad65223959f59089f875b", size = 374801, upload-time = "2025-12-06T21:30:49.154Z" },
{ url = "https://files.pythonhosted.org/packages/d4/24/a372aaf5c9b7208e7112038812994107bc65a84cd00e0354a88c2c77a617/pytest-9.0.3-py3-none-any.whl", hash = "sha256:2c5efc453d45394fdd706ade797c0a81091eccd1d6e4bccfcd476e2b8e0ab5d9", size = 375249, upload-time = "2026-04-07T17:16:16.13Z" },
]
[[package]]