Compare commits

..

65 Commits

Author SHA1 Message Date
Trenton H 2fbf975903 Experimetns with using magika instead of magic for mime detection 2026-04-23 09:09:33 -07:00
dependabot[bot] 1a3b56496a Chore(deps): Bump uuid (#12627)
Bumps the npm_and_yarn group with 1 update in the /src-ui directory: [uuid](https://github.com/uuidjs/uuid).


Updates `uuid` from 13.0.0 to 14.0.0
- [Release notes](https://github.com/uuidjs/uuid/releases)
- [Changelog](https://github.com/uuidjs/uuid/blob/main/CHANGELOG.md)
- [Commits](https://github.com/uuidjs/uuid/compare/v13.0.0...v14.0.0)

---
updated-dependencies:
- dependency-name: uuid
  dependency-version: 14.0.0
  dependency-type: direct:production
  dependency-group: npm_and_yarn
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-04-22 23:18:16 -07:00
Trenton H bdbecac7e8 Replaces two sentinel files with .index_settings.json which can properly store multiple values and handle None (#12625) 2026-04-23 02:38:26 +00:00
GitHub Actions 08131b48fa Auto translate strings 2026-04-23 00:42:28 +00:00
shamoon 55393b258c Enhancement: new Tasks UI (#12614) 2026-04-23 00:41:01 +00:00
Trenton H ceb67fef4d Fix: Changes bare metal webserver to use uvloop (#12626) 2026-04-22 17:34:25 -07:00
GitHub Actions 4b6bb23a9a Auto translate strings 2026-04-22 20:50:26 +00:00
Trenton H 0c25c2dac5 Feature: Allow monitoring access to tasks summary (#12624)
Co-authored-by: shamoon <4887959+shamoon@users.noreply.github.com>
2026-04-22 13:48:54 -07:00
dependabot[bot] 2a20cc29a6 Chore(deps): Bump lxml in the uv group across 1 directory (#12619)
Bumps the uv group with 1 update in the / directory: [lxml](https://github.com/lxml/lxml).


Updates `lxml` from 6.0.2 to 6.1.0
- [Release notes](https://github.com/lxml/lxml/releases)
- [Changelog](https://github.com/lxml/lxml/blob/master/CHANGES.txt)
- [Commits](https://github.com/lxml/lxml/compare/lxml-6.0.2...lxml-6.1.0)

---
updated-dependencies:
- dependency-name: lxml
  dependency-version: 6.1.0
  dependency-type: indirect
  dependency-group: uv
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-04-22 09:58:15 -07:00
Trenton H 89a9e7f190 Performance: Increases workflow related M2M prefetching (#12618) 2026-04-21 22:01:51 +00:00
GitHub Actions c669c3416e Auto translate strings 2026-04-21 21:50:36 +00:00
shamoon 88430c8ab7 Tweak: remove 'stale' indicator for index in system status (#12616) 2026-04-21 21:49:04 +00:00
GitHub Actions edfebcbe44 Auto translate strings 2026-04-21 18:02:57 +00:00
Trenton H a89cd2d5d9 Fix: Exact custom field monetary exact searching (#12592) 2026-04-21 18:01:27 +00:00
GitHub Actions 02e913b475 Auto translate strings 2026-04-21 17:26:25 +00:00
Trenton H 6017b11c42 Fix: Prefetches the custom field instance and the custom field all at once (#12617) 2026-04-21 10:24:51 -07:00
shamoon ffaa2bb77a Fix: prevent sidebar animation at startup (#12615) 2026-04-20 23:17:16 -07:00
Trenton H 50ec987a81 Chore: Refactors all of the mail tests to use model factories instead of bare create (#12613) 2026-04-20 15:43:43 -07:00
shamoon f784a74eba Enhancement: add highlighting to title + content searches (#12593) 2026-04-20 21:28:02 +00:00
GitHub Actions 814fdf5892 Auto translate strings 2026-04-20 20:21:27 +00:00
Trenton H 58789e5061 Chore: Structured consume task return values (#12612) 2026-04-20 13:19:54 -07:00
GitHub Actions 7492cda794 Auto translate strings 2026-04-20 18:41:41 +00:00
Trenton H fbf4e32646 Chore: Converts all call sites and test asserts to use apply_async and headers (#12591) 2026-04-20 11:40:04 -07:00
GitHub Actions 733d873e34 Auto translate strings 2026-04-20 18:06:35 +00:00
Trenton H 5e609101d1 Chore: Update API schema fields (#12611)
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-20 11:05:00 -07:00
GitHub Actions dfdf418adc Auto translate strings 2026-04-20 16:30:23 +00:00
Trenton H 8e67828bd7 Feature: Redesign the task system (#12584)
* feat(tasks): replace PaperlessTask model with structured redesign

Drop the old string-based PaperlessTask table and recreate it with
Status/TaskType/TriggerSource enums, JSONField result storage, and
duration tracking fields. Update all call sites to use the new API.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(tasks): rewrite signal handlers to track all task types

Replace the old consume_file-only handler with a full rewrite that tracks
6 task types (consume_file, train_classifier, sanity_check, index_optimize,
llm_index, mail_fetch) with proper trigger source detection, input data
extraction, legacy result string parsing, duration/wait time recording,
and structured error capture on failure.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test(tasks): add traceback and revoked state coverage to signal tests

* refactor(tasks): remove manual PaperlessTask creation and scheduled/auto params

All task records are now created exclusively via Celery signals (Task 2).
Removed PaperlessTask creation/update from train_classifier, sanity_check,
llmindex_index, and check_sanity. Removed scheduled= and auto= parameters
from all 7 call sites. Updated apply_async callers to use trigger_source
headers instead. Exceptions now propagate naturally from task functions.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(tasks): auto-inject trigger_source=scheduled header for all beat tasks

Inject `headers: {"trigger_source": "scheduled"}` into every Celery beat
schedule entry so signal handlers can identify scheduler-originated tasks
without per-task instrumentation.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(tasks): update serializer, filter, and viewset with v9 backwards compat

- Replace TasksViewSerializer/RunTaskViewSerializer with TaskSerializerV10
  (new field names), TaskSerializerV9 (v9 compat), TaskSummarySerializer,
  and RunTaskSerializer
- Add AcknowledgeTasksViewSerializer unchanged (kept existing validation)
- Expand PaperlessTaskFilterSet with MultipleChoiceFilter for task_type,
  trigger_source, status; add is_complete, date_created_after/before filters
- Replace TasksViewSet.get_serializer_class() to branch on request.version
- Add get_queryset() v9 compat for task_name/type query params
- Add acknowledge_all, summary, active actions to TasksViewSet
- Rewrite run action to use apply_async with trigger_source header
- Add timedelta import to views.py; add MultipleChoiceFilter/DateTimeFilter
  to filters.py imports

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(tasks): add read_only_fields to TaskSerializerV9, enforce admin via permission_classes on run action

* test(tasks): rewrite API task tests for redesigned model and v9 compat

Replaces the old Django TestCase-based tests with pytest-style classes using
PaperlessTaskFactory. Covers v10 field names, v9 backwards-compat field
mapping, filtering, ordering, acknowledge, acknowledge_all, summary, active,
and run endpoints. Also adds PaperlessTaskFactory to factories.py and fixes
a redundant source= kwarg in TaskSerializerV10.related_document_ids.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test(tasks): fix two spec gaps in task API test suite

Move test_list_is_owner_aware to TestGetTasksV10 (it tests GET /api/tasks/,
not acknowledge). Add test_related_document_ids_includes_duplicate_of to
cover the duplicate_of path in the related_document_ids property.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test(tasks): address code quality review findings

Remove trivial field-existence tests per project conventions. Fix
potentially flaky ordering test to use explicit date_created values.
Add is_complete=false filter test, v9 type filter input direction test,
and tighten TestActive second test to target REVOKED specifically.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(tasks): update TaskAdmin for redesigned model

Add date_created, duration_seconds to list_display; add trigger_source
to list_filter; add input_data, duration_seconds, wait_time_seconds to
readonly_fields.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(tasks): update Angular types and service for task redesign

Replace PaperlessTaskName/PaperlessTaskType/PaperlessTaskStatus enums
with new PaperlessTaskType, PaperlessTaskTriggerSource, PaperlessTaskStatus
enums. Update PaperlessTask interface to new field names (task_type,
trigger_source, input_data, result_message, related_document_ids).
Update TasksService to filter by task_type instead of task_name.
Update tasks component and system-status-dialog to use new field names.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore(tasks): remove django-celery-results

PaperlessTask now tracks all task results via Celery signals. The
django-celery-results DB backend was write-only -- nothing reads
from it. Drop the package and add a migration to clean up the
orphaned tables.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: fix remaining tests broken by task system redesign

Update all tests that created PaperlessTask objects with old field names
to use PaperlessTaskFactory and new field names (task_type, trigger_source,
status, result_message). Use apply_async instead of delay where mocked.
Drop TestCheckSanityTaskRecording — tests PaperlessTask creation that was
intentionally removed from check_sanity().

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test(tasks): improve test_api_tasks.py structure and add api marker

- Move admin_client, v9_client, user_client fixtures to conftest.py so
  they can be reused by other API tests; all three now build on the
  rest_api_client fixture instead of creating APIClient() directly
- Move regular_user fixture to conftest.py (was already done, now also
  used by the new client fixtures)
- Add docstrings to every test method describing the behaviour under test
- Move timedelta/timezone imports to module level
- Register 'api' pytest marker in pyproject.toml and apply pytestmark to
  the entire file so all 40 tests are selectable via -m api

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(tasks): simplify task tracking code after redesign

- Extract COMPLETE_STATUSES as a class constant on PaperlessTask,
  eliminating the repeated status tuple across models.py, views.py (3×),
  and filters.py
- Extract _CELERY_STATE_TO_STATUS as a module-level constant instead of
  rebuilding the dict on every task_postrun
- Extract _V9_TYPE_TO_TRIGGER_SOURCE and _RUNNABLE_TASKS as class
  constants on TasksViewSet instead of rebuilding on every request
- Extract _TRIGGER_SOURCE_TO_V9_TYPE as a class constant on
  TaskSerializerV9 instead of rebuilding per serialized object
- Extract _get_consume_args helper to deduplicate identical arg
  extraction logic in _extract_input_data, _determine_trigger_source,
  and _extract_owner_id
- Move inline imports (re, traceback) and Avg to module level
- Fix _DOCUMENT_SOURCE_TO_TRIGGER type annotation key type to
  DocumentSource instead of Any
- Remove redundant truthiness checks in SystemStatusView branches
  already guarded by an is-None check

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(tasks): add docstrings and rename _parse_legacy_result

- Add docstrings to _extract_input_data, _determine_trigger_source,
  _extract_owner_id explaining what each helper does and why
- Rename _parse_legacy_result -> _parse_consume_result: the function
  parses current consume_file string outputs (consumer.py returns
  "New document id N created" and "It is a duplicate of X (#N)"),
  not legacy data; the old name was misleading

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(tasks): extend and harden the task system redesign

- TaskType: add EMPTY_TRASH, CHECK_WORKFLOWS, CLEANUP_SHARE_LINKS;
  remove INDEX_REBUILD (no backing task — beat schedule uses index_optimize)
- TRACKED_TASKS: wire up all nine task types including the three new ones
  and llmindex_index / process_mail_accounts
- Add task_revoked_handler so cancelled/expired tasks are marked REVOKED
- Fix double-write: task_postrun_handler no longer overwrites result_data
  when status is already FAILURE (task_failure_handler owns that write)
- v9 serialiser: map EMAIL_CONSUME and FOLDER_CONSUME to AUTO_TASK
- views: scope task list to owner for regular users, admins see all;
  validate ?days= query param and return 400 on bad input
- tests: add test_list_admin_sees_all_tasks; rename/fix
  test_parses_duplicate_string (duplicates produce SUCCESS, not FAILURE);
  use PaperlessTaskFactory in modified tests

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(tasks): fix MAIL_FETCH null input_data and postrun double-query

- _extract_input_data: return {} instead of {"account_ids": None} when
  process_mail_accounts is called without an explicit account list (the
  normal beat-scheduled path); add test to cover this path
- task_postrun_handler: replace filter().first() + filter().update() with
  get() + save(update_fields=[...]) — single fetch, single write,
  consistent with task_prerun_handler

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(tasks): add queryset stub to satisfy drf-spectacular schema generation

TasksViewSet.get_queryset() accesses request.user, which drf-spectacular
cannot provide during static schema generation.  Adding a class-level
queryset = PaperlessTask.objects.none() gives spectacular a model to
introspect without invoking get_queryset(), eliminating both warnings
and the test_valid_schema failure.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test(tasks): fill coverage gaps in task system

- test_task_signals: add TestTaskRevokedHandler (marks REVOKED, ignores
  None request, ignores unknown id); switch existing direct
  PaperlessTask.objects.create calls to PaperlessTaskFactory; import
  pytest_mock and use MockerFixture typing on mocker params
- test_api_tasks: add test_rejects_invalid_days_param to TestSummary
- tasks.service.spec: add dismissAllTasks test (POST acknowledge_all +
  reload)
- models: add pragma: no cover to __str__, is_complete, and
  related_document_ids (trivial delegates, covered indirectly)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Well, that was a bad push.

* Fixes v9 API compatability with testing coverage

* fix(tasks): restore INDEX_OPTIMIZE enum and remove no-op run button

INDEX_OPTIMIZE was dropped from the TaskType enum but still referenced
in _RUNNABLE_TASKS (views.py) and the frontend system-status-dialog,
causing an AttributeError at import time. Restore the enum value in the
model and migration so the serializer accepts it, but remove it from
_RUNNABLE_TASKS since index_optimize is a Tantivy no-op. Remove the
frontend "Run Task" button for index optimization accordingly.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(tasks): v9 type filter now matches all equivalent trigger sources

The v9 ?type= query param mapped each value to a single TriggerSource,
but the serializer maps multiple sources to the same v9 type value.
A task serialized as "auto_task" would not appear when filtering by
?type=auto_task if its trigger_source was email_consume or
folder_consume. Same issue for "manual_task" missing web_ui and
api_upload sources. Changed to trigger_source__in with the full set
of sources for each v9 type value.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(tasks): give task_failure_handler full ownership of FAILURE path

task_postrun_handler now early-returns for FAILURE states instead of
redundantly writing status and date_done. task_failure_handler now
computes duration_seconds and wait_time_seconds so failed tasks get
complete timing data. This eliminates a wasted .get() + .save() round
trip on every failed task and gives each handler a clean, non-overlapping
responsibility.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(tasks): resolve trigger_source header via TriggerSource enum lookup

Replace two hardcoded string comparisons ("scheduled", "system") with a
single TriggerSource(header_source) lookup so the enum values are the
single source of truth. Any valid TriggerSource DB value passed in the
header is accepted; invalid values fall through to the document-source /
MANUAL logic. Update tests to pass enum values in headers rather than raw
strings, and add a test for the invalid-header fallback path.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(tasks): use TriggerSource enum values at all apply_async call sites

Replace raw strings ("system", "manual") with PaperlessTask.TriggerSource
enum values in the three callers that can import models. The settings
file remains a raw string (models cannot be imported at settings load
time) with a comment pointing to the enum value it must match.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test(tasks): parametrize repetitive test cases in task test files

test_api_tasks.py:
- Collapse six trigger_source->v9-type tests into one parametrized test,
  adding the previously untested API_UPLOAD case
- Collapse three task_name mapping tests (two remaps + pass-through)
  into one parametrized test
- Collapse two acknowledge_all status tests into one parametrized test
- Collapse two run-endpoint 400 tests into one parametrized test
- Update run/ assertions to use TriggerSource enum values

test_task_signals.py:
- Collapse three trigger_source header tests into one parametrized test
- Collapse two DocumentSource->TriggerSource mapping tests into one
  parametrized test
- Collapse two prerun ignore-invalid-id tests into one parametrized test

All parametrize cases use pytest.param with descriptive ids.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Handle JSON serialization for datetime and Path.  Further restrist the v9 permissions as Copilot suggests

* That should fix the generated schema/browser

* Use XSerializer for the schema

* A few more basic cases I see no value in covering

* Drops the migration related stuff too.  Just in case we want it again or it confuses people

* fix: annotate tasks_summary_retrieve as array of TaskSummarySerializer

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: annotate tasks_active_retrieve as array of TaskSerializerV10

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Restore task running to superuser only

* Removes the acknowledge/dismiss all stuff

* Aligns v10 and v9 task permissions with each other

* Short blurb just to warn users about the tasks being cleared

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-20 09:28:41 -07:00
shamoon 20aa0937e8 Fix (dev): retain backwards compatibility with natural-date keywords in tantivy (#12602) 2026-04-20 08:26:33 -07:00
shamoon 750a2723a2 Add anti-slop job PR template requirement 2026-04-20 01:36:00 -07:00
shamoon 8edbc70dbc Oops, it should be dark 2026-04-18 16:02:09 -07:00
shamoon 862e8e2292 Fix: added date check visibility (#12600) 2026-04-18 05:14:31 -07:00
Gaëtan GOUZI ae2b780509 Fix: prevent intermediate change event when CustomFieldQueryAtom operator changes type (#12597)
* fix: prevent intermediate change event when CustomFieldQueryAtom operator changes type

* Add regression test

---------

Co-authored-by: shamoon <4887959+shamoon@users.noreply.github.com>
2026-04-17 23:23:48 +00:00
dependabot[bot] bf6915114b Chore(deps): Bump hono (#12585)
Bumps the npm_and_yarn group with 1 update in the /src-ui directory: [hono](https://github.com/honojs/hono).


Updates `hono` from 4.12.12 to 4.12.14
- [Release notes](https://github.com/honojs/hono/releases)
- [Commits](https://github.com/honojs/hono/compare/v4.12.12...v4.12.14)

---
updated-dependencies:
- dependency-name: hono
  dependency-version: 4.12.14
  dependency-type: indirect
  dependency-group: npm_and_yarn
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-04-15 20:12:23 -07:00
GitHub Actions dc06b679d3 Auto translate strings 2026-04-15 23:22:00 +00:00
Trenton H 3ffbb8862c Feature: paginate search highlights and remove 10k document search limit (#12518)
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: shamoon <4887959+shamoon@users.noreply.github.com>
2026-04-15 23:20:31 +00:00
Trenton H 21db608d57 Feature: Tune SQLite default settings for increased speed (#12580) 2026-04-15 15:26:00 -07:00
Trenton H 2fd1a1cf3a Feature: Document fuzzy match improvements (#12579)
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 20:59:43 +00:00
GitHub Actions 2b4c1fe20d Auto translate strings 2026-04-15 18:58:56 +00:00
Gaëtan GOUZI 02008e40d8 fix: Return HTTP 400 instead of HTTP 500 on DELETE /api/documents/{id}/notes/ with missing or invalid note id (#12582) 2026-04-15 11:56:15 -07:00
GitHub Actions d9061d5e55 Auto translate strings 2026-04-14 22:15:38 +00:00
shamoon 8f036c9521 Merge branch 'main' into dev
# Conflicts:
#	docs/usage.md
#	src/documents/signals/handlers.py
#	src/documents/tests/test_api_documents.py
#	src/documents/views.py
2026-04-14 15:11:23 -07:00
Trenton H 17c13c1a03 Breaking: Remove the positional arguments from the pre/post consume scripts. Environment has been preferred for quite some time now (#12573) 2026-04-14 21:18:20 +00:00
github-actions[bot] 1692c916f8 Documentation: Add v2.20.14 changelog (#12577)
* Changelog v2.20.14 - GHA

* Reorganize

---------

Co-authored-by: github-actions <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: shamoon <4887959+shamoon@users.noreply.github.com>
2026-04-14 14:03:26 -07:00
dependabot[bot] 1876e38540 Chore(deps): Bump follow-redirects (#12575)
Bumps the npm_and_yarn group with 1 update in the /src-ui directory: [follow-redirects](https://github.com/follow-redirects/follow-redirects).


Updates `follow-redirects` from 1.15.11 to 1.16.0
- [Release notes](https://github.com/follow-redirects/follow-redirects/releases)
- [Commits](https://github.com/follow-redirects/follow-redirects/compare/v1.15.11...v1.16.0)

---
updated-dependencies:
- dependency-name: follow-redirects
  dependency-version: 1.16.0
  dependency-type: indirect
  dependency-group: npm_and_yarn
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-04-14 20:24:03 +00:00
shamoon ffd886eae0 Bump version to 2.20.14 2026-04-14 13:11:48 -07:00
shamoon 12c0dc635e Merge branch 'release/v2.20.x' 2026-04-14 13:10:52 -07:00
shamoon 1e01ce42c0 Update usage.md 2026-04-14 13:10:40 -07:00
dependabot[bot] b51c4572b1 Chore(deps): Bump pillow in the uv group across 1 directory (#12574)
Bumps the uv group with 1 update in the / directory: [pillow](https://github.com/python-pillow/Pillow).


Updates `pillow` from 12.1.1 to 12.2.0
- [Release notes](https://github.com/python-pillow/Pillow/releases)
- [Changelog](https://github.com/python-pillow/Pillow/blob/main/CHANGES.rst)
- [Commits](https://github.com/python-pillow/Pillow/compare/12.1.1...12.2.0)

---
updated-dependencies:
- dependency-name: pillow
  dependency-version: 12.2.0
  dependency-type: indirect
  dependency-group: uv
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-04-14 10:38:23 -07:00
dependabot[bot] a5bb3b9f8e Chore(deps-dev): Bump pytest in the uv group across 1 directory (#12568)
Bumps the uv group with 1 update in the / directory: [pytest](https://github.com/pytest-dev/pytest).


Updates `pytest` from 9.0.2 to 9.0.3
- [Release notes](https://github.com/pytest-dev/pytest/releases)
- [Changelog](https://github.com/pytest-dev/pytest/blob/main/CHANGELOG.rst)
- [Commits](https://github.com/pytest-dev/pytest/compare/9.0.2...9.0.3)

---
updated-dependencies:
- dependency-name: pytest
  dependency-version: 9.0.3
  dependency-type: direct:development
  dependency-group: uv
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-04-14 08:42:39 -07:00
GitHub Actions 3ed7297939 Auto translate strings 2026-04-13 21:14:40 +00:00
Trenton H 3b6edcdd8e Chore: Add generic type params and update our baselines (#12566)
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-13 14:12:59 -07:00
GitHub Actions b27d10646e Auto translate strings 2026-04-13 20:12:04 +00:00
Trenton H 8c1225e120 Fixes an N+1 query in matching with the version content fetching by prefetching versions (#12562) 2026-04-13 13:10:28 -07:00
shamoon e46f4a5aaa Fix: do not submit permissions for non-owners (#12571) 2026-04-13 12:43:06 -07:00
Trenton H 54d5269145 Fix: Use an iterator in the sanity checking (#12563) 2026-04-13 12:32:22 -07:00
Trenton H f5729811fe Chore: Upgrades Django manually, since dependabot is failing. Resolves security alerts (#12567) 2026-04-13 10:20:35 -07:00
shamoon 1ba6c31385 Clarify User permission scope in docs 2026-04-07 13:38:24 -07:00
shamoon f3ee820fa4 Fix: prevent duplicate parent tag IDs (#12522) 2026-04-06 21:59:11 -07:00
shamoon 2f5bcdf66e Fix: dont defer tag change application in workflows (#12478) 2026-04-02 11:54:37 -07:00
shamoon 501cdd92d2 Fix: limit share link viewset actions (#12461) 2026-03-30 09:34:13 -07:00
shamoon 66c5c46913 Fix: add fallback ordering for documents by id after created (#12440) 2026-03-26 06:16:14 -07:00
shamoon 782634d912 Documentation: clarify barcode file type support 2026-03-23 15:45:27 -07:00
shamoon 3cfe9fa2a8 Fixhancement: default mail -created correspondent matching to exact (#12414) 2026-03-21 16:15:18 -07:00
github-actions[bot] 784fed447f Documentation: Add v2.20.13 changelog (#12408)
* Changelog v2.20.13 - GHA

* Update changelog for version 2.20.13

Added bug fixes for version 2.20.13 in changelog.

---------

Co-authored-by: github-actions <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: shamoon <4887959+shamoon@users.noreply.github.com>
2026-03-21 08:50:27 -07:00
shamoon 3292a0e7cc Fix: validate date CF value in serializer (#12410) 2026-03-21 08:49:52 -07:00
133 changed files with 14663 additions and 12271 deletions
+1
View File
@@ -165,6 +165,7 @@ jobs:
contents: read
env:
DEFAULT_PYTHON: "3.12"
PAPERLESS_SECRET_KEY: "ci-typing-not-a-real-secret"
steps:
- name: Checkout
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
+2
View File
@@ -88,6 +88,7 @@ jobs:
uv export --quiet --no-dev --all-extras --format requirements-txt --output-file requirements.txt
- name: Compile messages
env:
PAPERLESS_SECRET_KEY: "ci-release-not-a-real-secret"
PYTHON_VERSION: ${{ steps.setup-python.outputs.python-version }}
run: |
cd src/
@@ -96,6 +97,7 @@ jobs:
manage.py compilemessages
- name: Collect static files
env:
PAPERLESS_SECRET_KEY: "ci-release-not-a-real-secret"
PYTHON_VERSION: ${{ steps.setup-python.outputs.python-version }}
run: |
cd src/
+2 -1
View File
@@ -3,7 +3,7 @@ on:
pull_request_target:
types: [opened]
jobs:
anti-slop:
Anti-slop:
runs-on: ubuntu-latest
permissions:
contents: read
@@ -14,6 +14,7 @@ jobs:
with:
max-failures: 4
failure-add-pr-labels: 'ai'
require-pr-template: true
pr-bot:
name: Automated PR Bot
runs-on: ubuntu-latest
+2
View File
@@ -36,6 +36,8 @@ jobs:
--group dev \
--frozen
- name: Generate backend translation strings
env:
PAPERLESS_SECRET_KEY: "ci-translate-not-a-real-secret"
run: cd src/ && uv run manage.py makemessages -l en_US -i "samples*"
- name: Install pnpm
uses: pnpm/action-setup@fc06bc1257f339d1d5d8b3a19a8cae5388b55320 # v5.0.0
+3
View File
@@ -79,6 +79,7 @@ virtualenv
/docker-compose.env
/docker-compose.yml
.ruff_cache/
.mypy_cache/
# Used for development
scripts/import-for-development
@@ -111,4 +112,6 @@ celerybeat-schedule*
# ignore pnpm package store folder created when setting up the devcontainer
.pnpm-store/
# Git worktree local folder
.worktrees
+1064 -894
View File
File diff suppressed because it is too large Load Diff
+5498 -6818
View File
File diff suppressed because one or more lines are too long
+1 -1
View File
@@ -842,7 +842,7 @@ MariaDB: `mariadb-tzinfo-to-sql /usr/share/zoneinfo | mariadb -u root mysql -p`
## Barcodes {#barcodes}
Paperless is able to utilize barcodes for automatically performing some tasks.
Paperless is able to utilize barcodes for automatically performing some tasks. Barcodes are only supported for PDF documents or TIFF, [if enabled](configuration.md#PAPERLESS_CONSUMER_BARCODE_TIFF_SUPPORT).
At this time, the library utilized for detection of barcodes supports the following types:
+35
View File
@@ -1,5 +1,40 @@
# Changelog
## paperless-ngx 2.20.14
### Bug Fixes
- Fix: do not submit permissions for non-owners [@shamoon](https://github.com/shamoon) ([#12571](https://github.com/paperless-ngx/paperless-ngx/pull/12571))
- Fix: prevent duplicate parent tag IDs [@shamoon](https://github.com/shamoon) ([#12522](https://github.com/paperless-ngx/paperless-ngx/pull/12522))
- Fix: dont defer tag change application in workflows [@shamoon](https://github.com/shamoon) ([#12478](https://github.com/paperless-ngx/paperless-ngx/pull/12478))
- Fix: limit share link viewset actions [@shamoon](https://github.com/shamoon) ([#12461](https://github.com/paperless-ngx/paperless-ngx/pull/12461))
- Fix: add fallback ordering for documents by id after created [@shamoon](https://github.com/shamoon) ([#12440](https://github.com/paperless-ngx/paperless-ngx/pull/12440))
- Fixhancement: default mail-created correspondent matching to exact [@shamoon](https://github.com/shamoon) ([#12414](https://github.com/paperless-ngx/paperless-ngx/pull/12414))
- Fix: validate date CF value in serializer [@shamoon](https://github.com/shamoon) ([#12410](https://github.com/paperless-ngx/paperless-ngx/pull/12410))
### All App Changes
<details>
<summary>7 changes</summary>
- Fix: do not submit permissions for non-owners [@shamoon](https://github.com/shamoon) ([#12571](https://github.com/paperless-ngx/paperless-ngx/pull/12571))
- Fix: prevent duplicate parent tag IDs [@shamoon](https://github.com/shamoon) ([#12522](https://github.com/paperless-ngx/paperless-ngx/pull/12522))
- Fix: dont defer tag change application in workflows [@shamoon](https://github.com/shamoon) ([#12478](https://github.com/paperless-ngx/paperless-ngx/pull/12478))
- Fix: limit share link viewset actions [@shamoon](https://github.com/shamoon) ([#12461](https://github.com/paperless-ngx/paperless-ngx/pull/12461))
- Fix: add fallback ordering for documents by id after created [@shamoon](https://github.com/shamoon) ([#12440](https://github.com/paperless-ngx/paperless-ngx/pull/12440))
- Fixhancement: default mail-created correspondent matching to exact [@shamoon](https://github.com/shamoon) ([#12414](https://github.com/paperless-ngx/paperless-ngx/pull/12414))
- Fix: validate date CF value in serializer [@shamoon](https://github.com/shamoon) ([#12410](https://github.com/paperless-ngx/paperless-ngx/pull/12410))
</details>
## paperless-ngx 2.20.13
### Bug Fixes
- Fix: suggest corrections only if visible results
- Fix: require view permission for more-like search
- Fix: validate document link targets
- Fix: enforce permissions when attaching accounts to mail rules
## paperless-ngx 2.20.12
### Security
+23 -5
View File
@@ -101,7 +101,7 @@ and `mariadb`.
#### [`PAPERLESS_DB_OPTIONS=<options>`](#PAPERLESS_DB_OPTIONS) {#PAPERLESS_DB_OPTIONS}
: Advanced database connection options as a semicolon-delimited key-value string.
: Advanced database connection options as a comma-delimited key-value string.
Keys and values are separated by `=`. Dot-notation produces nested option
dictionaries; for example, `pool.max_size=20` sets
`OPTIONS["pool"]["max_size"] = 20`.
@@ -123,18 +123,36 @@ dictionaries; for example, `pool.max_size=20` sets
to handle all pool connections across all workers:
`(web_workers + celery_workers) * pool.max_size + safety_margin`.
!!! note "SQLite defaults"
SQLite connections are pre-configured with WAL journal mode, optimised
synchronous and cache settings, and a 5-second busy timeout. These defaults
suit most deployments. To override `init_command`, use `;` between PRAGMAs
within the value and `,` between options:
```bash
PAPERLESS_DB_OPTIONS="init_command=PRAGMA journal_mode=DELETE;PRAGMA synchronous=FULL,transaction_mode=DEFERRED"
```
!!! note "MariaDB: READ COMMITTED isolation level"
MariaDB connections default to `READ COMMITTED` isolation level, which
eliminates gap locking and reduces deadlock frequency. If binary logging is
enabled on your MariaDB server, this requires `binlog_format=ROW` (the
default for most managed MariaDB instances). Statement-based replication is
not compatible with `READ COMMITTED`.
**Examples:**
```bash title="PostgreSQL: require SSL, set a custom CA certificate, and limit the pool size"
PAPERLESS_DB_OPTIONS="sslmode=require;sslrootcert=/certs/ca.pem;pool.max_size=5"
PAPERLESS_DB_OPTIONS="sslmode=require,sslrootcert=/certs/ca.pem,pool.max_size=5"
```
```bash title="MariaDB: require SSL with a custom CA certificate"
PAPERLESS_DB_OPTIONS="ssl_mode=REQUIRED;ssl.ca=/certs/ca.pem"
PAPERLESS_DB_OPTIONS="ssl_mode=REQUIRED,ssl.ca=/certs/ca.pem"
```
```bash title="SQLite: set a busy timeout of 30 seconds"
# PostgreSQL: set a connection timeout
```bash title="PostgreSQL or MariaDB: set a connection timeout"
PAPERLESS_DB_OPTIONS="connect_timeout=10"
```
+70 -1
View File
@@ -120,7 +120,7 @@ Users with any of the deprecated variables set should migrate to `PAPERLESS_DB_O
Multiple options are combined in a single value:
```bash
PAPERLESS_DB_OPTIONS="sslmode=require;sslrootcert=/certs/ca.pem;pool.max_size=10"
PAPERLESS_DB_OPTIONS="sslmode=require,sslrootcert=/certs/ca.pem,pool.max_size=10"
```
## OCR and Archive File Generation Settings
@@ -241,3 +241,72 @@ For example:
}
}
```
## Task History Cleared on Upgrade
The task tracking system has been redesigned in this release. All existing task history records are dropped from the database during the upgrade. Previously completed, failed, or acknowledged tasks will no longer appear in the task list after upgrading.
No user action is required.
## Consume Script Positional Arguments Removed
Pre- and post-consumption scripts no longer receive positional arguments. All information is
now passed exclusively via environment variables, which have been available since earlier versions.
### Pre-consumption script
Previously, the original file path was passed as `$1`. It is now only available as
`DOCUMENT_SOURCE_PATH`.
**Before:**
```bash
#!/usr/bin/env bash
# $1 was the original file path
process_document "$1"
```
**After:**
```bash
#!/usr/bin/env bash
process_document "${DOCUMENT_SOURCE_PATH}"
```
### Post-consumption script
Previously, document metadata was passed as positional arguments `$1` through `$8`:
| Argument | Environment Variable Equivalent |
| -------- | ------------------------------- |
| `$1` | `DOCUMENT_ID` |
| `$2` | `DOCUMENT_FILE_NAME` |
| `$3` | `DOCUMENT_SOURCE_PATH` |
| `$4` | `DOCUMENT_THUMBNAIL_PATH` |
| `$5` | `DOCUMENT_DOWNLOAD_URL` |
| `$6` | `DOCUMENT_THUMBNAIL_URL` |
| `$7` | `DOCUMENT_CORRESPONDENT` |
| `$8` | `DOCUMENT_TAGS` |
**Before:**
```bash
#!/usr/bin/env bash
DOCUMENT_ID=$1
CORRESPONDENT=$7
TAGS=$8
```
**After:**
```bash
#!/usr/bin/env bash
# Use environment variables directly
echo "Document ${DOCUMENT_ID} from ${DOCUMENT_CORRESPONDENT} tagged: ${DOCUMENT_TAGS}"
```
### Action Required
Update any pre- or post-consumption scripts that read `$1`, `$2`, etc. to use the
corresponding environment variables instead. Environment variables have been the preferred
option since v1.8.0.
+10 -7
View File
@@ -414,10 +414,10 @@ still have "object-level" permissions.
| SavedView | Add, edit, delete or view Saved Views. |
| ShareLink | Add, delete or view Share Links. |
| StoragePath | Add, edit, delete or view Storage Paths. |
| SystemStatus | View the system status dialog and corresponding API endpoint. Admin users also retain system status access. |
| SystemMonitoring | View the system status dialog, tasks summary and their API endpoints. Admin users also retain system status access. |
| Tag | Add, edit, delete or view Tags. |
| UISettings | Add, edit, delete or view the UI settings that are used by the web app.<br/>:warning: **Users that will access the web UI must be granted at least _View_ permissions.** |
| User | Add, edit, delete or view Users. |
| User | Add, edit, delete or view other user accounts via Settings > Users & Groups and `/api/users/`. These permissions are not needed for users to edit their own profile via "My Profile" or `/api/profile/`. |
| Workflow | Add, edit, delete or view Workflows.<br/>Note that Workflows are global; all users who can access workflows see the same set. Workflows have other permission implications — see [Workflow permissions](#workflow-permissions). |
#### Detailed Explanation of Object Permissions {#object-permissions}
@@ -428,6 +428,8 @@ still have "object-level" permissions.
| View | Confers the ability to view (not edit) a document, tag, etc.<br/>Users without 'view' (or higher) permissions will be shown _'Private'_ in place of the object name for example when viewing a document with a tag for which the user doesn't have permissions. |
| Edit | Confers the ability to edit (and view) a document, tag, etc. |
For related metadata such as tags, correspondents, document types, and storage paths, object visibility and document assignment are intentionally distinct. A user may still retain or submit a known object ID when editing a document even if that related object is displayed as _Private_ or omitted from search and selection results. This allows documents to preserve existing assignments that the current user cannot necessarily inspect in detail.
### Password reset
In order to enable the password reset feature you will need to setup an SMTP backend, see
@@ -853,13 +855,14 @@ Matching natural date keywords:
```
added:today
modified:yesterday
created:this_week
added:last_month
modified:this_year
created:"previous week"
added:"previous month"
modified:"this year"
```
Supported date keywords: `today`, `yesterday`, `this_week`, `last_week`,
`this_month`, `last_month`, `this_year`, `last_year`.
Supported date keywords: `today`, `yesterday`, `previous week`,
`this month`, `previous month`, `this year`, `previous year`,
`previous quarter`.
#### Searching custom fields
+5 -5
View File
@@ -1,6 +1,6 @@
[project]
name = "paperless-ngx"
version = "2.20.13"
version = "2.20.14"
description = "A community-supported supercharged document management system: scan, index and archive all your physical documents"
readme = "README.md"
requires-python = ">=3.11"
@@ -24,11 +24,10 @@ dependencies = [
"dateparser~=1.2",
# WARNING: django does not use semver.
# Only patch versions are guaranteed to not introduce breaking changes.
"django~=5.2.10",
"django~=5.2.13",
"django-allauth[mfa,socialaccount]~=65.15.0",
"django-auditlog~=3.4.1",
"django-cachalot~=2.9.0",
"django-celery-results~=2.6.0",
"django-compression-middleware~=0.5.0",
"django-cors-headers~=4.9.0",
"django-extensions~=4.1",
@@ -58,6 +57,7 @@ dependencies = [
"llama-index-llms-ollama>=0.9.1",
"llama-index-llms-openai>=0.6.13",
"llama-index-vector-stores-faiss>=0.5.2",
"magika>=1.0.2",
"nltk~=3.9.1",
"ocrmypdf~=17.4.0",
"openai>=1.76",
@@ -67,7 +67,6 @@ dependencies = [
"python-dotenv~=1.2.1",
"python-gnupg~=0.5.4",
"python-ipware~=3.0.0",
"python-magic~=0.4.27",
"rapidfuzz~=3.14.0",
"redis[hiredis]~=5.2.1",
"regex>=2025.9.18",
@@ -113,7 +112,7 @@ testing = [
"factory-boy~=3.3.1",
"faker~=40.12.0",
"imagehash",
"pytest~=9.0.0",
"pytest~=9.0.3",
"pytest-cov~=7.1.0",
"pytest-django~=4.12.0",
"pytest-env~=1.6.0",
@@ -312,6 +311,7 @@ markers = [
"date_parsing: Tests which cover date parsing from content or filename",
"management: Tests which cover management commands/functionality",
"search: Tests for the Tantivy search backend",
"api: Tests for REST API endpoints",
]
[tool.pytest_env]
+1 -1
View File
@@ -23,7 +23,7 @@ ExecStart=/bin/sh -c '\
[ -n "$PAPERLESS_WEBSERVER_WORKERS" ] && export GRANIAN_WORKERS=$PAPERLESS_WEBSERVER_WORKERS; \
# URL path prefix: only set if PAPERLESS_FORCE_SCRIPT_NAME exists \
[ -n "$PAPERLESS_FORCE_SCRIPT_NAME" ] && export GRANIAN_URL_PATH_PREFIX=$PAPERLESS_FORCE_SCRIPT_NAME; \
exec granian --interface asginl --ws "paperless.asgi:application"'
exec granian --interface asginl --ws --loop uvloop "paperless.asgi:application"'
[Install]
WantedBy=multi-user.target
+365 -221
View File
@@ -446,16 +446,12 @@
<context context-type="linenumber">214</context>
</context-group>
</trans-unit>
<trans-unit id="4680387114119209483" datatype="html">
<source>File Tasks shows you documents that have been consumed, are waiting to be, or may have failed during the process.</source>
<trans-unit id="2685141468866020893" datatype="html">
<source>Tasks helps you track background work, what needs attention, and what recently completed.</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/app.component.ts</context>
<context context-type="linenumber">222</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.html</context>
<context context-type="linenumber">4</context>
</context-group>
</trans-unit>
<trans-unit id="2755033621682206902" datatype="html">
<source>Check out the settings for various tweaks to the web app.</source>
@@ -772,7 +768,7 @@
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.html</context>
<context context-type="linenumber">41</context>
<context context-type="linenumber">16</context>
</context-group>
</trans-unit>
<trans-unit id="3894950702316166331" datatype="html">
@@ -787,7 +783,7 @@
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.html</context>
<context context-type="linenumber">48</context>
<context context-type="linenumber">23</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/trash/trash.component.html</context>
@@ -1793,8 +1789,8 @@
<context context-type="linenumber">263</context>
</context-group>
</trans-unit>
<trans-unit id="2991443309752293110" datatype="html">
<source>File Tasks</source>
<trans-unit id="6904866445262015585" datatype="html">
<source>Tasks</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.html</context>
<context context-type="linenumber">2</context>
@@ -1808,6 +1804,13 @@
<context context-type="linenumber">299</context>
</context-group>
</trans-unit>
<trans-unit id="8492095365580052820" datatype="html">
<source>Tasks shows detailed information about document consumption and system tasks.</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.html</context>
<context context-type="linenumber">4</context>
</context-group>
</trans-unit>
<trans-unit id="103921551219467537" datatype="html">
<source>Clear selection</source>
<context-group purpose="location">
@@ -1823,22 +1826,105 @@
<context context-type="linenumber">147</context>
</context-group>
</trans-unit>
<trans-unit id="1616102757855967475" datatype="html">
<source>All</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.html</context>
<context context-type="linenumber">36</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.ts</context>
<context context-type="linenumber">53</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/custom-fields-query-dropdown/custom-fields-query-dropdown.component.html</context>
<context context-type="linenumber">151</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/filterable-dropdown/filterable-dropdown.component.html</context>
<context context-type="linenumber">14</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/permissions-filter-dropdown/permissions-filter-dropdown.component.html</context>
<context context-type="linenumber">15</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/permissions-select/permissions-select.component.html</context>
<context context-type="linenumber">16</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/permissions-select/permissions-select.component.html</context>
<context context-type="linenumber">27</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/document-list/document-list.component.html</context>
<context context-type="linenumber">29</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/manage/document-attributes/document-attributes.component.html</context>
<context context-type="linenumber">37</context>
</context-group>
</trans-unit>
<trans-unit id="424356320420294719" datatype="html">
<source>All types</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.html</context>
<context context-type="linenumber">58</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.ts</context>
<context context-type="linenumber">202</context>
</context-group>
</trans-unit>
<trans-unit id="131016739441837046" datatype="html">
<source>All sources</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.html</context>
<context context-type="linenumber">67</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.ts</context>
<context context-type="linenumber">214</context>
</context-group>
</trans-unit>
<trans-unit id="4880728824338713664" datatype="html">
<source>Filter by</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.html</context>
<context context-type="linenumber">16</context>
<context context-type="linenumber">76</context>
</context-group>
</trans-unit>
<trans-unit id="6849725902312323996" datatype="html">
<source>Reset filters</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.html</context>
<context context-type="linenumber">98</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/document-list/document-list.component.html</context>
<context context-type="linenumber">142</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/document-list/filter-editor/filter-editor.component.html</context>
<context context-type="linenumber">107</context>
</context-group>
</trans-unit>
<trans-unit id="3387705505469073896" datatype="html">
<source>{VAR_PLURAL, plural, =1 {1 task} other {<x id="INTERPOLATION"/> tasks}}</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.html</context>
<context context-type="linenumber">108</context>
</context-group>
</trans-unit>
<trans-unit id="8953033926734869941" datatype="html">
<source>Name</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.html</context>
<context context-type="linenumber">61</context>
<context context-type="linenumber">130</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.ts</context>
<context context-type="linenumber">45</context>
<context context-type="linenumber">48</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/trash/trash.component.html</context>
@@ -1945,7 +2031,7 @@
<source>Created</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.html</context>
<context context-type="linenumber">62</context>
<context context-type="linenumber">131</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/dates-dropdown/dates-dropdown.component.html</context>
@@ -1976,21 +2062,21 @@
<source>Results</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.html</context>
<context context-type="linenumber">64</context>
<context context-type="linenumber">133</context>
</context-group>
</trans-unit>
<trans-unit id="314315645942131479" datatype="html">
<source>Info</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.html</context>
<context context-type="linenumber">66</context>
<context context-type="linenumber">135</context>
</context-group>
</trans-unit>
<trans-unit id="3193976279273491157" datatype="html">
<source>Actions</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.html</context>
<context context-type="linenumber">67</context>
<context context-type="linenumber">136</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/trash/trash.component.html</context>
@@ -2061,25 +2147,18 @@
<source>click for full output</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.html</context>
<context context-type="linenumber">97</context>
</context-group>
</trans-unit>
<trans-unit id="1418249006616536391" datatype="html">
<source>Duplicate(s) detected</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.html</context>
<context context-type="linenumber">103</context>
<context context-type="linenumber">187</context>
</context-group>
</trans-unit>
<trans-unit id="1536087519743707362" datatype="html">
<source>Dismiss</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.html</context>
<context context-type="linenumber">116</context>
<context context-type="linenumber">200</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.ts</context>
<context context-type="linenumber">155</context>
<context context-type="linenumber">298</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/document-detail/document-detail.component.ts</context>
@@ -2094,126 +2173,292 @@
<source>Open Document</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.html</context>
<context context-type="linenumber">121</context>
<context context-type="linenumber">205</context>
</context-group>
</trans-unit>
<trans-unit id="428536141871853903" datatype="html">
<source>{VAR_PLURAL, plural, =1 {One <x id="INTERPOLATION"/> task} other {<x id="INTERPOLATION_1"/> total <x id="INTERPOLATION"/> tasks}}</source>
<trans-unit id="5404759957685833020" datatype="html">
<source>Result message</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.html</context>
<context context-type="linenumber">140</context>
<context context-type="linenumber">218</context>
</context-group>
</trans-unit>
<trans-unit id="1943508481059904274" datatype="html">
<source> (<x id="INTERPOLATION" equiv-text="{{selectedTasks.size}}"/> selected)</source>
<trans-unit id="6621329748219109148" datatype="html">
<source>Duplicate</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.html</context>
<context context-type="linenumber">142</context>
<context context-type="linenumber">225</context>
</context-group>
</trans-unit>
<trans-unit id="5639839509673911668" datatype="html">
<source>Failed<x id="START_BLOCK_IF" equiv-text="@if (tasksService.failedFileTasks.length &gt; 0) {"/><x id="START_TAG_SPAN" ctype="x-span" equiv-text="&lt;span class=&quot;badge bg-danger ms-2&quot;&gt;"/><x id="INTERPOLATION" equiv-text="{{tasksService.failedFileTasks.length}}"/><x id="CLOSE_TAG_SPAN" ctype="x-span" equiv-text="&lt;/span&gt;"/><x id="CLOSE_BLOCK_IF" equiv-text="}"/></source>
<trans-unit id="7593555694782789615" datatype="html">
<source>Open</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.html</context>
<context context-type="linenumber">154,156</context>
<context context-type="linenumber">233</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/app-frame/global-search/global-search.component.html</context>
<context context-type="linenumber">52</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/app-frame/global-search/global-search.component.html</context>
<context context-type="linenumber">54</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/app-frame/global-search/global-search.component.html</context>
<context context-type="linenumber">56</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/app-frame/global-search/global-search.component.html</context>
<context context-type="linenumber">70</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/document-list/document-card-large/document-card-large.component.html</context>
<context context-type="linenumber">72</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/document-list/document-card-small/document-card-small.component.html</context>
<context context-type="linenumber">143</context>
</context-group>
</trans-unit>
<trans-unit id="8210778930307085868" datatype="html">
<source>Complete<x id="START_BLOCK_IF" equiv-text="@if (tasksService.completedFileTasks.length &gt; 0) {"/><x id="START_TAG_SPAN" ctype="x-span" equiv-text="&lt;span class=&quot;badge bg-secondary ms-2&quot;&gt;"/><x id="INTERPOLATION" equiv-text="{{tasksService.completedFileTasks.length}}"/><x id="CLOSE_TAG_SPAN" ctype="x-span" equiv-text="&lt;/span&gt;"/><x id="CLOSE_BLOCK_IF" equiv-text="}"/></source>
<trans-unit id="3278307631146748151" datatype="html">
<source>Input data</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.html</context>
<context context-type="linenumber">162,164</context>
<context context-type="linenumber">243</context>
</context-group>
</trans-unit>
<trans-unit id="3522801015717851360" datatype="html">
<source>Started<x id="START_BLOCK_IF" equiv-text="@if (tasksService.startedFileTasks.length &gt; 0) {"/><x id="START_TAG_SPAN" ctype="x-span" equiv-text="&lt;span class=&quot;badge bg-secondary ms-2&quot;&gt;"/><x id="INTERPOLATION" equiv-text="{{tasksService.startedFileTasks.length}}"/><x id="CLOSE_TAG_SPAN" ctype="x-span" equiv-text="&lt;/span&gt;"/><x id="CLOSE_BLOCK_IF" equiv-text="}"/></source>
<trans-unit id="1585185618099050920" datatype="html">
<source>Result data</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.html</context>
<context context-type="linenumber">170,172</context>
<context context-type="linenumber">249</context>
</context-group>
</trans-unit>
<trans-unit id="2341807459308874922" datatype="html">
<source>Queued<x id="START_BLOCK_IF" equiv-text="@if (tasksService.queuedFileTasks.length &gt; 0) {"/><x id="START_TAG_SPAN" ctype="x-span" equiv-text="&lt;span class=&quot;badge bg-secondary ms-2&quot;&gt;"/><x id="INTERPOLATION" equiv-text="{{tasksService.queuedFileTasks.length}}"/><x id="CLOSE_TAG_SPAN" ctype="x-span" equiv-text="&lt;/span&gt;"/><x id="CLOSE_BLOCK_IF" equiv-text="}"/></source>
<trans-unit id="7976920528153858271" datatype="html">
<source>No tasks match the current filters.</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.html</context>
<context context-type="linenumber">178,180</context>
<context context-type="linenumber">271</context>
</context-group>
</trans-unit>
<trans-unit id="2525230676386818985" datatype="html">
<source>Result</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.ts</context>
<context context-type="linenumber">46</context>
<context context-type="linenumber">49</context>
</context-group>
</trans-unit>
<trans-unit id="2952127331314675199" datatype="html">
<source>Needs attention</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.ts</context>
<context context-type="linenumber">54</context>
</context-group>
</trans-unit>
<trans-unit id="3307214558906712688" datatype="html">
<source>In progress</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.ts</context>
<context context-type="linenumber">55</context>
</context-group>
</trans-unit>
<trans-unit id="1371178633976505412" datatype="html">
<source>Recently completed</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.ts</context>
<context context-type="linenumber">56</context>
</context-group>
</trans-unit>
<trans-unit id="2936452645673957909" datatype="html">
<source>Consume File</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.ts</context>
<context context-type="linenumber">65</context>
</context-group>
</trans-unit>
<trans-unit id="5622386857946132209" datatype="html">
<source>Train Classifier</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.ts</context>
<context context-type="linenumber">69</context>
</context-group>
</trans-unit>
<trans-unit id="776309955507105337" datatype="html">
<source>Sanity Check</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.ts</context>
<context context-type="linenumber">73</context>
</context-group>
</trans-unit>
<trans-unit id="7502272564743467653" datatype="html">
<source>Mail Fetch</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.ts</context>
<context context-type="linenumber">75</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/edit-dialog/workflow-edit-dialog/workflow-edit-dialog.component.ts</context>
<context context-type="linenumber">79</context>
</context-group>
</trans-unit>
<trans-unit id="228975554646076615" datatype="html">
<source>LLM Index</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.ts</context>
<context context-type="linenumber">76</context>
</context-group>
</trans-unit>
<trans-unit id="6402092370576716734" datatype="html">
<source>Empty Trash</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.ts</context>
<context context-type="linenumber">79</context>
</context-group>
</trans-unit>
<trans-unit id="4173754487295857311" datatype="html">
<source>Check Workflows</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.ts</context>
<context context-type="linenumber">83</context>
</context-group>
</trans-unit>
<trans-unit id="2065831805515896240" datatype="html">
<source>Bulk Update</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.ts</context>
<context context-type="linenumber">87</context>
</context-group>
</trans-unit>
<trans-unit id="5062348912423871240" datatype="html">
<source>Reprocess Document</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.ts</context>
<context context-type="linenumber">91</context>
</context-group>
</trans-unit>
<trans-unit id="6903696804811148799" datatype="html">
<source>Build Share Link</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.ts</context>
<context context-type="linenumber">95</context>
</context-group>
</trans-unit>
<trans-unit id="6581973658756561124" datatype="html">
<source>Bulk Delete</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.ts</context>
<context context-type="linenumber">99</context>
</context-group>
</trans-unit>
<trans-unit id="9172233176401579786" datatype="html">
<source>Scheduled</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.ts</context>
<context context-type="linenumber">109</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/edit-dialog/workflow-edit-dialog/workflow-edit-dialog.component.ts</context>
<context context-type="linenumber">121</context>
</context-group>
</trans-unit>
<trans-unit id="235571817610183244" datatype="html">
<source>Web UI</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.ts</context>
<context context-type="linenumber">111</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/edit-dialog/workflow-edit-dialog/workflow-edit-dialog.component.ts</context>
<context context-type="linenumber">83</context>
</context-group>
</trans-unit>
<trans-unit id="526966086395145275" datatype="html">
<source>API Upload</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.ts</context>
<context context-type="linenumber">114</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/edit-dialog/workflow-edit-dialog/workflow-edit-dialog.component.ts</context>
<context context-type="linenumber">75</context>
</context-group>
</trans-unit>
<trans-unit id="6520092939021340973" datatype="html">
<source>Folder Consume</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.ts</context>
<context context-type="linenumber">118</context>
</context-group>
</trans-unit>
<trans-unit id="1456901843819961615" datatype="html">
<source>Email Consume</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.ts</context>
<context context-type="linenumber">122</context>
</context-group>
</trans-unit>
<trans-unit id="29832309535656200" datatype="html">
<source>System</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.ts</context>
<context context-type="linenumber">124</context>
</context-group>
</trans-unit>
<trans-unit id="2233560223291461480" datatype="html">
<source>Manual</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.ts</context>
<context context-type="linenumber">125</context>
</context-group>
</trans-unit>
<trans-unit id="5404910960991552159" datatype="html">
<source>Dismiss selected</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.ts</context>
<context context-type="linenumber">110</context>
<context context-type="linenumber">226</context>
</context-group>
</trans-unit>
<trans-unit id="8829078752502782653" datatype="html">
<source>Dismiss all</source>
<trans-unit id="9169677036332103838" datatype="html">
<source>Dismiss visible</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.ts</context>
<context context-type="linenumber">111</context>
<context context-type="linenumber">227</context>
</context-group>
</trans-unit>
<trans-unit id="1323591410517879795" datatype="html">
<source>Confirm Dismiss All</source>
<trans-unit id="3169751690815214293" datatype="html">
<source>Confirm Dismiss</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.ts</context>
<context context-type="linenumber">152</context>
<context context-type="linenumber">295</context>
</context-group>
</trans-unit>
<trans-unit id="4157200209636243740" datatype="html">
<source>Dismiss all <x id="PH" equiv-text="tasks.size"/> tasks?</source>
<trans-unit id="5029621907742319073" datatype="html">
<source>Dismiss <x id="PH" equiv-text="tasks.size"/> tasks?</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.ts</context>
<context context-type="linenumber">153</context>
<context context-type="linenumber">296</context>
</context-group>
</trans-unit>
<trans-unit id="3597309129998924778" datatype="html">
<source>Error dismissing tasks</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.ts</context>
<context context-type="linenumber">161</context>
<context context-type="linenumber">304</context>
</context-group>
</trans-unit>
<trans-unit id="2132179171926568807" datatype="html">
<source>Error dismissing task</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.ts</context>
<context context-type="linenumber">170</context>
<context context-type="linenumber">313</context>
</context-group>
</trans-unit>
<trans-unit id="9011556615675272238" datatype="html">
<source>queued</source>
<trans-unit id="8760066891202884337" datatype="html">
<source>Duplicate of document #<x id="PH" equiv-text="this.duplicateDocumentId(task)"/></source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.ts</context>
<context context-type="linenumber">246</context>
</context-group>
</trans-unit>
<trans-unit id="6415892379431855826" datatype="html">
<source>started</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.ts</context>
<context context-type="linenumber">248</context>
</context-group>
</trans-unit>
<trans-unit id="7510279840486540181" datatype="html">
<source>completed</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.ts</context>
<context context-type="linenumber">250</context>
</context-group>
</trans-unit>
<trans-unit id="4083337005045748464" datatype="html">
<source>failed</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/admin/tasks/tasks.component.ts</context>
<context context-type="linenumber">252</context>
<context context-type="linenumber">408</context>
</context-group>
</trans-unit>
<trans-unit id="3418677553313974490" datatype="html">
@@ -3150,33 +3395,6 @@
<context context-type="linenumber">26</context>
</context-group>
</trans-unit>
<trans-unit id="7593555694782789615" datatype="html">
<source>Open</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/app-frame/global-search/global-search.component.html</context>
<context context-type="linenumber">52</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/app-frame/global-search/global-search.component.html</context>
<context context-type="linenumber">54</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/app-frame/global-search/global-search.component.html</context>
<context context-type="linenumber">56</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/app-frame/global-search/global-search.component.html</context>
<context context-type="linenumber">70</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/document-list/document-card-large/document-card-large.component.html</context>
<context context-type="linenumber">72</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/document-list/document-card-small/document-card-small.component.html</context>
<context context-type="linenumber">143</context>
</context-group>
</trans-unit>
<trans-unit id="6329940072345709724" datatype="html">
<source>Filter documents</source>
<context-group purpose="location">
@@ -3717,37 +3935,6 @@
<context context-type="linenumber">16</context>
</context-group>
</trans-unit>
<trans-unit id="1616102757855967475" datatype="html">
<source>All</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/custom-fields-query-dropdown/custom-fields-query-dropdown.component.html</context>
<context context-type="linenumber">151</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/filterable-dropdown/filterable-dropdown.component.html</context>
<context context-type="linenumber">14</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/permissions-filter-dropdown/permissions-filter-dropdown.component.html</context>
<context context-type="linenumber">15</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/permissions-select/permissions-select.component.html</context>
<context context-type="linenumber">16</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/permissions-select/permissions-select.component.html</context>
<context context-type="linenumber">27</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/document-list/document-list.component.html</context>
<context context-type="linenumber">29</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/manage/document-attributes/document-attributes.component.html</context>
<context context-type="linenumber">37</context>
</context-group>
</trans-unit>
<trans-unit id="1496549861742963591" datatype="html">
<source>Not</source>
<context-group purpose="location">
@@ -4062,14 +4249,14 @@
<source>Create new item</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/edit-dialog/edit-dialog.component.ts</context>
<context context-type="linenumber">121</context>
<context context-type="linenumber">119</context>
</context-group>
</trans-unit>
<trans-unit id="5324147361912094446" datatype="html">
<source>Edit item</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/edit-dialog/edit-dialog.component.ts</context>
<context context-type="linenumber">125</context>
<context context-type="linenumber">123</context>
</context-group>
</trans-unit>
<trans-unit id="7878445132438733225" datatype="html">
@@ -4435,23 +4622,23 @@
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/system-status-dialog/system-status-dialog.component.html</context>
<context context-type="linenumber">186</context>
<context context-type="linenumber">172</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/system-status-dialog/system-status-dialog.component.html</context>
<context context-type="linenumber">220</context>
<context context-type="linenumber">206</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/system-status-dialog/system-status-dialog.component.html</context>
<context context-type="linenumber">254</context>
<context context-type="linenumber">240</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/system-status-dialog/system-status-dialog.component.html</context>
<context context-type="linenumber">264</context>
<context context-type="linenumber">250</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/system-status-dialog/system-status-dialog.component.html</context>
<context context-type="linenumber">302</context>
<context context-type="linenumber">288</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/toast/toast.component.html</context>
@@ -4859,32 +5046,32 @@
<source>Create new user account</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/edit-dialog/user-edit-dialog/user-edit-dialog.component.ts</context>
<context context-type="linenumber">72</context>
<context context-type="linenumber">70</context>
</context-group>
</trans-unit>
<trans-unit id="2887331217965896363" datatype="html">
<source>Edit user account</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/edit-dialog/user-edit-dialog/user-edit-dialog.component.ts</context>
<context context-type="linenumber">76</context>
<context context-type="linenumber">74</context>
</context-group>
</trans-unit>
<trans-unit id="5872286584705575476" datatype="html">
<source>Totp deactivated</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/edit-dialog/user-edit-dialog/user-edit-dialog.component.ts</context>
<context context-type="linenumber">132</context>
<context context-type="linenumber">130</context>
</context-group>
</trans-unit>
<trans-unit id="6439190193788239059" datatype="html">
<source>Totp deactivation failed</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/edit-dialog/user-edit-dialog/user-edit-dialog.component.ts</context>
<context context-type="linenumber">135</context>
<context context-type="linenumber">133</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/edit-dialog/user-edit-dialog/user-edit-dialog.component.ts</context>
<context context-type="linenumber">140</context>
<context context-type="linenumber">138</context>
</context-group>
</trans-unit>
<trans-unit id="8419515490539218007" datatype="html">
@@ -5381,27 +5568,6 @@
<context context-type="linenumber">71</context>
</context-group>
</trans-unit>
<trans-unit id="526966086395145275" datatype="html">
<source>API Upload</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/edit-dialog/workflow-edit-dialog/workflow-edit-dialog.component.ts</context>
<context context-type="linenumber">75</context>
</context-group>
</trans-unit>
<trans-unit id="7502272564743467653" datatype="html">
<source>Mail Fetch</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/edit-dialog/workflow-edit-dialog/workflow-edit-dialog.component.ts</context>
<context context-type="linenumber">79</context>
</context-group>
</trans-unit>
<trans-unit id="235571817610183244" datatype="html">
<source>Web UI</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/edit-dialog/workflow-edit-dialog/workflow-edit-dialog.component.ts</context>
<context context-type="linenumber">83</context>
</context-group>
</trans-unit>
<trans-unit id="3553216189604488439" datatype="html">
<source>Modified</source>
<context-group purpose="location">
@@ -5441,13 +5607,6 @@
<context context-type="linenumber">117</context>
</context-group>
</trans-unit>
<trans-unit id="9172233176401579786" datatype="html">
<source>Scheduled</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/edit-dialog/workflow-edit-dialog/workflow-edit-dialog.component.ts</context>
<context context-type="linenumber">121</context>
</context-group>
</trans-unit>
<trans-unit id="5502398334173581061" datatype="html">
<source>Assignment</source>
<context-group purpose="location">
@@ -6263,7 +6422,7 @@
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/system-status-dialog/system-status-dialog.component.html</context>
<context context-type="linenumber">321</context>
<context context-type="linenumber">307</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/manage/mail/mail.component.html</context>
@@ -6986,83 +7145,79 @@
<context context-type="linenumber">157</context>
</context-group>
</trans-unit>
<trans-unit id="9127131074422113272" datatype="html">
<source>Run Task</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/system-status-dialog/system-status-dialog.component.html</context>
<context context-type="linenumber">177</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/system-status-dialog/system-status-dialog.component.html</context>
<context context-type="linenumber">211</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/system-status-dialog/system-status-dialog.component.html</context>
<context context-type="linenumber">245</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/system-status-dialog/system-status-dialog.component.html</context>
<context context-type="linenumber">293</context>
</context-group>
</trans-unit>
<trans-unit id="4089509911694721896" datatype="html">
<source>Last Updated</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/system-status-dialog/system-status-dialog.component.html</context>
<context context-type="linenumber">184</context>
<context context-type="linenumber">170</context>
</context-group>
</trans-unit>
<trans-unit id="46628344485199198" datatype="html">
<source>Classifier</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/system-status-dialog/system-status-dialog.component.html</context>
<context context-type="linenumber">189</context>
<context context-type="linenumber">175</context>
</context-group>
</trans-unit>
<trans-unit id="9127131074422113272" datatype="html">
<source>Run Task</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/system-status-dialog/system-status-dialog.component.html</context>
<context context-type="linenumber">197</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/system-status-dialog/system-status-dialog.component.html</context>
<context context-type="linenumber">231</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/system-status-dialog/system-status-dialog.component.html</context>
<context context-type="linenumber">279</context>
</context-group>
</trans-unit>
<trans-unit id="6096684179126491743" datatype="html">
<source>Last Trained</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/system-status-dialog/system-status-dialog.component.html</context>
<context context-type="linenumber">218</context>
<context context-type="linenumber">204</context>
</context-group>
</trans-unit>
<trans-unit id="6427836860962380759" datatype="html">
<source>Sanity Checker</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/system-status-dialog/system-status-dialog.component.html</context>
<context context-type="linenumber">223</context>
<context context-type="linenumber">209</context>
</context-group>
</trans-unit>
<trans-unit id="6578747070254776938" datatype="html">
<source>Last Run</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/system-status-dialog/system-status-dialog.component.html</context>
<context context-type="linenumber">252</context>
<context context-type="linenumber">238</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/system-status-dialog/system-status-dialog.component.html</context>
<context context-type="linenumber">300</context>
<context context-type="linenumber">286</context>
</context-group>
</trans-unit>
<trans-unit id="5921685253729220446" datatype="html">
<source>WebSocket Connection</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/system-status-dialog/system-status-dialog.component.html</context>
<context context-type="linenumber">257</context>
<context context-type="linenumber">243</context>
</context-group>
</trans-unit>
<trans-unit id="8998179362936748717" datatype="html">
<source>OK</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/system-status-dialog/system-status-dialog.component.html</context>
<context context-type="linenumber">261</context>
<context context-type="linenumber">247</context>
</context-group>
</trans-unit>
<trans-unit id="3804349597565969872" datatype="html">
<source>AI Index</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/common/system-status-dialog/system-status-dialog.component.html</context>
<context context-type="linenumber">270</context>
<context context-type="linenumber">256</context>
</context-group>
</trans-unit>
<trans-unit id="6732151329960766506" datatype="html">
@@ -8780,17 +8935,6 @@
<context context-type="linenumber">137</context>
</context-group>
</trans-unit>
<trans-unit id="6849725902312323996" datatype="html">
<source>Reset filters</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/document-list/document-list.component.html</context>
<context context-type="linenumber">142</context>
</context-group>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/document-list/filter-editor/filter-editor.component.html</context>
<context context-type="linenumber">107</context>
</context-group>
</trans-unit>
<trans-unit id="1559883523769732271" datatype="html">
<source>Error while loading documents</source>
<context-group purpose="location">
+2 -2
View File
@@ -1,6 +1,6 @@
{
"name": "paperless-ngx-ui",
"version": "2.20.13",
"version": "2.20.14",
"scripts": {
"preinstall": "npx only-allow pnpm",
"ng": "ng",
@@ -36,7 +36,7 @@
"rxjs": "^7.8.2",
"tslib": "^2.8.1",
"utif": "^3.1.0",
"uuid": "^13.0.0",
"uuid": "^14.0.0",
"zone.js": "^0.16.1"
},
"devDependencies": {
+16 -16
View File
@@ -84,8 +84,8 @@ importers:
specifier: ^3.1.0
version: 3.1.0
uuid:
specifier: ^13.0.0
version: 13.0.0
specifier: ^14.0.0
version: 14.0.0
zone.js:
specifier: ^0.16.1
version: 0.16.1
@@ -4363,8 +4363,8 @@ packages:
flatted@3.4.2:
resolution: {integrity: sha512-PjDse7RzhcPkIJwy5t7KPWQSZ9cAbzQXcafsetQoD7sOJRQlGikNbx7yZp2OotDnJyrDcbyRq3Ttb18iYOqkxA==}
follow-redirects@1.15.11:
resolution: {integrity: sha512-deG2P0JfjrTxl50XGCDyfI97ZGVCxIpfKYmfyrQ54n5FO/0gfIES8C/Psl6kWVDolizcaaxZJnTS0QSMxvnsBQ==}
follow-redirects@1.16.0:
resolution: {integrity: sha512-y5rN/uOsadFT/JfYwhxRS5R7Qce+g3zG97+JrtFZlC9klX/W5hD7iiLzScI4nZqUS7DNUdhPgw4xI8W2LuXlUw==}
engines: {node: '>=4.0'}
peerDependencies:
debug: '*'
@@ -4502,8 +4502,8 @@ packages:
resolution: {integrity: sha512-0hJU9SCPvmMzIBdZFqNPXWa6dqh7WdH0cII9y+CyS8rG3nL48Bclra9HmKhVVUHyPWNH5Y7xDwAB7bfgSjkUMQ==}
engines: {node: '>= 0.4'}
hono@4.12.12:
resolution: {integrity: sha512-p1JfQMKaceuCbpJKAPKVqyqviZdS0eUxH9v82oWo1kb9xjQ5wA6iP3FNVAPDFlz5/p7d45lO+BpSk1tuSZMF4Q==}
hono@4.12.14:
resolution: {integrity: sha512-am5zfg3yu6sqn5yjKBNqhnTX7Cv+m00ox+7jbaKkrLMRJ4rAdldd1xPd/JzbBWspqaQv6RSTrgFN95EsfhC+7w==}
engines: {node: '>=16.9.0'}
hosted-git-info@9.0.2:
@@ -6562,8 +6562,8 @@ packages:
resolution: {integrity: sha512-pMZTvIkT1d+TFGvDOqodOclx0QWkkgi6Tdoa8gC8ffGAAqz9pzPTZWAybbsHHoED/ztMtkv/VoYTYyShUn81hA==}
engines: {node: '>= 0.4.0'}
uuid@13.0.0:
resolution: {integrity: sha512-XQegIaBTVUjSHliKqcnFqYypAd4S+WCYt5NIeRs6w/UAry7z8Y9j5ZwRRL4kzq9U3sD6v+85er9FvkEaBpji2w==}
uuid@14.0.0:
resolution: {integrity: sha512-Qo+uWgilfSmAhXCMav1uYFynlQO7fMFiMVZsQqZRMIXp0O7rR7qjkj+cPvBHLgBqi960QCoo/PH2/6ZtVqKvrg==}
hasBin: true
uuid@8.3.2:
@@ -8685,9 +8685,9 @@ snapshots:
'@harperfast/extended-iterable@1.0.3':
optional: true
'@hono/node-server@1.19.13(hono@4.12.12)':
'@hono/node-server@1.19.13(hono@4.12.14)':
dependencies:
hono: 4.12.12
hono: 4.12.14
'@humanfs/core@0.19.1': {}
@@ -9270,7 +9270,7 @@ snapshots:
'@modelcontextprotocol/sdk@1.26.0(zod@4.3.6)':
dependencies:
'@hono/node-server': 1.19.13(hono@4.12.12)
'@hono/node-server': 1.19.13(hono@4.12.14)
ajv: 8.18.0
ajv-formats: 3.0.1(ajv@8.18.0)
content-type: 1.0.5
@@ -9280,7 +9280,7 @@ snapshots:
eventsource-parser: 3.0.6
express: 5.2.1
express-rate-limit: 8.3.2(express@5.2.1)
hono: 4.12.12
hono: 4.12.14
jose: 6.2.2
json-schema-typed: 8.0.2
pkce-challenge: 5.0.1
@@ -11427,7 +11427,7 @@ snapshots:
flatted@3.4.2: {}
follow-redirects@1.15.11(debug@4.4.3):
follow-redirects@1.16.0(debug@4.4.3):
optionalDependencies:
debug: 4.4.3
@@ -11553,7 +11553,7 @@ snapshots:
dependencies:
function-bind: 1.1.2
hono@4.12.12: {}
hono@4.12.14: {}
hosted-git-info@9.0.2:
dependencies:
@@ -11634,7 +11634,7 @@ snapshots:
http-proxy@1.18.1(debug@4.4.3):
dependencies:
eventemitter3: 4.0.7
follow-redirects: 1.15.11(debug@4.4.3)
follow-redirects: 1.16.0(debug@4.4.3)
requires-port: 1.0.0
transitivePeerDependencies:
- debug
@@ -14029,7 +14029,7 @@ snapshots:
utils-merge@1.0.1: {}
uuid@13.0.0: {}
uuid@14.0.0: {}
uuid@8.3.2: {}
+1 -1
View File
@@ -219,7 +219,7 @@ export class AppComponent implements OnInit, OnDestroy {
},
{
anchorId: 'tour.file-tasks',
content: $localize`File Tasks shows you documents that have been consumed, are waiting to be, or may have failed during the process.`,
content: $localize`Tasks helps you track background work, what needs attention, and what recently completed.`,
route: '/tasks',
backdropConfig: {
offset: 0,
@@ -337,7 +337,7 @@ describe('SettingsComponent', () => {
.mockImplementation(
(action, type) =>
action === PermissionAction.View &&
type === PermissionType.SystemStatus
type === PermissionType.SystemMonitoring
)
completeSetup()
expect(component['systemStatus']).toEqual(status) // private
@@ -359,7 +359,7 @@ describe('SettingsComponent', () => {
.mockImplementation(
(action, type) =>
action === PermissionAction.View &&
type === PermissionType.SystemStatus
type === PermissionType.SystemMonitoring
)
completeSetup()
component.showSystemStatus()
@@ -652,7 +652,7 @@ export class SettingsComponent
this.permissionsService.isAdmin() ||
this.permissionsService.currentUserCan(
PermissionAction.View,
PermissionType.SystemStatus
PermissionType.SystemMonitoring
)
)
}
@@ -1,41 +1,16 @@
<pngx-page-header
title="File Tasks"
title="Tasks"
i18n-title
info="File Tasks shows you documents that have been consumed, are waiting to be, or may have failed during the process."
info="Tasks shows detailed information about document consumption and system tasks."
i18n-info
>
<div class="btn-toolbar col col-md-auto align-items-center gap-2">
<button class="btn btn-sm btn-outline-secondary me-2" (click)="clearSelection()" [hidden]="selectedTasks.size === 0">
<i-bs name="x" class="me-1"></i-bs><ng-container i18n>Clear selection</ng-container>
</button>
<button class="btn btn-sm btn-outline-primary me-2" (click)="dismissTasks()" *pngxIfPermissions="{ action: PermissionAction.Change, type: PermissionType.PaperlessTask }" [disabled]="tasksService.total === 0">
<button class="btn btn-sm btn-outline-primary me-2" (click)="dismissTasks()" *pngxIfPermissions="{ action: PermissionAction.Change, type: PermissionType.PaperlessTask }" [disabled]="visibleTasks.length === 0">
<i-bs name="check2-all" class="me-1"></i-bs>{{dismissButtonText}}
</button>
<div class="form-inline d-flex align-items-center">
<div class="input-group input-group-sm flex-fill w-auto flex-nowrap">
<span class="input-group-text text-muted" i18n>Filter by</span>
@if (filterTargets.length > 1) {
<div ngbDropdown>
<button class="btn btn-sm btn-outline-primary" ngbDropdownToggle>{{filterTargetName}}</button>
<div class="dropdown-menu shadow" ngbDropdownMenu>
@for (t of filterTargets; track t.id) {
<button ngbDropdownItem [class.active]="filterTargetID === t.id" (click)="filterTargetID = t.id">{{t.name}}</button>
}
</div>
</div>
} @else {
<span class="input-group-text">{{filterTargetName}}</span>
}
@if (filterText?.length) {
<button class="btn btn-link btn-sm px-2 position-absolute top-0 end-0 z-10" (click)="resetFilter()">
<i-bs width="1em" height="1em" name="x"></i-bs>
</button>
}
<input #filterInput class="form-control form-control-sm" type="text"
(keyup)="filterInputKeyup($event)"
[(ngModel)]="filterText">
</div>
</div>
<div class="form-check form-switch mb-0 ms-2">
<input class="form-check-input" type="checkbox" role="switch" [(ngModel)]="autoRefreshEnabled">
<label class="form-check-label" for="autoRefreshSwitch" i18n>Auto refresh</label>
@@ -48,139 +23,250 @@
<div class="visually-hidden" i18n>Loading...</div>
}
<ng-template let-tasks="tasks" #tasksTemplate>
<table class="table table-striped align-middle border shadow-sm">
<thead>
<tr>
<th scope="col">
<div class="form-check">
<input type="checkbox" class="form-check-input" id="all-tasks" [disabled]="currentTasks.length === 0" [(ngModel)]="togggleAll" (click)="toggleAll($event); $event.stopPropagation();">
<label class="form-check-label" for="all-tasks"></label>
</div>
</th>
<th scope="col" i18n>Name</th>
<th scope="col" class="d-none d-lg-table-cell" i18n>Created</th>
@if (activeTab !== 'started' && activeTab !== 'queued') {
<th scope="col" class="d-none d-lg-table-cell" i18n>Results</th>
<div class="task-controls mb-3 btn-toolbar">
<div class="task-view-scope btn-group btn-group-sm me-3" role="group">
<input
type="radio"
class="btn-check"
[checked]="selectedSection === TaskSection.All"
id="section-all"
(click)="setSection(TaskSection.All)"
(keydown)="setSection(TaskSection.All)" />
<label class="btn btn-outline-primary" for="section-all">
<ng-container i18n>All</ng-container>
</label>
@for (section of sections; track section) {
<input
type="radio"
class="btn-check"
[checked]="selectedSection === section"
id="section-{{section}}"
(click)="setSection(section)"
(keydown)="setSection(section)" />
<label class="btn btn-outline-primary" for="section-{{section}}">
{{ sectionLabel(section) }}
@if (sectionCount(section) > 0) {
<span class="badge ms-2" [class.bg-danger]="section === TaskSection.NeedsAttention" [class.bg-secondary]="section !== TaskSection.NeedsAttention">{{sectionCount(section)}}</span>
}
<th scope="col" class="d-table-cell d-lg-none" i18n>Info</th>
<th scope="col" i18n>Actions</th>
</tr>
</thead>
<tbody>
@for (task of tasks | slice: (page-1) * pageSize : page * pageSize; track task.id) {
<tr (click)="toggleSelected(task, $event); $event.stopPropagation();">
<td>
<div class="form-check">
<input type="checkbox" class="form-check-input" id="task{{task.id}}" [checked]="selectedTasks.has(task.id)" (click)="toggleSelected(task, $event); $event.stopPropagation();">
<label class="form-check-label" for="task{{task.id}}"></label>
</div>
</td>
<td class="overflow-auto name-col">{{ task.task_file_name }}</td>
<td class="d-none d-lg-table-cell">{{ task.date_created | customDate:'short' }}</td>
@if (activeTab !== 'started' && activeTab !== 'queued') {
<td class="d-none d-lg-table-cell">
@if (task.result?.length > 50) {
<div class="result" (click)="expandTask(task); $event.stopPropagation();"
[ngbPopover]="resultPopover" popoverClass="shadow small mobile" triggers="mouseenter:mouseleave" container="body">
<span class="small d-none d-md-inline-block font-monospace text-muted">{{ task.result | slice:0:50 }}&hellip;</span>
</div>
}
@if (task.result?.length <= 50) {
<span class="small d-none d-md-inline-block font-monospace text-muted">{{ task.result }}</span>
}
<ng-template #resultPopover>
<pre class="small mb-0">{{ task.result | slice:0:300 }}@if (task.result.length > 300) {
&hellip;
}</pre>
@if (task.result?.length > 300) {
<br/><em>(<ng-container i18n>click for full output</ng-container>)</em>
}
</ng-template>
@if (task.duplicate_documents?.length > 0) {
<div class="small text-warning-emphasis d-flex align-items-center gap-1">
<i-bs class="lh-1" width="1em" height="1em" name="exclamation-triangle"></i-bs>
<span i18n>Duplicate(s) detected</span>
</div>
}
</td>
</label>
}
</div>
<div ngbDropdown>
<button class="btn btn-sm btn-outline-primary me-3" ngbDropdownToggle>{{selectedTaskTypeLabel}}</button>
<div class="dropdown-menu shadow" ngbDropdownMenu>
<button ngbDropdownItem [class.active]="selectedTaskType === null" (click)="setTaskType(null)" i18n>All types</button>
@for (option of taskTypeOptions; track option.value) {
<button ngbDropdownItem [class.active]="selectedTaskType === option.value" [disabled]="isTaskTypeOptionDisabled(option.value)" (click)="setTaskType(option.value)">{{option.label}}</button>
}
</div>
</div>
<div ngbDropdown>
<button class="btn btn-sm btn-outline-primary me-3" ngbDropdownToggle>{{selectedTriggerSourceLabel}}</button>
<div class="dropdown-menu shadow" ngbDropdownMenu>
<button ngbDropdownItem [class.active]="selectedTriggerSource === null" (click)="setTriggerSource(null)" i18n>All sources</button>
@for (option of triggerSourceOptions; track option.value) {
<button ngbDropdownItem [class.active]="selectedTriggerSource === option.value" [disabled]="isTriggerSourceOptionDisabled(option.value)" (click)="setTriggerSource(option.value)">{{option.label}}</button>
}
</div>
</div>
<div class="form-inline d-flex align-items-center flex-grow-1 task-search">
<div class="input-group input-group-sm flex-fill w-auto flex-nowrap">
<span class="input-group-text text-muted" i18n>Filter by</span>
<div ngbDropdown>
<button class="btn btn-sm btn-outline-primary" ngbDropdownToggle>{{filterTargetName}}</button>
<div class="dropdown-menu shadow" ngbDropdownMenu>
@for (t of filterTargets; track t.id) {
<button ngbDropdownItem [class.active]="filterTargetID === t.id" (click)="filterTargetID = t.id">{{t.name}}</button>
}
<td class="d-lg-none">
<button class="btn btn-link" (click)="expandTask(task); $event.stopPropagation();">
<i-bs width="1.2em" height="1.2em" name="info-circle"></i-bs>
</button>
</td>
<td scope="row">
<div class="btn-group" role="group">
<button class="btn btn-sm btn-outline-secondary" (click)="dismissTask(task); $event.stopPropagation();" *pngxIfPermissions="{ action: PermissionAction.Change, type: PermissionType.PaperlessTask }">
<i-bs name="check" class="me-1"></i-bs><ng-container i18n>Dismiss</ng-container>
</button>
<ng-container *pngxIfPermissions="{ action: PermissionAction.View, type: PermissionType.Document }">
@if (task.related_document) {
<button class="btn btn-sm btn-outline-primary" (click)="dismissAndGo(task); $event.stopPropagation();">
<i-bs name="file-text" class="me-1"></i-bs><ng-container i18n>Open Document</ng-container>
</button>
}
</ng-container>
</div>
</td>
</tr>
</div>
</div>
@if (filterText?.length) {
<button class="btn btn-link btn-sm px-2 position-absolute top-0 end-0 z-10" (click)="resetFilter()">
<i-bs width="1em" height="1em" name="x"></i-bs>
</button>
}
<input #filterInput class="form-control form-control-sm" type="text"
(keyup)="filterInputKeyup($event)"
[(ngModel)]="filterText">
</div>
</div>
@if (isFiltered) {
<button class="btn btn-link py-0 ms-md-auto" (click)="resetFilters()">
<i-bs width="1em" height="1em" name="x"></i-bs><small i18n>Reset filters</small>
</button>
}
</div>
<ng-template let-tasks="tasks" let-section="section" #tasksTemplate>
<div class="section-header d-flex align-items-center justify-content-between mb-2">
<div>
<h5 class="mb-0">{{ sectionLabel(section) }}</h5>
<div class="small text-muted">
<ng-container i18n>{tasks.length, plural, =1 {1 task} other {{{tasks.length}} tasks}}</ng-container>
</div>
</div>
</div>
<div class="card border table-responsive mb-3">
<table class="table table-striped align-middle shadow-sm mb-0">
<thead>
<tr>
<td class="p-0" [class.border-0]="expandedTask !== task.id" colspan="5">
<pre #collapse="ngbCollapse" [ngbCollapse]="expandedTask !== task.id" class="small mb-0"><div class="small p-1 p-lg-3 ms-lg-3">{{ task.result }}</div></pre>
<th scope="col">
<div class="form-check">
<input
type="checkbox"
class="form-check-input"
[id]="'all-tasks-' + section"
[disabled]="tasks.length === 0"
[checked]="areAllSelected(tasks)"
(click)="toggleSection(section, $event); $event.stopPropagation();"
(keydown)="toggleSection(section, $event); $event.stopPropagation();" />
<label class="form-check-label" for="all-tasks-{{section}}"><span class="visually-hidden">Check all</span></label>
</div>
</th>
<th scope="col" i18n>Name</th>
<th scope="col" class="d-none d-lg-table-cell" i18n>Created</th>
@if (sectionShowsResults(section)) {
<th scope="col" class="d-none d-lg-table-cell" i18n>Results</th>
}
<th scope="col" class="d-table-cell d-lg-none" i18n>Info</th>
<th scope="col" i18n>Actions</th>
</tr>
</thead>
<tbody>
@for (task of tasks; track task.id) {
<tr (click)="toggleSelected(task); $event.stopPropagation();" (keydown)="toggleSelected(task); $event.stopPropagation();">
<td>
<div class="form-check">
<input
type="checkbox"
class="form-check-input"
id="task{{task.id}}"
[checked]="selectedTasks.has(task.id)"
(click)="toggleSelected(task); $event.stopPropagation();"
(keydown)="toggleSelected(task); $event.stopPropagation();" />
<label class="form-check-label" for="task{{task.id}}"></label>
</div>
</td>
<td class="overflow-auto name-col">
<div>{{ taskDisplayName(task) }}</div>
<div class="small text-muted">
@if (taskShowsSeparateTypeLabel(task)) {
<span>{{ task.task_type_display }}</span>
<span class="mx-1">&bull;</span>
}
<span>{{ task.trigger_source_display }}</span>
</div>
</td>
<td class="d-none d-lg-table-cell">{{ task.date_created | customDate:'short' }}</td>
@if (sectionShowsResults(section)) {
<td class="d-none d-lg-table-cell">
@if (taskHasLongResultMessage(task)) {
<div class="result" (click)="expandTask(task); $event.stopPropagation();"
[ngbPopover]="resultPopover" popoverClass="shadow small mobile" triggers="mouseenter:mouseleave" container="body">
<span class="small d-none d-md-inline-block font-monospace text-muted">{{ taskResultPreview(task) }}</span>
</div>
}
@if (taskHasResultMessage(task) && !taskHasLongResultMessage(task)) {
<span class="small d-none d-md-inline-block font-monospace text-muted">{{ taskResultMessage(task) }}</span>
}
@if (duplicateDocumentId(task)) {
<div class="small text-warning-emphasis d-flex align-items-center gap-1 mt-1">
<i-bs class="lh-1" width="1em" height="1em" name="exclamation-triangle"></i-bs>
<span>{{ duplicateTaskLabel(task) }}</span>
</div>
}
<ng-template #resultPopover>
<pre class="small mb-0">{{ taskResultPopoverMessage(task) }}@if (taskResultMessageOverflowsPopover(task)) {
&hellip;
}</pre>
@if (taskResultMessageOverflowsPopover(task)) {
<br/><em>(<ng-container i18n>click for full output</ng-container>)</em>
}
</ng-template>
</td>
}
<td class="d-lg-none">
<button class="btn btn-link" (click)="expandTask(task); $event.stopPropagation();">
<i-bs width="1.2em" height="1.2em" name="info-circle"></i-bs>
</button>
</td>
<td scope="row">
<div class="btn-group" role="group">
<button class="btn btn-sm btn-outline-secondary" (click)="dismissTask(task); $event.stopPropagation();" *pngxIfPermissions="{ action: PermissionAction.Change, type: PermissionType.PaperlessTask }">
<i-bs name="check" class="me-1"></i-bs><ng-container i18n>Dismiss</ng-container>
</button>
<ng-container *pngxIfPermissions="{ action: PermissionAction.View, type: PermissionType.Document }">
@if (task.related_document_ids?.[0]) {
<button class="btn btn-sm btn-outline-primary" (click)="dismissAndGo(task); $event.stopPropagation();">
<i-bs name="file-text" class="me-1"></i-bs><ng-container i18n>Open Document</ng-container>
</button>
}
</ng-container>
</div>
</td>
</tr>
<tr>
<td class="p-0" [class.border-0]="expandedTask !== task.id" [attr.colspan]="sectionShowsResults(section) ? 5 : 4">
<div #collapse="ngbCollapse" [ngbCollapse]="expandedTask !== task.id" class="task-detail-panel bg-darker small mb-0">
<div class="p-2 p-lg-3 ms-lg-3">
@if (taskHasResultMessage(task)) {
<div class="detail-section mb-3">
<div class="detail-label fs-7 fw-bold text-uppercase text-muted mb-1" i18n>Result message</div>
<pre class="detail-block border border-dark bg-body p-3 rounded-2 mb-0">{{ taskResultMessage(task) }}</pre>
</div>
}
@if (duplicateDocumentId(task); as duplicateDocumentId) {
<div class="detail-section mb-3">
<div class="detail-label fs-7 fw-bold text-uppercase text-muted mb-1" i18n>Duplicate</div>
<div class="detail-block border border-dark bg-body p-3 rounded-2 mb-0">
<div class="d-flex align-items-center justify-content-between gap-3">
<div class="text-break">{{ duplicateTaskLabel(task) }}</div>
<button
class="btn btn-sm btn-outline-primary"
type="button"
(click)="openDuplicateDocument(duplicateDocumentId)">
<ng-container i18n>Open</ng-container>
</button>
</div>
</div>
</div>
}
<div class="row g-3">
<div class="col-12 col-xl-6">
<div class="detail-section h-100">
<div class="detail-label fs-7 fw-bold text-uppercase text-muted mb-1" i18n>Input data</div>
<pre class="detail-block border border-dark bg-body p-3 rounded-2 mb-0">{{ task.input_data | json }}</pre>
</div>
</div>
<div class="col-12 col-xl-6">
<div class="detail-section h-100">
<div class="detail-label fs-7 fw-bold text-uppercase text-muted mb-1" i18n>Result data</div>
<pre class="detail-block border border-dark bg-body p-3 rounded-2 mb-0">{{ (task.result_data ?? {}) | json }}</pre>
</div>
</div>
</div>
</div>
</div>
</td>
</tr>
}
</tbody>
</table>
<div class="pb-3 d-sm-flex justify-content-between align-items-center">
@if (tasks.length > 0) {
<div class="pb-2 pb-sm-0">
<ng-container i18n>{tasks.length, plural, =1 {One {{this.activeTabLocalized}} task} other {{{tasks.length || 0}} total {{this.activeTabLocalized}} tasks}}</ng-container>
@if (selectedTasks.size > 0) {
<ng-container i18n>&nbsp;({{selectedTasks.size}} selected)</ng-container>
}
</div>
}
@if (tasks.length > pageSize) {
<ngb-pagination [(page)]="page" [pageSize]="pageSize" [collectionSize]="tasks.length" maxSize="8" size="sm"></ngb-pagination>
}
</tbody>
</table>
</div>
</ng-template>
<ul ngbNav #nav="ngbNav" [(activeId)]="activeTab" class="nav-tabs" (hidden)="duringTabChange()" (navChange)="beforeTabChange()">
<li ngbNavItem="failed">
<a ngbNavLink i18n>Failed@if (tasksService.failedFileTasks.length > 0) {
<span class="badge bg-danger ms-2">{{tasksService.failedFileTasks.length}}</span>
}</a>
<ng-template ngbNavContent>
<ng-container [ngTemplateOutlet]="tasksTemplate" [ngTemplateOutletContext]="{tasks:currentTasks}"></ng-container>
</ng-template>
</li>
<li ngbNavItem="completed">
<a ngbNavLink i18n>Complete@if (tasksService.completedFileTasks.length > 0) {
<span class="badge bg-secondary ms-2">{{tasksService.completedFileTasks.length}}</span>
}</a>
<ng-template ngbNavContent>
<ng-container [ngTemplateOutlet]="tasksTemplate" [ngTemplateOutletContext]="{tasks:currentTasks}"></ng-container>
</ng-template>
</li>
<li ngbNavItem="started">
<a ngbNavLink i18n>Started@if (tasksService.startedFileTasks.length > 0) {
<span class="badge bg-secondary ms-2">{{tasksService.startedFileTasks.length}}</span>
}</a>
<ng-template ngbNavContent>
<ng-container [ngTemplateOutlet]="tasksTemplate" [ngTemplateOutletContext]="{tasks:currentTasks}"></ng-container>
</ng-template>
</li>
<li ngbNavItem="queued">
<a ngbNavLink i18n>Queued@if (tasksService.queuedFileTasks.length > 0) {
<span class="badge bg-secondary ms-2">{{tasksService.queuedFileTasks.length}}</span>
}</a>
<ng-template ngbNavContent>
<ng-container [ngTemplateOutlet]="tasksTemplate" [ngTemplateOutletContext]="{tasks:currentTasks}"></ng-container>
</ng-template>
</li>
</ul>
<div [ngbNavOutlet]="nav"></div>
@if (visibleSections.length > 0) {
@for (section of visibleSections; track section) {
<div class="mb-4">
<ng-container [ngTemplateOutlet]="tasksTemplate" [ngTemplateOutletContext]="{tasks: tasksForSection(section), section: section}"></ng-container>
</div>
}
} @else {
<div class="alert alert-secondary fst-italic" i18n>No tasks match the current filters.</div>
}
@@ -37,3 +37,7 @@ pre {
.z-10 {
z-index: 10;
}
tbody tr:nth-last-child(2) td {
border-bottom: none !important;
}
@@ -9,19 +9,14 @@ import { FormsModule } from '@angular/forms'
import { By } from '@angular/platform-browser'
import { Router } from '@angular/router'
import { RouterTestingModule } from '@angular/router/testing'
import {
NgbModal,
NgbModalRef,
NgbModule,
NgbNavItem,
} from '@ng-bootstrap/ng-bootstrap'
import { NgbModal, NgbModalRef, NgbModule } from '@ng-bootstrap/ng-bootstrap'
import { allIcons, NgxBootstrapIconsModule } from 'ngx-bootstrap-icons'
import { throwError } from 'rxjs'
import { routes } from 'src/app/app-routing.module'
import {
PaperlessTask,
PaperlessTaskName,
PaperlessTaskStatus,
PaperlessTaskTriggerSource,
PaperlessTaskType,
} from 'src/app/data/paperless-task'
import { IfPermissionsDirective } from 'src/app/directives/if-permissions.directive'
@@ -33,87 +28,123 @@ import { ToastService } from 'src/app/services/toast.service'
import { environment } from 'src/environments/environment'
import { ConfirmDialogComponent } from '../../common/confirm-dialog/confirm-dialog.component'
import { PageHeaderComponent } from '../../common/page-header/page-header.component'
import { TasksComponent, TaskTab } from './tasks.component'
import { TasksComponent, TaskSection } from './tasks.component'
const tasks: PaperlessTask[] = [
{
id: 467,
task_id: '11ca1a5b-9f81-442c-b2c8-7e4ae53657f1',
task_file_name: 'test.pdf',
input_data: { filename: 'test.pdf' },
date_created: new Date('2023-03-01T10:26:03.093116Z'),
date_done: new Date('2023-03-01T10:26:07.223048Z'),
type: PaperlessTaskType.Auto,
task_name: PaperlessTaskName.ConsumeFile,
status: PaperlessTaskStatus.Failed,
result: 'test.pd: Not consuming test.pdf: It is a duplicate of test (#100)',
task_type: PaperlessTaskType.ConsumeFile,
task_type_display: 'Consume File',
trigger_source: PaperlessTaskTriggerSource.FolderConsume,
trigger_source_display: 'Folder Consume',
status: PaperlessTaskStatus.Failure,
status_display: 'Failure',
result_data: {
error_message:
'test.pd: Not consuming test.pdf: It is a duplicate of test (#100)',
},
acknowledged: false,
related_document: null,
related_document_ids: [],
},
{
id: 466,
task_id: '10ca1a5b-3c08-442c-b2c8-7e4ae53657f1',
task_file_name: '191092.pdf',
input_data: { filename: '191092.pdf' },
date_created: new Date('2023-03-01T09:26:03.093116Z'),
date_done: new Date('2023-03-01T09:26:07.223048Z'),
type: PaperlessTaskType.Auto,
task_name: PaperlessTaskName.ConsumeFile,
status: PaperlessTaskStatus.Failed,
result:
'191092.pd: Not consuming 191092.pdf: It is a duplicate of 191092 (#311)',
task_type: PaperlessTaskType.ConsumeFile,
task_type_display: 'Consume File',
trigger_source: PaperlessTaskTriggerSource.FolderConsume,
trigger_source_display: 'Folder Consume',
status: PaperlessTaskStatus.Failure,
status_display: 'Failure',
result_data: { duplicate_of: 311 },
acknowledged: false,
related_document: null,
related_document_ids: [],
},
{
id: 465,
task_id: '3612d477-bb04-44e3-985b-ac580dd496d8',
task_file_name: 'Scan Jun 6, 2023 at 3.19 PM.pdf',
input_data: { filename: 'Scan Jun 6, 2023 at 3.19 PM.pdf' },
date_created: new Date('2023-06-06T15:22:05.722323-07:00'),
date_done: new Date('2023-06-06T15:22:14.564305-07:00'),
type: PaperlessTaskType.Auto,
task_name: PaperlessTaskName.ConsumeFile,
task_type: PaperlessTaskType.ConsumeFile,
task_type_display: 'Consume File',
trigger_source: PaperlessTaskTriggerSource.FolderConsume,
trigger_source_display: 'Folder Consume',
status: PaperlessTaskStatus.Pending,
result: null,
status_display: 'Pending',
result_data: null,
acknowledged: false,
related_document: null,
related_document_ids: [],
},
{
id: 464,
task_id: '2eac4716-2aa6-4dcd-9953-264e11656d7e',
task_file_name: 'paperless-mail-l4dkg8ir',
input_data: { filename: 'paperless-mail-l4dkg8ir' },
date_created: new Date('2023-06-04T11:24:32.898089-07:00'),
date_done: new Date('2023-06-04T11:24:44.678605-07:00'),
type: PaperlessTaskType.Auto,
task_name: PaperlessTaskName.ConsumeFile,
status: PaperlessTaskStatus.Complete,
result: 'Success. New document id 422 created',
task_type: PaperlessTaskType.ConsumeFile,
task_type_display: 'Consume File',
trigger_source: PaperlessTaskTriggerSource.EmailConsume,
trigger_source_display: 'Email Consume',
status: PaperlessTaskStatus.Success,
status_display: 'Success',
result_data: { document_id: 422, duplicate_of: 99 },
acknowledged: false,
related_document: 422,
related_document_ids: [422],
},
{
id: 463,
task_id: '28125528-1575-4d6b-99e6-168906e8fa5c',
task_file_name: 'onlinePaymentSummary.pdf',
input_data: { filename: 'onlinePaymentSummary.pdf' },
date_created: new Date('2023-06-01T13:49:51.631305-07:00'),
date_done: new Date('2023-06-01T13:49:54.190220-07:00'),
type: PaperlessTaskType.Auto,
task_name: PaperlessTaskName.ConsumeFile,
status: PaperlessTaskStatus.Complete,
result: 'Success. New document id 421 created',
task_type: PaperlessTaskType.ConsumeFile,
task_type_display: 'Consume File',
trigger_source: PaperlessTaskTriggerSource.FolderConsume,
trigger_source_display: 'Folder Consume',
status: PaperlessTaskStatus.Success,
status_display: 'Success',
result_data: { document_id: 421 },
acknowledged: false,
related_document: 421,
related_document_ids: [421],
},
{
id: 462,
task_id: 'a5b9ca47-0c8e-490f-a04c-6db5d5fc09e5',
task_file_name: 'paperless-mail-_rrpmqk6',
input_data: { filename: 'paperless-mail-_rrpmqk6' },
date_created: new Date('2023-06-07T02:54:35.694916Z'),
date_done: null,
type: PaperlessTaskType.Auto,
task_name: PaperlessTaskName.ConsumeFile,
task_type: PaperlessTaskType.ConsumeFile,
task_type_display: 'Consume File',
trigger_source: PaperlessTaskTriggerSource.EmailConsume,
trigger_source_display: 'Email Consume',
status: PaperlessTaskStatus.Started,
result: null,
status_display: 'Started',
result_data: null,
acknowledged: false,
related_document: null,
related_document_ids: [],
},
{
id: 461,
task_id: 'bb79efb3-1e78-4f31-b4be-0966620b0ce1',
input_data: { dry_run: false, scope: 'global' },
date_created: new Date('2023-06-07T03:54:35.694916Z'),
date_done: null,
task_type: PaperlessTaskType.SanityCheck,
task_type_display: 'Sanity Check',
trigger_source: PaperlessTaskTriggerSource.System,
trigger_source_display: 'System',
status: PaperlessTaskStatus.Started,
status_display: 'Started',
result_data: { issues_found: 0 },
acknowledged: false,
related_document_ids: [],
},
]
@@ -166,59 +197,142 @@ describe('TasksComponent', () => {
jest.useFakeTimers()
fixture.detectChanges()
httpTestingController
.expectOne(
`${environment.apiBaseUrl}tasks/?task_name=consume_file&acknowledged=false`
)
.expectOne(`${environment.apiBaseUrl}tasks/?acknowledged=false`)
.flush(tasks)
})
it('should display file tasks in 4 tabs by status', () => {
const tabButtons = fixture.debugElement.queryAll(By.directive(NgbNavItem))
it('should display task sections with counts', () => {
expect(component.selectedSection).toBe(TaskSection.All)
expect(component.selectedTaskType).toBeNull()
expect(component.selectedTriggerSource).toBeNull()
let currentTasksLength = tasks.filter(
(t) => t.status === PaperlessTaskStatus.Failed
).length
component.activeTab = TaskTab.Failed
fixture.detectChanges()
expect(tabButtons[0].nativeElement.textContent).toEqual(
`Failed${currentTasksLength}`
)
expect(
fixture.debugElement.queryAll(By.css('table input[type="checkbox"]'))
).toHaveLength(currentTasksLength + 1)
currentTasksLength = tasks.filter(
(t) => t.status === PaperlessTaskStatus.Complete
).length
component.activeTab = TaskTab.Completed
fixture.detectChanges()
expect(tabButtons[1].nativeElement.textContent).toEqual(
`Complete${currentTasksLength}`
)
const viewScope = fixture.debugElement.query(By.css('.task-view-scope'))
const text = viewScope.nativeElement.textContent
currentTasksLength = tasks.filter(
(t) => t.status === PaperlessTaskStatus.Started
).length
component.activeTab = TaskTab.Started
fixture.detectChanges()
expect(tabButtons[2].nativeElement.textContent).toEqual(
`Started${currentTasksLength}`
)
expect(text).toContain('All')
expect(text).toContain('Needs attention')
expect(text).toContain('2')
expect(text).toContain('In progress')
expect(text).toContain('3')
expect(text).toContain('Recently completed')
})
currentTasksLength = tasks.filter(
(t) => t.status === PaperlessTaskStatus.Pending
).length
component.activeTab = TaskTab.Queued
it('should filter visible sections by selected status', () => {
component.setSection(TaskSection.InProgress)
fixture.detectChanges()
expect(tabButtons[3].nativeElement.textContent).toEqual(
`Queued${currentTasksLength}`
expect(component.visibleSections).toEqual([TaskSection.InProgress])
expect(fixture.nativeElement.textContent).toContain('In progress')
expect(fixture.nativeElement.textContent).not.toContain('Recent completed')
})
it('should filter tasks by task type', () => {
component.setSection(TaskSection.InProgress)
component.setTaskType(PaperlessTaskType.SanityCheck)
expect(component.tasksForSection(TaskSection.InProgress)).toHaveLength(1)
expect(component.tasksForSection(TaskSection.InProgress)[0].task_type).toBe(
PaperlessTaskType.SanityCheck
)
})
it('should to go page 1 between tab switch', () => {
component.page = 10
component.duringTabChange()
expect(component.page).toEqual(1)
it('should filter tasks by trigger source', () => {
component.setSection(TaskSection.InProgress)
component.setTriggerSource(PaperlessTaskTriggerSource.EmailConsume)
expect(component.tasksForSection(TaskSection.InProgress)).toHaveLength(1)
expect(
component.tasksForSection(TaskSection.InProgress)[0].trigger_source
).toBe(PaperlessTaskTriggerSource.EmailConsume)
})
it('should reset all active filters together', () => {
component.setSection(TaskSection.InProgress)
component.setTaskType(PaperlessTaskType.SanityCheck)
component.setTriggerSource(PaperlessTaskTriggerSource.System)
component.filterText = 'system'
jest.advanceTimersByTime(150)
expect(component.isFiltered).toBe(true)
component.resetFilters()
expect(component.selectedSection).toBe(TaskSection.InProgress)
expect(component.selectedTaskType).toBeNull()
expect(component.selectedTriggerSource).toBeNull()
expect(component.filterText).toBe('')
expect(component.isFiltered).toBe(false)
})
it('should keep header controls focused on actions and auto refresh', () => {
fixture.detectChanges()
const header = fixture.debugElement.query(By.css('pngx-page-header'))
const headerText = header.nativeElement.textContent
expect(headerText).toContain('Dismiss visible')
expect(headerText).toContain('Auto refresh')
expect(headerText).not.toContain('All types')
expect(headerText).not.toContain('All sources')
expect(headerText).not.toContain('Reset filters')
})
it('should render the view scope row above the filter bar', () => {
fixture.detectChanges()
const controls = fixture.debugElement.query(By.css('.task-controls'))
const viewScope = controls.query(By.css('.task-view-scope'))
const search = controls.query(By.css('.task-search'))
expect(viewScope).not.toBeNull()
expect(search).not.toBeNull()
expect(
viewScope.nativeElement.compareDocumentPosition(search.nativeElement) &
Node.DOCUMENT_POSITION_FOLLOWING
).toBeTruthy()
})
it('should expose stable task type options and disable empty ones', () => {
expect(component.taskTypeOptions.map((option) => option.value)).toContain(
PaperlessTaskType.TrainClassifier
)
expect(
component.isTaskTypeOptionDisabled(PaperlessTaskType.TrainClassifier)
).toBe(true)
expect(
component.isTaskTypeOptionDisabled(PaperlessTaskType.ConsumeFile)
).toBe(false)
})
it('should fall back to the raw selected task type label when no option matches', () => {
component.selectedTaskType = 'unknown_task_type' as PaperlessTaskType
expect(component.selectedTaskTypeLabel).toBe('unknown_task_type')
})
it('should expose stable trigger source options and disable empty ones', () => {
expect(
component.triggerSourceOptions.map((option) => option.value)
).toContain(PaperlessTaskTriggerSource.ApiUpload)
expect(
component.isTriggerSourceOptionDisabled(
PaperlessTaskTriggerSource.ApiUpload
)
).toBe(true)
expect(
component.isTriggerSourceOptionDisabled(
PaperlessTaskTriggerSource.EmailConsume
)
).toBe(false)
})
it('should fall back to the raw selected trigger source label when no option matches', () => {
component.selectedTriggerSource =
'unknown_trigger_source' as PaperlessTaskTriggerSource
expect(component.selectedTriggerSourceLabel).toBe('unknown_trigger_source')
})
it('should support expanding / collapsing one task at a time', () => {
@@ -230,6 +344,31 @@ describe('TasksComponent', () => {
expect(component.expandedTask).toBeUndefined()
})
it('should show structured task details when expanded', () => {
component.setSection(TaskSection.InProgress)
component.expandTask(tasks[6])
fixture.detectChanges()
const detailText = fixture.nativeElement.textContent
expect(detailText).toContain('Input data')
expect(detailText).toContain('Result data')
expect(detailText).toContain('"scope": "global"')
expect(detailText).toContain('"issues_found": 0')
})
it('should show duplicate warnings and duplicate details when present', () => {
component.setSection(TaskSection.Completed)
component.expandTask(tasks[3])
fixture.detectChanges()
const content = fixture.nativeElement.textContent
expect(content).toContain('Duplicate of document #99')
expect(content).toContain('Duplicate')
expect(content).toContain('Open')
})
it('should support dismiss single task', () => {
const dismissSpy = jest.spyOn(tasksService, 'dismissTasks')
component.dismissTask(tasks[0])
@@ -240,7 +379,7 @@ describe('TasksComponent', () => {
component.toggleSelected(tasks[0])
component.toggleSelected(tasks[1])
component.toggleSelected(tasks[3])
component.toggleSelected(tasks[3]) // uncheck, for coverage
component.toggleSelected(tasks[3])
const selected = new Set([tasks[0].id, tasks[1].id])
expect(component.selectedTasks).toEqual(selected)
let modal: NgbModalRef
@@ -289,31 +428,50 @@ describe('TasksComponent', () => {
expect(component.selectedTasks.size).toBe(0)
})
it('should support dismiss all tasks', () => {
it('should support dismiss visible tasks', () => {
component.setSection(TaskSection.NeedsAttention)
let modal: NgbModalRef
modalService.activeInstances.subscribe((m) => (modal = m[m.length - 1]))
const dismissSpy = jest.spyOn(tasksService, 'dismissTasks')
component.dismissTasks()
expect(modal).not.toBeUndefined()
modal.componentInstance.confirmClicked.emit()
expect(dismissSpy).toHaveBeenCalledWith(new Set(tasks.map((t) => t.id)))
expect(dismissSpy).toHaveBeenCalledWith(new Set([467, 466]))
})
it('should support toggle all tasks', () => {
it('should dismiss the currently visible scoped and filtered tasks', () => {
component.setSection(TaskSection.InProgress)
component.setTaskType(PaperlessTaskType.SanityCheck)
component.setTriggerSource(PaperlessTaskTriggerSource.System)
const dismissSpy = jest.spyOn(tasksService, 'dismissTasks')
component.dismissTasks()
expect(dismissSpy).toHaveBeenCalledWith(new Set([461]))
})
it('should support toggling a full section', () => {
component.setSection(TaskSection.NeedsAttention)
fixture.detectChanges()
const toggleCheck = fixture.debugElement.query(
By.css('table input[type=checkbox]')
)
toggleCheck.nativeElement.dispatchEvent(new MouseEvent('click'))
fixture.detectChanges()
expect(component.selectedTasks).toEqual(
new Set(
tasks
.filter((t) => t.status === PaperlessTaskStatus.Failed)
.map((t) => t.id)
)
By.css('#all-tasks-needs_attention')
)
expect(toggleCheck).not.toBeNull()
toggleCheck.nativeElement.dispatchEvent(new MouseEvent('click'))
fixture.detectChanges()
expect(component.selectedTasks).toEqual(new Set([467, 466]))
})
it('should remove a full section from selection when toggled off', () => {
component.setSection(TaskSection.NeedsAttention)
component.selectedTasks = new Set([467, 466])
component.toggleSection(TaskSection.NeedsAttention, {
target: { checked: false },
} as PointerEvent)
expect(component.selectedTasks).toEqual(new Set())
})
@@ -322,7 +480,7 @@ describe('TasksComponent', () => {
component.dismissAndGo(tasks[3])
expect(routerSpy).toHaveBeenCalledWith([
'documents',
tasks[3].related_document,
tasks[3].related_document_ids?.[0],
])
})
@@ -336,57 +494,127 @@ describe('TasksComponent', () => {
})
it('should filter tasks by file name', () => {
fixture.detectChanges()
const input = fixture.debugElement.query(
By.css('pngx-page-header input[type=text]')
By.css('.task-search input[type=text]')
)
expect(input).not.toBeNull()
input.nativeElement.value = '191092'
input.nativeElement.dispatchEvent(new Event('input'))
jest.advanceTimersByTime(150) // debounce time
jest.advanceTimersByTime(150)
fixture.detectChanges()
expect(component.filterText).toEqual('191092')
expect(
fixture.debugElement.queryAll(By.css('table tbody tr')).length
).toEqual(2) // 1 task x 2 lines
expect(component.tasksForSection(TaskSection.NeedsAttention)).toHaveLength(
1
)
})
it('should match task type and source in name filtering', () => {
component.setSection(TaskSection.InProgress)
component.filterText = 'system'
jest.advanceTimersByTime(150)
expect(component.tasksForSection(TaskSection.InProgress)).toHaveLength(1)
expect(component.tasksForSection(TaskSection.InProgress)[0].task_type).toBe(
PaperlessTaskType.SanityCheck
)
})
it('should fall back to task type when filename is unavailable', () => {
component.setSection(TaskSection.InProgress)
fixture.detectChanges()
const nameColumn = fixture.debugElement.queryAll(
By.css('tbody td.name-col')
)
const sanityTaskRow = nameColumn.find((cell) =>
cell.nativeElement.textContent.includes('Sanity Check')
)
expect(sanityTaskRow.nativeElement.textContent).toContain('Sanity Check')
expect(sanityTaskRow.nativeElement.textContent).toContain('System')
})
it('should filter tasks by result', () => {
component.activeTab = TaskTab.Failed
fixture.detectChanges()
component.setSection(TaskSection.NeedsAttention)
component.filterTargetID = 1
fixture.detectChanges()
const input = fixture.debugElement.query(
By.css('pngx-page-header input[type=text]')
By.css('.task-search input[type=text]')
)
expect(input).not.toBeNull()
input.nativeElement.value = 'duplicate'
input.nativeElement.dispatchEvent(new Event('input'))
jest.advanceTimersByTime(150) // debounce time
jest.advanceTimersByTime(150)
fixture.detectChanges()
expect(component.filterText).toEqual('duplicate')
expect(component.tasksForSection(TaskSection.NeedsAttention)).toHaveLength(
2
)
})
it('should prefer explicit reason in the result message', () => {
expect(
fixture.debugElement.queryAll(By.css('table tbody tr')).length
).toEqual(4) // 2 tasks x 2 lines
component.taskResultMessage({
...tasks[0],
result_data: { reason: 'Manual review required', duplicate_of: 311 },
})
).toBe('Manual review required')
})
it('should return null preview and popover text when there is no result message', () => {
expect(component.taskResultPreview(tasks[2])).toBeNull()
expect(component.taskResultPopoverMessage(tasks[2])).toBe('')
expect(component.taskResultMessageOverflowsPopover(tasks[2])).toBe(false)
})
it('should navigate to a duplicate document details page', () => {
const routerSpy = jest.spyOn(router, 'navigate')
component.openDuplicateDocument(99)
expect(routerSpy).toHaveBeenCalledWith(['documents', 99, 'details'])
})
it('should report when a result message overflows the popover limit', () => {
const longMessage = 'x'.repeat(350)
const task = {
...tasks[0],
result_data: { error_message: longMessage },
}
expect(component.taskResultPopoverMessage(task)).toBe(
longMessage.slice(0, 300)
)
expect(component.taskResultMessageOverflowsPopover(task)).toBe(true)
})
it('should support keyboard events for filtering', () => {
fixture.detectChanges()
const input = fixture.debugElement.query(
By.css('pngx-page-header input[type=text]')
By.css('.task-search input[type=text]')
)
expect(input).not.toBeNull()
input.nativeElement.value = '191092'
input.nativeElement.dispatchEvent(
new KeyboardEvent('keyup', { key: 'Enter' })
)
expect(component.filterText).toEqual('191092') // no debounce needed
expect(component.filterText).toEqual('191092')
input.nativeElement.dispatchEvent(
new KeyboardEvent('keyup', { key: 'Escape' })
)
expect(component.filterText).toEqual('')
})
it('should reset filter and target on tab switch', () => {
component.filterText = '191092'
component.filterTargetID = 1
component.activeTab = TaskTab.Completed
component.beforeTabChange()
expect(component.filterText).toEqual('')
expect(component.filterTargetID).toEqual(0)
it('should keep clearing selection independent from resetting filters', () => {
component.setTaskType(PaperlessTaskType.ConsumeFile)
component.toggleSelected(tasks[0])
expect(component.selectedTasks.size).toBe(1)
component.clearSelection()
expect(component.selectedTasks.size).toBe(0)
expect(component.selectedTaskType).toBe(PaperlessTaskType.ConsumeFile)
expect(component.isFiltered).toBe(true)
})
})
@@ -1,4 +1,4 @@
import { NgTemplateOutlet, SlicePipe } from '@angular/common'
import { JsonPipe, NgTemplateOutlet } from '@angular/common'
import { Component, inject, OnDestroy, OnInit } from '@angular/core'
import { FormsModule, ReactiveFormsModule } from '@angular/forms'
import { Router } from '@angular/router'
@@ -6,8 +6,6 @@ import {
NgbCollapseModule,
NgbDropdownModule,
NgbModal,
NgbNavModule,
NgbPaginationModule,
NgbPopoverModule,
} from '@ng-bootstrap/ng-bootstrap'
import { NgxBootstrapIconsModule } from 'ngx-bootstrap-icons'
@@ -20,7 +18,12 @@ import {
takeUntil,
timer,
} from 'rxjs'
import { PaperlessTask } from 'src/app/data/paperless-task'
import {
PaperlessTask,
PaperlessTaskStatus,
PaperlessTaskTriggerSource,
PaperlessTaskType,
} from 'src/app/data/paperless-task'
import { IfPermissionsDirective } from 'src/app/directives/if-permissions.directive'
import { CustomDatePipe } from 'src/app/pipes/custom-date.pipe'
import { TasksService } from 'src/app/services/tasks.service'
@@ -29,11 +32,11 @@ import { ConfirmDialogComponent } from '../../common/confirm-dialog/confirm-dial
import { PageHeaderComponent } from '../../common/page-header/page-header.component'
import { LoadingComponentWithPermissions } from '../../loading-component/loading.component'
export enum TaskTab {
Queued = 'queued',
Started = 'started',
export enum TaskSection {
All = 'all',
NeedsAttention = 'needs_attention',
InProgress = 'in_progress',
Completed = 'completed',
Failed = 'failed',
}
enum TaskFilterTargetID {
@@ -46,6 +49,82 @@ const FILTER_TARGETS = [
{ id: TaskFilterTargetID.Result, name: $localize`Result` },
]
const SECTION_LABELS = {
[TaskSection.All]: $localize`All`,
[TaskSection.NeedsAttention]: $localize`Needs attention`,
[TaskSection.InProgress]: $localize`In progress`,
[TaskSection.Completed]: $localize`Recently completed`,
}
const TASK_TYPE_OPTIONS: Array<{
value: PaperlessTaskType
label: string
}> = [
{
value: PaperlessTaskType.ConsumeFile,
label: $localize`Consume File`,
},
{
value: PaperlessTaskType.TrainClassifier,
label: $localize`Train Classifier`,
},
{
value: PaperlessTaskType.SanityCheck,
label: $localize`Sanity Check`,
},
{ value: PaperlessTaskType.MailFetch, label: $localize`Mail Fetch` },
{ value: PaperlessTaskType.LlmIndex, label: $localize`LLM Index` },
{
value: PaperlessTaskType.EmptyTrash,
label: $localize`Empty Trash`,
},
{
value: PaperlessTaskType.CheckWorkflows,
label: $localize`Check Workflows`,
},
{
value: PaperlessTaskType.BulkUpdate,
label: $localize`Bulk Update`,
},
{
value: PaperlessTaskType.ReprocessDocument,
label: $localize`Reprocess Document`,
},
{
value: PaperlessTaskType.BuildShareLink,
label: $localize`Build Share Link`,
},
{
value: PaperlessTaskType.BulkDelete,
label: $localize`Bulk Delete`,
},
]
const TRIGGER_SOURCE_OPTIONS: Array<{
value: PaperlessTaskTriggerSource
label: string
}> = [
{
value: PaperlessTaskTriggerSource.Scheduled,
label: $localize`Scheduled`,
},
{ value: PaperlessTaskTriggerSource.WebUI, label: $localize`Web UI` },
{
value: PaperlessTaskTriggerSource.ApiUpload,
label: $localize`API Upload`,
},
{
value: PaperlessTaskTriggerSource.FolderConsume,
label: $localize`Folder Consume`,
},
{
value: PaperlessTaskTriggerSource.EmailConsume,
label: $localize`Email Consume`,
},
{ value: PaperlessTaskTriggerSource.System, label: $localize`System` },
{ value: PaperlessTaskTriggerSource.Manual, label: $localize`Manual` },
]
@Component({
selector: 'pngx-tasks',
templateUrl: './tasks.component.html',
@@ -54,14 +133,12 @@ const FILTER_TARGETS = [
PageHeaderComponent,
IfPermissionsDirective,
CustomDatePipe,
SlicePipe,
JsonPipe,
FormsModule,
ReactiveFormsModule,
NgTemplateOutlet,
NgbCollapseModule,
NgbDropdownModule,
NgbNavModule,
NgbPaginationModule,
NgbPopoverModule,
NgxBootstrapIconsModule,
],
@@ -75,15 +152,18 @@ export class TasksComponent
private readonly router = inject(Router)
private readonly toastService = inject(ToastService)
public activeTab: TaskTab
readonly TaskSection = TaskSection
readonly sections = [
TaskSection.NeedsAttention,
TaskSection.InProgress,
TaskSection.Completed,
]
public selectedTasks: Set<number> = new Set()
public togggleAll: boolean = false
public expandedTask: number
public pageSize: number = 25
public page: number = 1
public autoRefreshEnabled: boolean = true
public selectedSection: TaskSection = TaskSection.All
public selectedTaskType: PaperlessTaskType | null = null
public selectedTriggerSource: PaperlessTaskTriggerSource | null = null
private _filterText: string = ''
get filterText() {
@@ -95,20 +175,81 @@ export class TasksComponent
public filterTargetID: TaskFilterTargetID = TaskFilterTargetID.Name
public get filterTargetName(): string {
return this.filterTargets.find((t) => t.id == this.filterTargetID).name
return FILTER_TARGETS.find((t) => t.id == this.filterTargetID).name
}
private filterDebounce: Subject<string> = new Subject<string>()
public get filterTargets(): Array<{ id: number; name: string }> {
return [TaskTab.Failed, TaskTab.Completed].includes(this.activeTab)
? FILTER_TARGETS
: FILTER_TARGETS.slice(0, 1)
return FILTER_TARGETS
}
public get taskTypeOptions(): Array<{
value: PaperlessTaskType
label: string
}> {
return TASK_TYPE_OPTIONS
}
public get triggerSourceOptions(): Array<{
value: PaperlessTaskTriggerSource
label: string
}> {
return TRIGGER_SOURCE_OPTIONS
}
public get selectedTaskTypeLabel(): string {
if (this.selectedTaskType === null) {
return $localize`All types`
}
return (
this.taskTypeOptions.find(
(option) => option.value === this.selectedTaskType
)?.label ?? this.selectedTaskType
)
}
public get selectedTriggerSourceLabel(): string {
if (this.selectedTriggerSource === null) {
return $localize`All sources`
}
return (
this.triggerSourceOptions.find(
(option) => option.value === this.selectedTriggerSource
)?.label ?? this.selectedTriggerSource
)
}
get dismissButtonText(): string {
return this.selectedTasks.size > 0
? $localize`Dismiss selected`
: $localize`Dismiss all`
: $localize`Dismiss visible`
}
get visibleSections(): TaskSection[] {
const sections =
this.selectedSection === TaskSection.All
? this.sections
: [this.selectedSection]
return sections.filter(
(section) => this.tasksForSection(section).length > 0
)
}
get visibleTasks(): PaperlessTask[] {
return this.visibleSections.flatMap((section) =>
this.tasksForSection(section)
)
}
get isFiltered(): boolean {
return (
this.selectedTaskType !== null ||
this.selectedTriggerSource !== null ||
this._filterText.length > 0
)
}
ngOnInit() {
@@ -143,14 +284,16 @@ export class TasksComponent
dismissTasks(task: PaperlessTask = undefined) {
let tasks = task ? new Set([task.id]) : new Set(this.selectedTasks.values())
if (!task && tasks.size == 0)
tasks = new Set(this.tasksService.allFileTasks.map((t) => t.id))
if (!task && tasks.size == 0) {
tasks = new Set(this.visibleTasks.map((t) => t.id))
}
if (tasks.size > 1) {
let modal = this.modalService.open(ConfirmDialogComponent, {
backdrop: 'static',
})
modal.componentInstance.title = $localize`Confirm Dismiss All`
modal.componentInstance.messageBold = $localize`Dismiss all ${tasks.size} tasks?`
modal.componentInstance.title = $localize`Confirm Dismiss`
modal.componentInstance.messageBold = $localize`Dismiss ${tasks.size} tasks?`
modal.componentInstance.btnClass = 'btn-warning'
modal.componentInstance.btnCaption = $localize`Dismiss`
modal.componentInstance.confirmClicked.pipe(first()).subscribe(() => {
@@ -164,7 +307,7 @@ export class TasksComponent
})
this.clearSelection()
})
} else {
} else if (tasks.size === 1) {
this.tasksService.dismissTasks(tasks).subscribe({
error: (e) =>
this.toastService.showError($localize`Error dismissing task`, e),
@@ -175,7 +318,7 @@ export class TasksComponent
dismissAndGo(task: PaperlessTask) {
this.dismissTask(task)
this.router.navigate(['documents', task.related_document])
this.router.navigate(['documents', task.related_document_ids?.[0]])
}
expandTask(task: PaperlessTask) {
@@ -188,75 +331,167 @@ export class TasksComponent
: this.selectedTasks.add(task.id)
}
get currentTasks(): PaperlessTask[] {
let tasks: PaperlessTask[] = []
switch (this.activeTab) {
case TaskTab.Queued:
tasks = this.tasksService.queuedFileTasks
break
case TaskTab.Started:
tasks = this.tasksService.startedFileTasks
break
case TaskTab.Completed:
tasks = this.tasksService.completedFileTasks
break
case TaskTab.Failed:
tasks = this.tasksService.failedFileTasks
break
toggleSection(section: TaskSection, event: PointerEvent) {
const sectionTasks = this.tasksForSection(section)
if ((event.target as HTMLInputElement).checked) {
sectionTasks.forEach((task) => this.selectedTasks.add(task.id))
} else {
sectionTasks.forEach((task) => this.selectedTasks.delete(task.id))
}
if (this._filterText.length) {
tasks = tasks.filter((t) => {
if (this.filterTargetID == TaskFilterTargetID.Name) {
return t.task_file_name
.toLowerCase()
.includes(this._filterText.toLowerCase())
} else if (this.filterTargetID == TaskFilterTargetID.Result) {
return t.result.toLowerCase().includes(this._filterText.toLowerCase())
}
})
}
return tasks
}
toggleAll(event: PointerEvent) {
if ((event.target as HTMLInputElement).checked) {
this.selectedTasks = new Set(this.currentTasks.map((t) => t.id))
} else {
this.clearSelection()
areAllSelected(tasks: PaperlessTask[]): boolean {
return (
tasks.length > 0 && tasks.every((task) => this.selectedTasks.has(task.id))
)
}
taskDisplayName(task: PaperlessTask): string {
return task.input_data?.filename?.toString() || task.task_type_display
}
taskShowsSeparateTypeLabel(task: PaperlessTask): boolean {
return this.taskDisplayName(task) !== task.task_type_display
}
taskResultMessage(task: PaperlessTask): string | null {
if (!task.result_data) {
return null
}
const documentId = task.result_data?.['document_id']
if (typeof documentId === 'number') {
return `Success. New document id ${documentId} created`
}
const reason = task.result_data?.['reason']
if (typeof reason === 'string') {
return reason
}
const duplicateOf = task.result_data?.['duplicate_of']
if (typeof duplicateOf === 'number') {
return `Duplicate of document #${duplicateOf}`
}
const errorMessage = task.result_data?.['error_message']
if (typeof errorMessage === 'string') {
return errorMessage
}
return null
}
taskResultPreview(task: PaperlessTask): string | null {
const message = this.taskResultMessage(task)
if (!message) {
return null
}
return message.length > 50 ? `${message.slice(0, 50)}...` : message
}
taskHasLongResultMessage(task: PaperlessTask): boolean {
return (this.taskResultMessage(task)?.length ?? 0) > 50
}
taskHasResultMessage(task: PaperlessTask): boolean {
return !!this.taskResultMessage(task)
}
duplicateDocumentId(task: PaperlessTask): number | null {
const duplicateOf = task.result_data?.['duplicate_of']
return typeof duplicateOf === 'number' ? duplicateOf : null
}
duplicateTaskLabel(task: PaperlessTask): string {
return $localize`Duplicate of document #${this.duplicateDocumentId(task)}`
}
openDuplicateDocument(documentId: number) {
this.router.navigate(['documents', documentId, 'details'])
}
taskResultPopoverMessage(task: PaperlessTask): string {
return this.taskResultMessage(task)?.slice(0, 300) ?? ''
}
taskResultMessageOverflowsPopover(task: PaperlessTask): boolean {
return (this.taskResultMessage(task)?.length ?? 0) > 300
}
tasksForSection(section: TaskSection): PaperlessTask[] {
let tasks = this.tasksService.allFileTasks.filter((task) =>
this.taskBelongsToSection(task, section)
)
return tasks.filter((task) => this.taskMatchesCurrentFilters(task))
}
sectionLabel(section: TaskSection): string {
return SECTION_LABELS[section]
}
sectionCount(section: TaskSection): number {
return this.tasksService.allFileTasks.filter((task) =>
this.taskBelongsToSection(task, section)
).length
}
sectionShowsResults(section: TaskSection): boolean {
return section !== TaskSection.InProgress
}
setSection(section: TaskSection) {
this.selectedSection = section
this.clearSelection()
}
setTaskType(taskType: PaperlessTaskType | null) {
this.selectedTaskType = taskType
this.clearSelection()
}
setTriggerSource(triggerSource: PaperlessTaskTriggerSource | null) {
this.selectedTriggerSource = triggerSource
this.clearSelection()
}
taskTypeOptionCount(taskType: PaperlessTaskType | null): number {
return this.tasksForOptionCounts({ taskType }).length
}
triggerSourceOptionCount(
triggerSource: PaperlessTaskTriggerSource | null
): number {
return this.tasksForOptionCounts({ triggerSource }).length
}
isTaskTypeOptionDisabled(taskType: PaperlessTaskType | null): boolean {
return this.taskTypeOptionCount(taskType) === 0
}
isTriggerSourceOptionDisabled(
triggerSource: PaperlessTaskTriggerSource | null
): boolean {
return this.triggerSourceOptionCount(triggerSource) === 0
}
clearSelection() {
this.togggleAll = false
this.selectedTasks.clear()
}
duringTabChange() {
this.page = 1
}
beforeTabChange() {
this.resetFilter()
this.filterTargetID = TaskFilterTargetID.Name
}
get activeTabLocalized(): string {
switch (this.activeTab) {
case TaskTab.Queued:
return $localize`queued`
case TaskTab.Started:
return $localize`started`
case TaskTab.Completed:
return $localize`completed`
case TaskTab.Failed:
return $localize`failed`
}
}
public resetFilter() {
this._filterText = ''
}
public resetFilters() {
this.selectedTaskType = null
this.selectedTriggerSource = null
this.resetFilter()
this.clearSelection()
}
filterInputKeyup(event: KeyboardEvent) {
if (event.key == 'Enter') {
this._filterText = (event.target as HTMLInputElement).value
@@ -264,4 +499,87 @@ export class TasksComponent
this.resetFilter()
}
}
private taskBelongsToSection(
task: PaperlessTask,
section: TaskSection
): boolean {
switch (section) {
case TaskSection.NeedsAttention:
return [
PaperlessTaskStatus.Failure,
PaperlessTaskStatus.Revoked,
].includes(task.status)
case TaskSection.InProgress:
return [
PaperlessTaskStatus.Pending,
PaperlessTaskStatus.Started,
].includes(task.status)
case TaskSection.Completed:
return task.status === PaperlessTaskStatus.Success
}
}
private taskMatchesCurrentFilters(task: PaperlessTask): boolean {
return this.taskMatchesFilters(task, {
taskType: this.selectedTaskType,
triggerSource: this.selectedTriggerSource,
})
}
private taskMatchesFilters(
task: PaperlessTask,
{
taskType,
triggerSource,
}: {
taskType: PaperlessTaskType | null
triggerSource: PaperlessTaskTriggerSource | null
}
): boolean {
if (taskType !== null && task.task_type !== taskType) {
return false
}
if (triggerSource !== null && task.trigger_source !== triggerSource) {
return false
}
if (!this._filterText.length) {
return true
}
const query = this._filterText.toLowerCase()
if (this.filterTargetID == TaskFilterTargetID.Name) {
return [
this.taskDisplayName(task),
task.task_type_display,
task.trigger_source_display,
]
.filter(Boolean)
.some((value) => value.toLowerCase().includes(query))
}
return this.taskResultMessage(task)?.toLowerCase().includes(query) ?? false
}
private tasksForOptionCounts({
taskType = this.selectedTaskType,
triggerSource = this.selectedTriggerSource,
}: {
taskType?: PaperlessTaskType | null
triggerSource?: PaperlessTaskTriggerSource | null
}): PaperlessTask[] {
const sections =
this.selectedSection === TaskSection.All
? this.sections
: [this.selectedSection]
return this.tasksService.allFileTasks.filter(
(task) =>
sections.some((section) => this.taskBelongsToSection(task, section)) &&
this.taskMatchesFilters(task, { taskType, triggerSource })
)
}
}
@@ -294,13 +294,13 @@
*pngxIfPermissions="{ action: PermissionAction.View, type: PermissionType.PaperlessTask }"
tourAnchor="tour.file-tasks">
<a class="nav-link" routerLink="tasks" routerLinkActive="active" (click)="closeMenu()"
ngbPopover="File Tasks" i18n-ngbPopover [disablePopover]="!slimSidebarEnabled" placement="end"
ngbPopover="Tasks" i18n-ngbPopover [disablePopover]="!slimSidebarEnabled" placement="end"
container="body" triggers="mouseenter:mouseleave" popoverClass="popover-slim">
<i-bs class="me-2" name="list-task"></i-bs><span><ng-container i18n>File Tasks</ng-container>@if (tasksService.failedFileTasks.length > 0) {
<span><span class="badge bg-danger ms-2 d-inline">{{tasksService.failedFileTasks.length}}</span></span>
<i-bs class="me-2" name="list-task"></i-bs><span><ng-container i18n>Tasks</ng-container>@if (tasksService.needsAttentionTasks.length > 0) {
<span><span class="badge bg-danger ms-2 d-inline">{{tasksService.needsAttentionTasks.length}}</span></span>
}</span>
@if (tasksService.failedFileTasks.length > 0 && slimSidebarEnabled) {
<span class="badge bg-danger position-absolute top-0 end-0 d-none d-md-block">{{tasksService.failedFileTasks.length}}</span>
@if (tasksService.needsAttentionTasks.length > 0 && slimSidebarEnabled) {
<span class="badge bg-danger position-absolute top-0 end-0 d-none d-md-block">{{tasksService.needsAttentionTasks.length}}</span>
}
</a>
</li>
@@ -94,12 +94,18 @@ main {
}
.sidebar.slim:not(.animating) {
transition: none;
li.nav-item span,
.sidebar-heading span {
display: none;
}
}
.sidebar.slim:not(.animating) ~ main.col-slim {
transition: none;
}
.sidebar.animating {
li.nav-item span,
.sidebar-heading span {
@@ -36,6 +36,7 @@ import { RemoteVersionService } from 'src/app/services/rest/remote-version.servi
import { SavedViewService } from 'src/app/services/rest/saved-view.service'
import { SearchService } from 'src/app/services/rest/search.service'
import { SettingsService } from 'src/app/services/settings.service'
import { TasksService } from 'src/app/services/tasks.service'
import { ToastService } from 'src/app/services/toast.service'
import { environment } from 'src/environments/environment'
import { ProfileEditDialogComponent } from '../common/profile-edit-dialog/profile-edit-dialog.component'
@@ -97,6 +98,7 @@ describe('AppFrameComponent', () => {
let savedViewSpy
let modalService: NgbModal
let maybeRefreshSpy
let tasksService: TasksService
beforeEach(async () => {
TestBed.configureTestingModule({
@@ -174,6 +176,7 @@ describe('AppFrameComponent', () => {
openDocumentsService = TestBed.inject(OpenDocumentsService)
modalService = TestBed.inject(NgbModal)
router = TestBed.inject(Router)
tasksService = TestBed.inject(TasksService)
jest
.spyOn(settingsService, 'displayName', 'get')
@@ -444,6 +447,16 @@ describe('AppFrameComponent', () => {
expect(maybeRefreshSpy).toHaveBeenCalled()
})
it('should show tasks badge for needs-attention tasks', () => {
jest
.spyOn(tasksService, 'needsAttentionTasks', 'get')
.mockReturnValue([{} as any, {} as any])
fixture.detectChanges()
expect(fixture.nativeElement.textContent).toContain('Tasks2')
})
it('should indicate attributes management availability when any permission is granted', () => {
jest
.spyOn(permissionsService, 'currentUserCan')
@@ -86,7 +86,7 @@
<div class="selected-icon">
@if (addedRelativeDate) {
<a class="text-light focus-variants" href="javascript:void(0)" (click)="clearAddedRelativeDate()">
<i-bs width="1em" height="1em" name="check" class="variant-unfocused"></i-bs>
<i-bs width="1em" height="1em" name="check" class="variant-unfocused text-dark"></i-bs>
<i-bs width="1em" height="1em" name="x" class="variant-focused text-primary"></i-bs>
</a>
}
@@ -26,6 +26,7 @@ import {
} from 'src/app/data/matching-model'
import { Tag } from 'src/app/data/tag'
import { SETTINGS_KEYS } from 'src/app/data/ui-settings'
import { PermissionsService } from 'src/app/services/permissions.service'
import { TagService } from 'src/app/services/rest/tag.service'
import { UserService } from 'src/app/services/rest/user.service'
import { SettingsService } from 'src/app/services/settings.service'
@@ -87,6 +88,7 @@ describe('EditDialogComponent', () => {
let component: TestComponent
let fixture: ComponentFixture<TestComponent>
let tagService: TagService
let permissionsService: PermissionsService
let settingsService: SettingsService
let activeModal: NgbActiveModal
let httpTestingController: HttpTestingController
@@ -118,8 +120,10 @@ describe('EditDialogComponent', () => {
}).compileComponents()
tagService = TestBed.inject(TagService)
permissionsService = TestBed.inject(PermissionsService)
settingsService = TestBed.inject(SettingsService)
settingsService.currentUser = currentUser
permissionsService.initialize([], currentUser as any)
activeModal = TestBed.inject(NgbActiveModal)
httpTestingController = TestBed.inject(HttpTestingController)
@@ -226,6 +230,25 @@ describe('EditDialogComponent', () => {
expect(updateSpy).toHaveBeenCalled()
})
it('should not submit owner or permissions for non-owner edits', () => {
component.object = tag
component.dialogMode = EditDialogMode.EDIT
component.ngOnInit()
component.objectForm.get('name').setValue('Updated tag')
component.save()
const req = httpTestingController.expectOne(
`${environment.apiBaseUrl}tags/${tag.id}/`
)
expect(req.request.method).toEqual('PUT')
expect(req.request.body.name).toEqual('Updated tag')
expect(req.request.body.owner).toEqual(tag.owner)
expect(req.request.body.set_permissions).toBeUndefined()
req.flush({})
})
it('should create an object on save in edit mode', () => {
const createSpy = jest.spyOn(tagService, 'create')
component.dialogMode = EditDialogMode.CREATE
@@ -18,6 +18,7 @@ import { ObjectWithId } from 'src/app/data/object-with-id'
import { ObjectWithPermissions } from 'src/app/data/object-with-permissions'
import { SETTINGS_KEYS } from 'src/app/data/ui-settings'
import { User } from 'src/app/data/user'
import { PermissionsService } from 'src/app/services/permissions.service'
import { AbstractPaperlessService } from 'src/app/services/rest/abstract-paperless-service'
import { UserService } from 'src/app/services/rest/user.service'
import { SettingsService } from 'src/app/services/settings.service'
@@ -42,6 +43,7 @@ export abstract class EditDialogComponent<
protected activeModal = inject(NgbActiveModal)
protected userService = inject(UserService)
protected settingsService = inject(SettingsService)
protected permissionsService = inject(PermissionsService)
users: User[]
@@ -69,10 +71,6 @@ export abstract class EditDialogComponent<
ngOnInit(): void {
if (this.object != null && this.dialogMode !== EditDialogMode.CREATE) {
if ((this.object as ObjectWithPermissions).permissions) {
this.object['set_permissions'] = this.object['permissions']
}
this.object['permissions_form'] = {
owner: (this.object as ObjectWithPermissions).owner,
set_permissions: (this.object as ObjectWithPermissions).permissions,
@@ -151,18 +149,28 @@ export abstract class EditDialogComponent<
return Object.assign({}, this.objectForm.value)
}
protected shouldSubmitPermissions(): boolean {
return (
this.dialogMode === EditDialogMode.CREATE ||
this.permissionsService.currentUserOwnsObject(this.object)
)
}
save() {
this.error = null
const formValues = this.getFormValues()
const permissionsObject: PermissionsFormObject =
this.objectForm.get('permissions_form')?.value
if (permissionsObject) {
if (permissionsObject && this.shouldSubmitPermissions()) {
formValues.owner = permissionsObject.owner
formValues.set_permissions = permissionsObject.set_permissions
delete formValues.permissions_form
}
delete formValues.permissions_form
var newObject = Object.assign(Object.assign({}, this.object), formValues)
if (!this.shouldSubmitPermissions()) {
delete newObject['set_permissions']
}
var serverResponse: Observable<T>
switch (this.dialogMode) {
case EditDialogMode.CREATE:
@@ -9,7 +9,6 @@ import { first } from 'rxjs'
import { EditDialogComponent } from 'src/app/components/common/edit-dialog/edit-dialog.component'
import { Group } from 'src/app/data/group'
import { User } from 'src/app/data/user'
import { PermissionsService } from 'src/app/services/permissions.service'
import { GroupService } from 'src/app/services/rest/group.service'
import { UserService } from 'src/app/services/rest/user.service'
import { SettingsService } from 'src/app/services/settings.service'
@@ -39,7 +38,6 @@ export class UserEditDialogComponent
implements OnInit
{
private toastService = inject(ToastService)
private permissionsService = inject(PermissionsService)
private groupsService: GroupService
groups: Group[]
@@ -205,6 +205,20 @@ describe('TagsComponent', () => {
expect(component.value).toEqual([2, 1])
})
it('should not duplicate parents when adding sibling nested tags', () => {
const root: Tag = { id: 1, name: 'root' }
const parent: Tag = { id: 2, name: 'parent', parent: 1 }
const leafA: Tag = { id: 3, name: 'leaf-a', parent: 2 }
const leafB: Tag = { id: 4, name: 'leaf-b', parent: 2 }
component.tags = [root, parent, leafA, leafB]
component.value = []
component.addTag(3)
component.addTag(4)
expect(component.value).toEqual([3, 2, 1, 4])
})
it('should return ancestors from root to parent using getParentChain', () => {
const root: Tag = { id: 1, name: 'root' }
const mid: Tag = { id: 2, name: 'mid', parent: 1 }
@@ -153,11 +153,13 @@ export class TagsComponent implements OnInit, ControlValueAccessor {
}
public onAdd(tag: Tag) {
if (tag.parent) {
if (tag?.parent) {
// add all parents recursively
const parent = this.getTag(tag.parent)
this.value = [...this.value, parent.id]
this.onAdd(parent)
if (parent && !this.value.includes(parent.id)) {
this.value = [...this.value, parent.id]
this.onAdd(parent)
}
}
}
@@ -133,28 +133,28 @@ describe('PermissionsSelectComponent', () => {
expect(viewInput.nativeElement.disabled).toBeFalsy()
})
it('should treat system status as view-only', () => {
it('should treat system monitoring as view-only', () => {
component.ngOnInit()
fixture.detectChanges()
expect(
component.isActionSupported(
PermissionType.SystemStatus,
PermissionType.SystemMonitoring,
PermissionAction.View
)
).toBeTruthy()
expect(
component.isActionSupported(
PermissionType.SystemStatus,
PermissionType.SystemMonitoring,
PermissionAction.Change
)
).toBeFalsy()
const changeInput = fixture.debugElement.query(
By.css('input#SystemStatus_Change')
By.css('input#SystemMonitoring_Change')
)
const viewInput = fixture.debugElement.query(
By.css('input#SystemStatus_View')
By.css('input#SystemMonitoring_View')
)
expect(changeInput.nativeElement.disabled).toBeTruthy()
@@ -261,7 +261,7 @@ export class PermissionsSelectComponent
// Global statistics and system status only support view
if (
type === PermissionType.GlobalStatistics ||
type === PermissionType.SystemStatus
type === PermissionType.SystemMonitoring
) {
return action === PermissionAction.View
}
@@ -159,25 +159,11 @@
<button class="btn btn-sm d-flex align-items-center btn-dark text-uppercase small" [ngbPopover]="indexStatus" triggers="click mouseenter:mouseleave">
{{status.tasks.index_status}}
@if (status.tasks.index_status === 'OK') {
@if (isStale(status.tasks.index_last_modified)) {
<i-bs name="exclamation-triangle-fill" class="text-warning ms-2 lh-1"></i-bs>
} @else {
<i-bs name="check-circle-fill" class="text-primary ms-2 lh-1"></i-bs>
}
<i-bs name="check-circle-fill" class="text-primary ms-2 lh-1"></i-bs>
} @else {
<i-bs name="exclamation-triangle-fill" class="text-danger ms-2 lh-1"></i-bs>
}
</button>
@if (currentUserIsSuperUser) {
@if (isRunning(PaperlessTaskName.IndexOptimize)) {
<div class="spinner-border spinner-border-sm ms-2" role="status"></div>
} @else {
<button class="btn btn-sm d-flex align-items-center btn-dark small ms-2" (click)="runTask(PaperlessTaskName.IndexOptimize)">
<i-bs name="play-fill" class="me-1"></i-bs>
<ng-container i18n>Run Task</ng-container>
</button>
}
}
</dd>
<ng-template #indexStatus>
@if (status.tasks.index_status === 'OK') {
@@ -203,10 +189,10 @@
}
</button>
@if (currentUserIsSuperUser) {
@if (isRunning(PaperlessTaskName.TrainClassifier)) {
@if (isRunning(PaperlessTaskType.TrainClassifier)) {
<div class="spinner-border spinner-border-sm ms-2" role="status"></div>
} @else {
<button class="btn btn-sm d-flex align-items-center btn-dark small ms-2" (click)="runTask(PaperlessTaskName.TrainClassifier)">
<button class="btn btn-sm d-flex align-items-center btn-dark small ms-2" (click)="runTask(PaperlessTaskType.TrainClassifier)">
<i-bs name="play-fill" class="me-1"></i-bs>
<ng-container i18n>Run Task</ng-container>
</button>
@@ -237,10 +223,10 @@
}
</button>
@if (currentUserIsSuperUser) {
@if (isRunning(PaperlessTaskName.SanityCheck)) {
@if (isRunning(PaperlessTaskType.SanityCheck)) {
<div class="spinner-border spinner-border-sm ms-2" role="status"></div>
} @else {
<button class="btn btn-sm d-flex align-items-center btn-dark small ms-2" (click)="runTask(PaperlessTaskName.SanityCheck)">
<button class="btn btn-sm d-flex align-items-center btn-dark small ms-2" (click)="runTask(PaperlessTaskType.SanityCheck)">
<i-bs name="play-fill" class="me-1"></i-bs>
<ng-container i18n>Run Task</ng-container>
</button>
@@ -285,10 +271,10 @@
}
</button>
@if (currentUserIsSuperUser) {
@if (isRunning(PaperlessTaskName.LLMIndexUpdate)) {
@if (isRunning(PaperlessTaskType.LlmIndex)) {
<div class="spinner-border spinner-border-sm ms-2" role="status"></div>
} @else {
<button class="btn btn-sm d-flex align-items-center btn-dark small ms-2" (click)="runTask(PaperlessTaskName.LLMIndexUpdate)">
<button class="btn btn-sm d-flex align-items-center btn-dark small ms-2" (click)="runTask(PaperlessTaskType.LlmIndex)">
<i-bs name="play-fill" class="me-1"></i-bs>
<ng-container i18n>Run Task</ng-container>
</button>
@@ -25,7 +25,7 @@ import {
import { NgbActiveModal } from '@ng-bootstrap/ng-bootstrap'
import { NgxBootstrapIconsModule, allIcons } from 'ngx-bootstrap-icons'
import { Subject, of, throwError } from 'rxjs'
import { PaperlessTaskName } from 'src/app/data/paperless-task'
import { PaperlessTaskType } from 'src/app/data/paperless-task'
import {
InstallType,
SystemStatus,
@@ -138,9 +138,9 @@ describe('SystemStatusDialogComponent', () => {
})
it('should check if task is running', () => {
component.runTask(PaperlessTaskName.IndexOptimize)
expect(component.isRunning(PaperlessTaskName.IndexOptimize)).toBeTruthy()
expect(component.isRunning(PaperlessTaskName.SanityCheck)).toBeFalsy()
component.runTask(PaperlessTaskType.SanityCheck)
expect(component.isRunning(PaperlessTaskType.SanityCheck)).toBeTruthy()
expect(component.isRunning(PaperlessTaskType.TrainClassifier)).toBeFalsy()
})
it('should support running tasks, refresh status and show toasts', () => {
@@ -151,22 +151,22 @@ describe('SystemStatusDialogComponent', () => {
// fail first
runSpy.mockReturnValue(throwError(() => new Error('error')))
component.runTask(PaperlessTaskName.IndexOptimize)
expect(runSpy).toHaveBeenCalledWith(PaperlessTaskName.IndexOptimize)
component.runTask(PaperlessTaskType.SanityCheck)
expect(runSpy).toHaveBeenCalledWith(PaperlessTaskType.SanityCheck)
expect(toastErrorSpy).toHaveBeenCalledWith(
`Failed to start task ${PaperlessTaskName.IndexOptimize}, see the logs for more details`,
`Failed to start task ${PaperlessTaskType.SanityCheck}, see the logs for more details`,
expect.any(Error)
)
// succeed
runSpy.mockReturnValue(of({}))
getStatusSpy.mockReturnValue(of(status))
component.runTask(PaperlessTaskName.IndexOptimize)
expect(runSpy).toHaveBeenCalledWith(PaperlessTaskName.IndexOptimize)
component.runTask(PaperlessTaskType.SanityCheck)
expect(runSpy).toHaveBeenCalledWith(PaperlessTaskType.SanityCheck)
expect(getStatusSpy).toHaveBeenCalled()
expect(toastSpy).toHaveBeenCalledWith(
`Task ${PaperlessTaskName.IndexOptimize} started`
`Task ${PaperlessTaskType.SanityCheck} started`
)
})
@@ -8,7 +8,7 @@ import {
} from '@ng-bootstrap/ng-bootstrap'
import { NgxBootstrapIconsModule } from 'ngx-bootstrap-icons'
import { Subject, takeUntil } from 'rxjs'
import { PaperlessTaskName } from 'src/app/data/paperless-task'
import { PaperlessTaskType } from 'src/app/data/paperless-task'
import {
SystemStatus,
SystemStatusItemStatus,
@@ -49,14 +49,14 @@ export class SystemStatusDialogComponent implements OnInit, OnDestroy {
private settingsService = inject(SettingsService)
public SystemStatusItemStatus = SystemStatusItemStatus
public PaperlessTaskName = PaperlessTaskName
public PaperlessTaskType = PaperlessTaskType
public status: SystemStatus
public frontendVersion: string = environment.version
public versionMismatch: boolean = false
public copied: boolean = false
private runningTasks: Set<PaperlessTaskName> = new Set()
private runningTasks: Set<PaperlessTaskType> = new Set()
private unsubscribeNotifier: Subject<any> = new Subject()
get currentUserIsSuperUser(): boolean {
@@ -107,11 +107,11 @@ export class SystemStatusDialogComponent implements OnInit, OnDestroy {
return now.getTime() - date.getTime() > hours * 60 * 60 * 1000
}
public isRunning(taskName: PaperlessTaskName): boolean {
public isRunning(taskName: PaperlessTaskType): boolean {
return this.runningTasks.has(taskName)
}
public runTask(taskName: PaperlessTaskName) {
public runTask(taskName: PaperlessTaskType) {
this.runningTasks.add(taskName)
this.toastService.showInfo(`Task ${taskName} started`)
this.tasksService.run(taskName).subscribe({
@@ -43,7 +43,7 @@
</div>
<p class="card-text">
@if (document) {
@if (document.__search_hit__ && document.__search_hit__.highlights) {
@if (hasSearchHighlights) {
<span [innerHtml]="document.__search_hit__.highlights"></span>
}
@for (highlight of searchNoteHighlights; track highlight) {
@@ -52,7 +52,7 @@
<span [innerHtml]="highlight"></span>
</span>
}
@if (!document.__search_hit__?.score) {
@if (shouldShowContentFallback) {
<span class="result-content">{{contentTrimmed}}</span>
}
} @else {
@@ -65,7 +65,9 @@
}
}
span ::ng-deep .match {
.card-text ::ng-deep .match,
.card-text ::ng-deep b {
font-weight: normal;
color: black;
background-color: rgb(255, 211, 66);
}
@@ -127,6 +127,19 @@ describe('DocumentCardLargeComponent', () => {
expect(component.searchNoteHighlights).toContain('<span>bananas</span>')
})
it('should fall back to document content when a search hit has no highlights', () => {
component.document.__search_hit__ = {
score: 0.9,
rank: 1,
highlights: '',
note_highlights: null,
}
fixture.detectChanges()
expect(fixture.nativeElement.textContent).toContain('Cupcake ipsum')
expect(component.shouldShowContentFallback).toBe(true)
})
it('should try to close the preview on mouse leave', () => {
component.popupPreview = {
close: jest.fn(),
@@ -164,6 +164,17 @@ export class DocumentCardLargeComponent
)
}
get hasSearchHighlights() {
return Boolean(this.document?.__search_hit__?.highlights?.trim()?.length)
}
get shouldShowContentFallback() {
return (
this.document?.__search_hit__?.score == null ||
(!this.hasSearchHighlights && this.searchNoteHighlights.length === 0)
)
}
get notesEnabled(): boolean {
return this.settingsService.get(SETTINGS_KEYS.NOTES_ENABLED)
}
+50 -33
View File
@@ -1,49 +1,66 @@
import { Document } from './document'
import { ObjectWithId } from './object-with-id'
export enum PaperlessTaskType {
Auto = 'auto_task',
ScheduledTask = 'scheduled_task',
ManualTask = 'manual_task',
}
export enum PaperlessTaskName {
ConsumeFile = 'consume_file',
TrainClassifier = 'train_classifier',
SanityCheck = 'check_sanity',
IndexOptimize = 'index_optimize',
LLMIndexUpdate = 'llmindex_update',
SanityCheck = 'sanity_check',
MailFetch = 'mail_fetch',
LlmIndex = 'llm_index',
EmptyTrash = 'empty_trash',
CheckWorkflows = 'check_workflows',
BulkUpdate = 'bulk_update',
ReprocessDocument = 'reprocess_document',
BuildShareLink = 'build_share_link',
BulkDelete = 'bulk_delete',
}
export enum PaperlessTaskTriggerSource {
Scheduled = 'scheduled',
WebUI = 'web_ui',
ApiUpload = 'api_upload',
FolderConsume = 'folder_consume',
EmailConsume = 'email_consume',
System = 'system',
Manual = 'manual',
}
export enum PaperlessTaskStatus {
Pending = 'PENDING',
Started = 'STARTED',
Complete = 'SUCCESS',
Failed = 'FAILURE',
Pending = 'pending',
Started = 'started',
Success = 'success',
Failure = 'failure',
Revoked = 'revoked',
}
export interface PaperlessTask extends ObjectWithId {
type: PaperlessTaskType
status: PaperlessTaskStatus
acknowledged: boolean
task_id: string
task_file_name: string
task_name: PaperlessTaskName
task_type: PaperlessTaskType
task_type_display: string
trigger_source: PaperlessTaskTriggerSource
trigger_source_display: string
status: PaperlessTaskStatus
status_display: string
date_created: Date
date_started?: Date
date_done?: Date
result?: string
related_document?: number
duplicate_documents?: Document[]
duration_seconds?: number
wait_time_seconds?: number
input_data: Record<string, any>
result_data?: Record<string, any>
related_document_ids: number[]
acknowledged: boolean
owner?: number
}
export interface PaperlessTaskSummary {
task_type: PaperlessTaskType
total_count: number
pending_count: number
success_count: number
failure_count: number
avg_duration_seconds: number | null
avg_wait_time_seconds: number | null
last_run: Date | null
last_success: Date | null
last_failure: Date | null
}
@@ -8,7 +8,7 @@ import {
const VIEW_ONLY_PERMISSION_TYPES = new Set<PermissionType>([
PermissionType.GlobalStatistics,
PermissionType.SystemStatus,
PermissionType.SystemMonitoring,
])
describe('PermissionsService', () => {
@@ -270,7 +270,7 @@ describe('PermissionsService', () => {
'delete_applicationconfiguration',
'view_applicationconfiguration',
'view_global_statistics',
'view_system_status',
'view_system_monitoring',
],
{
username: 'testuser',
@@ -30,7 +30,7 @@ export enum PermissionType {
Workflow = '%s_workflow',
ProcessedMail = '%s_processedmail',
GlobalStatistics = '%s_global_statistics',
SystemStatus = '%s_system_status',
SystemMonitoring = '%s_system_monitoring',
}
@Injectable({
+83 -29
View File
@@ -6,8 +6,8 @@ import {
import { TestBed } from '@angular/core/testing'
import { environment } from 'src/environments/environment'
import {
PaperlessTaskName,
PaperlessTaskStatus,
PaperlessTaskTriggerSource,
PaperlessTaskType,
} from '../data/paperless-task'
import { TasksService } from './tasks.service'
@@ -37,7 +37,7 @@ describe('TasksService', () => {
it('calls tasks api endpoint on reload', () => {
tasksService.reload()
const req = httpTestingController.expectOne(
`${environment.apiBaseUrl}tasks/?task_name=consume_file&acknowledged=false`
`${environment.apiBaseUrl}tasks/?acknowledged=false`
)
expect(req.request.method).toEqual('GET')
})
@@ -46,7 +46,7 @@ describe('TasksService', () => {
tasksService.loading = true
tasksService.reload()
httpTestingController.expectNone(
`${environment.apiBaseUrl}tasks/?task_name=consume_file&acknowledged=false`
`${environment.apiBaseUrl}tasks/?acknowledged=false`
)
})
@@ -62,66 +62,69 @@ describe('TasksService', () => {
req.flush([])
// reload is then called
httpTestingController
.expectOne(
`${environment.apiBaseUrl}tasks/?task_name=consume_file&acknowledged=false`
)
.expectOne(`${environment.apiBaseUrl}tasks/?acknowledged=false`)
.flush([])
})
it('sorts tasks returned from api', () => {
it('groups mixed task types by status when reloading', () => {
expect(tasksService.total).toEqual(0)
const mockTasks = [
{
type: PaperlessTaskType.Auto,
task_name: PaperlessTaskName.ConsumeFile,
status: PaperlessTaskStatus.Complete,
task_type: PaperlessTaskType.ConsumeFile,
trigger_source: PaperlessTaskTriggerSource.FolderConsume,
status: PaperlessTaskStatus.Success,
acknowledged: false,
task_id: '1234',
task_file_name: 'file1.pdf',
input_data: { filename: 'file1.pdf' },
date_created: new Date(),
related_document_ids: [],
},
{
type: PaperlessTaskType.Auto,
task_name: PaperlessTaskName.ConsumeFile,
status: PaperlessTaskStatus.Failed,
task_type: PaperlessTaskType.SanityCheck,
trigger_source: PaperlessTaskTriggerSource.System,
status: PaperlessTaskStatus.Failure,
acknowledged: false,
task_id: '1235',
task_file_name: 'file2.pdf',
input_data: {},
date_created: new Date(),
related_document_ids: [],
},
{
type: PaperlessTaskType.Auto,
task_name: PaperlessTaskName.ConsumeFile,
task_type: PaperlessTaskType.MailFetch,
trigger_source: PaperlessTaskTriggerSource.Scheduled,
status: PaperlessTaskStatus.Pending,
acknowledged: false,
task_id: '1236',
task_file_name: 'file3.pdf',
input_data: {},
date_created: new Date(),
related_document_ids: [],
},
{
type: PaperlessTaskType.Auto,
task_name: PaperlessTaskName.ConsumeFile,
task_type: PaperlessTaskType.LlmIndex,
trigger_source: PaperlessTaskTriggerSource.WebUI,
status: PaperlessTaskStatus.Started,
acknowledged: false,
task_id: '1237',
task_file_name: 'file4.pdf',
input_data: {},
date_created: new Date(),
related_document_ids: [],
},
{
type: PaperlessTaskType.Auto,
task_name: PaperlessTaskName.ConsumeFile,
status: PaperlessTaskStatus.Complete,
task_type: PaperlessTaskType.EmptyTrash,
trigger_source: PaperlessTaskTriggerSource.Manual,
status: PaperlessTaskStatus.Success,
acknowledged: false,
task_id: '1238',
task_file_name: 'file5.pdf',
input_data: {},
date_created: new Date(),
related_document_ids: [],
},
]
tasksService.reload()
const req = httpTestingController.expectOne(
`${environment.apiBaseUrl}tasks/?task_name=consume_file&acknowledged=false`
`${environment.apiBaseUrl}tasks/?acknowledged=false`
)
req.flush(mockTasks)
@@ -133,10 +136,61 @@ describe('TasksService', () => {
expect(tasksService.startedFileTasks).toHaveLength(1)
})
it('includes revoked tasks in needs attention', () => {
const mockTasks = [
{
task_type: PaperlessTaskType.SanityCheck,
trigger_source: PaperlessTaskTriggerSource.System,
status: PaperlessTaskStatus.Failure,
acknowledged: false,
task_id: '1235',
input_data: {},
date_created: new Date(),
related_document_ids: [],
},
{
task_type: PaperlessTaskType.MailFetch,
trigger_source: PaperlessTaskTriggerSource.Scheduled,
status: PaperlessTaskStatus.Revoked,
acknowledged: false,
task_id: '1236',
input_data: {},
date_created: new Date(),
related_document_ids: [],
},
{
task_type: PaperlessTaskType.EmptyTrash,
trigger_source: PaperlessTaskTriggerSource.Manual,
status: PaperlessTaskStatus.Success,
acknowledged: false,
task_id: '1238',
input_data: {},
date_created: new Date(),
related_document_ids: [],
},
]
tasksService.reload()
const req = httpTestingController.expectOne(
`${environment.apiBaseUrl}tasks/?acknowledged=false`
)
req.flush(mockTasks)
expect(tasksService.needsAttentionTasks).toHaveLength(2)
expect(tasksService.needsAttentionTasks.map((task) => task.status)).toEqual(
expect.arrayContaining([
PaperlessTaskStatus.Failure,
PaperlessTaskStatus.Revoked,
])
)
})
it('supports running tasks', () => {
tasksService.run(PaperlessTaskName.SanityCheck).subscribe((res) => {
tasksService.run(PaperlessTaskType.SanityCheck).subscribe((res) => {
expect(res).toEqual({
result: 'success',
task_id: 'abc-123',
})
})
const req = httpTestingController.expectOne(
@@ -144,7 +198,7 @@ describe('TasksService', () => {
)
expect(req.request.method).toEqual('POST')
req.flush({
result: 'success',
task_id: 'abc-123',
})
})
})
+26 -16
View File
@@ -4,8 +4,8 @@ import { Observable, Subject } from 'rxjs'
import { first, takeUntil, tap } from 'rxjs/operators'
import {
PaperlessTask,
PaperlessTaskName,
PaperlessTaskStatus,
PaperlessTaskType,
} from 'src/app/data/paperless-task'
import { environment } from 'src/environments/environment'
@@ -18,7 +18,7 @@ export class TasksService {
private baseUrl: string = environment.apiBaseUrl
private endpoint: string = 'tasks'
public loading: boolean
public loading: boolean = false
private fileTasks: PaperlessTask[] = []
@@ -33,21 +33,35 @@ export class TasksService {
}
public get queuedFileTasks(): PaperlessTask[] {
return this.fileTasks.filter((t) => t.status == PaperlessTaskStatus.Pending)
return this.fileTasks.filter(
(t) => t.status === PaperlessTaskStatus.Pending
)
}
public get startedFileTasks(): PaperlessTask[] {
return this.fileTasks.filter((t) => t.status == PaperlessTaskStatus.Started)
return this.fileTasks.filter(
(t) => t.status === PaperlessTaskStatus.Started
)
}
public get completedFileTasks(): PaperlessTask[] {
return this.fileTasks.filter(
(t) => t.status == PaperlessTaskStatus.Complete
(t) => t.status === PaperlessTaskStatus.Success
)
}
public get failedFileTasks(): PaperlessTask[] {
return this.fileTasks.filter((t) => t.status == PaperlessTaskStatus.Failed)
return this.fileTasks.filter(
(t) => t.status === PaperlessTaskStatus.Failure
)
}
public get needsAttentionTasks(): PaperlessTask[] {
return this.fileTasks.filter((t) =>
[PaperlessTaskStatus.Failure, PaperlessTaskStatus.Revoked].includes(
t.status
)
)
}
public reload() {
@@ -56,18 +70,16 @@ export class TasksService {
this.http
.get<PaperlessTask[]>(
`${this.baseUrl}${this.endpoint}/?task_name=consume_file&acknowledged=false`
`${this.baseUrl}${this.endpoint}/?acknowledged=false`
)
.pipe(takeUntil(this.unsubscribeNotifer), first())
.subscribe((r) => {
this.fileTasks = r.filter(
(t) => t.task_name == PaperlessTaskName.ConsumeFile
)
this.fileTasks = r
this.loading = false
})
}
public dismissTasks(task_ids: Set<number>) {
public dismissTasks(task_ids: Set<number>): Observable<any> {
return this.http
.post(`${this.baseUrl}tasks/acknowledge/`, {
tasks: [...task_ids],
@@ -85,12 +97,10 @@ export class TasksService {
this.unsubscribeNotifer.next(true)
}
public run(taskName: PaperlessTaskName): Observable<any> {
return this.http.post<any>(
public run(taskType: PaperlessTaskType): Observable<{ task_id: string }> {
return this.http.post<{ task_id: string }>(
`${environment.apiBaseUrl}${this.endpoint}/run/`,
{
task_name: taskName,
}
{ task_type: taskType }
)
}
}
@@ -142,6 +142,21 @@ describe('CustomFieldQueryAtom', () => {
atom.value = [1, 3]
expect(changeSpy).toHaveBeenCalledTimes(1)
})
it('should emit one changed event when operator change coerces value', () => {
const atom = new CustomFieldQueryAtom([
1,
CustomFieldQueryOperator.In,
[1, 2],
])
const changeSpy = jest.fn()
atom.changed.subscribe(changeSpy)
atom.operator = CustomFieldQueryOperator.Exact
expect(changeSpy).toHaveBeenCalledTimes(1)
expect(atom.serialize()).toEqual([1, CustomFieldQueryOperator.Exact, ''])
})
})
describe('CustomFieldQueryExpression', () => {
@@ -70,29 +70,29 @@ export class CustomFieldQueryAtom extends CustomFieldQueryElement {
const newTypes: string[] =
CUSTOM_FIELD_QUERY_VALUE_TYPES_BY_OPERATOR[operator]?.split('|')
if (!newTypes) {
this.value = null
this._value = null
} else {
if (!newTypes.includes(typeof this.value)) {
switch (newTypes[0]) {
case 'string':
this.value = ''
this._value = ''
break
case 'boolean':
this.value = 'true'
this._value = 'true'
break
case 'array':
this.value = []
this._value = []
break
case 'number':
const num = parseFloat(this.value as string)
this.value = isNaN(num) ? null : num.toString()
this._value = isNaN(num) ? null : num.toString()
break
}
} else if (
['true', 'false'].includes(this.value as string) &&
newTypes.includes('string')
) {
this.value = ''
this._value = ''
}
}
super.operator = operator
+1 -1
View File
@@ -6,7 +6,7 @@ export const environment = {
apiVersion: '10', // match src/paperless/settings.py
appTitle: 'Paperless-ngx',
tag: 'prod',
version: '2.20.13',
version: '2.20.14',
webSocketHost: window.location.host,
webSocketProtocol: window.location.protocol == 'https:' ? 'wss:' : 'ws:',
webSocketBaseUrl: base_url.pathname + 'ws/',
+8
View File
@@ -119,6 +119,10 @@ table .btn-link {
font-size: 1em;
}
.fs-7 {
font-size: 0.75rem !important;
}
.bg-body {
background-color: var(--bs-body-bg);
}
@@ -128,6 +132,10 @@ table .btn-link {
color: var(--pngx-primary-text-contrast);
}
.bg-darker {
background-color: var(--pngx-bg-darker) !important;
}
.navbar-brand {
color: var(--pngx-primary-text-contrast) !important;
}
+17 -6
View File
@@ -144,18 +144,29 @@ class StoragePathAdmin(GuardedModelAdmin):
class TaskAdmin(admin.ModelAdmin):
list_display = ("task_id", "task_file_name", "task_name", "date_done", "status")
list_filter = ("status", "date_done", "task_name")
search_fields = ("task_name", "task_id", "status", "task_file_name")
list_display = (
"task_id",
"task_type",
"trigger_source",
"status",
"date_created",
"date_done",
"duration_seconds",
)
list_filter = ("status", "task_type", "trigger_source", "date_done")
search_fields = ("task_id", "task_type", "status")
readonly_fields = (
"task_id",
"task_file_name",
"task_name",
"task_type",
"trigger_source",
"status",
"date_created",
"date_started",
"date_done",
"result",
"duration_seconds",
"wait_time_seconds",
"input_data",
"result_data",
)
+28 -13
View File
@@ -17,7 +17,9 @@ from pikepdf import Pdf
from documents.converters import convert_from_tiff_to_pdf
from documents.data_models import ConsumableDocument
from documents.data_models import DocumentMetadataOverrides
from documents.data_models import DocumentSource
from documents.models import Document
from documents.models import PaperlessTask
from documents.models import Tag
from documents.plugins.base import ConsumeTaskPlugin
from documents.plugins.base import StopConsumeTaskError
@@ -193,23 +195,36 @@ class BarcodePlugin(ConsumeTaskPlugin):
from documents import tasks
_SOURCE_TO_TRIGGER: dict[DocumentSource, PaperlessTask.TriggerSource] = {
DocumentSource.ConsumeFolder: PaperlessTask.TriggerSource.FOLDER_CONSUME,
DocumentSource.ApiUpload: PaperlessTask.TriggerSource.API_UPLOAD,
DocumentSource.MailFetch: PaperlessTask.TriggerSource.EMAIL_CONSUME,
DocumentSource.WebUI: PaperlessTask.TriggerSource.WEB_UI,
}
trigger_source = _SOURCE_TO_TRIGGER.get(
self.input_doc.source,
PaperlessTask.TriggerSource.MANUAL,
)
# Create the split document tasks
for new_document in self.separate_pages(separator_pages):
copy_file_with_basic_stats(new_document, tmp_dir / new_document.name)
task = tasks.consume_file.delay(
ConsumableDocument(
# Same source, for templates
source=self.input_doc.source,
mailrule_id=self.input_doc.mailrule_id,
# Can't use same folder or the consume might grab it again
original_file=(tmp_dir / new_document.name).resolve(),
# Adding optional original_path for later uses in
# workflow matching
original_path=self.input_doc.original_file,
),
# All the same metadata
self.metadata,
task = tasks.consume_file.apply_async(
kwargs={
"input_doc": ConsumableDocument(
# Same source, for templates
source=self.input_doc.source,
mailrule_id=self.input_doc.mailrule_id,
# Can't use same folder or the consume might grab it again
original_file=(tmp_dir / new_document.name).resolve(),
# Adding optional original_path for later uses in
# workflow matching
original_path=self.input_doc.original_file,
),
"overrides": self.metadata,
},
headers={"trigger_source": trigger_source},
)
logger.info(f"Created new task {task.id} for {new_document.name}")
+94 -52
View File
@@ -22,6 +22,7 @@ from documents.models import CustomField
from documents.models import CustomFieldInstance
from documents.models import Document
from documents.models import DocumentType
from documents.models import PaperlessTask
from documents.models import StoragePath
from documents.models import Tag
from documents.permissions import set_permissions_for_object
@@ -113,7 +114,10 @@ def set_correspondent(
affected_docs = list(qs.values_list("pk", flat=True))
qs.update(correspondent=correspondent)
bulk_update_documents.delay(document_ids=affected_docs)
bulk_update_documents.apply_async(
kwargs={"document_ids": affected_docs},
headers={"trigger_source": PaperlessTask.TriggerSource.SYSTEM},
)
return "OK"
@@ -132,8 +136,9 @@ def set_storage_path(doc_ids: list[int], storage_path: StoragePath) -> Literal["
affected_docs = list(qs.values_list("pk", flat=True))
qs.update(storage_path=storage_path)
bulk_update_documents.delay(
document_ids=affected_docs,
bulk_update_documents.apply_async(
kwargs={"document_ids": affected_docs},
headers={"trigger_source": PaperlessTask.TriggerSource.SYSTEM},
)
return "OK"
@@ -151,7 +156,10 @@ def set_document_type(doc_ids: list[int], document_type: DocumentType) -> Litera
affected_docs = list(qs.values_list("pk", flat=True))
qs.update(document_type=document_type)
bulk_update_documents.delay(document_ids=affected_docs)
bulk_update_documents.apply_async(
kwargs={"document_ids": affected_docs},
headers={"trigger_source": PaperlessTask.TriggerSource.SYSTEM},
)
return "OK"
@@ -177,7 +185,10 @@ def add_tag(doc_ids: list[int], tag: int) -> Literal["OK"]:
DocumentTagRelationship.objects.bulk_create(to_create)
if affected_docs:
bulk_update_documents.delay(document_ids=list(affected_docs))
bulk_update_documents.apply_async(
kwargs={"document_ids": list(affected_docs)},
headers={"trigger_source": PaperlessTask.TriggerSource.SYSTEM},
)
return "OK"
@@ -195,7 +206,10 @@ def remove_tag(doc_ids: list[int], tag: int) -> Literal["OK"]:
qs.delete()
if affected_docs:
bulk_update_documents.delay(document_ids=affected_docs)
bulk_update_documents.apply_async(
kwargs={"document_ids": affected_docs},
headers={"trigger_source": PaperlessTask.TriggerSource.SYSTEM},
)
return "OK"
@@ -254,7 +268,10 @@ def modify_tags(
)
if affected_docs:
bulk_update_documents.delay(document_ids=affected_docs)
bulk_update_documents.apply_async(
kwargs={"document_ids": affected_docs},
headers={"trigger_source": PaperlessTask.TriggerSource.SYSTEM},
)
except Exception as e:
logger.error(f"Error modifying tags: {e}")
return "ERROR"
@@ -326,7 +343,10 @@ def modify_custom_fields(
field_id__in=remove_custom_fields,
).hard_delete()
bulk_update_documents.delay(document_ids=affected_docs)
bulk_update_documents.apply_async(
kwargs={"document_ids": affected_docs},
headers={"trigger_source": PaperlessTask.TriggerSource.SYSTEM},
)
return "OK"
@@ -369,8 +389,9 @@ def delete(doc_ids: list[int]) -> Literal["OK"]:
def reprocess(doc_ids: list[int]) -> Literal["OK"]:
for document_id in doc_ids:
update_document_content_maybe_archive_file.delay(
document_id=document_id,
update_document_content_maybe_archive_file.apply_async(
kwargs={"document_id": document_id},
headers={"trigger_source": PaperlessTask.TriggerSource.MANUAL},
)
return "OK"
@@ -396,7 +417,10 @@ def set_permissions(
affected_docs = list(qs.values_list("pk", flat=True))
bulk_update_documents.delay(document_ids=affected_docs)
bulk_update_documents.apply_async(
kwargs={"document_ids": affected_docs},
headers={"trigger_source": PaperlessTask.TriggerSource.SYSTEM},
)
return "OK"
@@ -407,6 +431,7 @@ def rotate(
*,
source_mode: SourceMode = SourceModeChoices.LATEST_VERSION,
user: User | None = None,
trigger_source: PaperlessTask.TriggerSource = PaperlessTask.TriggerSource.WEB_UI,
) -> Literal["OK"]:
logger.info(
f"Attempting to rotate {len(doc_ids)} documents by {degrees} degrees.",
@@ -453,13 +478,16 @@ def rotate(
if user is not None:
overrides.actor_id = user.id
consume_file.delay(
ConsumableDocument(
source=DocumentSource.ConsumeFolder,
original_file=filepath,
root_document_id=root_doc.id,
),
overrides,
consume_file.apply_async(
kwargs={
"input_doc": ConsumableDocument(
source=DocumentSource.ConsumeFolder,
original_file=filepath,
root_document_id=root_doc.id,
),
"overrides": overrides,
},
headers={"trigger_source": trigger_source},
)
logger.info(
f"Queued new rotated version for document {root_doc.id} by {degrees} degrees",
@@ -478,6 +506,7 @@ def merge(
archive_fallback: bool = False,
source_mode: SourceMode = SourceModeChoices.LATEST_VERSION,
user: User | None = None,
trigger_source: PaperlessTask.TriggerSource = PaperlessTask.TriggerSource.WEB_UI,
) -> Literal["OK"]:
logger.info(
f"Attempting to merge {len(doc_ids)} documents into a single document.",
@@ -556,12 +585,12 @@ def merge(
logger.info("Adding merged document to the task queue.")
consume_task = consume_file.s(
ConsumableDocument(
input_doc=ConsumableDocument(
source=DocumentSource.ConsumeFolder,
original_file=filepath,
),
overrides,
)
overrides=overrides,
).set(headers={"trigger_source": trigger_source})
if delete_originals:
backup = release_archive_serial_numbers(affected_docs)
@@ -577,7 +606,7 @@ def merge(
restore_archive_serial_numbers(backup)
raise
else:
consume_task.delay()
consume_task.apply_async()
return "OK"
@@ -589,6 +618,7 @@ def split(
delete_originals: bool = False,
source_mode: SourceMode = SourceModeChoices.LATEST_VERSION,
user: User | None = None,
trigger_source: PaperlessTask.TriggerSource = PaperlessTask.TriggerSource.WEB_UI,
) -> Literal["OK"]:
logger.info(
f"Attempting to split document {doc_ids[0]} into {len(pages)} documents",
@@ -631,12 +661,12 @@ def split(
)
consume_tasks.append(
consume_file.s(
ConsumableDocument(
input_doc=ConsumableDocument(
source=DocumentSource.ConsumeFolder,
original_file=filepath,
),
overrides,
),
overrides=overrides,
).set(headers={"trigger_source": trigger_source}),
)
if delete_originals:
@@ -669,6 +699,7 @@ def delete_pages(
*,
source_mode: SourceMode = SourceModeChoices.LATEST_VERSION,
user: User | None = None,
trigger_source: PaperlessTask.TriggerSource = PaperlessTask.TriggerSource.WEB_UI,
) -> Literal["OK"]:
logger.info(
f"Attempting to delete pages {pages} from {len(doc_ids)} documents",
@@ -698,13 +729,16 @@ def delete_pages(
overrides = DocumentMetadataOverrides().from_document(root_doc)
if user is not None:
overrides.actor_id = user.id
consume_file.delay(
ConsumableDocument(
source=DocumentSource.ConsumeFolder,
original_file=filepath,
root_document_id=root_doc.id,
),
overrides,
consume_file.apply_async(
kwargs={
"input_doc": ConsumableDocument(
source=DocumentSource.ConsumeFolder,
original_file=filepath,
root_document_id=root_doc.id,
),
"overrides": overrides,
},
headers={"trigger_source": trigger_source},
)
logger.info(
f"Queued new version for document {root_doc.id} after deleting pages {pages}",
@@ -724,6 +758,7 @@ def edit_pdf(
include_metadata: bool = True,
source_mode: SourceMode = SourceModeChoices.LATEST_VERSION,
user: User | None = None,
trigger_source: PaperlessTask.TriggerSource = PaperlessTask.TriggerSource.WEB_UI,
) -> Literal["OK"]:
"""
Operations is a list of dictionaries describing the final PDF pages.
@@ -781,13 +816,16 @@ def edit_pdf(
if user is not None:
overrides.owner_id = user.id
overrides.actor_id = user.id
consume_file.delay(
ConsumableDocument(
source=DocumentSource.ConsumeFolder,
original_file=filepath,
root_document_id=root_doc.id,
),
overrides,
consume_file.apply_async(
kwargs={
"input_doc": ConsumableDocument(
source=DocumentSource.ConsumeFolder,
original_file=filepath,
root_document_id=root_doc.id,
),
"overrides": overrides,
},
headers={"trigger_source": trigger_source},
)
else:
consume_tasks = []
@@ -812,12 +850,12 @@ def edit_pdf(
pdf.save(version_filepath)
consume_tasks.append(
consume_file.s(
ConsumableDocument(
input_doc=ConsumableDocument(
source=DocumentSource.ConsumeFolder,
original_file=version_filepath,
),
overrides,
),
overrides=overrides,
).set(headers={"trigger_source": trigger_source}),
)
if delete_original:
@@ -853,6 +891,7 @@ def remove_password(
include_metadata: bool = True,
source_mode: SourceMode = SourceModeChoices.LATEST_VERSION,
user: User | None = None,
trigger_source: PaperlessTask.TriggerSource = PaperlessTask.TriggerSource.WEB_UI,
) -> Literal["OK"]:
"""
Remove password protection from PDF documents.
@@ -887,13 +926,16 @@ def remove_password(
if user is not None:
overrides.owner_id = user.id
overrides.actor_id = user.id
consume_file.delay(
ConsumableDocument(
source=DocumentSource.ConsumeFolder,
original_file=filepath,
root_document_id=root_doc.id,
),
overrides,
consume_file.apply_async(
kwargs={
"input_doc": ConsumableDocument(
source=DocumentSource.ConsumeFolder,
original_file=filepath,
root_document_id=root_doc.id,
),
"overrides": overrides,
},
headers={"trigger_source": trigger_source},
)
else:
consume_tasks = []
@@ -908,12 +950,12 @@ def remove_password(
consume_tasks.append(
consume_file.s(
ConsumableDocument(
input_doc=ConsumableDocument(
source=DocumentSource.ConsumeFolder,
original_file=filepath,
),
overrides,
),
overrides=overrides,
).set(headers={"trigger_source": trigger_source}),
)
if delete_original:
+26 -17
View File
@@ -8,7 +8,6 @@ from pathlib import Path
from typing import TYPE_CHECKING
from typing import Final
import magic
from django.conf import settings
from django.contrib.auth.models import User
from django.db import transaction
@@ -20,6 +19,7 @@ from rest_framework.reverse import reverse
from documents.classifier import load_classifier
from documents.data_models import ConsumableDocument
from documents.data_models import ConsumeFileSuccessResult
from documents.data_models import DocumentMetadataOverrides
from documents.file_handling import create_source_path_directory
from documents.file_handling import generate_filename
@@ -51,6 +51,7 @@ from documents.utils import compute_checksum
from documents.utils import copy_basic_file_stats
from documents.utils import copy_file_with_basic_stats
from documents.utils import run_subprocess
from paperless import mime_detection
from paperless.config import OcrConfig
from paperless.models import ArchiveFileGenerationChoices
from paperless.parsers import ParserContext
@@ -90,6 +91,15 @@ class ConsumerError(Exception):
pass
class ConsumeFileDuplicateError(ConsumerError):
"""Raised when a file is rejected because it duplicates an existing document."""
def __init__(self, message: str, duplicate_id: int, *, in_trash: bool) -> None:
super().__init__(message)
self.duplicate_id = duplicate_id
self.in_trash = in_trash
class ConsumerStatusShortMessage(StrEnum):
DOCUMENT_ALREADY_EXISTS = "document_already_exists"
DOCUMENT_ALREADY_EXISTS_IN_TRASH = "document_already_exists_in_trash"
@@ -313,7 +323,6 @@ class ConsumerPlugin(
run_subprocess(
[
settings.PRE_CONSUME_SCRIPT,
original_file_path,
],
script_env,
self.log,
@@ -383,14 +392,6 @@ class ConsumerPlugin(
run_subprocess(
[
settings.POST_CONSUME_SCRIPT,
str(document.pk),
document.get_public_filename(),
os.path.normpath(document.source_path),
os.path.normpath(document.thumbnail_path),
reverse("document-download", kwargs={"pk": document.pk}),
reverse("document-thumb", kwargs={"pk": document.pk}),
str(document.correspondent),
str(",".join(document.tags.all().values_list("name", flat=True))),
],
script_env,
self.log,
@@ -404,7 +405,7 @@ class ConsumerPlugin(
exception=e,
)
def run(self) -> str:
def run(self) -> "ConsumeFileSuccessResult":
"""
Return the document object if it was successfully created.
"""
@@ -423,7 +424,7 @@ class ConsumerPlugin(
# Determine the parser class.
mime_type = magic.from_file(self.working_copy, mime=True)
mime_type = mime_detection.from_file(self.working_copy)
self.log.debug(f"Detected mime type: {mime_type}")
@@ -445,7 +446,7 @@ class ConsumerPlugin(
],
logger=self.log,
)
mime_type = magic.from_file(self.working_copy, mime=True)
mime_type = mime_detection.from_file(self.working_copy)
self.log.debug(f"Detected mime type after qpdf: {mime_type}")
# Save the original file for later
self.unmodified_original = (
@@ -650,6 +651,10 @@ class ConsumerPlugin(
# If we get here, it was successful. Proceed with post-consume
# hooks. If they fail, nothing will get changed.
document = Document.objects.prefetch_related("versions").get(
pk=document.pk,
)
document_consumption_finished.send(
sender=self.__class__,
document=document,
@@ -776,7 +781,7 @@ class ConsumerPlugin(
# Return the most up to date fields
document.refresh_from_db()
return f"Success. New document id {document.pk} created"
return ConsumeFileSuccessResult(document_id=document.pk)
def _parse_title_placeholders(self, title: str) -> str:
local_added = timezone.localtime(timezone.now())
@@ -1015,9 +1020,13 @@ class ConsumerPreflightPlugin(
)
failure_msg += " Note: existing document is in the trash."
self._fail(
status_msg,
failure_msg,
self._send_progress(100, 100, ProgressStatusOptions.FAILED, status_msg)
self.log.error(failure_msg)
in_trash = duplicates_in_trash.exists()
raise ConsumeFileDuplicateError(
f"{self.filename}: {failure_msg}",
duplicate.pk,
in_trash=in_trash,
)
def pre_check_directories(self) -> None:
+27 -2
View File
@@ -2,11 +2,13 @@ import dataclasses
import datetime
from enum import IntEnum
from pathlib import Path
from typing import TypedDict
import magic
from guardian.shortcuts import get_groups_with_perms
from guardian.shortcuts import get_users_with_perms
from paperless import mime_detection
@dataclasses.dataclass
class DocumentMetadataOverrides:
@@ -183,4 +185,27 @@ class ConsumableDocument:
# Get the file type once at init
# Note this function isn't called when the object is unpickled
self.mime_type = magic.from_file(self.original_file, mime=True)
self.mime_type = mime_detection.from_file(self.original_file)
class ConsumeFileDuplicateResult(TypedDict):
"""Returned by consume_file when the file is rejected as a duplicate."""
duplicate_of: int
duplicate_in_trash: bool
class ConsumeFileSuccessResult(TypedDict):
"""Returned by consume_file when the document is created successfully."""
document_id: int
class ConsumeFileStoppedResult(TypedDict):
"""Returned by consume_file when a plugin raises StopConsumeTaskError.
Examples: barcode split dispatched child tasks, double-sided scan waiting
for the second half, workflow deleted the document during consumption.
"""
reason: str
+79 -8
View File
@@ -6,6 +6,8 @@ import json
import logging
import operator
from contextlib import contextmanager
from decimal import Decimal
from decimal import InvalidOperation
from typing import TYPE_CHECKING
from typing import Any
@@ -26,8 +28,10 @@ from django.db.models.functions import Cast
from django.utils.translation import gettext_lazy as _
from django_filters import DateFilter
from django_filters.rest_framework import BooleanFilter
from django_filters.rest_framework import DateTimeFilter
from django_filters.rest_framework import Filter
from django_filters.rest_framework import FilterSet
from django_filters.rest_framework import MultipleChoiceFilter
from drf_spectacular.utils import extend_schema_field
from guardian.utils import get_group_obj_perms_model
from guardian.utils import get_user_obj_perms_model
@@ -289,6 +293,34 @@ class MimeTypeFilter(Filter):
return qs
class MonetaryAmountField(serializers.Field):
"""
Accepts either a plain decimal string ("100", "100.00") or a currency-prefixed
string ("USD100.00") and returns the numeric amount as a Decimal.
Mirrors the logic of the value_monetary_amount generated field: if the value
starts with a non-digit, the first 3 characters are treated as a currency code
(ISO 4217) and stripped before parsing. This preserves backwards compatibility
with saved views that stored a currency-prefixed string as the filter value.
"""
default_error_messages = {"invalid": "A valid number is required."}
def to_internal_value(self, data):
if not isinstance(data, str | int | float):
self.fail("invalid")
value = str(data).strip()
if value and not value[0].isdigit() and value[0] != "-":
value = value[3:] # strip 3-char ISO 4217 currency code
try:
return Decimal(value)
except InvalidOperation:
self.fail("invalid")
def to_representation(self, value):
return str(value)
class SelectField(serializers.CharField):
def __init__(self, custom_field: CustomField) -> None:
self._options = custom_field.extra_data["select_options"]
@@ -514,9 +546,8 @@ class CustomFieldQueryParser:
value_field_name = CustomFieldInstance.get_value_field_name(
custom_field.data_type,
)
if (
custom_field.data_type == CustomField.FieldDataType.MONETARY
and op in self.EXPR_BY_CATEGORY["arithmetic"]
if custom_field.data_type == CustomField.FieldDataType.MONETARY and (
op in self.EXPR_BY_CATEGORY["arithmetic"] or op in {"exact", "in"}
):
value_field_name = "value_monetary_amount"
has_field = Q(custom_fields__field=custom_field)
@@ -626,6 +657,13 @@ class CustomFieldQueryParser:
elif custom_field.data_type == CustomField.FieldDataType.URL:
# For URL fields we don't need to be strict about validation (e.g., for istartswith).
field = serializers.CharField()
elif custom_field.data_type == CustomField.FieldDataType.MONETARY and (
op in self.EXPR_BY_CATEGORY["arithmetic"] or op in {"exact", "in"}
):
# These ops compare against value_monetary_amount (a DecimalField).
# MonetaryAmountField accepts both "100" and "USD100.00" for backwards
# compatibility with saved views that stored currency-prefixed values.
field = MonetaryAmountField()
else:
# The general case: inferred from the corresponding field in CustomFieldInstance.
value_field_name = CustomFieldInstance.get_value_field_name(
@@ -862,18 +900,51 @@ class ShareLinkBundleFilterSet(FilterSet):
class PaperlessTaskFilterSet(FilterSet):
task_type = MultipleChoiceFilter(
choices=PaperlessTask.TaskType.choices,
label="Task Type",
)
trigger_source = MultipleChoiceFilter(
choices=PaperlessTask.TriggerSource.choices,
label="Trigger Source",
)
status = MultipleChoiceFilter(
choices=PaperlessTask.Status.choices,
label="Status",
)
is_complete = BooleanFilter(
method="filter_is_complete",
label="Is Complete",
)
acknowledged = BooleanFilter(
label="Acknowledged",
field_name="acknowledged",
)
date_created_after = DateTimeFilter(
field_name="date_created",
lookup_expr="gte",
label="Created After",
)
date_created_before = DateTimeFilter(
field_name="date_created",
lookup_expr="lte",
label="Created Before",
)
class Meta:
model = PaperlessTask
fields = {
"type": ["exact"],
"task_name": ["exact"],
"status": ["exact"],
}
fields = ["task_type", "trigger_source", "status", "acknowledged", "owner"]
def filter_is_complete(self, queryset, name, value):
if value:
return queryset.filter(status__in=PaperlessTask.COMPLETE_STATUSES)
return queryset.exclude(status__in=PaperlessTask.COMPLETE_STATUSES)
class ObjectOwnedOrGrantedPermissionsFilter(ObjectPermissionsFilter):
@@ -27,6 +27,7 @@ from watchfiles import watch
from documents.data_models import ConsumableDocument
from documents.data_models import DocumentMetadataOverrides
from documents.data_models import DocumentSource
from documents.models import PaperlessTask
from documents.models import Tag
from documents.parsers import get_supported_file_extensions
from documents.tasks import consume_file
@@ -338,12 +339,15 @@ def _consume_file(
# Queue for consumption
try:
logger.info(f"Adding {filepath} to the task queue")
consume_file.delay(
ConsumableDocument(
source=DocumentSource.ConsumeFolder,
original_file=filepath,
),
DocumentMetadataOverrides(tag_ids=tag_ids),
consume_file.apply_async(
kwargs={
"input_doc": ConsumableDocument(
source=DocumentSource.ConsumeFolder,
original_file=filepath,
),
"overrides": DocumentMetadataOverrides(tag_ids=tag_ids),
},
headers={"trigger_source": PaperlessTask.TriggerSource.FOLDER_CONSUME},
)
except Exception:
logger.exception(f"Error while queuing document {filepath}")
@@ -22,7 +22,6 @@ class Command(PaperlessCommand):
self.buffered_logging("paperless.classifier"),
):
train_classifier(
scheduled=False,
status_callback=lambda msg: self.console.print(f" {msg}"),
)
@@ -1,9 +1,8 @@
import hashlib
import io
import json
import os
import shutil
import zipfile
import tempfile
from itertools import islice
from pathlib import Path
from typing import TYPE_CHECKING
@@ -98,8 +97,6 @@ class StreamingManifestWriter:
*,
compare_json: bool = False,
files_in_export_dir: "set[Path] | None" = None,
zip_file: "zipfile.ZipFile | None" = None,
zip_arcname: str | None = None,
) -> None:
self._path = path.resolve()
self._tmp_path = self._path.with_suffix(self._path.suffix + ".tmp")
@@ -107,20 +104,12 @@ class StreamingManifestWriter:
self._files_in_export_dir: set[Path] = (
files_in_export_dir if files_in_export_dir is not None else set()
)
self._zip_file = zip_file
self._zip_arcname = zip_arcname
self._zip_mode = zip_file is not None
self._file = None
self._first = True
def open(self) -> None:
if self._zip_mode:
# zipfile only allows one open write handle at a time, so buffer
# the manifest in memory and write it atomically on close()
self._file = io.StringIO()
else:
self._path.parent.mkdir(parents=True, exist_ok=True)
self._file = self._tmp_path.open("w", encoding="utf-8")
self._path.parent.mkdir(parents=True, exist_ok=True)
self._file = self._tmp_path.open("w", encoding="utf-8")
self._file.write("[")
self._first = True
@@ -141,18 +130,15 @@ class StreamingManifestWriter:
if self._file is None:
return
self._file.write("\n]")
if self._zip_mode:
self._zip_file.writestr(self._zip_arcname, self._file.getvalue())
self._file.close()
self._file = None
if not self._zip_mode:
self._finalize()
self._finalize()
def discard(self) -> None:
if self._file is not None:
self._file.close()
self._file = None
if not self._zip_mode and self._tmp_path.exists():
if self._tmp_path.exists():
self._tmp_path.unlink()
def _finalize(self) -> None:
@@ -329,13 +315,18 @@ class Command(CryptMixin, PaperlessCommand):
self.files_in_export_dir: set[Path] = set()
self.exported_files: set[str] = set()
self.zip_file: zipfile.ZipFile | None = None
self._zip_dirs: set[str] = set()
# If zipping, save the original target for later and
# get a temporary directory for the target instead
temp_dir = None
self.original_target = self.target
if self.zip_export:
zip_name = options["zip_name"]
self.zip_path = (self.target / zip_name).with_suffix(".zip")
self.zip_tmp_path = self.zip_path.parent / (self.zip_path.name + ".tmp")
settings.SCRATCH_DIR.mkdir(parents=True, exist_ok=True)
temp_dir = tempfile.TemporaryDirectory(
dir=settings.SCRATCH_DIR,
prefix="paperless-export",
)
self.target = Path(temp_dir.name).resolve()
if not self.target.exists():
raise CommandError("That path doesn't exist")
@@ -346,53 +337,30 @@ class Command(CryptMixin, PaperlessCommand):
if not os.access(self.target, os.W_OK):
raise CommandError("That path doesn't appear to be writable")
if self.zip_export:
if self.compare_checksums:
self.stdout.write(
self.style.WARNING(
"--compare-checksums is ignored when --zip is used",
),
)
if self.compare_json:
self.stdout.write(
self.style.WARNING(
"--compare-json is ignored when --zip is used",
),
)
try:
# Prevent any ongoing changes in the documents
with FileLock(settings.MEDIA_LOCK):
if self.zip_export:
self.zip_file = zipfile.ZipFile(
self.zip_tmp_path,
"w",
compression=zipfile.ZIP_DEFLATED,
allowZip64=True,
)
self.dump()
if self.zip_file is not None:
self.zip_file.close()
self.zip_file = None
self.zip_tmp_path.rename(self.zip_path)
# We've written everything to the temporary directory in this case,
# now make an archive in the original target, with all files stored
if self.zip_export and temp_dir is not None:
shutil.make_archive(
self.original_target / options["zip_name"],
format="zip",
root_dir=temp_dir.name,
)
finally:
# Ensure zip_file is closed and the incomplete .tmp is removed on failure
if self.zip_file is not None:
self.zip_file.close()
self.zip_file = None
if self.zip_export and self.zip_tmp_path.exists():
self.zip_tmp_path.unlink()
# Always cleanup the temporary directory, if one was created
if self.zip_export and temp_dir is not None:
temp_dir.cleanup()
def dump(self) -> None:
# 1. Take a snapshot of what files exist in the current export folder
# (skipped in zip mode — always write fresh, no skip/compare logic applies)
if not self.zip_export:
for x in self.target.glob("**/*"):
if x.is_file():
self.files_in_export_dir.add(x.resolve())
for x in self.target.glob("**/*"):
if x.is_file():
self.files_in_export_dir.add(x.resolve())
# 2. Create manifest, containing all correspondents, types, tags, storage paths
# note, documents and ui_settings
@@ -453,8 +421,6 @@ class Command(CryptMixin, PaperlessCommand):
manifest_path,
compare_json=self.compare_json,
files_in_export_dir=self.files_in_export_dir,
zip_file=self.zip_file,
zip_arcname="manifest.json",
) as writer:
with transaction.atomic():
for key, qs in manifest_key_to_object_query.items():
@@ -573,12 +539,8 @@ class Command(CryptMixin, PaperlessCommand):
self.target,
)
else:
# 5. Remove pre-existing files/dirs from target, keeping the
# in-progress zip (.tmp) and any prior zip at the final path
skip = {self.zip_path.resolve(), self.zip_tmp_path.resolve()}
for item in self.target.glob("*"):
if item.resolve() in skip:
continue
# 5. Remove anything in the original location (before moving the zip)
for item in self.original_target.glob("*"):
if item.is_dir():
shutil.rmtree(item)
else:
@@ -748,23 +710,9 @@ class Command(CryptMixin, PaperlessCommand):
if self.use_folder_prefix:
manifest_name = Path("json") / manifest_name
manifest_name = (self.target / manifest_name).resolve()
if not self.zip_export:
manifest_name.parent.mkdir(parents=True, exist_ok=True)
manifest_name.parent.mkdir(parents=True, exist_ok=True)
self.check_and_write_json(content, manifest_name)
def _ensure_zip_dirs(self, arcname: str) -> None:
"""Write directory marker entries for all parent directories of arcname.
Some zip viewers only show folder structure when explicit directory
entries exist, so we add them to avoid confusing users.
"""
parts = Path(arcname).parts[:-1]
for i in range(len(parts)):
dir_arc = "/".join(parts[: i + 1]) + "/"
if dir_arc not in self._zip_dirs:
self._zip_dirs.add(dir_arc)
self.zip_file.mkdir(dir_arc)
def check_and_write_json(
self,
content: list[dict] | dict,
@@ -777,38 +725,32 @@ class Command(CryptMixin, PaperlessCommand):
This preserves the file timestamps when no changes are made.
"""
if self.zip_export:
arcname = str(target.resolve().relative_to(self.target))
self._ensure_zip_dirs(arcname)
self.zip_file.writestr(
arcname,
target = target.resolve()
perform_write = True
if target in self.files_in_export_dir:
self.files_in_export_dir.remove(target)
if self.compare_json:
target_checksum = hashlib.blake2b(target.read_bytes()).hexdigest()
src_str = json.dumps(
content,
cls=DjangoJSONEncoder,
indent=2,
ensure_ascii=False,
)
src_checksum = hashlib.blake2b(src_str.encode("utf-8")).hexdigest()
if src_checksum == target_checksum:
perform_write = False
if perform_write:
target.write_text(
json.dumps(
content,
cls=DjangoJSONEncoder,
indent=2,
ensure_ascii=False,
),
encoding="utf-8",
)
return
target = target.resolve()
json_str = json.dumps(
content,
cls=DjangoJSONEncoder,
indent=2,
ensure_ascii=False,
)
perform_write = True
if target in self.files_in_export_dir:
self.files_in_export_dir.remove(target)
if self.compare_json:
target_checksum = hashlib.blake2b(target.read_bytes()).hexdigest()
src_checksum = hashlib.blake2b(json_str.encode("utf-8")).hexdigest()
if src_checksum == target_checksum:
perform_write = False
if perform_write:
target.write_text(json_str, encoding="utf-8")
def check_and_copy(
self,
@@ -821,12 +763,6 @@ class Command(CryptMixin, PaperlessCommand):
the source attributes
"""
if self.zip_export:
arcname = str(target.resolve().relative_to(self.target))
self._ensure_zip_dirs(arcname)
self.zip_file.write(source, arcname=arcname)
return
target = target.resolve()
if target in self.files_in_export_dir:
self.files_in_export_dir.remove(target)
@@ -1,8 +1,12 @@
import dataclasses
from itertools import combinations
from typing import Final
import rapidfuzz
from django.core.management import CommandError
from rich.panel import Panel
from rich.table import Table
from rich.text import Text
from documents.management.commands.base import PaperlessCommand
from documents.models import Document
@@ -10,8 +14,11 @@ from documents.models import Document
@dataclasses.dataclass(frozen=True, slots=True)
class _WorkPackage:
first_doc: Document
second_doc: Document
pk_a: int
content_a: str
pk_b: int
content_b: str
score_cutoff: float
@dataclasses.dataclass(frozen=True, slots=True)
@@ -20,21 +27,20 @@ class _WorkResult:
doc_two_pk: int
ratio: float
def __lt__(self, other: "_WorkResult") -> bool:
return self.doc_one_pk < other.doc_one_pk
def _process_and_match(work: _WorkPackage) -> _WorkResult:
"""
Does basic processing of document content, gets the basic ratio
and returns the result package.
Process document content and compute the fuzzy ratio.
score_cutoff lets rapidfuzz short-circuit when the score cannot reach the threshold.
"""
first_string = rapidfuzz.utils.default_process(work.first_doc.content)
second_string = rapidfuzz.utils.default_process(work.second_doc.content)
match = rapidfuzz.fuzz.ratio(first_string, second_string)
return _WorkResult(work.first_doc.pk, work.second_doc.pk, match)
first_string = rapidfuzz.utils.default_process(work.content_a)
second_string = rapidfuzz.utils.default_process(work.content_b)
ratio = rapidfuzz.fuzz.ratio(
first_string,
second_string,
score_cutoff=work.score_cutoff,
)
return _WorkResult(work.pk_a, work.pk_b, ratio)
class Command(PaperlessCommand):
@@ -57,78 +63,169 @@ class Command(PaperlessCommand):
action="store_true",
help="If set, one document of matches above the ratio WILL BE DELETED",
)
parser.add_argument(
"--yes",
default=False,
action="store_true",
help="Skip the confirmation prompt when used with --delete",
)
def _render_results(
self,
matches: list[_WorkResult],
*,
opt_ratio: float,
do_delete: bool,
) -> list[int]:
"""Render match results as a Rich table. Returns list of PKs to delete."""
if not matches:
self.console.print(
Panel(
"[green]No duplicate documents found.[/green]",
title="Fuzzy Match",
border_style="green",
),
)
return []
# Fetch titles for matched documents in a single query.
all_pks = {pk for m in matches for pk in (m.doc_one_pk, m.doc_two_pk)}
titles: dict[int, str] = dict(
Document.objects.filter(pk__in=all_pks)
.only("pk", "title")
.values_list("pk", "title"),
)
table = Table(
title=f"Fuzzy Matches (threshold: {opt_ratio:.1f}%)",
show_lines=True,
title_style="bold",
)
table.add_column("#", style="dim", width=4, no_wrap=True)
table.add_column("Document A", min_width=24)
table.add_column("Document B", min_width=24)
table.add_column("Similarity", width=11, justify="right")
maybe_delete_ids: list[int] = []
for i, match_result in enumerate(matches, 1):
pk_a = match_result.doc_one_pk
pk_b = match_result.doc_two_pk
ratio = match_result.ratio
if ratio >= 97.0:
ratio_style = "bold red"
elif ratio >= 92.0:
ratio_style = "red"
elif ratio >= 88.0:
ratio_style = "yellow"
else:
ratio_style = "dim"
table.add_row(
str(i),
f"[dim]#{pk_a}[/dim] {titles.get(pk_a, 'Unknown')}",
f"[dim]#{pk_b}[/dim] {titles.get(pk_b, 'Unknown')}",
Text(f"{ratio:.1f}%", style=ratio_style),
)
maybe_delete_ids.append(pk_b)
self.console.print(table)
summary = f"Found [bold]{len(matches)}[/bold] matching pair(s)."
if do_delete:
summary += f" [yellow]{len(maybe_delete_ids)}[/yellow] document(s) will be deleted."
self.console.print(summary)
return maybe_delete_ids
def handle(self, *args, **options):
RATIO_MIN: Final[float] = 0.0
RATIO_MAX: Final[float] = 100.0
if options["delete"]:
self.stdout.write(
self.style.WARNING(
"The command is configured to delete documents. Use with caution",
),
)
opt_ratio = options["ratio"]
checked_pairs: set[tuple[int, int]] = set()
work_pkgs: list[_WorkPackage] = []
if opt_ratio < RATIO_MIN or opt_ratio > RATIO_MAX:
raise CommandError("The ratio must be between 0 and 100")
all_docs = Document.objects.all().order_by("id")
for first_doc in all_docs:
for second_doc in all_docs:
if first_doc.pk == second_doc.pk:
continue
if first_doc.content.strip() == "" or second_doc.content.strip() == "":
continue
doc_1_to_doc_2 = (first_doc.pk, second_doc.pk)
doc_2_to_doc_1 = doc_1_to_doc_2[::-1]
if doc_1_to_doc_2 in checked_pairs or doc_2_to_doc_1 in checked_pairs:
continue
checked_pairs.update([doc_1_to_doc_2, doc_2_to_doc_1])
work_pkgs.append(_WorkPackage(first_doc, second_doc))
results: list[_WorkResult] = []
if self.process_count == 1:
for work in self.track(work_pkgs, description="Matching..."):
results.append(_process_and_match(work))
else: # pragma: no cover
for proc_result in self.process_parallel(
_process_and_match,
work_pkgs,
description="Matching...",
):
if proc_result.error:
self.console.print(
f"[red]Failed: {proc_result.error}[/red]",
)
elif proc_result.result is not None:
results.append(proc_result.result)
messages: list[str] = []
maybe_delete_ids: list[int] = []
for match_result in sorted(results):
if match_result.ratio >= opt_ratio:
messages.append(
self.style.NOTICE(
f"Document {match_result.doc_one_pk} fuzzy match"
f" to {match_result.doc_two_pk}"
f" (confidence {match_result.ratio:.3f})\n",
),
)
maybe_delete_ids.append(match_result.doc_two_pk)
if len(messages) == 0:
messages.append(self.style.SUCCESS("No matches found\n"))
self.stdout.writelines(messages)
if options["delete"]:
self.stdout.write(
self.style.NOTICE(
f"Deleting {len(maybe_delete_ids)} documents based on ratio matches",
self.console.print(
Panel(
"[bold yellow]WARNING:[/bold yellow] This run is configured to delete"
" documents. One document from each matched pair WILL BE PERMANENTLY DELETED.",
title="Delete Mode",
border_style="red",
),
)
Document.objects.filter(pk__in=maybe_delete_ids).delete()
# Load only the fields we need -- avoids fetching title, archive_checksum, etc.
slim_docs: list[tuple[int, str]] = list(
Document.objects.only("id", "content")
.order_by("id")
.values_list("id", "content"),
)
# combinations() generates each unique pair exactly once -- no checked_pairs set needed.
# The total is computed cheaply so the progress bar can start immediately without
# materialising all pairs up front (n*(n-1)/2 can be hundreds of thousands).
n = len(slim_docs)
total_pairs = n * (n - 1) // 2
def _work_gen():
for (pk_a, ca), (pk_b, cb) in combinations(slim_docs, 2):
if ca.strip() and cb.strip():
yield _WorkPackage(pk_a, ca, pk_b, cb, opt_ratio)
def _iter_matches():
if self.process_count == 1:
for work in self.track(
_work_gen(),
description="Matching...",
total=total_pairs,
):
result = _process_and_match(work)
if result.ratio >= opt_ratio:
yield result
else: # pragma: no cover
work_pkgs = list(_work_gen())
for proc_result in self.process_parallel(
_process_and_match,
work_pkgs,
description="Matching...",
):
if proc_result.error:
self.console.print(
f"[red]Failed: {proc_result.error}[/red]",
)
elif (
proc_result.result is not None
and proc_result.result.ratio >= opt_ratio
):
yield proc_result.result
matches = sorted(_iter_matches(), key=lambda m: m.ratio, reverse=True)
maybe_delete_ids = self._render_results(
matches,
opt_ratio=opt_ratio,
do_delete=options["delete"],
)
if options["delete"] and maybe_delete_ids:
confirmed = options["yes"]
if not confirmed:
self.console.print(
f"\nDelete [bold]{len(maybe_delete_ids)}[/bold] document(s)? "
"[bold]\\[y/N][/bold] ",
end="",
)
answer = input().strip().lower()
confirmed = answer in {"y", "yes"}
if confirmed:
self.console.print(
f"[red]Deleting {len(maybe_delete_ids)} document(s)...[/red]",
)
Document.objects.filter(pk__in=maybe_delete_ids).delete()
self.console.print("[green]Done.[/green]")
else:
self.console.print("[yellow]Deletion cancelled.[/yellow]")
@@ -17,7 +17,6 @@ class Command(PaperlessCommand):
def handle(self, *args: Any, **options: Any) -> None:
llmindex_index(
rebuild=options["command"] == "rebuild",
scheduled=False,
iter_wrapper=lambda docs: self.track(
docs,
description="Indexing documents...",
@@ -111,7 +111,6 @@ class Command(PaperlessCommand):
def handle(self, *args: Any, **options: Any) -> None:
messages = check_sanity(
scheduled=False,
iter_wrapper=lambda docs: self.track(
docs,
description="Checking documents...",
@@ -0,0 +1,209 @@
"""
Drop and recreate the PaperlessTask table with the new structured schema.
We intentionally drop all existing task data -- the old schema was
string-based and incompatible with the new JSONField result storage.
"""
import django.db.models.deletion
import django.utils.timezone
from django.conf import settings
from django.db import migrations
from django.db import models
class Migration(migrations.Migration):
dependencies = [
("documents", "0018_saved_view_simple_search_rules"),
migrations.swappable_dependency(settings.AUTH_USER_MODEL),
]
operations = [
migrations.DeleteModel(name="PaperlessTask"),
migrations.CreateModel(
name="PaperlessTask",
fields=[
(
"id",
models.AutoField(
auto_created=True,
primary_key=True,
serialize=False,
verbose_name="ID",
),
),
(
"owner",
models.ForeignKey(
blank=True,
default=None,
null=True,
on_delete=django.db.models.deletion.SET_NULL,
to=settings.AUTH_USER_MODEL,
verbose_name="owner",
),
),
(
"task_id",
models.CharField(
help_text="Celery task ID",
max_length=72,
unique=True,
verbose_name="Task ID",
),
),
(
"task_type",
models.CharField(
choices=[
("consume_file", "Consume File"),
("train_classifier", "Train Classifier"),
("sanity_check", "Sanity Check"),
("index_optimize", "Index Optimize"),
("mail_fetch", "Mail Fetch"),
("llm_index", "LLM Index"),
("empty_trash", "Empty Trash"),
("check_workflows", "Check Workflows"),
("bulk_update", "Bulk Update"),
("reprocess_document", "Reprocess Document"),
("build_share_link", "Build Share Link"),
("bulk_delete", "Bulk Delete"),
],
db_index=True,
help_text="The kind of work being performed",
max_length=50,
verbose_name="Task Type",
),
),
(
"trigger_source",
models.CharField(
choices=[
("scheduled", "Scheduled"),
("web_ui", "Web UI"),
("api_upload", "API Upload"),
("folder_consume", "Folder Consume"),
("email_consume", "Email Consume"),
("system", "System"),
("manual", "Manual"),
],
db_index=True,
help_text="What initiated this task",
max_length=50,
verbose_name="Trigger Source",
),
),
(
"status",
models.CharField(
choices=[
("pending", "Pending"),
("started", "Started"),
("success", "Success"),
("failure", "Failure"),
("revoked", "Revoked"),
],
db_index=True,
default="pending",
max_length=30,
verbose_name="Status",
),
),
(
"date_created",
models.DateTimeField(
db_index=True,
default=django.utils.timezone.now,
verbose_name="Created",
),
),
(
"date_started",
models.DateTimeField(
blank=True,
null=True,
verbose_name="Started",
),
),
(
"date_done",
models.DateTimeField(
blank=True,
db_index=True,
null=True,
verbose_name="Completed",
),
),
(
"duration_seconds",
models.FloatField(
blank=True,
help_text="Elapsed time from start to completion",
null=True,
verbose_name="Duration (seconds)",
),
),
(
"wait_time_seconds",
models.FloatField(
blank=True,
help_text="Time from task creation to worker pickup",
null=True,
verbose_name="Wait Time (seconds)",
),
),
(
"input_data",
models.JSONField(
blank=True,
default=dict,
help_text="Structured input parameters for the task",
verbose_name="Input Data",
),
),
(
"result_data",
models.JSONField(
blank=True,
help_text="Structured result data from task execution",
null=True,
verbose_name="Result Data",
),
),
(
"acknowledged",
models.BooleanField(
db_index=True,
default=False,
verbose_name="Acknowledged",
),
),
],
options={
"verbose_name": "Task",
"verbose_name_plural": "Tasks",
"ordering": ["-date_created"],
},
),
migrations.AddIndex(
model_name="paperlesstask",
index=models.Index(
fields=["status", "date_created"],
name="documents_p_status_8aa687_idx",
),
),
migrations.AddIndex(
model_name="paperlesstask",
index=models.Index(
fields=["task_type", "status"],
name="documents_p_task_ty_e4a93f_idx",
),
),
migrations.AddIndex(
model_name="paperlesstask",
index=models.Index(
fields=["owner", "acknowledged", "date_created"],
name="documents_p_owner_i_62c545_idx",
),
),
]
@@ -0,0 +1,26 @@
from django.db import migrations
class Migration(migrations.Migration):
dependencies = [
("documents", "0019_task_system_redesign"),
]
operations = [
migrations.RunSQL(
sql="DROP TABLE IF EXISTS django_celery_results_taskresult;",
reverse_sql=migrations.RunSQL.noop,
),
migrations.RunSQL(
sql="DROP TABLE IF EXISTS django_celery_results_groupresult;",
reverse_sql=migrations.RunSQL.noop,
),
migrations.RunSQL(
sql="DROP TABLE IF EXISTS django_celery_results_chordcounter;",
reverse_sql=migrations.RunSQL.noop,
),
migrations.RunSQL(
sql="DELETE FROM django_migrations WHERE app = 'django_celery_results';",
reverse_sql=migrations.RunSQL.noop,
),
]
+130 -58
View File
@@ -3,7 +3,6 @@ from pathlib import Path
from typing import Final
import pathvalidate
from celery import states
from django.conf import settings
from django.contrib.auth.models import Group
from django.contrib.auth.models import User
@@ -381,7 +380,10 @@ class Document(SoftDeleteModel, ModelWithOwner): # type: ignore[django-manager-
if isinstance(prefetched_cache, dict)
else None
)
if prefetched_versions:
if prefetched_versions is not None:
# Empty list means prefetch ran and found no versions — use own content.
if not prefetched_versions:
return self.content
latest_prefetched = max(prefetched_versions, key=lambda doc: doc.id)
return latest_prefetched.content
@@ -660,97 +662,167 @@ class UiSettings(models.Model):
class PaperlessTask(ModelWithOwner):
ALL_STATES = sorted(states.ALL_STATES)
TASK_STATE_CHOICES = sorted(zip(ALL_STATES, ALL_STATES))
"""
Tracks background task execution for user visibility and debugging.
State transitions:
PENDING -> STARTED -> SUCCESS
PENDING -> STARTED -> FAILURE
PENDING -> REVOKED (if cancelled before starting)
"""
class Status(models.TextChoices):
PENDING = "pending", _("Pending")
STARTED = "started", _("Started")
SUCCESS = "success", _("Success")
FAILURE = "failure", _("Failure")
REVOKED = "revoked", _("Revoked")
class TaskType(models.TextChoices):
AUTO = ("auto_task", _("Auto Task"))
SCHEDULED_TASK = ("scheduled_task", _("Scheduled Task"))
MANUAL_TASK = ("manual_task", _("Manual Task"))
CONSUME_FILE = "consume_file", _("Consume File")
TRAIN_CLASSIFIER = "train_classifier", _("Train Classifier")
SANITY_CHECK = "sanity_check", _("Sanity Check")
INDEX_OPTIMIZE = "index_optimize", _("Index Optimize")
MAIL_FETCH = "mail_fetch", _("Mail Fetch")
LLM_INDEX = "llm_index", _("LLM Index")
EMPTY_TRASH = "empty_trash", _("Empty Trash")
CHECK_WORKFLOWS = "check_workflows", _("Check Workflows")
BULK_UPDATE = "bulk_update", _("Bulk Update")
REPROCESS_DOCUMENT = "reprocess_document", _("Reprocess Document")
BUILD_SHARE_LINK = "build_share_link", _("Build Share Link")
BULK_DELETE = "bulk_delete", _("Bulk Delete")
class TaskName(models.TextChoices):
CONSUME_FILE = ("consume_file", _("Consume File"))
TRAIN_CLASSIFIER = ("train_classifier", _("Train Classifier"))
CHECK_SANITY = ("check_sanity", _("Check Sanity"))
INDEX_OPTIMIZE = ("index_optimize", _("Index Optimize"))
LLMINDEX_UPDATE = ("llmindex_update", _("LLM Index Update"))
COMPLETE_STATUSES = (
Status.SUCCESS,
Status.FAILURE,
Status.REVOKED,
)
class TriggerSource(models.TextChoices):
SCHEDULED = "scheduled", _("Scheduled") # Celery beat
WEB_UI = "web_ui", _("Web UI") # Document uploaded via web
API_UPLOAD = "api_upload", _("API Upload") # Document uploaded via API
FOLDER_CONSUME = "folder_consume", _("Folder Consume") # Consume folder
EMAIL_CONSUME = "email_consume", _("Email Consume") # Email attachment
SYSTEM = "system", _("System") # Auto-triggered (self-heal, config side-effect)
MANUAL = "manual", _("Manual") # User explicitly ran via /api/tasks/run/
# Identification
task_id = models.CharField(
max_length=255,
max_length=72,
unique=True,
verbose_name=_("Task ID"),
help_text=_("Celery ID for the Task that was run"),
help_text=_("Celery task ID"),
)
acknowledged = models.BooleanField(
default=False,
verbose_name=_("Acknowledged"),
help_text=_("If the task is acknowledged via the frontend or API"),
task_type = models.CharField(
max_length=50,
choices=TaskType.choices,
verbose_name=_("Task Type"),
help_text=_("The kind of work being performed"),
db_index=True,
)
task_file_name = models.CharField(
null=True,
max_length=255,
verbose_name=_("Task Filename"),
help_text=_("Name of the file which the Task was run for"),
)
task_name = models.CharField(
null=True,
max_length=255,
choices=TaskName.choices,
verbose_name=_("Task Name"),
help_text=_("Name of the task that was run"),
trigger_source = models.CharField(
max_length=50,
choices=TriggerSource.choices,
verbose_name=_("Trigger Source"),
help_text=_("What initiated this task"),
db_index=True,
)
# State tracking
status = models.CharField(
max_length=30,
default=states.PENDING,
choices=TASK_STATE_CHOICES,
verbose_name=_("Task State"),
help_text=_("Current state of the task being run"),
choices=Status.choices,
default=Status.PENDING,
verbose_name=_("Status"),
db_index=True,
)
# Timestamps
date_created = models.DateTimeField(
null=True,
default=timezone.now,
verbose_name=_("Created DateTime"),
help_text=_("Datetime field when the task result was created in UTC"),
verbose_name=_("Created"),
db_index=True,
)
date_started = models.DateTimeField(
null=True,
default=None,
verbose_name=_("Started DateTime"),
help_text=_("Datetime field when the task was started in UTC"),
blank=True,
verbose_name=_("Started"),
)
date_done = models.DateTimeField(
null=True,
default=None,
verbose_name=_("Completed DateTime"),
help_text=_("Datetime field when the task was completed in UTC"),
blank=True,
verbose_name=_("Completed"),
db_index=True,
)
result = models.TextField(
# Duration fields -- populated by task_postrun signal handler
duration_seconds = models.FloatField(
null=True,
default=None,
blank=True,
verbose_name=_("Duration (seconds)"),
help_text=_("Elapsed time from start to completion"),
)
wait_time_seconds = models.FloatField(
null=True,
blank=True,
verbose_name=_("Wait Time (seconds)"),
help_text=_("Time from task creation to worker pickup"),
)
# Input/Output data
input_data = models.JSONField(
default=dict,
blank=True,
verbose_name=_("Input Data"),
help_text=_("Structured input parameters for the task"),
)
result_data = models.JSONField(
null=True,
blank=True,
verbose_name=_("Result Data"),
help_text=_(
"The data returned by the task",
),
help_text=_("Structured result data from task execution"),
)
type = models.CharField(
max_length=30,
choices=TaskType.choices,
default=TaskType.AUTO,
verbose_name=_("Task Type"),
help_text=_("The type of task that was run"),
# Acknowledgment
acknowledged = models.BooleanField(
default=False,
verbose_name=_("Acknowledged"),
db_index=True,
)
def __str__(self) -> str:
return f"Task {self.task_id}"
class Meta:
verbose_name = _("Task")
verbose_name_plural = _("Tasks")
ordering = ["-date_created"]
indexes = [
models.Index(fields=["status", "date_created"]),
models.Index(fields=["task_type", "status"]),
models.Index(fields=["owner", "acknowledged", "date_created"]),
]
def __str__(self) -> str: # pragma: no cover
return f"{self.get_task_type_display()} [{self.task_id[:8]}]"
@property
def is_complete(self) -> bool: # pragma: no cover
return self.status in self.COMPLETE_STATUSES
@property
def related_document_ids(self) -> list[int]: # pragma: no cover
if not self.result_data:
return []
if doc_id := self.result_data.get("document_id"):
return [doc_id]
if dup_id := self.result_data.get("duplicate_of"):
return [dup_id]
return []
class Note(SoftDeleteModel):
+1 -1
View File
@@ -72,7 +72,7 @@ def has_system_status_permission(user: User | None) -> bool:
return (
getattr(user, "is_superuser", False)
or getattr(user, "is_staff", False)
or user.has_perm("paperless.view_system_status")
or user.has_perm("paperless.view_system_monitoring")
)
+18 -45
View File
@@ -10,7 +10,6 @@ is an identity function that adds no overhead.
"""
import logging
import uuid
from collections import defaultdict
from collections.abc import Iterator
from pathlib import Path
@@ -18,12 +17,9 @@ from typing import TYPE_CHECKING
from typing import Final
from typing import TypedDict
from celery import states
from django.conf import settings
from django.utils import timezone
from documents.models import Document
from documents.models import PaperlessTask
from documents.utils import IterWrapper
from documents.utils import compute_checksum
from documents.utils import identity
@@ -182,8 +178,9 @@ def _check_thumbnail(
present_files: set[Path],
) -> None:
"""Verify the thumbnail exists and is readable."""
thumbnail_path: Final[Path] = Path(doc.thumbnail_path).resolve()
if not thumbnail_path.exists() or not thumbnail_path.is_file():
# doc.thumbnail_path already returns a resolved Path; no need to re-resolve.
thumbnail_path: Final[Path] = doc.thumbnail_path
if not thumbnail_path.is_file():
messages.error(doc.pk, "Thumbnail of document does not exist.")
return
@@ -200,8 +197,9 @@ def _check_original(
present_files: set[Path],
) -> None:
"""Verify the original file exists, is readable, and has matching checksum."""
source_path: Final[Path] = Path(doc.source_path).resolve()
if not source_path.exists() or not source_path.is_file():
# doc.source_path already returns a resolved Path; no need to re-resolve.
source_path: Final[Path] = doc.source_path
if not source_path.is_file():
messages.error(doc.pk, "Original of document does not exist.")
return
@@ -237,8 +235,9 @@ def _check_archive(
elif doc.has_archive_version:
if TYPE_CHECKING:
assert isinstance(doc.archive_path, Path)
archive_path: Final[Path] = Path(doc.archive_path).resolve()
if not archive_path.exists() or not archive_path.is_file():
# doc.archive_path already returns a resolved Path; no need to re-resolve.
archive_path: Final[Path] = doc.archive_path # type: ignore[assignment]
if not archive_path.is_file():
messages.error(doc.pk, "Archived version of document does not exist.")
return
@@ -284,59 +283,33 @@ def _check_document(
def check_sanity(
*,
scheduled: bool = True,
iter_wrapper: IterWrapper[Document] = identity,
) -> SanityCheckMessages:
"""Run a full sanity check on the document archive.
Args:
scheduled: Whether this is a scheduled (automatic) or manual check.
Controls the task type recorded in the database.
iter_wrapper: A callable that wraps the document iterable, e.g.,
for progress bar display. Defaults to identity (no wrapping).
Returns:
A SanityCheckMessages instance containing all detected issues.
"""
paperless_task = PaperlessTask.objects.create(
task_id=uuid.uuid4(),
type=(
PaperlessTask.TaskType.SCHEDULED_TASK
if scheduled
else PaperlessTask.TaskType.MANUAL_TASK
),
task_name=PaperlessTask.TaskName.CHECK_SANITY,
status=states.STARTED,
date_created=timezone.now(),
date_started=timezone.now(),
)
messages = SanityCheckMessages()
present_files = _build_present_files()
documents = Document.global_objects.all()
documents = Document.global_objects.only(
"pk",
"filename",
"mime_type",
"checksum",
"archive_checksum",
"archive_filename",
"content",
).iterator(chunk_size=500)
for doc in iter_wrapper(documents):
_check_document(doc, messages, present_files)
for extra_file in present_files:
messages.warning(None, f"Orphaned file in media dir: {extra_file}")
paperless_task.status = states.SUCCESS if not messages.has_error else states.FAILURE
if messages.total_issue_count == 0:
paperless_task.result = "No issues found."
else:
parts: list[str] = []
if messages.document_error_count:
parts.append(f"{messages.document_error_count} document(s) with errors")
if messages.document_warning_count:
parts.append(f"{messages.document_warning_count} document(s) with warnings")
if messages.global_warning_count:
parts.append(f"{messages.global_warning_count} global warning(s)")
paperless_task.result = ", ".join(parts) + " found."
if messages.has_error:
paperless_task.result += " Check logs for details."
paperless_task.date_done = timezone.now()
paperless_task.save(update_fields=["status", "result", "date_done"])
return messages
+2 -2
View File
@@ -1,6 +1,6 @@
from documents.search._backend import SearchHit
from documents.search._backend import SearchIndexLockError
from documents.search._backend import SearchMode
from documents.search._backend import SearchResults
from documents.search._backend import TantivyBackend
from documents.search._backend import TantivyRelevanceList
from documents.search._backend import WriteBatch
@@ -10,9 +10,9 @@ from documents.search._schema import needs_rebuild
from documents.search._schema import wipe_index
__all__ = [
"SearchHit",
"SearchIndexLockError",
"SearchMode",
"SearchResults",
"TantivyBackend",
"TantivyRelevanceList",
"WriteBatch",
+349 -241
View File
@@ -1,9 +1,9 @@
from __future__ import annotations
import logging
import re
import threading
from collections import Counter
from dataclasses import dataclass
from datetime import UTC
from datetime import datetime
from enum import StrEnum
@@ -21,6 +21,7 @@ from guardian.shortcuts import get_users_with_perms
from documents.search._normalize import ascii_fold
from documents.search._query import build_permission_filter
from documents.search._query import parse_simple_text_highlight_query
from documents.search._query import parse_simple_text_query
from documents.search._query import parse_simple_title_query
from documents.search._query import parse_user_query
@@ -88,45 +89,63 @@ class SearchHit(TypedDict):
highlights: dict[str, str]
@dataclass(frozen=True, slots=True)
class SearchResults:
"""
Container for search results with pagination metadata.
Attributes:
hits: List of search results with scores and highlights
total: Total matching documents across all pages (for pagination)
query: Preprocessed query string after date/syntax rewriting
"""
hits: list[SearchHit]
total: int # total matching documents (for pagination)
query: str # preprocessed query string
class TantivyRelevanceList:
"""
DRF-compatible list wrapper for Tantivy search hits.
DRF-compatible list wrapper for Tantivy search results.
Provides paginated access to search results while storing all hits in memory
for efficient ID retrieval. Used by Django REST framework for pagination.
Holds a lightweight ordered list of IDs (for pagination count and
``selection_data``) together with a small page of rich ``SearchHit``
dicts (for serialization). DRF's ``PageNumberPagination`` calls
``__len__`` to compute the total page count and ``__getitem__`` to
slice the displayed page.
Methods:
__len__: Returns total hit count for pagination calculations
__getitem__: Slices the hit list for page-specific results
Note: Stores ALL post-filter hits so get_all_result_ids() can return
every matching document ID without requiring a second search query.
Args:
ordered_ids: All matching document IDs in display order.
page_hits: Rich SearchHit dicts for the requested DRF page only.
page_offset: Index into *ordered_ids* where *page_hits* starts.
"""
def __init__(self, hits: list[SearchHit]) -> None:
self._hits = hits
def __init__(
self,
ordered_ids: list[int],
page_hits: list[SearchHit],
page_offset: int = 0,
) -> None:
self._ordered_ids = ordered_ids
self._page_hits = page_hits
self._page_offset = page_offset
def __len__(self) -> int:
return len(self._hits)
return len(self._ordered_ids)
def __getitem__(self, key: slice) -> list[SearchHit]:
return self._hits[key]
def __getitem__(self, key: int | slice) -> SearchHit | list[SearchHit]:
if isinstance(key, int):
idx = key if key >= 0 else len(self._ordered_ids) + key
if self._page_offset <= idx < self._page_offset + len(self._page_hits):
return self._page_hits[idx - self._page_offset]
return SearchHit(
id=self._ordered_ids[key],
score=0.0,
rank=idx + 1,
highlights={},
)
start = key.start or 0
stop = key.stop or len(self._ordered_ids)
# DRF slices to extract the current page. If the slice aligns
# with our pre-fetched page_hits, return them directly.
# We only check start — DRF always slices with stop=start+page_size,
# which exceeds page_hits length on the last page.
if start == self._page_offset:
return self._page_hits[: stop - start]
# Fallback: return stub dicts (no highlights).
return [
SearchHit(id=doc_id, score=0.0, rank=start + i + 1, highlights={})
for i, doc_id in enumerate(self._ordered_ids[key])
]
def get_all_ids(self) -> list[int]:
"""Return all matching document IDs in display order."""
return self._ordered_ids
class SearchIndexLockError(Exception):
@@ -206,10 +225,13 @@ class WriteBatch:
"""
Remove a document from the batch by its primary key.
Uses range query instead of term query to work around unsigned integer
type detection bug in tantivy-py 0.25.
Uses range_query instead of term_query to work around a tantivy-py bug
where Python integers are inferred as i64, producing Terms that never
match u64 fields.
TODO: Replace with term_query("id", doc_id) once
https://github.com/quickwit-oss/tantivy-py/pull/642 lands.
"""
# Use range query to work around u64 deletion bug
self._writer.delete_documents_by_query(
tantivy.Query.range_query(
self._backend._schema,
@@ -234,6 +256,34 @@ class TantivyBackend:
the underlying index directory changes (e.g., during test isolation).
"""
# Maps DRF ordering field names to Tantivy index field names.
SORT_FIELD_MAP: dict[str, str] = {
"title": "title_sort",
"correspondent__name": "correspondent_sort",
"document_type__name": "type_sort",
"created": "created",
"added": "added",
"modified": "modified",
"archive_serial_number": "asn",
"page_count": "page_count",
"num_notes": "num_notes",
}
# Fields where Tantivy's sort order matches the ORM's sort order.
# Text-based fields (title, correspondent__name, document_type__name)
# are excluded because Tantivy's tokenized fast fields produce different
# ordering than the ORM's collation-based ordering.
SORTABLE_FIELDS: frozenset[str] = frozenset(
{
"created",
"added",
"modified",
"archive_serial_number",
"page_count",
"num_notes",
},
)
def __init__(self, path: Path | None = None):
# path=None → in-memory index (for tests)
# path=some_dir → on-disk index (for production)
@@ -272,6 +322,36 @@ class TantivyBackend:
if self._index is None:
self.open() # pragma: no cover
def _parse_query(
self,
query: str,
search_mode: SearchMode,
) -> tantivy.Query:
"""Parse a user query string into a Tantivy Query object."""
tz = get_current_timezone()
if search_mode is SearchMode.TEXT:
return parse_simple_text_query(self._index, query)
elif search_mode is SearchMode.TITLE:
return parse_simple_title_query(self._index, query)
else:
return parse_user_query(self._index, query, tz)
def _apply_permission_filter(
self,
query: tantivy.Query,
user: AbstractBaseUser | None,
) -> tantivy.Query:
"""Wrap a query with a permission filter if the user is not a superuser."""
if user is not None:
permission_filter = build_permission_filter(self._schema, user)
return tantivy.Query.boolean_query(
[
(tantivy.Occur.Must, query),
(tantivy.Occur.Must, permission_filter),
],
)
return query
def _build_tantivy_doc(
self,
document: Document,
@@ -326,12 +406,17 @@ class TantivyBackend:
doc.add_unsigned("tag_id", tag.pk)
tag_names.append(tag.name)
# Notes — JSON for structured queries (notes.user:alice, notes.note:text),
# companion text field for default full-text search.
# Notes — JSON for structured queries (notes.user:alice, notes.note:text).
# notes_text is a plain-text companion for snippet/highlight generation;
# tantivy's SnippetGenerator does not support JSON fields.
num_notes = 0
note_texts: list[str] = []
for note in document.notes.all():
num_notes += 1
doc.add_json("notes", {"note": note.note, "user": note.user.username})
note_texts.append(note.note)
if note_texts:
doc.add_text("notes_text", " ".join(note_texts))
# Custom fields — JSON for structured queries (custom_fields.name:x, custom_fields.value:y),
# companion text field for default full-text search.
@@ -425,155 +510,128 @@ class TantivyBackend:
with self.batch_update(lock_timeout=5.0) as batch:
batch.remove(doc_id)
def search(
def highlight_hits(
self,
query: str,
user: AbstractBaseUser | None,
page: int,
page_size: int,
sort_field: str | None,
doc_ids: list[int],
*,
sort_reverse: bool,
search_mode: SearchMode = SearchMode.QUERY,
) -> SearchResults:
rank_start: int = 1,
) -> list[SearchHit]:
"""
Execute a search query against the document index.
Generate SearchHit dicts with highlights for specific document IDs.
Processes the user query through date rewriting, normalization, and
permission filtering before executing against Tantivy. Supports both
relevance-based and field-based sorting.
Unlike search(), this does not execute a ranked query it looks up
each document by ID and generates snippets against the provided query.
Use this when you already know which documents to display (from
search_ids + ORM filtering) and just need highlight data.
QUERY search mode supports natural date keywords, field filters, etc.
TITLE search mode treats the query as plain text to search for in title only
TEXT search mode treats the query as plain text to search for in title and content
Note: Each doc_id requires an individual index lookup because tantivy-py
does not yet expose a batch fast-field read API. This is acceptable for
page-sized batches (typically 25 docs) but should not be called with
thousands of IDs.
TODO: When https://github.com/quickwit-oss/tantivy-py/pull/641 lands,
the per-doc range_query lookups here can be replaced with a single
collect_u64_fast_field("id", doc_addresses) call.
Args:
query: User's search query
user: User for permission filtering (None for superuser/no filtering)
page: Page number (1-indexed) for pagination
page_size: Number of results per page
sort_field: Field to sort by (None for relevance ranking)
sort_reverse: Whether to reverse the sort order
search_mode: "query" for advanced Tantivy syntax, "text" for
plain-text search over title and content only, "title" for
plain-text search over title only
query: The search query (used for snippet generation)
doc_ids: Ordered list of document IDs to generate hits for
search_mode: Query parsing mode (for building the snippet query)
rank_start: Starting rank value (1-based absolute position in the
full result set; pass ``page_offset + 1`` for paginated calls)
Returns:
SearchResults with hits, total count, and processed query
List of SearchHit dicts in the same order as doc_ids
"""
self._ensure_open()
tz = get_current_timezone()
if search_mode is SearchMode.TEXT:
user_query = parse_simple_text_query(self._index, query)
elif search_mode is SearchMode.TITLE:
user_query = parse_simple_title_query(self._index, query)
else:
user_query = parse_user_query(self._index, query, tz)
if not doc_ids:
return []
# Apply permission filter if user is not None (not superuser)
if user is not None:
permission_filter = build_permission_filter(self._schema, user)
final_query = tantivy.Query.boolean_query(
[
(tantivy.Occur.Must, user_query),
(tantivy.Occur.Must, permission_filter),
],
self._ensure_open()
user_query = self._parse_query(query, search_mode)
highlight_query = user_query
if search_mode is SearchMode.TEXT:
highlight_query = parse_simple_text_highlight_query(self._index, query)
# For notes_text snippet generation, we need a query that targets the
# notes_text field directly. user_query may contain JSON-field terms
# (e.g. notes.note:urgent) that the SnippetGenerator cannot resolve
# against a text field. Strip field:value prefixes so bare terms like
# "urgent" are re-parsed against notes_text, producing highlights even
# when the original query used structured syntax.
bare_query = re.sub(r"\w[\w.]*:", "", query).strip()
try:
notes_text_query = (
self._index.parse_query(bare_query, ["notes_text"])
if bare_query
else user_query
)
else:
final_query = user_query
except Exception:
notes_text_query = user_query
searcher = self._index.searcher()
offset = (page - 1) * page_size
# Map sort fields
sort_field_map = {
"title": "title_sort",
"correspondent__name": "correspondent_sort",
"document_type__name": "type_sort",
"created": "created",
"added": "added",
"modified": "modified",
"archive_serial_number": "asn",
"page_count": "page_count",
"num_notes": "num_notes",
}
# Perform search
if sort_field and sort_field in sort_field_map:
mapped_field = sort_field_map[sort_field]
results = searcher.search(
final_query,
limit=offset + page_size,
order_by_field=mapped_field,
order=tantivy.Order.Desc if sort_reverse else tantivy.Order.Asc,
)
# Field sorting: hits are still (score, DocAddress) tuples; score unused
all_hits = [(hit[1], 0.0) for hit in results.hits]
else:
# Score-based search: hits are (score, DocAddress) tuples
results = searcher.search(final_query, limit=offset + page_size)
all_hits = [(hit[1], hit[0]) for hit in results.hits]
total = results.count
# Normalize scores for score-based searches
if not sort_field and all_hits:
max_score = max(hit[1] for hit in all_hits) or 1.0
all_hits = [(hit[0], hit[1] / max_score) for hit in all_hits]
# Apply threshold filter if configured (score-based search only)
threshold = settings.ADVANCED_FUZZY_SEARCH_THRESHOLD
if threshold is not None and not sort_field:
all_hits = [hit for hit in all_hits if hit[1] >= threshold]
# Get the page's hits
page_hits = all_hits[offset : offset + page_size]
# Build result hits with highlights
hits: list[SearchHit] = []
snippet_generator = None
notes_snippet_generator = None
hits: list[SearchHit] = []
for rank, (doc_address, score) in enumerate(page_hits, start=offset + 1):
# Get the actual document from the searcher using the doc address
for rank, doc_id in enumerate(doc_ids, start=rank_start):
# Look up document by ID, scoring against the user query so that
# the returned SearchHit carries a real BM25 relevance score.
id_query = tantivy.Query.range_query(
self._schema,
"id",
tantivy.FieldType.Unsigned,
doc_id,
doc_id,
)
scored_query = tantivy.Query.boolean_query(
[
(tantivy.Occur.Must, user_query),
(tantivy.Occur.Must, id_query),
],
)
results = searcher.search(scored_query, limit=1)
if not results.hits:
continue
score, doc_address = results.hits[0]
actual_doc = searcher.doc(doc_address)
doc_dict = actual_doc.to_dict()
doc_id = doc_dict["id"][0]
highlights: dict[str, str] = {}
try:
if snippet_generator is None:
snippet_generator = tantivy.SnippetGenerator.create(
searcher,
highlight_query,
self._schema,
"content",
)
# Generate highlights if score > 0
if score > 0:
try:
if snippet_generator is None:
snippet_generator = tantivy.SnippetGenerator.create(
content_html = snippet_generator.snippet_from_doc(actual_doc).to_html()
if content_html:
highlights["content"] = content_html
if search_mode is SearchMode.QUERY and "notes_text" in doc_dict:
# Use notes_text (plain text) for snippet generation — tantivy's
# SnippetGenerator does not support JSON fields.
if notes_snippet_generator is None:
notes_snippet_generator = tantivy.SnippetGenerator.create(
searcher,
final_query,
notes_text_query,
self._schema,
"content",
"notes_text",
)
notes_html = notes_snippet_generator.snippet_from_doc(
actual_doc,
).to_html()
if notes_html:
highlights["notes"] = notes_html
content_snippet = snippet_generator.snippet_from_doc(actual_doc)
if content_snippet:
highlights["content"] = str(content_snippet)
# Try notes highlights
if "notes" in doc_dict:
if notes_snippet_generator is None:
notes_snippet_generator = tantivy.SnippetGenerator.create(
searcher,
final_query,
self._schema,
"notes",
)
notes_snippet = notes_snippet_generator.snippet_from_doc(
actual_doc,
)
if notes_snippet:
highlights["notes"] = str(notes_snippet)
except Exception: # pragma: no cover
logger.debug("Failed to generate highlights for doc %s", doc_id)
except Exception: # pragma: no cover
logger.debug("Failed to generate highlights for doc %s", doc_id)
hits.append(
SearchHit(
@@ -584,11 +642,69 @@ class TantivyBackend:
),
)
return SearchResults(
hits=hits,
total=total,
query=query,
)
return hits
def search_ids(
self,
query: str,
user: AbstractBaseUser | None,
*,
sort_field: str | None = None,
sort_reverse: bool = False,
search_mode: SearchMode = SearchMode.QUERY,
limit: int | None = None,
) -> list[int]:
"""
Return document IDs matching a query no highlights or scores.
This is the lightweight companion to search(). Use it when you need the
full set of matching IDs (e.g. for ``selection_data``) but don't need
scores, ranks, or highlights.
Args:
query: User's search query
user: User for permission filtering (None for superuser/no filtering)
sort_field: Field to sort by (None for relevance ranking)
sort_reverse: Whether to reverse the sort order
search_mode: Query parsing mode (QUERY, TEXT, or TITLE)
limit: Maximum number of IDs to return (None = all matching docs)
Returns:
List of document IDs in the requested order
"""
self._ensure_open()
user_query = self._parse_query(query, search_mode)
final_query = self._apply_permission_filter(user_query, user)
searcher = self._index.searcher()
effective_limit = limit if limit is not None else searcher.num_docs
if sort_field and sort_field in self.SORT_FIELD_MAP:
mapped_field = self.SORT_FIELD_MAP[sort_field]
results = searcher.search(
final_query,
limit=effective_limit,
order_by_field=mapped_field,
order=tantivy.Order.Desc if sort_reverse else tantivy.Order.Asc,
)
all_hits = [(hit[1],) for hit in results.hits]
else:
results = searcher.search(final_query, limit=effective_limit)
all_hits = [(hit[1], hit[0]) for hit in results.hits]
# Normalize scores and apply threshold (relevance search only)
if all_hits:
max_score = max(hit[1] for hit in all_hits) or 1.0
all_hits = [(hit[0], hit[1] / max_score) for hit in all_hits]
threshold = settings.ADVANCED_FUZZY_SEARCH_THRESHOLD
if threshold is not None:
all_hits = [hit for hit in all_hits if hit[1] >= threshold]
# TODO: Replace with searcher.collect_u64_fast_field("id", addrs) once
# https://github.com/quickwit-oss/tantivy-py/pull/641 lands — eliminates
# one stored-doc fetch per result (~80% reduction in search_ids latency).
return [searcher.doc(doc_addr).to_dict()["id"][0] for doc_addr, *_ in all_hits]
def autocomplete(
self,
@@ -603,6 +719,10 @@ class TantivyBackend:
frequency (how many documents contain each word). Optionally filters
results to only words from documents visible to the specified user.
NOTE: This is the hottest search path (called per keystroke).
A future improvement would be to cache results in Redis, keyed by
(prefix, user_id), and invalidate on index writes.
Args:
term: Prefix to match against autocomplete words
limit: Maximum number of suggestions to return
@@ -613,64 +733,94 @@ class TantivyBackend:
"""
self._ensure_open()
normalized_term = ascii_fold(term.lower())
if not normalized_term:
return []
searcher = self._index.searcher()
# Apply permission filter for non-superusers so autocomplete words
# from invisible documents don't leak to other users.
# Build a prefix query on autocomplete_word so we only scan docs
# containing words that start with the prefix, not the entire index.
# tantivy regex is implicitly anchored; .+ avoids the empty-match
# error that .* triggers. We OR with term_query to also match the
# exact prefix as a complete word.
escaped = re.escape(normalized_term)
prefix_query = tantivy.Query.boolean_query(
[
(
tantivy.Occur.Should,
tantivy.Query.term_query(
self._schema,
"autocomplete_word",
normalized_term,
),
),
(
tantivy.Occur.Should,
tantivy.Query.regex_query(
self._schema,
"autocomplete_word",
f"{escaped}.+",
),
),
],
)
# Intersect with permission filter so autocomplete words from
# invisible documents don't leak to other users.
if user is not None and not user.is_superuser:
base_query = build_permission_filter(self._schema, user)
final_query = tantivy.Query.boolean_query(
[
(tantivy.Occur.Must, prefix_query),
(tantivy.Occur.Must, build_permission_filter(self._schema, user)),
],
)
else:
base_query = tantivy.Query.all_query()
final_query = prefix_query
results = searcher.search(base_query, limit=10000)
results = searcher.search(final_query, limit=searcher.num_docs)
# Count how many visible documents each word appears in.
# Using Counter (not set) preserves per-word document frequency so
# we can rank suggestions by how commonly they occur — the same
# signal Whoosh used for Tf/Idf-based autocomplete ordering.
# Count how many visible documents each matching word appears in.
word_counts: Counter[str] = Counter()
for _score, doc_address in results.hits:
stored_doc = searcher.doc(doc_address)
doc_dict = stored_doc.to_dict()
if "autocomplete_word" in doc_dict:
word_counts.update(doc_dict["autocomplete_word"])
for word in doc_dict["autocomplete_word"]:
if word.startswith(normalized_term):
word_counts[word] += 1
# Filter to prefix matches, sort by document frequency descending;
# break ties alphabetically for stable, deterministic output.
# Sort by document frequency descending; break ties alphabetically.
matches = sorted(
(w for w in word_counts if w.startswith(normalized_term)),
word_counts,
key=lambda w: (-word_counts[w], w),
)
return matches[:limit]
def more_like_this(
def more_like_this_ids(
self,
doc_id: int,
user: AbstractBaseUser | None,
page: int,
page_size: int,
) -> SearchResults:
*,
limit: int | None = None,
) -> list[int]:
"""
Find documents similar to the given document using content analysis.
Return IDs of documents similar to the given document no highlights.
Uses Tantivy's "more like this" query to find documents with similar
content patterns. The original document is excluded from results.
Lightweight companion to more_like_this(). The original document is
excluded from results.
Args:
doc_id: Primary key of the reference document
user: User for permission filtering (None for no filtering)
page: Page number (1-indexed) for pagination
page_size: Number of results per page
limit: Maximum number of IDs to return (None = all matching docs)
Returns:
SearchResults with similar documents (excluding the original)
List of similar document IDs (excluding the original)
"""
self._ensure_open()
searcher = self._index.searcher()
# First find the document address
id_query = tantivy.Query.range_query(
self._schema,
"id",
@@ -681,13 +831,9 @@ class TantivyBackend:
results = searcher.search(id_query, limit=1)
if not results.hits:
# Document not found
return SearchResults(hits=[], total=0, query=f"more_like:{doc_id}")
return []
# Extract doc_address from (score, doc_address) tuple
doc_address = results.hits[0][1]
# Build more like this query
mlt_query = tantivy.Query.more_like_this_query(
doc_address,
min_doc_frequency=1,
@@ -699,59 +845,21 @@ class TantivyBackend:
boost_factor=None,
)
# Apply permission filter
if user is not None:
permission_filter = build_permission_filter(self._schema, user)
final_query = tantivy.Query.boolean_query(
[
(tantivy.Occur.Must, mlt_query),
(tantivy.Occur.Must, permission_filter),
],
)
else:
final_query = mlt_query
final_query = self._apply_permission_filter(mlt_query, user)
# Search
offset = (page - 1) * page_size
results = searcher.search(final_query, limit=offset + page_size)
effective_limit = limit if limit is not None else searcher.num_docs
# Fetch one extra to account for excluding the original document
results = searcher.search(final_query, limit=effective_limit + 1)
total = results.count
# Convert from (score, doc_address) to (doc_address, score)
all_hits = [(hit[1], hit[0]) for hit in results.hits]
# TODO: Replace with collect_u64_fast_field("id", addrs) once
# https://github.com/quickwit-oss/tantivy-py/pull/641 lands.
ids = []
for _score, doc_address in results.hits:
result_doc_id = searcher.doc(doc_address).to_dict()["id"][0]
if result_doc_id != doc_id:
ids.append(result_doc_id)
# Normalize scores
if all_hits:
max_score = max(hit[1] for hit in all_hits) or 1.0
all_hits = [(hit[0], hit[1] / max_score) for hit in all_hits]
# Get page hits
page_hits = all_hits[offset : offset + page_size]
# Build results
hits: list[SearchHit] = []
for rank, (doc_address, score) in enumerate(page_hits, start=offset + 1):
actual_doc = searcher.doc(doc_address)
doc_dict = actual_doc.to_dict()
result_doc_id = doc_dict["id"][0]
# Skip the original document
if result_doc_id == doc_id:
continue
hits.append(
SearchHit(
id=result_doc_id,
score=score,
rank=rank,
highlights={}, # MLT doesn't generate highlights
),
)
return SearchResults(
hits=hits,
total=max(0, total - 1), # Subtract 1 for the original document
query=f"more_like:{doc_id}",
)
return ids[:limit] if limit is not None else ids
def batch_update(self, lock_timeout: float = 30.0) -> WriteBatch:
"""
+107 -41
View File
@@ -25,21 +25,39 @@ _REGEX_TIMEOUT: Final[float] = 1.0
_DATE_ONLY_FIELDS = frozenset({"created"})
_TODAY: Final[str] = "today"
_YESTERDAY: Final[str] = "yesterday"
_PREVIOUS_WEEK: Final[str] = "previous week"
_THIS_MONTH: Final[str] = "this month"
_PREVIOUS_MONTH: Final[str] = "previous month"
_THIS_YEAR: Final[str] = "this year"
_PREVIOUS_YEAR: Final[str] = "previous year"
_PREVIOUS_QUARTER: Final[str] = "previous quarter"
_DATE_KEYWORDS = frozenset(
{
"today",
"yesterday",
"this_week",
"last_week",
"this_month",
"last_month",
"this_year",
"last_year",
_TODAY,
_YESTERDAY,
_PREVIOUS_WEEK,
_THIS_MONTH,
_PREVIOUS_MONTH,
_THIS_YEAR,
_PREVIOUS_YEAR,
_PREVIOUS_QUARTER,
},
)
_DATE_KEYWORD_PATTERN = "|".join(
sorted((regex.escape(k) for k in _DATE_KEYWORDS), key=len, reverse=True),
)
_FIELD_DATE_RE = regex.compile(
r"(\w+):(" + "|".join(_DATE_KEYWORDS) + r")\b",
rf"""(?P<field>\w+)\s*:\s*(?:
(?P<quote>["'])(?P<quoted>{_DATE_KEYWORD_PATTERN})(?P=quote)
|
(?P<bare>{_DATE_KEYWORD_PATTERN})(?![\w-])
)""",
regex.IGNORECASE | regex.VERBOSE,
)
_COMPACT_DATE_RE = regex.compile(r"\b(\d{14})\b")
_RELATIVE_RANGE_RE = regex.compile(
@@ -74,44 +92,59 @@ def _date_only_range(keyword: str, tz: tzinfo) -> str:
today = datetime.now(tz).date()
if keyword == "today":
def _quarter_start(d: date) -> date:
return date(d.year, ((d.month - 1) // 3) * 3 + 1, 1)
if keyword == _TODAY:
lo = datetime(today.year, today.month, today.day, tzinfo=UTC)
return _iso_range(lo, lo + timedelta(days=1))
if keyword == "yesterday":
if keyword == _YESTERDAY:
y = today - timedelta(days=1)
lo = datetime(y.year, y.month, y.day, tzinfo=UTC)
hi = datetime(today.year, today.month, today.day, tzinfo=UTC)
return _iso_range(lo, hi)
if keyword == "this_week":
mon = today - timedelta(days=today.weekday())
lo = datetime(mon.year, mon.month, mon.day, tzinfo=UTC)
return _iso_range(lo, lo + timedelta(weeks=1))
if keyword == "last_week":
if keyword == _PREVIOUS_WEEK:
this_mon = today - timedelta(days=today.weekday())
last_mon = this_mon - timedelta(weeks=1)
lo = datetime(last_mon.year, last_mon.month, last_mon.day, tzinfo=UTC)
hi = datetime(this_mon.year, this_mon.month, this_mon.day, tzinfo=UTC)
return _iso_range(lo, hi)
if keyword == "this_month":
if keyword == _THIS_MONTH:
lo = datetime(today.year, today.month, 1, tzinfo=UTC)
if today.month == 12:
hi = datetime(today.year + 1, 1, 1, tzinfo=UTC)
else:
hi = datetime(today.year, today.month + 1, 1, tzinfo=UTC)
return _iso_range(lo, hi)
if keyword == "last_month":
if keyword == _PREVIOUS_MONTH:
if today.month == 1:
lo = datetime(today.year - 1, 12, 1, tzinfo=UTC)
else:
lo = datetime(today.year, today.month - 1, 1, tzinfo=UTC)
hi = datetime(today.year, today.month, 1, tzinfo=UTC)
return _iso_range(lo, hi)
if keyword == "this_year":
if keyword == _THIS_YEAR:
lo = datetime(today.year, 1, 1, tzinfo=UTC)
return _iso_range(lo, datetime(today.year + 1, 1, 1, tzinfo=UTC))
if keyword == "last_year":
if keyword == _PREVIOUS_YEAR:
lo = datetime(today.year - 1, 1, 1, tzinfo=UTC)
return _iso_range(lo, datetime(today.year, 1, 1, tzinfo=UTC))
if keyword == _PREVIOUS_QUARTER:
this_quarter = _quarter_start(today)
last_quarter = this_quarter - relativedelta(months=3)
lo = datetime(
last_quarter.year,
last_quarter.month,
last_quarter.day,
tzinfo=UTC,
)
hi = datetime(
this_quarter.year,
this_quarter.month,
this_quarter.day,
tzinfo=UTC,
)
return _iso_range(lo, hi)
raise ValueError(f"Unknown keyword: {keyword}")
@@ -127,42 +160,46 @@ def _datetime_range(keyword: str, tz: tzinfo) -> str:
def _midnight(d: date) -> datetime:
return datetime(d.year, d.month, d.day, tzinfo=tz).astimezone(UTC)
if keyword == "today":
def _quarter_start(d: date) -> date:
return date(d.year, ((d.month - 1) // 3) * 3 + 1, 1)
if keyword == _TODAY:
return _iso_range(_midnight(today), _midnight(today + timedelta(days=1)))
if keyword == "yesterday":
if keyword == _YESTERDAY:
y = today - timedelta(days=1)
return _iso_range(_midnight(y), _midnight(today))
if keyword == "this_week":
mon = today - timedelta(days=today.weekday())
return _iso_range(_midnight(mon), _midnight(mon + timedelta(weeks=1)))
if keyword == "last_week":
if keyword == _PREVIOUS_WEEK:
this_mon = today - timedelta(days=today.weekday())
last_mon = this_mon - timedelta(weeks=1)
return _iso_range(_midnight(last_mon), _midnight(this_mon))
if keyword == "this_month":
if keyword == _THIS_MONTH:
first = today.replace(day=1)
if today.month == 12:
next_first = date(today.year + 1, 1, 1)
else:
next_first = date(today.year, today.month + 1, 1)
return _iso_range(_midnight(first), _midnight(next_first))
if keyword == "last_month":
if keyword == _PREVIOUS_MONTH:
this_first = today.replace(day=1)
if today.month == 1:
last_first = date(today.year - 1, 12, 1)
else:
last_first = date(today.year, today.month - 1, 1)
return _iso_range(_midnight(last_first), _midnight(this_first))
if keyword == "this_year":
if keyword == _THIS_YEAR:
return _iso_range(
_midnight(date(today.year, 1, 1)),
_midnight(date(today.year + 1, 1, 1)),
)
if keyword == "last_year":
if keyword == _PREVIOUS_YEAR:
return _iso_range(
_midnight(date(today.year - 1, 1, 1)),
_midnight(date(today.year, 1, 1)),
)
if keyword == _PREVIOUS_QUARTER:
this_quarter = _quarter_start(today)
last_quarter = this_quarter - relativedelta(months=3)
return _iso_range(_midnight(last_quarter), _midnight(this_quarter))
raise ValueError(f"Unknown keyword: {keyword}")
@@ -308,7 +345,7 @@ def rewrite_natural_date_keywords(query: str, tz: tzinfo) -> str:
- Compact 14-digit dates (YYYYMMDDHHmmss)
- Whoosh relative ranges ([-7 days to now], [now-1h TO now+2h])
- 8-digit dates with field awareness (created:20240115)
- Natural keywords (field:today, field:last_week, etc.)
- Natural keywords (field:today, field:"previous quarter", etc.)
Args:
query: Raw user query string
@@ -326,7 +363,8 @@ def rewrite_natural_date_keywords(query: str, tz: tzinfo) -> str:
query = _rewrite_relative_range(query)
def _replace(m: regex.Match[str]) -> str:
field, keyword = m.group(1), m.group(2)
field = m.group("field")
keyword = (m.group("quoted") or m.group("bare")).lower()
if field in _DATE_ONLY_FIELDS:
return f"{field}:{_date_only_range(keyword, tz)}"
return f"{field}:{_datetime_range(keyword, tz)}"
@@ -396,10 +434,17 @@ def build_permission_filter(
Tantivy query that filters results to visible documents
Implementation Notes:
- Uses range_query instead of term_query to work around unsigned integer
type detection bug in tantivy-py 0.25
- Uses boolean_query for "no owner" check since exists_query is not
available in tantivy-py 0.25.1 (available in master)
- Uses range_query instead of term_query for owner_id/viewer_id to work
around a tantivy-py bug where Python ints are inferred as i64, causing
term_query to return no hits on u64 fields.
TODO: Replace with term_query once
https://github.com/quickwit-oss/tantivy-py/pull/642 lands.
- Uses range_query(owner_id, 1, MAX_U64) as an "owner exists" check
because exists_query is not yet available in tantivy-py 0.25.
TODO: Replace with exists_query("owner_id") once that is exposed in
a tantivy-py release.
- Uses disjunction_max_query to combine permission clauses with OR logic
"""
owner_any = tantivy.Query.range_query(
@@ -445,6 +490,14 @@ _FIELD_BOOSTS = {"title": 2.0}
_SIMPLE_FIELD_BOOSTS = {"simple_title": 2.0}
def _simple_query_tokens(raw_query: str) -> list[str]:
tokens = [
ascii_fold(token.lower())
for token in _SIMPLE_QUERY_TOKEN_RE.findall(raw_query, timeout=_REGEX_TIMEOUT)
]
return [token for token in tokens if token]
def _build_simple_field_query(
index: tantivy.Index,
field: str,
@@ -540,11 +593,7 @@ def parse_simple_query(
Query string is escaped and normalized to be treated as "simple" text query.
"""
tokens = [
ascii_fold(token.lower())
for token in _SIMPLE_QUERY_TOKEN_RE.findall(raw_query, timeout=_REGEX_TIMEOUT)
]
tokens = [token for token in tokens if token]
tokens = _simple_query_tokens(raw_query)
if not tokens:
return tantivy.Query.empty_query()
@@ -557,6 +606,23 @@ def parse_simple_query(
return tantivy.Query.boolean_query(field_queries)
def parse_simple_text_highlight_query(
index: tantivy.Index,
raw_query: str,
) -> tantivy.Query:
"""Build a snippet-friendly query for simple text searches.
Simple search matching uses regex queries but for compatibility with Tantivy
SnippetGenerator we build a plain term query over the content field instead.
"""
tokens = _simple_query_tokens(raw_query)
if not tokens:
return tantivy.Query.empty_query()
return index.parse_query(" ".join(tokens), ["content"])
def parse_simple_text_query(
index: tantivy.Index,
raw_query: str,
+24 -18
View File
@@ -1,5 +1,6 @@
from __future__ import annotations
import json
import logging
import shutil
from typing import TYPE_CHECKING
@@ -72,6 +73,9 @@ def build_schema() -> tantivy.Schema:
# JSON fields — structured queries: notes.user:alice, custom_fields.name:invoice
sb.add_json_field("notes", stored=True, tokenizer_name="paperless_text")
# Plain-text companion for notes — tantivy's SnippetGenerator does not support
# JSON fields, so highlights require a text field with the same content.
sb.add_text_field("notes_text", stored=True, tokenizer_name="paperless_text")
sb.add_json_field("custom_fields", stored=True, tokenizer_name="paperless_text")
for field in (
@@ -97,9 +101,9 @@ def needs_rebuild(index_dir: Path) -> bool:
"""
Check if the search index needs rebuilding.
Compares the current schema version and search language configuration
against sentinel files to determine if the index is compatible with
the current paperless-ngx version and settings.
Reads .index_settings.json to compare the stored schema version and
search language against the current configuration. Returns True if the
file is missing, unparsable, or either value mismatches.
Args:
index_dir: Path to the search index directory
@@ -107,24 +111,19 @@ def needs_rebuild(index_dir: Path) -> bool:
Returns:
True if the index needs rebuilding, False if it's up to date
"""
version_file = index_dir / ".schema_version"
if not version_file.exists():
settings_file = index_dir / ".index_settings.json"
if not settings_file.exists():
return True
try:
if int(version_file.read_text().strip()) != SCHEMA_VERSION:
data = json.loads(settings_file.read_text())
if data.get("schema_version") != SCHEMA_VERSION:
logger.info("Search index schema version mismatch - rebuilding.")
return True
if "language" not in data or data["language"] != settings.SEARCH_LANGUAGE:
logger.info("Search index language changed - rebuilding.")
return True
except ValueError:
return True
language_file = index_dir / ".schema_language"
if not language_file.exists():
logger.info("Search index language sentinel missing - rebuilding.")
return True
if language_file.read_text().strip() != (settings.SEARCH_LANGUAGE or ""):
logger.info("Search index language changed - rebuilding.")
return True
return False
@@ -146,9 +145,16 @@ def wipe_index(index_dir: Path) -> None:
def _write_sentinels(index_dir: Path) -> None:
"""Write schema version and language sentinel files so the next index open can skip rebuilding."""
(index_dir / ".schema_version").write_text(str(SCHEMA_VERSION))
(index_dir / ".schema_language").write_text(settings.SEARCH_LANGUAGE or "")
"""Write .index_settings.json so the next index open can skip rebuilding."""
settings_file = index_dir / ".index_settings.json"
settings_file.write_text(
json.dumps(
{
"schema_version": SCHEMA_VERSION,
"language": settings.SEARCH_LANGUAGE,
},
),
)
def open_or_rebuild_index(index_dir: Path | None = None) -> tantivy.Index:
+208 -79
View File
@@ -11,8 +11,6 @@ from typing import Any
from typing import Literal
from typing import TypedDict
import magic
from celery import states
from django.conf import settings
from django.contrib.auth.models import Group
from django.contrib.auth.models import User
@@ -40,6 +38,7 @@ from drf_spectacular.utils import extend_schema_field
from drf_spectacular.utils import extend_schema_serializer
from drf_writable_nested.serializers import NestedUpdateMixin
from guardian.core import ObjectPermissionChecker
from guardian.shortcuts import get_objects_for_user
from guardian.shortcuts import get_users_with_perms
from guardian.utils import get_group_obj_perms_model
from guardian.utils import get_user_obj_perms_model
@@ -49,6 +48,8 @@ from rest_framework.exceptions import PermissionDenied
from rest_framework.fields import SerializerMethodField
from rest_framework.filters import OrderingFilter
from paperless import mime_detection
if settings.AUDIT_LOG_ENABLED:
from auditlog.context import set_actor
@@ -100,7 +101,7 @@ logger = logging.getLogger("paperless.serializers")
# https://www.django-rest-framework.org/api-guide/serializers/#example
class DynamicFieldsModelSerializer(serializers.ModelSerializer):
class DynamicFieldsModelSerializer(serializers.ModelSerializer[Any]):
"""
A ModelSerializer that takes an additional `fields` argument that
controls which fields should be displayed.
@@ -121,7 +122,7 @@ class DynamicFieldsModelSerializer(serializers.ModelSerializer):
self.fields.pop(field_name)
class MatchingModelSerializer(serializers.ModelSerializer):
class MatchingModelSerializer(serializers.ModelSerializer[Any]):
document_count = serializers.IntegerField(read_only=True)
def get_slug(self, obj) -> str:
@@ -261,7 +262,7 @@ class SetPermissionsSerializer(serializers.DictField):
class OwnedObjectSerializer(
SerializerWithPerms,
serializers.ModelSerializer,
serializers.ModelSerializer[Any],
SetPermissionsMixin,
):
def __init__(self, *args, **kwargs) -> None:
@@ -469,7 +470,7 @@ class OwnedObjectSerializer(
return super().update(instance, validated_data)
class OwnedObjectListSerializer(serializers.ListSerializer):
class OwnedObjectListSerializer(serializers.ListSerializer[Any]):
def to_representation(self, documents):
self.child.context["shared_object_pks"] = self.child.get_shared_object_pks(
documents,
@@ -682,27 +683,27 @@ class TagSerializer(MatchingModelSerializer, OwnedObjectSerializer):
return super().validate(attrs)
class CorrespondentField(serializers.PrimaryKeyRelatedField):
class CorrespondentField(serializers.PrimaryKeyRelatedField[Correspondent]):
def get_queryset(self):
return Correspondent.objects.all()
class TagsField(serializers.PrimaryKeyRelatedField):
class TagsField(serializers.PrimaryKeyRelatedField[Tag]):
def get_queryset(self):
return Tag.objects.all()
class DocumentTypeField(serializers.PrimaryKeyRelatedField):
class DocumentTypeField(serializers.PrimaryKeyRelatedField[DocumentType]):
def get_queryset(self):
return DocumentType.objects.all()
class StoragePathField(serializers.PrimaryKeyRelatedField):
class StoragePathField(serializers.PrimaryKeyRelatedField[StoragePath]):
def get_queryset(self):
return StoragePath.objects.all()
class CustomFieldSerializer(serializers.ModelSerializer):
class CustomFieldSerializer(serializers.ModelSerializer[CustomField]):
data_type = serializers.ChoiceField(
choices=CustomField.FieldDataType,
read_only=False,
@@ -816,7 +817,7 @@ def validate_documentlink_targets(user, doc_ids):
)
class CustomFieldInstanceSerializer(serializers.ModelSerializer):
class CustomFieldInstanceSerializer(serializers.ModelSerializer[CustomFieldInstance]):
field = serializers.PrimaryKeyRelatedField(queryset=CustomField.objects.all())
value = ReadWriteSerializerMethodField(allow_null=True)
@@ -911,6 +912,8 @@ class CustomFieldInstanceSerializer(serializers.ModelSerializer):
getattr(request, "user", None) if request is not None else None,
doc_ids,
)
elif field.data_type == CustomField.FieldDataType.DATE:
data["value"] = serializers.DateField().to_internal_value(data["value"])
return data
@@ -922,14 +925,14 @@ class CustomFieldInstanceSerializer(serializers.ModelSerializer):
]
class BasicUserSerializer(serializers.ModelSerializer):
class BasicUserSerializer(serializers.ModelSerializer[User]):
# Different than paperless.serializers.UserSerializer
class Meta:
model = User
fields = ["id", "username", "first_name", "last_name"]
class NotesSerializer(serializers.ModelSerializer):
class NotesSerializer(serializers.ModelSerializer[Note]):
user = BasicUserSerializer(read_only=True)
class Meta:
@@ -1256,7 +1259,7 @@ class DocumentSerializer(
list_serializer_class = OwnedObjectListSerializer
class SearchResultListSerializer(serializers.ListSerializer):
class SearchResultListSerializer(serializers.ListSerializer[Document]):
def to_representation(self, hits):
document_ids = [hit["id"] for hit in hits]
# Fetch all Document objects in the list in one SQL query.
@@ -1313,7 +1316,7 @@ class SearchResultSerializer(DocumentSerializer):
list_serializer_class = SearchResultListSerializer
class SavedViewFilterRuleSerializer(serializers.ModelSerializer):
class SavedViewFilterRuleSerializer(serializers.ModelSerializer[SavedViewFilterRule]):
class Meta:
model = SavedViewFilterRule
fields = ["rule_type", "value"]
@@ -1602,6 +1605,7 @@ class RotateDocumentsSerializer(DocumentSelectionSerializer, SourceModeValidatio
required=False,
default=bulk_edit.SourceModeChoices.LATEST_VERSION,
)
from_webui = serializers.BooleanField(required=False, default=False)
class MergeDocumentsSerializer(DocumentListSerializer, SourceModeValidationMixin):
@@ -1615,6 +1619,7 @@ class MergeDocumentsSerializer(DocumentListSerializer, SourceModeValidationMixin
required=False,
default=bulk_edit.SourceModeChoices.LATEST_VERSION,
)
from_webui = serializers.BooleanField(required=False, default=False)
class EditPdfDocumentsSerializer(DocumentListSerializer, SourceModeValidationMixin):
@@ -1626,6 +1631,7 @@ class EditPdfDocumentsSerializer(DocumentListSerializer, SourceModeValidationMix
required=False,
default=bulk_edit.SourceModeChoices.LATEST_VERSION,
)
from_webui = serializers.BooleanField(required=False, default=False)
def validate(self, attrs):
documents = attrs["documents"]
@@ -1677,6 +1683,7 @@ class RemovePasswordDocumentsSerializer(
required=False,
default=bulk_edit.SourceModeChoices.LATEST_VERSION,
)
from_webui = serializers.BooleanField(required=False, default=False)
class DeleteDocumentsSerializer(DocumentSelectionSerializer):
@@ -1724,6 +1731,7 @@ class BulkEditSerializer(
)
parameters = serializers.DictField(allow_empty=True, default={}, write_only=True)
from_webui = serializers.BooleanField(required=False, default=False)
def _validate_tag_id_list(self, tags, name="tags") -> None:
if not isinstance(tags, list):
@@ -2152,7 +2160,7 @@ class PostDocumentSerializer(serializers.Serializer):
def validate_document(self, document):
document_data = document.file.read()
mime_type = magic.from_buffer(document_data, mime=True)
mime_type = mime_detection.from_buffer(document_data)
if not is_mime_type_supported(mime_type):
if (
@@ -2396,12 +2404,15 @@ class StoragePathSerializer(MatchingModelSerializer, OwnedObjectSerializer):
"""
doc_ids = [doc.id for doc in instance.documents.all()]
if doc_ids:
bulk_edit.bulk_update_documents.delay(doc_ids)
bulk_edit.bulk_update_documents.apply_async(
kwargs={"document_ids": doc_ids},
headers={"trigger_source": PaperlessTask.TriggerSource.SYSTEM},
)
return super().update(instance, validated_data)
class UiSettingsViewSerializer(serializers.ModelSerializer):
class UiSettingsViewSerializer(serializers.ModelSerializer[UiSettings]):
settings = serializers.DictField(required=False, allow_null=True)
class Meta:
@@ -2429,7 +2440,79 @@ class UiSettingsViewSerializer(serializers.ModelSerializer):
return ui_settings
class TasksViewSerializer(OwnedObjectSerializer):
class TaskSerializerV10(OwnedObjectSerializer):
"""Task serializer for API v10+ using new field names."""
related_document_ids = serializers.ListField(
child=serializers.IntegerField(),
read_only=True,
)
task_type_display = serializers.CharField(
source="get_task_type_display",
read_only=True,
)
trigger_source_display = serializers.CharField(
source="get_trigger_source_display",
read_only=True,
)
status_display = serializers.CharField(
source="get_status_display",
read_only=True,
)
class Meta:
model = PaperlessTask
fields = (
"id",
"task_id",
"task_type",
"task_type_display",
"trigger_source",
"trigger_source_display",
"status",
"status_display",
"date_created",
"date_started",
"date_done",
"duration_seconds",
"wait_time_seconds",
"input_data",
"result_data",
"related_document_ids",
"acknowledged",
"owner",
)
read_only_fields = fields
class TaskSerializerV9(serializers.ModelSerializer):
"""Task serializer for API v9 backwards compatibility.
Maps old field names to the new model fields so existing clients continue
to work unchanged.
"""
# v9 field: task_name -> task_type (with value remapping for renamed tasks)
task_name = serializers.SerializerMethodField()
# v9 field: task_file_name -> input_data.filename
task_file_name = serializers.SerializerMethodField()
# v9 field: type -> trigger_source (mapped to old enum labels)
type = serializers.SerializerMethodField()
# v9 field: status -> uppercase Celery state strings
status = serializers.SerializerMethodField()
# v9 field: result -> derived from result_data
result = serializers.SerializerMethodField()
# v9 field: related_document -> first document ID from result_data
related_document = serializers.SerializerMethodField()
# v9 field: duplicate_documents -> list of duplicate IDs from result_data
duplicate_documents = serializers.SerializerMethodField()
class Meta:
model = PaperlessTask
fields = (
@@ -2437,59 +2520,113 @@ class TasksViewSerializer(OwnedObjectSerializer):
"task_id",
"task_name",
"task_file_name",
"date_created",
"date_done",
"type",
"status",
"date_created",
"date_done",
"result",
"acknowledged",
"related_document",
"duplicate_documents",
"owner",
)
read_only_fields = fields
related_document = serializers.SerializerMethodField()
duplicate_documents = serializers.SerializerMethodField()
created_doc_re = re.compile(r"New document id (\d+) created")
duplicate_doc_re = re.compile(r"It is a duplicate of .* \(#(\d+)\)")
_TASK_TYPE_TO_V9_NAME = {
PaperlessTask.TaskType.SANITY_CHECK: "check_sanity",
PaperlessTask.TaskType.LLM_INDEX: "llmindex_update",
}
def get_related_document(self, obj) -> str | None:
result = None
re = None
if obj.result:
match obj.status:
case states.SUCCESS:
re = self.created_doc_re
case states.FAILURE:
re = (
self.duplicate_doc_re
if "existing document is in the trash" not in obj.result
else None
)
if re is not None:
try:
result = re.search(obj.result).group(1)
except Exception:
pass
def get_result(self, obj: PaperlessTask) -> str | None:
"""Reconstruct a human-readable result string from result_data for v9 clients."""
if not obj.result_data:
return None
if doc_id := obj.result_data.get("document_id"):
return f"Success. New document id {doc_id} created"
if reason := obj.result_data.get("reason"):
return reason
if dup_id := obj.result_data.get("duplicate_of"):
return f"Not consuming: It is a duplicate of document #{dup_id}"
if error := obj.result_data.get("error_message"):
return error
return None
return result
def get_task_name(self, obj: PaperlessTask) -> str:
return self._TASK_TYPE_TO_V9_NAME.get(obj.task_type, obj.task_type)
@extend_schema_field(DuplicateDocumentSummarySerializer(many=True))
def get_duplicate_documents(self, obj):
related_document = self.get_related_document(obj)
request = self.context.get("request")
user = request.user if request else None
document = Document.global_objects.filter(pk=related_document).first()
if not related_document or not user or not document:
def get_task_file_name(self, obj: PaperlessTask) -> str | None:
if not obj.input_data:
return None
return obj.input_data.get("filename")
_STATUS_TO_V9 = {
PaperlessTask.Status.PENDING: "PENDING",
PaperlessTask.Status.STARTED: "STARTED",
PaperlessTask.Status.SUCCESS: "SUCCESS",
PaperlessTask.Status.FAILURE: "FAILURE",
PaperlessTask.Status.REVOKED: "REVOKED",
}
def get_status(self, obj: PaperlessTask) -> str:
return self._STATUS_TO_V9.get(obj.status, obj.status.upper())
_TRIGGER_SOURCE_TO_V9_TYPE = {
PaperlessTask.TriggerSource.SCHEDULED: "scheduled_task",
PaperlessTask.TriggerSource.SYSTEM: "auto_task",
# Email and folder-consumer documents are system-initiated, not manually triggered
PaperlessTask.TriggerSource.EMAIL_CONSUME: "auto_task",
PaperlessTask.TriggerSource.FOLDER_CONSUME: "auto_task",
}
def get_type(self, obj: PaperlessTask) -> str:
return self._TRIGGER_SOURCE_TO_V9_TYPE.get(obj.trigger_source, "manual_task")
def get_related_document(self, obj: PaperlessTask) -> int | None:
ids = obj.related_document_ids
return ids[0] if ids else None
def get_duplicate_documents(
self,
obj: PaperlessTask,
) -> list[dict[str, Any]]:
if not obj.result_data:
return []
duplicates = _get_viewable_duplicates(document, user)
return list(duplicates.values("id", "title", "deleted_at"))
dup_of = obj.result_data.get("duplicate_of")
if dup_of is None:
return []
request = self.context.get("request")
if request is None:
return []
user = request.user
qs = Document.global_objects.filter(pk=dup_of)
if not user.is_staff:
with_perms = get_objects_for_user(
user,
"documents.view_document",
qs,
accept_global_perms=False,
)
qs = with_perms | qs.filter(owner=user) | qs.filter(owner__isnull=True)
return list(qs.values("id", "title", "deleted_at"))
class RunTaskViewSerializer(serializers.Serializer[dict[str, Any]]):
task_name = serializers.ChoiceField(
choices=PaperlessTask.TaskName.choices,
label="Task Name",
class TaskSummarySerializer(serializers.Serializer):
task_type = serializers.CharField()
total_count = serializers.IntegerField()
pending_count = serializers.IntegerField()
success_count = serializers.IntegerField()
failure_count = serializers.IntegerField()
avg_duration_seconds = serializers.FloatField(allow_null=True)
avg_wait_time_seconds = serializers.FloatField(allow_null=True)
last_run = serializers.DateTimeField(allow_null=True)
last_success = serializers.DateTimeField(allow_null=True)
last_failure = serializers.DateTimeField(allow_null=True)
class RunTaskSerializer(serializers.Serializer):
task_type = serializers.ChoiceField(
choices=PaperlessTask.TaskType.choices,
label="Task Type",
write_only=True,
)
@@ -2760,7 +2897,7 @@ class BulkEditObjectsSerializer(SerializerWithPerms, SetPermissionsMixin):
return attrs
class WorkflowTriggerSerializer(serializers.ModelSerializer):
class WorkflowTriggerSerializer(serializers.ModelSerializer[WorkflowTrigger]):
id = serializers.IntegerField(required=False, allow_null=True)
sources = fields.MultipleChoiceField(
choices=WorkflowTrigger.DocumentSourceChoices.choices,
@@ -2870,7 +3007,7 @@ class WorkflowTriggerSerializer(serializers.ModelSerializer):
return super().update(instance, validated_data)
class WorkflowActionEmailSerializer(serializers.ModelSerializer):
class WorkflowActionEmailSerializer(serializers.ModelSerializer[WorkflowActionEmail]):
id = serializers.IntegerField(allow_null=True, required=False)
class Meta:
@@ -2884,7 +3021,9 @@ class WorkflowActionEmailSerializer(serializers.ModelSerializer):
]
class WorkflowActionWebhookSerializer(serializers.ModelSerializer):
class WorkflowActionWebhookSerializer(
serializers.ModelSerializer[WorkflowActionWebhook],
):
id = serializers.IntegerField(allow_null=True, required=False)
def validate_url(self, url):
@@ -2905,7 +3044,7 @@ class WorkflowActionWebhookSerializer(serializers.ModelSerializer):
]
class WorkflowActionSerializer(serializers.ModelSerializer):
class WorkflowActionSerializer(serializers.ModelSerializer[WorkflowAction]):
id = serializers.IntegerField(required=False, allow_null=True)
assign_correspondent = CorrespondentField(allow_null=True, required=False)
assign_tags = TagsField(many=True, allow_null=True, required=False)
@@ -3027,7 +3166,7 @@ class WorkflowActionSerializer(serializers.ModelSerializer):
return attrs
class WorkflowSerializer(serializers.ModelSerializer):
class WorkflowSerializer(serializers.ModelSerializer[Workflow]):
order = serializers.IntegerField(required=False)
triggers = WorkflowTriggerSerializer(many=True)
@@ -3214,13 +3353,13 @@ class WorkflowSerializer(serializers.ModelSerializer):
ManyToMany fields dont support e.g. on_delete so we need to discard unattached
triggers and actions manually
"""
for trigger in WorkflowTrigger.objects.all():
if trigger.workflows.all().count() == 0:
trigger.delete()
WorkflowTrigger.objects.annotate(
workflow_count=Count("workflows"),
).filter(workflow_count=0).delete()
for action in WorkflowAction.objects.all():
if action.workflows.all().count() == 0:
action.delete()
WorkflowAction.objects.annotate(
workflow_count=Count("workflows"),
).filter(workflow_count=0).delete()
WorkflowActionEmail.objects.filter(action=None).delete()
WorkflowActionWebhook.objects.filter(action=None).delete()
@@ -3249,16 +3388,6 @@ class WorkflowSerializer(serializers.ModelSerializer):
return instance
def to_representation(self, instance: Workflow) -> dict[str, Any]:
data = super().to_representation(instance)
actions = instance.actions.order_by("order", "pk")
data["actions"] = WorkflowActionSerializer(
actions,
many=True,
context=self.context,
).data
return data
class TrashSerializer(SerializerWithPerms):
documents = serializers.ListField(
+255 -61
View File
@@ -1,18 +1,20 @@
from __future__ import annotations
import datetime
import hashlib
import logging
import shutil
import traceback as _tb
from pathlib import Path
from typing import TYPE_CHECKING
from typing import Any
from celery import shared_task
from celery import states
from celery.signals import before_task_publish
from celery.signals import task_failure
from celery.signals import task_postrun
from celery.signals import task_prerun
from celery.signals import task_revoked
from celery.signals import worker_process_init
from django.conf import settings
from django.contrib.auth.models import Group
@@ -705,7 +707,7 @@ def check_paths_and_prune_custom_fields(
and instance.fields.count() > 0
and instance.extra_data
): # Only select fields, for now
process_cf_select_update.delay(instance)
process_cf_select_update.apply_async(kwargs={"custom_field": instance})
@receiver(models.signals.post_delete, sender=CustomField)
@@ -894,7 +896,6 @@ def run_workflows(
# Refresh this so the matching data is fresh and instance fields are re-freshed
# Otherwise, this instance might be behind and overwrite the work another process did
document.refresh_from_db()
doc_tag_ids = list(document.tags.values_list("pk", flat=True))
except Document.DoesNotExist:
# Document was hard deleted by a previous workflow or another process
logger.info(
@@ -928,14 +929,13 @@ def run_workflows(
apply_assignment_to_document(
action,
document,
doc_tag_ids,
logging_group,
)
elif action.type == WorkflowAction.WorkflowActionType.REMOVAL:
if use_overrides and overrides:
apply_removal_to_overrides(action, overrides)
else:
apply_removal_to_document(action, document, doc_tag_ids)
apply_removal_to_document(action, document)
elif action.type == WorkflowAction.WorkflowActionType.EMAIL:
context = build_workflow_action_context(document, overrides)
execute_email_action(
@@ -982,7 +982,6 @@ def run_workflows(
"modified",
],
)
document.tags.set(doc_tag_ids)
WorkflowRun.objects.create(
workflow=workflow,
@@ -999,68 +998,168 @@ def run_workflows(
return overrides, "\n".join(messages)
@before_task_publish.connect
def before_task_publish_handler(sender=None, headers=None, body=None, **kwargs) -> None:
# ---------------------------------------------------------------------------
# Task tracking -- Celery signal handlers
# ---------------------------------------------------------------------------
TRACKED_TASKS: dict[str, PaperlessTask.TaskType] = {
"documents.tasks.consume_file": PaperlessTask.TaskType.CONSUME_FILE,
"documents.tasks.train_classifier": PaperlessTask.TaskType.TRAIN_CLASSIFIER,
"documents.tasks.sanity_check": PaperlessTask.TaskType.SANITY_CHECK,
"documents.tasks.llmindex_index": PaperlessTask.TaskType.LLM_INDEX,
"documents.tasks.empty_trash": PaperlessTask.TaskType.EMPTY_TRASH,
"documents.tasks.check_scheduled_workflows": PaperlessTask.TaskType.CHECK_WORKFLOWS,
"paperless_mail.tasks.process_mail_accounts": PaperlessTask.TaskType.MAIL_FETCH,
"documents.tasks.bulk_update_documents": PaperlessTask.TaskType.BULK_UPDATE,
"documents.tasks.update_document_content_maybe_archive_file": PaperlessTask.TaskType.REPROCESS_DOCUMENT,
"documents.tasks.build_share_link_bundle": PaperlessTask.TaskType.BUILD_SHARE_LINK,
"documents.bulk_edit.delete": PaperlessTask.TaskType.BULK_DELETE,
}
_CELERY_STATE_TO_STATUS: dict[str, PaperlessTask.Status] = {
"SUCCESS": PaperlessTask.Status.SUCCESS,
"FAILURE": PaperlessTask.Status.FAILURE,
"REVOKED": PaperlessTask.Status.REVOKED,
}
def _extract_input_data(
task_type: PaperlessTask.TaskType,
task_kwargs: dict,
) -> dict:
"""Build the input_data dict stored on the PaperlessTask record.
For consume_file tasks this includes the filename, MIME type, and any
non-null overrides from the DocumentMetadataOverrides object. For
mail_fetch tasks it captures the account_ids list. All other task
types store no input data and return {}.
"""
Creates the PaperlessTask object in a pending state. This is sent before
the task reaches the broker, but before it begins executing on a worker.
if task_type == PaperlessTask.TaskType.CONSUME_FILE:
input_doc = task_kwargs.get("input_doc")
overrides = task_kwargs.get("overrides")
if input_doc is None:
return {}
data: dict = {
"filename": input_doc.original_file.name,
"mime_type": input_doc.mime_type,
}
if input_doc.original_path: # pragma: no cover
data["source_path"] = str(input_doc.original_path)
if input_doc.mailrule_id: # pragma: no cover
data["mailrule_id"] = input_doc.mailrule_id
if overrides:
override_dict = {}
for k, v in vars(overrides).items():
if v is None or k.startswith("_"):
continue
if isinstance(v, datetime.date):
v = v.isoformat()
elif isinstance(v, Path):
v = str(v)
override_dict[k] = v
if override_dict:
data["overrides"] = override_dict
return data
if task_type == PaperlessTask.TaskType.MAIL_FETCH:
account_ids = task_kwargs.get("account_ids")
if account_ids is not None:
return {"account_ids": account_ids}
return {}
return {}
def _determine_trigger_source(
headers: dict,
) -> PaperlessTask.TriggerSource:
"""Resolve the TriggerSource for a task being published to the broker.
Reads the trigger_source header set by the caller; falls back to MANUAL
when the header is absent or contains an unrecognised value.
"""
header_source = headers.get("trigger_source")
if header_source is not None:
try:
return PaperlessTask.TriggerSource(header_source)
except ValueError:
pass
return PaperlessTask.TriggerSource.MANUAL
def _extract_owner_id(
task_type: PaperlessTask.TaskType,
task_kwargs: dict,
) -> int | None:
"""Return the owner_id from consume_file overrides, or None for all other task types."""
if task_type != PaperlessTask.TaskType.CONSUME_FILE:
return None
overrides = task_kwargs.get("overrides")
if overrides and hasattr(overrides, "owner_id"):
return overrides.owner_id
return None # pragma: no cover
@before_task_publish.connect
def before_task_publish_handler(
sender=None,
headers=None,
body=None,
**kwargs,
) -> None:
"""
Creates the PaperlessTask record when the task is published to broker.
https://docs.celeryq.dev/en/stable/userguide/signals.html#before-task-publish
https://docs.celeryq.dev/en/stable/internals/protocol.html#version-2
"""
if "task" not in headers or headers["task"] != "documents.tasks.consume_file":
# Assumption: this is only ever a v2 message
if headers is None or body is None:
return
task_name = headers.get("task", "")
task_type = TRACKED_TASKS.get(task_name)
if task_type is None:
return
try:
close_old_connections()
_, task_kwargs, _ = body
task_id = headers["id"]
task_args = body[0]
input_doc, overrides = task_args
task_file_name = input_doc.original_file.name
user_id = overrides.owner_id if overrides else None
input_data = _extract_input_data(task_type, task_kwargs)
trigger_source = _determine_trigger_source(headers)
owner_id = _extract_owner_id(task_type, task_kwargs)
PaperlessTask.objects.create(
type=PaperlessTask.TaskType.AUTO,
task_id=headers["id"],
status=states.PENDING,
task_file_name=task_file_name,
task_name=PaperlessTask.TaskName.CONSUME_FILE,
result=None,
date_created=timezone.now(),
date_started=None,
date_done=None,
owner_id=user_id,
task_id=task_id,
task_type=task_type,
trigger_source=trigger_source,
status=PaperlessTask.Status.PENDING,
input_data=input_data,
owner_id=owner_id,
)
except Exception: # pragma: no cover
# Don't let an exception in the signal handlers prevent
# a document from being consumed.
logger.exception("Creating PaperlessTask failed")
@task_prerun.connect
def task_prerun_handler(sender=None, task_id=None, task=None, **kwargs) -> None:
"""
Updates the PaperlessTask to be started. Sent before the task begins execution
on a worker.
Marks the task STARTED when execution begins on a worker.
https://docs.celeryq.dev/en/stable/userguide/signals.html#task-prerun
"""
if task_id is None: # pragma: no cover
return
if task and task.name not in TRACKED_TASKS:
return
try:
close_old_connections()
task_instance = PaperlessTask.objects.filter(task_id=task_id).first()
if task_instance is not None:
task_instance.status = states.STARTED
task_instance.date_started = timezone.now()
task_instance.save()
PaperlessTask.objects.filter(task_id=task_id).update(
status=PaperlessTask.Status.STARTED,
date_started=timezone.now(),
)
except Exception: # pragma: no cover
# Don't let an exception in the signal handlers prevent
# a document from being consumed.
logger.exception("Setting PaperlessTask started failed")
@@ -1074,22 +1173,54 @@ def task_postrun_handler(
**kwargs,
) -> None:
"""
Updates the result of the PaperlessTask.
Records task completion and result data for non-failure outcomes.
Skips FAILURE states entirely, since task_failure_handler fires first
and fully owns the failure path (status, date_done, duration, result_data).
https://docs.celeryq.dev/en/stable/userguide/signals.html#task-postrun
"""
if task_id is None: # pragma: no cover
return
if task and task.name not in TRACKED_TASKS:
return
try:
close_old_connections()
task_instance = PaperlessTask.objects.filter(task_id=task_id).first()
if task_instance is not None:
task_instance.status = state or states.FAILURE
task_instance.result = retval
task_instance.date_done = timezone.now()
task_instance.save()
new_status = _CELERY_STATE_TO_STATUS.get(state, PaperlessTask.Status.FAILURE)
if new_status == PaperlessTask.Status.FAILURE:
return
now = timezone.now()
try:
task_instance = PaperlessTask.objects.get(task_id=task_id)
except PaperlessTask.DoesNotExist:
return
task_instance.status = new_status
task_instance.date_done = now
changed_fields = ["status", "date_done"]
if task_instance.date_started:
task_instance.duration_seconds = (
now - task_instance.date_started
).total_seconds()
changed_fields.append("duration_seconds")
if task_instance.date_started and task_instance.date_created:
task_instance.wait_time_seconds = (
task_instance.date_started - task_instance.date_created
).total_seconds()
changed_fields.append("wait_time_seconds")
if isinstance(retval, dict):
task_instance.result_data = retval
changed_fields.append("result_data")
if "duplicate_of" in retval:
task_instance.status = PaperlessTask.Status.FAILURE
changed_fields.append("status")
task_instance.save(update_fields=changed_fields)
except Exception: # pragma: no cover
# Don't let an exception in the signal handlers prevent
# a document from being consumed.
logger.exception("Updating PaperlessTask failed")
@@ -1103,21 +1234,84 @@ def task_failure_handler(
**kwargs,
) -> None:
"""
Updates the result of a failed PaperlessTask.
Records failure details when a task raises an exception.
Fully owns the FAILURE path. task_postrun_handler skips FAILURE
states so there is no overlap.
https://docs.celeryq.dev/en/stable/userguide/signals.html#task-failure
"""
if task_id is None: # pragma: no cover
return
if sender and sender.name not in TRACKED_TASKS: # pragma: no cover
return
try:
close_old_connections()
task_instance = PaperlessTask.objects.filter(task_id=task_id).first()
if task_instance is not None and task_instance.result is None:
task_instance.status = states.FAILURE
task_instance.result = traceback
task_instance.date_done = timezone.now()
task_instance.save()
result_data: dict = {
"error_type": type(exception).__name__ if exception else "Unknown",
"error_message": str(exception) if exception else "Unknown error",
}
if traceback:
tb_str = "".join(_tb.format_tb(traceback))
result_data["traceback"] = tb_str[:5000]
now = timezone.now()
update_fields: dict = {
"status": PaperlessTask.Status.FAILURE,
"result_data": result_data,
"date_done": now,
}
task_qs = PaperlessTask.objects.filter(task_id=task_id)
task_instance = task_qs.values("date_started", "date_created").first()
if task_instance:
date_started = task_instance["date_started"]
if date_started:
update_fields["duration_seconds"] = (now - date_started).total_seconds()
date_created = task_instance["date_created"]
if date_started and date_created:
update_fields["wait_time_seconds"] = (
date_started - date_created
).total_seconds()
task_qs.update(**update_fields)
except Exception: # pragma: no cover
logger.exception("Updating PaperlessTask failed")
logger.exception("Updating PaperlessTask on failure failed")
@task_revoked.connect
def task_revoked_handler(
sender=None,
request=None,
*,
terminated: bool = False,
signum=None,
expired: bool = False,
**kwargs,
) -> None:
"""
Marks the task REVOKED when it is cancelled before or during execution.
This fires for tasks revoked while still queued (before task_prerun) as
well as for tasks terminated mid-run. task_postrun does NOT fire for
pre-start revocations, so this handler is the only way to move those
records out of PENDING.
https://docs.celeryq.dev/en/stable/userguide/signals.html#task-revoked
"""
task_id = request.id if request else None
if task_id is None: # pragma: no cover
return
if sender and sender.name not in TRACKED_TASKS: # pragma: no cover
return
try:
close_old_connections()
PaperlessTask.objects.filter(task_id=task_id).update(
status=PaperlessTask.Status.REVOKED,
date_done=timezone.now(),
)
except Exception: # pragma: no cover
logger.exception("Updating PaperlessTask on revocation failed")
@worker_process_init.connect
@@ -1143,7 +1337,7 @@ def add_or_update_document_in_llm_index(sender, document, **kwargs):
if ai_config.llm_index_enabled:
from documents.tasks import update_document_in_llm_index
update_document_in_llm_index.delay(document)
update_document_in_llm_index.apply_async(kwargs={"document": document})
@receiver(models.signals.post_delete, sender=Document)
@@ -1159,4 +1353,4 @@ def delete_document_from_llm_index(
if ai_config.llm_index_enabled:
from documents.tasks import remove_document_from_llm_index
remove_document_from_llm_index.delay(instance)
remove_document_from_llm_index.apply_async(kwargs={"document": instance})
+47 -77
View File
@@ -10,7 +10,6 @@ from tempfile import mkstemp
from celery import Task
from celery import shared_task
from celery import states
from django.conf import settings
from django.contrib.contenttypes.models import ContentType
from django.db import models
@@ -27,11 +26,15 @@ from documents.caching import clear_document_caches
from documents.classifier import DocumentClassifier
from documents.classifier import load_classifier
from documents.consumer import AsnCheckPlugin
from documents.consumer import ConsumeFileDuplicateError
from documents.consumer import ConsumerPlugin
from documents.consumer import ConsumerPreflightPlugin
from documents.consumer import WorkflowTriggerPlugin
from documents.consumer import should_produce_archive
from documents.data_models import ConsumableDocument
from documents.data_models import ConsumeFileDuplicateResult
from documents.data_models import ConsumeFileStoppedResult
from documents.data_models import ConsumeFileSuccessResult
from documents.data_models import DocumentMetadataOverrides
from documents.double_sided import CollatePlugin
from documents.file_handling import create_source_path_directory
@@ -84,19 +87,8 @@ def index_optimize() -> None:
@shared_task
def train_classifier(
*,
scheduled=True,
status_callback: Callable[[str], None] | None = None,
) -> None:
task = PaperlessTask.objects.create(
type=PaperlessTask.TaskType.SCHEDULED_TASK
if scheduled
else PaperlessTask.TaskType.MANUAL_TASK,
task_id=uuid.uuid4(),
task_name=PaperlessTask.TaskName.TRAIN_CLASSIFIER,
status=states.STARTED,
date_created=timezone.now(),
date_started=timezone.now(),
)
) -> str:
if (
not Tag.objects.filter(matching_algorithm=Tag.MATCH_AUTO).exists()
and not DocumentType.objects.filter(matching_algorithm=Tag.MATCH_AUTO).exists()
@@ -107,40 +99,25 @@ def train_classifier(
logger.info(result)
# Special case, items were once auto and trained, so remove the model
# and prevent its use again
if settings.MODEL_FILE.exists():
if settings.MODEL_FILE.exists(): # pragma: no cover
logger.info(f"Removing {settings.MODEL_FILE} so it won't be used")
settings.MODEL_FILE.unlink()
task.status = states.SUCCESS
task.result = result
task.date_done = timezone.now()
task.save()
return
return result
classifier = load_classifier()
if not classifier:
classifier = DocumentClassifier()
try:
if classifier.train(status_callback=status_callback):
logger.info(
f"Saving updated classifier model to {settings.MODEL_FILE}...",
)
classifier.save()
task.result = "Training completed successfully"
else:
logger.debug("Training data unchanged.")
task.result = "Training data unchanged"
task.status = states.SUCCESS
except Exception as e:
logger.warning("Classifier error: " + str(e))
task.status = states.FAILURE
task.result = str(e)
task.date_done = timezone.now()
task.save(update_fields=["status", "result", "date_done"])
if classifier.train(status_callback=status_callback):
logger.info(
f"Saving updated classifier model to {settings.MODEL_FILE}...",
)
classifier.save()
return "Training completed successfully"
else:
logger.debug("Training data unchanged.")
return "Training data unchanged"
@shared_task(bind=True)
@@ -148,6 +125,11 @@ def consume_file(
self: Task,
input_doc: ConsumableDocument,
overrides: DocumentMetadataOverrides | None = None,
) -> (
ConsumeFileSuccessResult
| ConsumeFileStoppedResult
| ConsumeFileDuplicateResult
| None
):
token = consume_task_id.set((self.request.id or "")[:8])
try:
@@ -180,6 +162,7 @@ def consume_file(
TemporaryDirectory(dir=settings.SCRATCH_DIR) as tmp_dir,
):
tmp_dir = Path(tmp_dir)
msg = None
for plugin_class in plugins:
plugin_name = plugin_class.NAME
@@ -210,7 +193,14 @@ def consume_file(
except StopConsumeTaskError as e:
logger.info(f"{plugin_name} requested task exit: {e.message}")
return e.message
return ConsumeFileStoppedResult(reason=e.message)
except ConsumeFileDuplicateError as e:
logger.info(f"{plugin_name} rejected duplicate: {e}")
return ConsumeFileDuplicateResult(
duplicate_of=e.duplicate_id,
duplicate_in_trash=e.in_trash,
)
except Exception as e:
logger.exception(f"{plugin_name} failed: {e}")
@@ -231,8 +221,8 @@ def consume_file(
@shared_task
def sanity_check(*, scheduled=True, raise_on_error=True):
messages = sanity_checker.check_sanity(scheduled=scheduled)
def sanity_check(*, raise_on_error: bool = True) -> str:
messages = sanity_checker.check_sanity()
messages.log_messages()
if not messages.has_error and not messages.has_warning and not messages.has_info:
@@ -628,49 +618,29 @@ def update_document_parent_tags(tag: Tag, new_parent: Tag) -> None:
)
if affected:
bulk_update_documents.delay(document_ids=list(affected))
bulk_update_documents.apply_async(
kwargs={"document_ids": list(affected)},
headers={"trigger_source": PaperlessTask.TriggerSource.SYSTEM},
)
@shared_task
def llmindex_index(
*,
iter_wrapper: IterWrapper[Document] = identity,
rebuild=False,
scheduled=True,
auto=False,
) -> None:
rebuild: bool = False,
) -> str | None:
ai_config = AIConfig()
if ai_config.llm_index_enabled:
task = PaperlessTask.objects.create(
type=PaperlessTask.TaskType.SCHEDULED_TASK
if scheduled
else PaperlessTask.TaskType.AUTO
if auto
else PaperlessTask.TaskType.MANUAL_TASK,
task_id=uuid.uuid4(),
task_name=PaperlessTask.TaskName.LLMINDEX_UPDATE,
status=states.STARTED,
date_created=timezone.now(),
date_started=timezone.now(),
)
from paperless_ai.indexing import update_llm_index
try:
result = update_llm_index(
iter_wrapper=iter_wrapper,
rebuild=rebuild,
)
task.status = states.SUCCESS
task.result = result
except Exception as e:
logger.error("LLM index error: " + str(e))
task.status = states.FAILURE
task.result = str(e)
task.date_done = timezone.now()
task.save(update_fields=["status", "result", "date_done"])
else:
if not ai_config.llm_index_enabled: # pragma: no cover
logger.info("LLM index is disabled, skipping update.")
return None
from paperless_ai.indexing import update_llm_index
return update_llm_index(
iter_wrapper=iter_wrapper,
rebuild=rebuild,
)
@shared_task
+30 -9
View File
@@ -13,6 +13,8 @@ from rest_framework.test import APIClient
from documents.tests.factories import DocumentFactory
UserModelT = get_user_model()
if TYPE_CHECKING:
from documents.models import Document
@@ -126,15 +128,34 @@ def rest_api_client():
yield APIClient()
@pytest.fixture
def authenticated_rest_api_client(rest_api_client: APIClient):
"""
The basic DRF ApiClient which has been authenticated
"""
UserModel = get_user_model()
user = UserModel.objects.create_user(username="testuser", password="password")
rest_api_client.force_authenticate(user=user)
yield rest_api_client
@pytest.fixture()
def regular_user(django_user_model: type[UserModelT]) -> UserModelT:
"""Unprivileged authenticated user for permission boundary tests."""
return django_user_model.objects.create_user(username="regular", password="regular")
@pytest.fixture()
def admin_client(rest_api_client: APIClient, admin_user: UserModelT) -> APIClient:
"""Admin client pre-authenticated and sending the v10 Accept header."""
rest_api_client.force_authenticate(user=admin_user)
rest_api_client.credentials(HTTP_ACCEPT="application/json; version=10")
return rest_api_client
@pytest.fixture()
def v9_client(rest_api_client: APIClient, admin_user: UserModelT) -> APIClient:
"""Admin client pre-authenticated and sending the v9 Accept header."""
rest_api_client.force_authenticate(user=admin_user)
rest_api_client.credentials(HTTP_ACCEPT="application/json; version=9")
return rest_api_client
@pytest.fixture()
def user_client(rest_api_client: APIClient, regular_user: UserModelT) -> APIClient:
"""Regular-user client pre-authenticated and sending the v10 Accept header."""
rest_api_client.force_authenticate(user=regular_user)
rest_api_client.credentials(HTTP_ACCEPT="application/json; version=10")
return rest_api_client
@pytest.fixture(scope="session", autouse=True)
+14
View File
@@ -11,6 +11,7 @@ from documents.models import Correspondent
from documents.models import Document
from documents.models import DocumentType
from documents.models import MatchingModel
from documents.models import PaperlessTask
from documents.models import StoragePath
from documents.models import Tag
@@ -65,3 +66,16 @@ class DocumentFactory(DjangoModelFactory):
correspondent = None
document_type = None
storage_path = None
class PaperlessTaskFactory(DjangoModelFactory):
class Meta:
model = PaperlessTask
task_id = factory.Faker("uuid4")
task_type = PaperlessTask.TaskType.CONSUME_FILE
trigger_source = PaperlessTask.TriggerSource.WEB_UI
status = PaperlessTask.Status.PENDING
input_data = factory.LazyFunction(dict)
result_data = None
acknowledged = False
+236 -329
View File
@@ -33,19 +33,12 @@ class TestWriteBatch:
except RuntimeError:
pass
r = backend.search(
"should survive",
user=None,
page=1,
page_size=10,
sort_field=None,
sort_reverse=False,
)
assert r.total == 1
ids = backend.search_ids("should survive", user=None)
assert len(ids) == 1
class TestSearch:
"""Test search functionality."""
"""Test search query parsing and matching via search_ids."""
def test_text_mode_limits_default_search_to_title_and_content(
self,
@@ -60,27 +53,20 @@ class TestSearch:
)
backend.add_or_update(doc)
metadata_only = backend.search(
"document_type:invoice",
user=None,
page=1,
page_size=10,
sort_field=None,
sort_reverse=False,
search_mode=SearchMode.TEXT,
assert (
len(
backend.search_ids(
"document_type:invoice",
user=None,
search_mode=SearchMode.TEXT,
),
)
== 0
)
assert metadata_only.total == 0
content_match = backend.search(
"monthly",
user=None,
page=1,
page_size=10,
sort_field=None,
sort_reverse=False,
search_mode=SearchMode.TEXT,
assert (
len(backend.search_ids("monthly", user=None, search_mode=SearchMode.TEXT))
== 1
)
assert content_match.total == 1
def test_title_mode_limits_default_search_to_title_only(
self,
@@ -95,27 +81,14 @@ class TestSearch:
)
backend.add_or_update(doc)
content_only = backend.search(
"monthly",
user=None,
page=1,
page_size=10,
sort_field=None,
sort_reverse=False,
search_mode=SearchMode.TITLE,
assert (
len(backend.search_ids("monthly", user=None, search_mode=SearchMode.TITLE))
== 0
)
assert content_only.total == 0
title_match = backend.search(
"invoice",
user=None,
page=1,
page_size=10,
sort_field=None,
sort_reverse=False,
search_mode=SearchMode.TITLE,
assert (
len(backend.search_ids("invoice", user=None, search_mode=SearchMode.TITLE))
== 1
)
assert title_match.total == 1
def test_text_mode_matches_partial_term_substrings(
self,
@@ -130,38 +103,16 @@ class TestSearch:
)
backend.add_or_update(doc)
prefix_match = backend.search(
"pass",
user=None,
page=1,
page_size=10,
sort_field=None,
sort_reverse=False,
search_mode=SearchMode.TEXT,
assert (
len(backend.search_ids("pass", user=None, search_mode=SearchMode.TEXT)) == 1
)
assert prefix_match.total == 1
infix_match = backend.search(
"sswo",
user=None,
page=1,
page_size=10,
sort_field=None,
sort_reverse=False,
search_mode=SearchMode.TEXT,
assert (
len(backend.search_ids("sswo", user=None, search_mode=SearchMode.TEXT)) == 1
)
assert infix_match.total == 1
phrase_match = backend.search(
"sswo re",
user=None,
page=1,
page_size=10,
sort_field=None,
sort_reverse=False,
search_mode=SearchMode.TEXT,
assert (
len(backend.search_ids("sswo re", user=None, search_mode=SearchMode.TEXT))
== 1
)
assert phrase_match.total == 1
def test_text_mode_does_not_match_on_partial_term_overlap(
self,
@@ -176,16 +127,10 @@ class TestSearch:
)
backend.add_or_update(doc)
non_match = backend.search(
"raptor",
user=None,
page=1,
page_size=10,
sort_field=None,
sort_reverse=False,
search_mode=SearchMode.TEXT,
assert (
len(backend.search_ids("raptor", user=None, search_mode=SearchMode.TEXT))
== 0
)
assert non_match.total == 0
def test_text_mode_anchors_later_query_tokens_to_token_starts(
self,
@@ -214,16 +159,9 @@ class TestSearch:
backend.add_or_update(prefix_doc)
backend.add_or_update(false_positive)
results = backend.search(
"Z-Berichte 6",
user=None,
page=1,
page_size=10,
sort_field=None,
sort_reverse=False,
search_mode=SearchMode.TEXT,
result_ids = set(
backend.search_ids("Z-Berichte 6", user=None, search_mode=SearchMode.TEXT),
)
result_ids = {hit["id"] for hit in results.hits}
assert exact_doc.id in result_ids
assert prefix_doc.id in result_ids
@@ -242,16 +180,9 @@ class TestSearch:
)
backend.add_or_update(doc)
no_tokens = backend.search(
"!!!",
user=None,
page=1,
page_size=10,
sort_field=None,
sort_reverse=False,
search_mode=SearchMode.TEXT,
assert (
len(backend.search_ids("!!!", user=None, search_mode=SearchMode.TEXT)) == 0
)
assert no_tokens.total == 0
def test_title_mode_matches_partial_term_substrings(
self,
@@ -266,59 +197,18 @@ class TestSearch:
)
backend.add_or_update(doc)
prefix_match = backend.search(
"pass",
user=None,
page=1,
page_size=10,
sort_field=None,
sort_reverse=False,
search_mode=SearchMode.TITLE,
assert (
len(backend.search_ids("pass", user=None, search_mode=SearchMode.TITLE))
== 1
)
assert prefix_match.total == 1
infix_match = backend.search(
"sswo",
user=None,
page=1,
page_size=10,
sort_field=None,
sort_reverse=False,
search_mode=SearchMode.TITLE,
assert (
len(backend.search_ids("sswo", user=None, search_mode=SearchMode.TITLE))
== 1
)
assert infix_match.total == 1
phrase_match = backend.search(
"sswo gu",
user=None,
page=1,
page_size=10,
sort_field=None,
sort_reverse=False,
search_mode=SearchMode.TITLE,
assert (
len(backend.search_ids("sswo gu", user=None, search_mode=SearchMode.TITLE))
== 1
)
assert phrase_match.total == 1
def test_scores_normalised_top_hit_is_one(self, backend: TantivyBackend):
"""Search scores must be normalized so top hit has score 1.0 for UI consistency."""
for i, title in enumerate(["bank invoice", "bank statement", "bank receipt"]):
doc = Document.objects.create(
title=title,
content=title,
checksum=f"SN{i}",
pk=10 + i,
)
backend.add_or_update(doc)
r = backend.search(
"bank",
user=None,
page=1,
page_size=10,
sort_field=None,
sort_reverse=False,
)
assert r.hits[0]["score"] == pytest.approx(1.0)
assert all(0.0 <= h["score"] <= 1.0 for h in r.hits)
def test_sort_field_ascending(self, backend: TantivyBackend):
"""Searching with sort_reverse=False must return results in ascending ASN order."""
@@ -331,16 +221,14 @@ class TestSearch:
)
backend.add_or_update(doc)
r = backend.search(
ids = backend.search_ids(
"sortable",
user=None,
page=1,
page_size=10,
sort_field="archive_serial_number",
sort_reverse=False,
)
assert r.total == 3
asns = [Document.objects.get(pk=h["id"]).archive_serial_number for h in r.hits]
assert len(ids) == 3
asns = [Document.objects.get(pk=doc_id).archive_serial_number for doc_id in ids]
assert asns == [10, 20, 30]
def test_sort_field_descending(self, backend: TantivyBackend):
@@ -354,79 +242,91 @@ class TestSearch:
)
backend.add_or_update(doc)
r = backend.search(
ids = backend.search_ids(
"sortable",
user=None,
page=1,
page_size=10,
sort_field="archive_serial_number",
sort_reverse=True,
)
assert r.total == 3
asns = [Document.objects.get(pk=h["id"]).archive_serial_number for h in r.hits]
assert len(ids) == 3
asns = [Document.objects.get(pk=doc_id).archive_serial_number for doc_id in ids]
assert asns == [30, 20, 10]
def test_fuzzy_threshold_filters_low_score_hits(
self,
backend: TantivyBackend,
settings,
):
"""When ADVANCED_FUZZY_SEARCH_THRESHOLD exceeds all normalized scores, hits must be filtered out."""
doc = Document.objects.create(
title="Invoice document",
content="financial report",
checksum="FT1",
pk=120,
)
backend.add_or_update(doc)
# Threshold above 1.0 filters every hit (normalized scores top out at 1.0)
settings.ADVANCED_FUZZY_SEARCH_THRESHOLD = 1.1
r = backend.search(
"invoice",
class TestSearchIds:
"""Test lightweight ID-only search."""
def test_returns_matching_ids(self, backend: TantivyBackend):
"""search_ids must return IDs of all matching documents."""
docs = []
for i in range(5):
doc = Document.objects.create(
title=f"findable doc {i}",
content="common keyword",
checksum=f"SI{i}",
)
backend.add_or_update(doc)
docs.append(doc)
other = Document.objects.create(
title="unrelated",
content="nothing here",
checksum="SI_other",
)
backend.add_or_update(other)
ids = backend.search_ids(
"common keyword",
user=None,
page=1,
page_size=10,
sort_field=None,
sort_reverse=False,
search_mode=SearchMode.QUERY,
)
assert r.hits == []
assert set(ids) == {d.pk for d in docs}
assert other.pk not in ids
def test_owner_filter(self, backend: TantivyBackend):
"""Document owners can search their private documents; other users cannot access them."""
owner = User.objects.create_user("owner")
other = User.objects.create_user("other")
def test_respects_permission_filter(self, backend: TantivyBackend):
"""search_ids must respect user permission filtering."""
owner = User.objects.create_user("ids_owner")
other = User.objects.create_user("ids_other")
doc = Document.objects.create(
title="Private",
content="secret",
checksum="PF1",
pk=20,
title="private doc",
content="secret keyword",
checksum="SIP1",
owner=owner,
)
backend.add_or_update(doc)
assert backend.search_ids(
"secret",
user=owner,
search_mode=SearchMode.QUERY,
) == [doc.pk]
assert (
backend.search(
"secret",
user=owner,
page=1,
page_size=10,
sort_field=None,
sort_reverse=False,
).total
== 1
backend.search_ids("secret", user=other, search_mode=SearchMode.QUERY) == []
)
assert (
backend.search(
"secret",
user=other,
page=1,
page_size=10,
sort_field=None,
sort_reverse=False,
).total
== 0
def test_respects_fuzzy_threshold(self, backend: TantivyBackend, settings):
"""search_ids must apply the same fuzzy threshold as search()."""
doc = Document.objects.create(
title="threshold test",
content="unique term",
checksum="SIT1",
)
backend.add_or_update(doc)
settings.ADVANCED_FUZZY_SEARCH_THRESHOLD = 1.1
ids = backend.search_ids("unique", user=None, search_mode=SearchMode.QUERY)
assert ids == []
def test_returns_ids_for_text_mode(self, backend: TantivyBackend):
"""search_ids must work with TEXT search mode."""
doc = Document.objects.create(
title="text mode doc",
content="findable phrase",
checksum="SIM1",
)
backend.add_or_update(doc)
ids = backend.search_ids("findable", user=None, search_mode=SearchMode.TEXT)
assert ids == [doc.pk]
class TestRebuild:
@@ -490,57 +390,26 @@ class TestAutocomplete:
class TestMoreLikeThis:
"""Test more like this functionality."""
def test_excludes_original(self, backend: TantivyBackend):
"""More like this queries must exclude the reference document from results."""
def test_more_like_this_ids_excludes_original(self, backend: TantivyBackend):
"""more_like_this_ids must return IDs of similar documents, excluding the original."""
doc1 = Document.objects.create(
title="Important document",
content="financial information",
checksum="MLT1",
pk=50,
content="financial information report",
checksum="MLTI1",
pk=150,
)
doc2 = Document.objects.create(
title="Another document",
content="financial report",
checksum="MLT2",
pk=51,
content="financial information report",
checksum="MLTI2",
pk=151,
)
backend.add_or_update(doc1)
backend.add_or_update(doc2)
results = backend.more_like_this(doc_id=50, user=None, page=1, page_size=10)
returned_ids = [hit["id"] for hit in results.hits]
assert 50 not in returned_ids # Original document excluded
def test_with_user_applies_permission_filter(self, backend: TantivyBackend):
"""more_like_this with a user must exclude documents that user cannot see."""
viewer = User.objects.create_user("mlt_viewer")
other = User.objects.create_user("mlt_other")
public_doc = Document.objects.create(
title="Public financial document",
content="quarterly financial analysis report figures",
checksum="MLT3",
pk=52,
)
private_doc = Document.objects.create(
title="Private financial document",
content="quarterly financial analysis report figures",
checksum="MLT4",
pk=53,
owner=other,
)
backend.add_or_update(public_doc)
backend.add_or_update(private_doc)
results = backend.more_like_this(doc_id=52, user=viewer, page=1, page_size=10)
returned_ids = [hit["id"] for hit in results.hits]
# private_doc is owned by other, so viewer cannot see it
assert 53 not in returned_ids
def test_document_not_in_index_returns_empty(self, backend: TantivyBackend):
"""more_like_this for a doc_id absent from the index must return empty results."""
results = backend.more_like_this(doc_id=9999, user=None, page=1, page_size=10)
assert results.hits == []
assert results.total == 0
ids = backend.more_like_this_ids(doc_id=150, user=None)
assert 150 not in ids
assert 151 in ids
class TestSingleton:
@@ -593,19 +462,10 @@ class TestFieldHandling:
# Should not raise an exception
backend.add_or_update(doc)
results = backend.search(
"test",
user=None,
page=1,
page_size=10,
sort_field=None,
sort_reverse=False,
)
assert results.total == 1
assert len(backend.search_ids("test", user=None)) == 1
def test_custom_fields_include_name_and_value(self, backend: TantivyBackend):
"""Custom fields must be indexed with both field name and value for structured queries."""
# Create a custom field
field = CustomField.objects.create(
name="Invoice Number",
data_type=CustomField.FieldDataType.STRING,
@@ -622,18 +482,9 @@ class TestFieldHandling:
value_text="INV-2024-001",
)
# Should not raise an exception during indexing
backend.add_or_update(doc)
results = backend.search(
"invoice",
user=None,
page=1,
page_size=10,
sort_field=None,
sort_reverse=False,
)
assert results.total == 1
assert len(backend.search_ids("invoice", user=None)) == 1
def test_select_custom_field_indexes_label_not_id(self, backend: TantivyBackend):
"""SELECT custom fields must index the human-readable label, not the opaque option ID."""
@@ -660,27 +511,8 @@ class TestFieldHandling:
)
backend.add_or_update(doc)
# Label should be findable
results = backend.search(
"custom_fields.value:invoice",
user=None,
page=1,
page_size=10,
sort_field=None,
sort_reverse=False,
)
assert results.total == 1
# Opaque ID must not appear in the index
results = backend.search(
"custom_fields.value:opt_abc",
user=None,
page=1,
page_size=10,
sort_field=None,
sort_reverse=False,
)
assert results.total == 0
assert len(backend.search_ids("custom_fields.value:invoice", user=None)) == 1
assert len(backend.search_ids("custom_fields.value:opt_abc", user=None)) == 0
def test_none_custom_field_value_not_indexed(self, backend: TantivyBackend):
"""Custom field instances with no value set must not produce an index entry."""
@@ -702,16 +534,7 @@ class TestFieldHandling:
)
backend.add_or_update(doc)
# The string "none" must not appear as an indexed value
results = backend.search(
"custom_fields.value:none",
user=None,
page=1,
page_size=10,
sort_field=None,
sort_reverse=False,
)
assert results.total == 0
assert len(backend.search_ids("custom_fields.value:none", user=None)) == 0
def test_notes_include_user_information(self, backend: TantivyBackend):
"""Notes must be indexed with user information when available for structured queries."""
@@ -724,32 +547,116 @@ class TestFieldHandling:
)
Note.objects.create(document=doc, note="Important note", user=user)
# Should not raise an exception during indexing
backend.add_or_update(doc)
# Test basic document search first
results = backend.search(
"test",
user=None,
page=1,
page_size=10,
sort_field=None,
sort_reverse=False,
)
assert results.total == 1, (
f"Expected 1, got {results.total}. Document content should be searchable."
ids = backend.search_ids("test", user=None)
assert len(ids) == 1, (
f"Expected 1, got {len(ids)}. Document content should be searchable."
)
# Test notes search — must use structured JSON syntax now that note
# is no longer in DEFAULT_SEARCH_FIELDS
results = backend.search(
"notes.note:important",
user=None,
page=1,
page_size=10,
sort_field=None,
sort_reverse=False,
ids = backend.search_ids("notes.note:important", user=None)
assert len(ids) == 1, (
f"Expected 1, got {len(ids)}. Note content should be searchable via notes.note: prefix."
)
assert results.total == 1, (
f"Expected 1, got {results.total}. Note content should be searchable via notes.note: prefix."
class TestHighlightHits:
"""Test highlight_hits returns proper HTML strings, not raw Snippet objects."""
def test_highlights_simple_text_mode_returns_html_string(
self,
backend: TantivyBackend,
):
"""Simple text search should still produce content highlights for exact-token hits."""
doc = Document.objects.create(
title="Highlight Test",
content="The quick brown fox jumps over the lazy dog",
checksum="HH0",
pk=89,
)
backend.add_or_update(doc)
hits = backend.highlight_hits("quick", [doc.pk], search_mode=SearchMode.TEXT)
assert len(hits) == 1
highlights = hits[0]["highlights"]
assert "content" in highlights
assert "<b>" in highlights["content"]
def test_highlights_content_returns_html_string(self, backend: TantivyBackend):
"""highlight_hits must return HTML strings (from Snippet.to_html()), not Snippet objects."""
doc = Document.objects.create(
title="Highlight Test",
content="The quick brown fox jumps over the lazy dog",
checksum="HH1",
pk=90,
)
backend.add_or_update(doc)
hits = backend.highlight_hits("quick", [doc.pk])
assert len(hits) == 1
highlights = hits[0]["highlights"]
assert "content" in highlights
content_highlight = highlights["content"]
assert isinstance(content_highlight, str), (
f"Expected str, got {type(content_highlight)}: {content_highlight!r}"
)
# Tantivy wraps matched terms in <b> tags
assert "<b>" in content_highlight, (
f"Expected HTML with <b> tags, got: {content_highlight!r}"
)
def test_highlights_notes_returns_html_string(self, backend: TantivyBackend):
"""Note highlights must be HTML strings via notes_text companion field.
The notes JSON field does not support tantivy SnippetGenerator; the
notes_text plain-text field is used instead. We use the full-text
query "urgent" (not notes.note:) because notes_text IS in
DEFAULT_SEARCH_FIELDS via the normal search path actually, we use
notes.note: prefix so the query targets notes content directly, but
the snippet is generated from notes_text which stores the same text.
"""
user = User.objects.create_user("hl_noteuser")
doc = Document.objects.create(
title="Doc with matching note",
content="unrelated content",
checksum="HH2",
pk=91,
)
Note.objects.create(document=doc, note="urgent payment required", user=user)
backend.add_or_update(doc)
# Use notes.note: prefix so the document matches the query and the
# notes_text snippet generator can produce highlights.
hits = backend.highlight_hits("notes.note:urgent", [doc.pk])
assert len(hits) == 1
highlights = hits[0]["highlights"]
assert "notes" in highlights
note_highlight = highlights["notes"]
assert isinstance(note_highlight, str), (
f"Expected str, got {type(note_highlight)}: {note_highlight!r}"
)
assert "<b>" in note_highlight, (
f"Expected HTML with <b> tags, got: {note_highlight!r}"
)
def test_empty_doc_list_returns_empty_hits(self, backend: TantivyBackend):
"""highlight_hits with no doc IDs must return an empty list."""
hits = backend.highlight_hits("anything", [])
assert hits == []
def test_no_highlights_when_no_match(self, backend: TantivyBackend):
"""Documents not matching the query should not appear in results."""
doc = Document.objects.create(
title="Unrelated",
content="completely different text",
checksum="HH3",
pk=92,
)
backend.add_or_update(doc)
hits = backend.highlight_hits("quick", [doc.pk])
assert len(hits) == 0
+66 -33
View File
@@ -81,45 +81,38 @@ class TestCreatedDateField:
),
pytest.param(
"created",
"this_week",
"2026-03-23T00:00:00Z",
"2026-03-30T00:00:00Z",
id="this_week_mon_sun",
),
pytest.param(
"created",
"last_week",
"previous week",
"2026-03-16T00:00:00Z",
"2026-03-23T00:00:00Z",
id="last_week",
id="previous_week",
),
pytest.param(
"created",
"this_month",
"this month",
"2026-03-01T00:00:00Z",
"2026-04-01T00:00:00Z",
id="this_month",
),
pytest.param(
"created",
"last_month",
"previous month",
"2026-02-01T00:00:00Z",
"2026-03-01T00:00:00Z",
id="last_month",
id="previous_month",
),
pytest.param(
"created",
"this_year",
"this year",
"2026-01-01T00:00:00Z",
"2027-01-01T00:00:00Z",
id="this_year",
),
pytest.param(
"created",
"last_year",
"previous year",
"2025-01-01T00:00:00Z",
"2026-01-01T00:00:00Z",
id="last_year",
id="previous_year",
),
],
)
@@ -141,7 +134,7 @@ class TestCreatedDateField:
def test_this_month_december_wraps_to_next_year(self) -> None:
# December: next month must roll over to January 1 of next year
lo, hi = _range(
rewrite_natural_date_keywords("created:this_month", UTC),
rewrite_natural_date_keywords("created:this month", UTC),
"created",
)
assert lo == "2026-12-01T00:00:00Z"
@@ -151,12 +144,21 @@ class TestCreatedDateField:
def test_last_month_january_wraps_to_previous_year(self) -> None:
# January: last month must roll back to December 1 of previous year
lo, hi = _range(
rewrite_natural_date_keywords("created:last_month", UTC),
rewrite_natural_date_keywords("created:previous month", UTC),
"created",
)
assert lo == "2025-12-01T00:00:00Z"
assert hi == "2026-01-01T00:00:00Z"
@time_machine.travel(datetime(2026, 7, 15, 12, 0, tzinfo=UTC), tick=False)
def test_previous_quarter(self) -> None:
lo, hi = _range(
rewrite_natural_date_keywords('created:"previous quarter"', UTC),
"created",
)
assert lo == "2026-04-01T00:00:00Z"
assert hi == "2026-07-01T00:00:00Z"
def test_unknown_keyword_raises(self) -> None:
with pytest.raises(ValueError, match="Unknown keyword"):
_date_only_range("bogus_keyword", UTC)
@@ -202,40 +204,34 @@ class TestDateTimeFields:
id="yesterday",
),
pytest.param(
"this_week",
"2026-03-23T00:00:00Z",
"2026-03-30T00:00:00Z",
id="this_week",
),
pytest.param(
"last_week",
"previous week",
"2026-03-16T00:00:00Z",
"2026-03-23T00:00:00Z",
id="last_week",
id="previous_week",
),
pytest.param(
"this_month",
"this month",
"2026-03-01T00:00:00Z",
"2026-04-01T00:00:00Z",
id="this_month",
),
pytest.param(
"last_month",
"previous month",
"2026-02-01T00:00:00Z",
"2026-03-01T00:00:00Z",
id="last_month",
id="previous_month",
),
pytest.param(
"this_year",
"this year",
"2026-01-01T00:00:00Z",
"2027-01-01T00:00:00Z",
id="this_year",
),
pytest.param(
"last_year",
"previous year",
"2025-01-01T00:00:00Z",
"2026-01-01T00:00:00Z",
id="last_year",
id="previous_year",
),
],
)
@@ -254,17 +250,54 @@ class TestDateTimeFields:
@time_machine.travel(datetime(2026, 12, 15, 12, 0, tzinfo=UTC), tick=False)
def test_this_month_december_wraps_to_next_year(self) -> None:
# December: next month wraps to January of next year
lo, hi = _range(rewrite_natural_date_keywords("added:this_month", UTC), "added")
lo, hi = _range(rewrite_natural_date_keywords("added:this month", UTC), "added")
assert lo == "2026-12-01T00:00:00Z"
assert hi == "2027-01-01T00:00:00Z"
@time_machine.travel(datetime(2026, 1, 15, 12, 0, tzinfo=UTC), tick=False)
def test_last_month_january_wraps_to_previous_year(self) -> None:
# January: last month wraps back to December of previous year
lo, hi = _range(rewrite_natural_date_keywords("added:last_month", UTC), "added")
lo, hi = _range(
rewrite_natural_date_keywords("added:previous month", UTC),
"added",
)
assert lo == "2025-12-01T00:00:00Z"
assert hi == "2026-01-01T00:00:00Z"
@pytest.mark.parametrize(
("query", "expected_lo", "expected_hi"),
[
pytest.param(
'added:"previous quarter"',
"2026-04-01T00:00:00Z",
"2026-07-01T00:00:00Z",
id="quoted_previous_quarter",
),
pytest.param(
"added:previous month",
"2026-06-01T00:00:00Z",
"2026-07-01T00:00:00Z",
id="bare_previous_month",
),
pytest.param(
"added:this month",
"2026-07-01T00:00:00Z",
"2026-08-01T00:00:00Z",
id="bare_this_month",
),
],
)
@time_machine.travel(datetime(2026, 7, 15, 12, 0, tzinfo=UTC), tick=False)
def test_legacy_natural_language_aliases(
self,
query: str,
expected_lo: str,
expected_hi: str,
) -> None:
lo, hi = _range(rewrite_natural_date_keywords(query, UTC), "added")
assert lo == expected_lo
assert hi == expected_hi
def test_unknown_keyword_raises(self) -> None:
with pytest.raises(ValueError, match="Unknown keyword"):
_datetime_range("bogus_keyword", UTC)
+30 -15
View File
@@ -1,5 +1,6 @@
from __future__ import annotations
import json
from typing import TYPE_CHECKING
import pytest
@@ -18,7 +19,7 @@ pytestmark = pytest.mark.search
class TestNeedsRebuild:
"""needs_rebuild covers all sentinel-file states that require a full reindex."""
def test_returns_true_when_version_file_missing(self, index_dir: Path) -> None:
def test_returns_true_when_settings_file_missing(self, index_dir: Path) -> None:
assert needs_rebuild(index_dir) is True
def test_returns_false_when_version_and_language_match(
@@ -27,37 +28,51 @@ class TestNeedsRebuild:
settings: SettingsWrapper,
) -> None:
settings.SEARCH_LANGUAGE = "en"
(index_dir / ".schema_version").write_text(str(SCHEMA_VERSION))
(index_dir / ".schema_language").write_text("en")
(index_dir / ".index_settings.json").write_text(
json.dumps({"schema_version": SCHEMA_VERSION, "language": "en"}),
)
assert needs_rebuild(index_dir) is False
def test_returns_true_on_schema_version_mismatch(self, index_dir: Path) -> None:
(index_dir / ".schema_version").write_text(str(SCHEMA_VERSION - 1))
assert needs_rebuild(index_dir) is True
def test_returns_true_when_version_file_not_an_integer(
def test_returns_true_on_schema_version_mismatch(
self,
index_dir: Path,
settings: SettingsWrapper,
) -> None:
(index_dir / ".schema_version").write_text("not-a-number")
settings.SEARCH_LANGUAGE = None
(index_dir / ".index_settings.json").write_text(
json.dumps({"schema_version": SCHEMA_VERSION - 1, "language": None}),
)
assert needs_rebuild(index_dir) is True
def test_returns_true_when_language_sentinel_missing(
def test_returns_true_when_version_is_not_an_integer(
self,
index_dir: Path,
settings: SettingsWrapper,
) -> None:
settings.SEARCH_LANGUAGE = None
(index_dir / ".index_settings.json").write_text(
json.dumps({"schema_version": "not-a-number", "language": None}),
)
assert needs_rebuild(index_dir) is True
def test_returns_true_when_language_key_missing(
self,
index_dir: Path,
settings: SettingsWrapper,
) -> None:
settings.SEARCH_LANGUAGE = "en"
(index_dir / ".schema_version").write_text(str(SCHEMA_VERSION))
# .schema_language intentionally absent
(index_dir / ".index_settings.json").write_text(
json.dumps({"schema_version": SCHEMA_VERSION}),
)
assert needs_rebuild(index_dir) is True
def test_returns_true_when_language_sentinel_content_differs(
def test_returns_true_when_language_differs(
self,
index_dir: Path,
settings: SettingsWrapper,
) -> None:
settings.SEARCH_LANGUAGE = "de"
(index_dir / ".schema_version").write_text(str(SCHEMA_VERSION))
(index_dir / ".schema_language").write_text("en")
(index_dir / ".index_settings.json").write_text(
json.dumps({"schema_version": SCHEMA_VERSION, "language": "en"}),
)
assert needs_rebuild(index_dir) is True
+1 -1
View File
@@ -831,7 +831,7 @@ class TestApiAppConfig(DirectoriesMixin, APITestCase):
config.save()
with (
patch("documents.tasks.llmindex_index.delay") as mock_update,
patch("documents.tasks.llmindex_index.apply_async") as mock_update,
patch("paperless_ai.indexing.vector_store_file_exists") as mock_exists,
):
mock_exists.return_value = False
+27 -11
View File
@@ -26,7 +26,7 @@ class TestBulkEditAPI(DirectoriesMixin, APITestCase):
self.user = user
self.client.force_authenticate(user=user)
patcher = mock.patch("documents.bulk_edit.bulk_update_documents.delay")
patcher = mock.patch("documents.bulk_edit.bulk_update_documents.apply_async")
self.async_task = patcher.start()
self.addCleanup(patcher.stop)
self.c1 = Correspondent.objects.create(name="c1")
@@ -62,7 +62,7 @@ class TestBulkEditAPI(DirectoriesMixin, APITestCase):
m.return_value = return_value
m.__name__ = method_name
@mock.patch("documents.bulk_edit.bulk_update_documents.delay")
@mock.patch("documents.bulk_edit.bulk_update_documents.apply_async")
def test_api_set_correspondent(self, bulk_update_task_mock) -> None:
self.assertNotEqual(self.doc1.correspondent, self.c1)
response = self.client.post(
@@ -79,9 +79,13 @@ class TestBulkEditAPI(DirectoriesMixin, APITestCase):
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.doc1.refresh_from_db()
self.assertEqual(self.doc1.correspondent, self.c1)
bulk_update_task_mock.assert_called_once_with(document_ids=[self.doc1.pk])
bulk_update_task_mock.assert_called_once()
self.assertCountEqual(
bulk_update_task_mock.call_args.kwargs["kwargs"]["document_ids"],
[self.doc1.pk],
)
@mock.patch("documents.bulk_edit.bulk_update_documents.delay")
@mock.patch("documents.bulk_edit.bulk_update_documents.apply_async")
def test_api_unset_correspondent(self, bulk_update_task_mock) -> None:
self.doc1.correspondent = self.c1
self.doc1.save()
@@ -103,7 +107,7 @@ class TestBulkEditAPI(DirectoriesMixin, APITestCase):
self.doc1.refresh_from_db()
self.assertIsNone(self.doc1.correspondent)
@mock.patch("documents.bulk_edit.bulk_update_documents.delay")
@mock.patch("documents.bulk_edit.bulk_update_documents.apply_async")
def test_api_set_type(self, bulk_update_task_mock) -> None:
self.assertNotEqual(self.doc1.document_type, self.dt1)
response = self.client.post(
@@ -120,9 +124,13 @@ class TestBulkEditAPI(DirectoriesMixin, APITestCase):
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.doc1.refresh_from_db()
self.assertEqual(self.doc1.document_type, self.dt1)
bulk_update_task_mock.assert_called_once_with(document_ids=[self.doc1.pk])
bulk_update_task_mock.assert_called_once()
self.assertCountEqual(
bulk_update_task_mock.call_args.kwargs["kwargs"]["document_ids"],
[self.doc1.pk],
)
@mock.patch("documents.bulk_edit.bulk_update_documents.delay")
@mock.patch("documents.bulk_edit.bulk_update_documents.apply_async")
def test_api_unset_type(self, bulk_update_task_mock) -> None:
self.doc1.document_type = self.dt1
self.doc1.save()
@@ -141,9 +149,13 @@ class TestBulkEditAPI(DirectoriesMixin, APITestCase):
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.doc1.refresh_from_db()
self.assertIsNone(self.doc1.document_type)
bulk_update_task_mock.assert_called_once_with(document_ids=[self.doc1.pk])
bulk_update_task_mock.assert_called_once()
self.assertCountEqual(
bulk_update_task_mock.call_args.kwargs["kwargs"]["document_ids"],
[self.doc1.pk],
)
@mock.patch("documents.bulk_edit.bulk_update_documents.delay")
@mock.patch("documents.bulk_edit.bulk_update_documents.apply_async")
def test_api_add_tag(self, bulk_update_task_mock) -> None:
self.assertFalse(self.doc1.tags.filter(pk=self.t1.pk).exists())
@@ -163,9 +175,13 @@ class TestBulkEditAPI(DirectoriesMixin, APITestCase):
self.assertTrue(self.doc1.tags.filter(pk=self.t1.pk).exists())
bulk_update_task_mock.assert_called_once_with(document_ids=[self.doc1.pk])
bulk_update_task_mock.assert_called_once()
self.assertCountEqual(
bulk_update_task_mock.call_args.kwargs["kwargs"]["document_ids"],
[self.doc1.pk],
)
@mock.patch("documents.bulk_edit.bulk_update_documents.delay")
@mock.patch("documents.bulk_edit.bulk_update_documents.apply_async")
def test_api_remove_tag(self, bulk_update_task_mock) -> None:
self.doc1.tags.add(self.t1)
+40 -2
View File
@@ -278,7 +278,7 @@ class TestCustomFieldsAPI(DirectoriesMixin, APITestCase):
doc.refresh_from_db()
self.assertEqual(doc.custom_fields.first().value, None)
@mock.patch("documents.signals.handlers.process_cf_select_update.delay")
@mock.patch("documents.signals.handlers.process_cf_select_update.apply_async")
def test_custom_field_update_offloaded_once(self, mock_delay) -> None:
"""
GIVEN:
@@ -322,7 +322,7 @@ class TestCustomFieldsAPI(DirectoriesMixin, APITestCase):
}
cf_select.save()
mock_delay.assert_called_once_with(cf_select)
mock_delay.assert_called_once_with(kwargs={"custom_field": cf_select})
def test_create_custom_field_monetary_validation(self) -> None:
"""
@@ -1320,3 +1320,41 @@ class TestCustomFieldsAPI(DirectoriesMixin, APITestCase):
self.assertEqual(response.status_code, status.HTTP_200_OK)
results = response.data["results"]
self.assertEqual(results[0]["document_count"], 0)
def test_patch_document_invalid_date_custom_field_returns_validation_error(self):
"""
GIVEN:
- A date custom field
- A document
WHEN:
- Patching the document with a date string in the wrong format
THEN:
- HTTP 400 is returned instead of an internal server error
- No custom field instance is created
"""
cf_date = CustomField.objects.create(
name="datefield",
data_type=CustomField.FieldDataType.DATE,
)
doc = Document.objects.create(
title="Doc",
checksum="123",
mime_type="application/pdf",
)
response = self.client.patch(
f"/api/documents/{doc.pk}/",
{
"custom_fields": [
{
"field": cf_date.pk,
"value": "10.03.2026",
},
],
},
format="json",
)
self.assertEqual(response.status_code, status.HTTP_400_BAD_REQUEST)
self.assertIn("custom_fields", response.data)
self.assertEqual(CustomFieldInstance.objects.count(), 0)
@@ -537,7 +537,7 @@ class TestDocumentVersioningApi(DirectoriesMixin, APITestCase):
async_task.id = "task-123"
with mock.patch("documents.views.consume_file") as consume_mock:
consume_mock.delay.return_value = async_task
consume_mock.apply_async.return_value = async_task
resp = self.client.post(
f"/api/documents/{root.id}/update_version/",
{"document": upload, "version_label": " New Version "},
@@ -546,8 +546,9 @@ class TestDocumentVersioningApi(DirectoriesMixin, APITestCase):
self.assertEqual(resp.status_code, status.HTTP_200_OK)
self.assertEqual(resp.data, "task-123")
consume_mock.delay.assert_called_once()
input_doc, overrides = consume_mock.delay.call_args[0]
consume_mock.apply_async.assert_called_once()
task_kwargs = consume_mock.apply_async.call_args.kwargs["kwargs"]
input_doc, overrides = task_kwargs["input_doc"], task_kwargs["overrides"]
self.assertEqual(input_doc.root_document_id, root.id)
self.assertEqual(input_doc.source, DocumentSource.ApiUpload)
self.assertEqual(overrides.version_label, "New Version")
@@ -571,7 +572,7 @@ class TestDocumentVersioningApi(DirectoriesMixin, APITestCase):
async_task.id = "task-123"
with mock.patch("documents.views.consume_file") as consume_mock:
consume_mock.delay.return_value = async_task
consume_mock.apply_async.return_value = async_task
resp = self.client.post(
f"/api/documents/{version.id}/update_version/",
{"document": upload, "version_label": " New Version "},
@@ -580,8 +581,9 @@ class TestDocumentVersioningApi(DirectoriesMixin, APITestCase):
self.assertEqual(resp.status_code, status.HTTP_200_OK)
self.assertEqual(resp.data, "task-123")
consume_mock.delay.assert_called_once()
input_doc, overrides = consume_mock.delay.call_args[0]
consume_mock.apply_async.assert_called_once()
task_kwargs = consume_mock.apply_async.call_args.kwargs["kwargs"]
input_doc, overrides = task_kwargs["input_doc"], task_kwargs["overrides"]
self.assertEqual(input_doc.root_document_id, root.id)
self.assertEqual(overrides.version_label, "New Version")
self.assertEqual(overrides.actor_id, self.user.id)
@@ -595,7 +597,7 @@ class TestDocumentVersioningApi(DirectoriesMixin, APITestCase):
upload = self._make_pdf_upload()
with mock.patch("documents.views.consume_file") as consume_mock:
consume_mock.delay.side_effect = Exception("boom")
consume_mock.apply_async.side_effect = Exception("boom")
resp = self.client.post(
f"/api/documents/{root.id}/update_version/",
{"document": upload},
+126 -48
View File
@@ -47,11 +47,11 @@ from documents.models import Workflow
from documents.models import WorkflowAction
from documents.models import WorkflowTrigger
from documents.signals.handlers import run_workflows
from documents.tests.utils import ConsumeTaskMixin
from documents.tests.utils import DirectoriesMixin
from documents.tests.utils import DocumentConsumeDelayMixin
class TestDocumentApi(DirectoriesMixin, DocumentConsumeDelayMixin, APITestCase):
class TestDocumentApi(DirectoriesMixin, ConsumeTaskMixin, APITestCase):
def setUp(self) -> None:
super().setUp()
@@ -1168,6 +1168,43 @@ class TestDocumentApi(DirectoriesMixin, DocumentConsumeDelayMixin, APITestCase):
self.assertIn("all", response.data)
self.assertCountEqual(response.data["all"], [d.id for d in docs])
def test_default_ordering_uses_id_as_tiebreaker(self):
"""
GIVEN:
- Documents sharing the same created date
WHEN:
- API request for documents without an explicit ordering
THEN:
- Results are correctly ordered by created > id
"""
older_doc = Document.objects.create(
checksum="older",
content="older",
created=date(2024, 1, 1),
)
first_same_date_doc = Document.objects.create(
checksum="same-date-1",
content="same-date-1",
created=date(2024, 1, 2),
)
second_same_date_doc = Document.objects.create(
checksum="same-date-2",
content="same-date-2",
created=date(2024, 1, 2),
)
response = self.client.get("/api/documents/")
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.assertEqual(
[result["id"] for result in response.data["results"]],
[
second_same_date_doc.id,
first_same_date_doc.id,
older_doc.id,
],
)
def test_list_with_include_selection_data(self) -> None:
correspondent = Correspondent.objects.create(name="c1")
doc_type = DocumentType.objects.create(name="dt1")
@@ -1363,9 +1400,7 @@ class TestDocumentApi(DirectoriesMixin, DocumentConsumeDelayMixin, APITestCase):
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.consume_file_mock.assert_called_once()
input_doc, overrides = self.get_last_consume_delay_call_args()
input_doc, overrides = self.assert_queue_consumption_task_call_args()
self.assertEqual(input_doc.original_file.name, "simple.pdf")
self.assertTrue(
@@ -1395,9 +1430,7 @@ class TestDocumentApi(DirectoriesMixin, DocumentConsumeDelayMixin, APITestCase):
)
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.consume_file_mock.assert_called_once()
input_doc, overrides = self.get_last_consume_delay_call_args()
input_doc, overrides = self.assert_queue_consumption_task_call_args()
self.assertEqual(input_doc.original_file.name, "outside.pdf")
self.assertEqual(overrides.filename, "outside.pdf")
@@ -1437,9 +1470,7 @@ class TestDocumentApi(DirectoriesMixin, DocumentConsumeDelayMixin, APITestCase):
)
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.consume_file_mock.assert_called_once()
input_doc, overrides = self.get_last_consume_delay_call_args()
input_doc, overrides = self.assert_queue_consumption_task_call_args()
self.assertEqual(input_doc.original_file.name, "outside.pdf")
self.assertEqual(overrides.filename, "outside.pdf")
@@ -1521,9 +1552,7 @@ class TestDocumentApi(DirectoriesMixin, DocumentConsumeDelayMixin, APITestCase):
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.consume_file_mock.assert_called_once()
input_doc, overrides = self.get_last_consume_delay_call_args()
input_doc, overrides = self.assert_queue_consumption_task_call_args()
self.assertEqual(input_doc.original_file.name, "simple.pdf")
self.assertTrue(
@@ -1575,9 +1604,7 @@ class TestDocumentApi(DirectoriesMixin, DocumentConsumeDelayMixin, APITestCase):
)
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.consume_file_mock.assert_called_once()
_, overrides = self.get_last_consume_delay_call_args()
_, overrides = self.assert_queue_consumption_task_call_args()
self.assertEqual(overrides.title, "my custom title")
self.assertIsNone(overrides.correspondent_id)
@@ -1597,9 +1624,7 @@ class TestDocumentApi(DirectoriesMixin, DocumentConsumeDelayMixin, APITestCase):
)
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.consume_file_mock.assert_called_once()
_, overrides = self.get_last_consume_delay_call_args()
_, overrides = self.assert_queue_consumption_task_call_args()
self.assertEqual(overrides.correspondent_id, c.id)
self.assertIsNone(overrides.title)
@@ -1633,9 +1658,7 @@ class TestDocumentApi(DirectoriesMixin, DocumentConsumeDelayMixin, APITestCase):
)
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.consume_file_mock.assert_called_once()
_, overrides = self.get_last_consume_delay_call_args()
_, overrides = self.assert_queue_consumption_task_call_args()
self.assertEqual(overrides.document_type_id, dt.id)
self.assertIsNone(overrides.correspondent_id)
@@ -1669,9 +1692,7 @@ class TestDocumentApi(DirectoriesMixin, DocumentConsumeDelayMixin, APITestCase):
)
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.consume_file_mock.assert_called_once()
_, overrides = self.get_last_consume_delay_call_args()
_, overrides = self.assert_queue_consumption_task_call_args()
self.assertEqual(overrides.storage_path_id, sp.id)
self.assertIsNone(overrides.correspondent_id)
@@ -1706,9 +1727,7 @@ class TestDocumentApi(DirectoriesMixin, DocumentConsumeDelayMixin, APITestCase):
)
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.consume_file_mock.assert_called_once()
_, overrides = self.get_last_consume_delay_call_args()
_, overrides = self.assert_queue_consumption_task_call_args()
self.assertCountEqual(overrides.tag_ids, [t1.id, t2.id])
self.assertIsNone(overrides.document_type_id)
@@ -1753,9 +1772,7 @@ class TestDocumentApi(DirectoriesMixin, DocumentConsumeDelayMixin, APITestCase):
)
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.consume_file_mock.assert_called_once()
_, overrides = self.get_last_consume_delay_call_args()
_, overrides = self.assert_queue_consumption_task_call_args()
self.assertEqual(overrides.created, created.date())
@@ -1772,9 +1789,7 @@ class TestDocumentApi(DirectoriesMixin, DocumentConsumeDelayMixin, APITestCase):
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.consume_file_mock.assert_called_once()
input_doc, overrides = self.get_last_consume_delay_call_args()
input_doc, overrides = self.assert_queue_consumption_task_call_args()
self.assertEqual(input_doc.original_file.name, "simple.pdf")
self.assertEqual(overrides.filename, "simple.pdf")
@@ -1804,9 +1819,7 @@ class TestDocumentApi(DirectoriesMixin, DocumentConsumeDelayMixin, APITestCase):
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.consume_file_mock.assert_called_once()
input_doc, overrides = self.get_last_consume_delay_call_args()
input_doc, overrides = self.assert_queue_consumption_task_call_args()
self.assertEqual(input_doc.original_file.name, "simple.pdf")
self.assertEqual(overrides.filename, "simple.pdf")
@@ -1861,9 +1874,7 @@ class TestDocumentApi(DirectoriesMixin, DocumentConsumeDelayMixin, APITestCase):
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.consume_file_mock.assert_called_once()
input_doc, overrides = self.get_last_consume_delay_call_args()
input_doc, overrides = self.assert_queue_consumption_task_call_args()
new_overrides, _ = run_workflows(
trigger_type=WorkflowTrigger.WorkflowTriggerType.CONSUMPTION,
@@ -1909,9 +1920,7 @@ class TestDocumentApi(DirectoriesMixin, DocumentConsumeDelayMixin, APITestCase):
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.consume_file_mock.assert_called_once()
input_doc, overrides = self.get_last_consume_delay_call_args()
input_doc, overrides = self.assert_queue_consumption_task_call_args()
self.assertEqual(input_doc.original_file.name, "simple.pdf")
self.assertEqual(overrides.filename, "simple.pdf")
@@ -2010,9 +2019,7 @@ class TestDocumentApi(DirectoriesMixin, DocumentConsumeDelayMixin, APITestCase):
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.consume_file_mock.assert_called_once()
input_doc, _ = self.get_last_consume_delay_call_args()
input_doc, _ = self.assert_queue_consumption_task_call_args()
self.assertEqual(input_doc.source, WorkflowTrigger.DocumentSourceChoices.WEB_UI)
@@ -3109,6 +3116,77 @@ class TestDocumentApi(DirectoriesMixin, DocumentConsumeDelayMixin, APITestCase):
# modified was updated to today
self.assertEqual(doc.modified.day, timezone.now().day)
def test_delete_note_missing_id(self) -> None:
"""
GIVEN:
- Existing document
WHEN:
- API DELETE request to notes endpoint without an id query param
- API DELETE request to notes endpoint with an empty id query param
THEN:
- HTTP 400 is returned
"""
doc = Document.objects.create(
title="test",
mime_type="application/pdf",
content="this is a document",
)
response = self.client.delete(
f"/api/documents/{doc.pk}/notes/",
format="json",
)
self.assertEqual(response.status_code, status.HTTP_400_BAD_REQUEST)
response = self.client.delete(
f"/api/documents/{doc.pk}/notes/?id=",
format="json",
)
self.assertEqual(response.status_code, status.HTTP_400_BAD_REQUEST)
def test_delete_note_invalid_id(self) -> None:
"""
GIVEN:
- Existing document
WHEN:
- API DELETE request to notes endpoint with a non-integer note id
THEN:
- HTTP 400 is returned
"""
doc = Document.objects.create(
title="test",
mime_type="application/pdf",
content="this is a document",
)
response = self.client.delete(
f"/api/documents/{doc.pk}/notes/?id=notaninteger",
format="json",
)
self.assertEqual(response.status_code, status.HTTP_400_BAD_REQUEST)
def test_delete_note_nonexistent_id(self) -> None:
"""
GIVEN:
- Existing document, no notes
WHEN:
- API DELETE request to notes endpoint with a non-existent note id
THEN:
- HTTP 404 is returned
"""
doc = Document.objects.create(
title="test",
mime_type="application/pdf",
content="this is a document",
)
response = self.client.delete(
f"/api/documents/{doc.pk}/notes/?id=99999",
format="json",
)
self.assertEqual(response.status_code, status.HTTP_404_NOT_FOUND)
def test_get_notes_no_doc(self) -> None:
"""
GIVEN:
@@ -3379,7 +3457,7 @@ class TestDocumentApi(DirectoriesMixin, DocumentConsumeDelayMixin, APITestCase):
self.assertEqual(create_resp.status_code, status.HTTP_201_CREATED)
self.assertEqual(create_resp.data["document"], doc.pk)
def test_next_asn(self) -> None:
def test_next_asn(self):
"""
GIVEN:
- Existing documents with ASNs, highest owned by user2
@@ -9,6 +9,8 @@ from rest_framework.test import APITestCase
from documents.models import CustomField
from documents.models import Document
from documents.models import SavedView
from documents.models import SavedViewFilterRule
from documents.serialisers import DocumentSerializer
from documents.tests.utils import DirectoriesMixin
@@ -453,6 +455,111 @@ class TestCustomFieldsSearch(DirectoriesMixin, APITestCase):
),
)
def test_exact_monetary(self) -> None:
# "exact" should match by numeric amount, ignoring currency code prefix.
self._assert_query_match_predicate(
["monetary_field", "exact", "100"],
lambda document: (
"monetary_field" in document
and document["monetary_field"] == "USD100.00"
),
)
self._assert_query_match_predicate(
["monetary_field", "exact", "101"],
lambda document: (
"monetary_field" in document and document["monetary_field"] == "101.00"
),
)
def test_in_monetary(self) -> None:
# "in" should match by numeric amount, ignoring currency code prefix.
self._assert_query_match_predicate(
["monetary_field", "in", ["100", "50"]],
lambda document: (
"monetary_field" in document
and document["monetary_field"] in {"USD100.00", "EUR50.00"}
),
)
def test_exact_monetary_with_currency_prefix(self) -> None:
# Providing a currency-prefixed string like "USD100.00" for an exact monetary
# filter should work for backwards compatibility with saved views. The currency
# code is stripped and the numeric amount is used for comparison.
self._assert_query_match_predicate(
["monetary_field", "exact", "USD100.00"],
lambda document: (
"monetary_field" in document
and document["monetary_field"] == "USD100.00"
),
)
self._assert_query_match_predicate(
["monetary_field", "in", ["USD100.00", "EUR50.00"]],
lambda document: (
"monetary_field" in document
and document["monetary_field"] in {"USD100.00", "EUR50.00"}
),
)
self._assert_query_match_predicate(
["monetary_field", "gt", "USD99.00"],
lambda document: (
"monetary_field" in document
and document["monetary_field"] is not None
and (
document["monetary_field"] == "USD100.00"
or document["monetary_field"] == "101.00"
)
),
)
def test_saved_view_with_currency_prefixed_monetary_filter(self) -> None:
"""
A saved view created before the exact-monetary fix stored currency-prefixed
values like '["monetary_field", "exact", "USD100.00"]' as the filter rule value
(rule_type=42). Those saved views must continue to return correct results.
"""
saved_view = SavedView.objects.create(name="test view", owner=self.user)
SavedViewFilterRule.objects.create(
saved_view=saved_view,
rule_type=42, # FILTER_CUSTOM_FIELDS_QUERY
value=json.dumps(["monetary_field", "exact", "USD100.00"]),
)
# The frontend translates rule_type=42 to the custom_field_query URL param;
# simulate that here using the stored filter rule value directly.
rule = saved_view.filter_rules.get(rule_type=42)
query_string = quote(rule.value, safe="")
response = self.client.get(
"/api/documents/?"
+ "&".join(
(
f"custom_field_query={query_string}",
"ordering=archive_serial_number",
"page=1",
f"page_size={len(self.documents)}",
"truncate_content=true",
),
),
)
self.assertEqual(response.status_code, 200, msg=str(response.json()))
result_ids = {doc["id"] for doc in response.json()["results"]}
# Should match the single document with monetary_field = "USD100.00"
expected_ids = {
doc.id
for doc in self.documents
if doc.custom_fields.filter(
field__name="monetary_field",
value_monetary="USD100.00",
).exists()
}
self.assertEqual(result_ids, expected_ids)
def test_monetary_amount_with_invalid_value(self) -> None:
# A value that has a currency prefix but no valid number after it should fail.
self._assert_validation_error(
json.dumps(["monetary_field", "exact", "USDnotanumber"]),
["custom_field_query", "2"],
"valid number",
)
# ==========================================================#
# Subset check (document link field only) #
# ==========================================================#
+11 -6
View File
@@ -291,7 +291,7 @@ class TestApiStoragePaths(DirectoriesMixin, APITestCase):
self.assertEqual(response.status_code, status.HTTP_201_CREATED)
self.assertEqual(StoragePath.objects.count(), 2)
@mock.patch("documents.bulk_edit.bulk_update_documents.delay")
@mock.patch("documents.bulk_edit.bulk_update_documents.apply_async")
def test_api_update_storage_path(self, bulk_update_mock) -> None:
"""
GIVEN:
@@ -316,11 +316,12 @@ class TestApiStoragePaths(DirectoriesMixin, APITestCase):
bulk_update_mock.assert_called_once()
args, _ = bulk_update_mock.call_args
self.assertCountEqual(
[document.pk],
bulk_update_mock.call_args.kwargs["kwargs"]["document_ids"],
)
self.assertCountEqual([document.pk], args[0])
@mock.patch("documents.bulk_edit.bulk_update_documents.delay")
@mock.patch("documents.bulk_edit.bulk_update_documents.apply_async")
def test_api_delete_storage_path(self, bulk_update_mock) -> None:
"""
GIVEN:
@@ -347,7 +348,11 @@ class TestApiStoragePaths(DirectoriesMixin, APITestCase):
self.assertEqual(response.status_code, status.HTTP_204_NO_CONTENT)
# only called once
bulk_update_mock.assert_called_once_with([document.pk])
bulk_update_mock.assert_called_once()
self.assertCountEqual(
[document.pk],
bulk_update_mock.call_args.kwargs["kwargs"]["document_ids"],
)
def test_test_storage_path(self) -> None:
"""
+187
View File
@@ -1,9 +1,17 @@
import pytest
from django.core.management import call_command
from django.core.management.base import CommandError
from drf_spectacular.generators import SchemaGenerator
from rest_framework import status
from rest_framework.test import APITestCase
@pytest.fixture(scope="session")
def api_schema():
generator = SchemaGenerator()
return generator.get_schema(request=None, public=True)
class TestApiSchema(APITestCase):
ENDPOINT = "/api/schema/"
@@ -66,3 +74,182 @@ class TestApiSchema(APITestCase):
"delete_pages",
]:
self.assertIn(action_method, advertised_methods)
class TestTasksSummarySchema:
"""tasks_summary_retrieve: response must be an array of TaskSummarySerializer."""
def test_summary_response_is_array(self, api_schema: SchemaGenerator):
op = api_schema["paths"]["/api/tasks/summary/"]["get"]
resp_200 = op["responses"]["200"]["content"]["application/json"]["schema"]
assert resp_200["type"] == "array", (
"tasks_summary_retrieve response must be type:array"
)
def test_summary_items_have_total_count(self, api_schema: SchemaGenerator):
op = api_schema["paths"]["/api/tasks/summary/"]["get"]
resp_200 = op["responses"]["200"]["content"]["application/json"]["schema"]
items = resp_200.get("items", {})
ref = items.get("$ref", "")
component_name = ref.split("/")[-1] if ref else ""
if component_name:
props = api_schema["components"]["schemas"][component_name]["properties"]
else:
props = items.get("properties", {})
assert "total_count" in props, (
"summary items must have 'total_count' (TaskSummarySerializer)"
)
def test_summary_days_parameter_constraints(self, api_schema: SchemaGenerator):
op = api_schema["paths"]["/api/tasks/summary/"]["get"]
params = {p["name"]: p for p in op.get("parameters", [])}
assert "days" in params, "days query parameter must be declared"
schema = params["days"]["schema"]
assert schema.get("minimum") == 1, "days must have minimum: 1"
assert schema.get("maximum") == 365, "days must have maximum: 365"
class TestTasksActiveSchema:
"""tasks_active_retrieve: response must be an array of TaskSerializerV10."""
def test_active_response_is_array(self, api_schema: SchemaGenerator):
op = api_schema["paths"]["/api/tasks/active/"]["get"]
resp_200 = op["responses"]["200"]["content"]["application/json"]["schema"]
assert resp_200["type"] == "array", (
"tasks_active_retrieve response must be type:array"
)
def test_active_items_ref_named_schema(self, api_schema: SchemaGenerator):
op = api_schema["paths"]["/api/tasks/active/"]["get"]
resp_200 = op["responses"]["200"]["content"]["application/json"]["schema"]
items = resp_200.get("items", {})
ref = items.get("$ref", "")
component_name = ref.split("/")[-1] if ref else ""
assert component_name, "items should be a $ref to a named schema"
assert component_name in api_schema["components"]["schemas"]
class TestMetadataSchema:
"""Metadata component: array fields and optional archive fields."""
@pytest.mark.parametrize("field", ["original_metadata", "archive_metadata"])
def test_metadata_field_is_array(self, api_schema: SchemaGenerator, field: str):
props = api_schema["components"]["schemas"]["Metadata"]["properties"]
assert props[field]["type"] == "array", (
f"{field} should be type:array, not type:object"
)
@pytest.mark.parametrize("field", ["original_metadata", "archive_metadata"])
def test_metadata_items_have_key_field(
self,
api_schema: SchemaGenerator,
field: str,
):
props = api_schema["components"]["schemas"]["Metadata"]["properties"]
items = props[field]["items"]
ref = items.get("$ref", "")
component_name = ref.split("/")[-1] if ref else ""
if component_name:
item_props = api_schema["components"]["schemas"][component_name][
"properties"
]
else:
item_props = items.get("properties", {})
assert "key" in item_props
@pytest.mark.parametrize(
"field",
[
"archive_checksum",
"archive_media_filename",
"archive_size",
"archive_metadata",
],
)
def test_archive_field_not_required(self, api_schema, field):
schema = api_schema["components"]["schemas"]["Metadata"]
required = schema.get("required", [])
assert field not in required
props = schema["properties"]
assert props[field].get("nullable") is True, (
f"{field} should be nullable (allow_null=True)"
)
class TestStoragePathTestSchema:
"""storage_paths_test_create: response must be a string, not a StoragePath object."""
def test_test_action_response_is_string(self, api_schema: SchemaGenerator):
op = api_schema["paths"]["/api/storage_paths/test/"]["post"]
resp_200 = op["responses"]["200"]["content"]["application/json"]["schema"]
assert resp_200.get("type") == "string", (
"storage_paths_test_create 200 response must be type:string"
)
def test_test_action_request_uses_storage_path_test_serializer(
self,
api_schema: SchemaGenerator,
):
op = api_schema["paths"]["/api/storage_paths/test/"]["post"]
content = (
op.get("requestBody", {}).get("content", {}).get("application/json", {})
)
schema_ref = content.get("schema", {}).get("$ref", "")
component_name = schema_ref.split("/")[-1]
# COMPONENT_SPLIT_REQUEST=True causes drf-spectacular to append "Request"
# to request body component names, so StoragePathTestSerializer -> StoragePathTestRequest
assert component_name == "StoragePathTestRequest", (
f"Request body should reference StoragePathTestRequest, got {component_name!r}"
)
class TestProcessedMailBulkDeleteSchema:
"""processed_mail_bulk_delete_create: response must be {result, deleted_mail_ids}."""
def _get_props(self, api_schema: SchemaGenerator):
op = api_schema["paths"]["/api/processed_mail/bulk_delete/"]["post"]
resp_200 = op["responses"]["200"]["content"]["application/json"]["schema"]
ref = resp_200.get("$ref", "")
component_name = ref.split("/")[-1] if ref else ""
if component_name:
return api_schema["components"]["schemas"][component_name]["properties"]
return resp_200.get("properties", {})
@pytest.mark.parametrize("field", ["result", "deleted_mail_ids"])
def test_bulk_delete_response_has_field(
self,
api_schema: SchemaGenerator,
field: str,
):
props = self._get_props(api_schema)
assert field in props, f"bulk_delete 200 response must have a '{field}' field"
def test_bulk_delete_response_is_not_processed_mail_serializer(self, api_schema):
op = api_schema["paths"]["/api/processed_mail/bulk_delete/"]["post"]
resp_200 = op["responses"]["200"]["content"]["application/json"]["schema"]
ref = resp_200.get("$ref", "")
component_name = ref.split("/")[-1] if ref else ""
assert component_name != "ProcessedMail", (
"bulk_delete 200 response must not be the full ProcessedMail serializer"
)
class TestShareLinkBundleRebuildSchema:
"""share_link_bundles_rebuild_create: 200 returns bundle data; 400 is documented."""
def test_rebuild_has_400_response(self, api_schema: SchemaGenerator):
op = api_schema["paths"]["/api/share_link_bundles/{id}/rebuild/"]["post"]
assert "400" in op["responses"], (
"rebuild must document the 400 response for 'Bundle is already being processed.'"
)
def test_rebuild_400_has_detail_field(self, api_schema: SchemaGenerator):
op = api_schema["paths"]["/api/share_link_bundles/{id}/rebuild/"]["post"]
resp_400 = op["responses"]["400"]["content"]["application/json"]["schema"]
ref = resp_400.get("$ref", "")
component_name = ref.split("/")[-1] if ref else ""
if component_name:
props = api_schema["components"]["schemas"][component_name]["properties"]
else:
props = resp_400.get("properties", {})
assert "detail" in props, "rebuild 400 response must have a 'detail' field"
+165
View File
@@ -3,6 +3,7 @@ from datetime import timedelta
from unittest import mock
import pytest
import time_machine
from dateutil.relativedelta import relativedelta
from django.contrib.auth.models import Group
from django.contrib.auth.models import Permission
@@ -26,6 +27,7 @@ from documents.models import Tag
from documents.models import Workflow
from documents.search import get_backend
from documents.search import reset_backend
from documents.tests.factories import DocumentFactory
from documents.tests.utils import DirectoriesMixin
from paperless_mail.models import MailAccount
from paperless_mail.models import MailRule
@@ -741,6 +743,49 @@ class TestDocumentSearchApi(DirectoriesMixin, APITestCase):
# Tantivy rejects unparsable field queries with a 400
self.assertEqual(response.status_code, status.HTTP_400_BAD_REQUEST)
@override_settings(
TIME_ZONE="UTC",
)
@time_machine.travel(
datetime.datetime(2026, 7, 15, 12, 0, tzinfo=datetime.UTC),
tick=False,
)
def test_search_added_previous_quarter(self) -> None:
"""
GIVEN:
- Documents inside and outside the previous quarter
WHEN:
- Query with the legacy natural-language phrase used by the UI
THEN:
- Previous-quarter documents are returned
"""
d1 = DocumentFactory.create(
title="quarterly statement april",
content="bank statement",
added=datetime.datetime(2026, 4, 10, 12, 0, tzinfo=datetime.UTC),
)
d2 = DocumentFactory.create(
title="quarterly statement june",
content="bank statement",
added=datetime.datetime(2026, 6, 20, 12, 0, tzinfo=datetime.UTC),
)
d3 = DocumentFactory.create(
title="quarterly statement july",
content="bank statement",
added=datetime.datetime(2026, 7, 10, 12, 0, tzinfo=datetime.UTC),
)
backend = get_backend()
backend.add_or_update(d1)
backend.add_or_update(d2)
backend.add_or_update(d3)
response = self.client.get('/api/documents/?query=added:"previous quarter"')
self.assertEqual(response.status_code, status.HTTP_200_OK)
results = response.data["results"]
self.assertEqual({r["id"] for r in results}, {1, 2})
@mock.patch("documents.search._backend.TantivyBackend.autocomplete")
def test_search_autocomplete_limits(self, m) -> None:
"""
@@ -1503,6 +1548,126 @@ class TestDocumentSearchApi(DirectoriesMixin, APITestCase):
[d2.id, d1.id, d3.id],
)
def test_search_ordering_by_score(self) -> None:
"""ordering=-score must return results in descending relevance order (best first)."""
backend = get_backend()
# doc_high has more occurrences of the search term → higher BM25 score
doc_low = Document.objects.create(
title="score sort low",
content="apple",
checksum="SCL1",
)
doc_high = Document.objects.create(
title="score sort high",
content="apple apple apple apple apple",
checksum="SCH1",
)
backend.add_or_update(doc_low)
backend.add_or_update(doc_high)
# -score = descending = best first (highest score)
response = self.client.get("/api/documents/?query=apple&ordering=-score")
self.assertEqual(response.status_code, status.HTTP_200_OK)
ids = [r["id"] for r in response.data["results"]]
self.assertEqual(
ids[0],
doc_high.id,
"Most relevant doc should be first for -score",
)
# score = ascending = worst first (lowest score)
response = self.client.get("/api/documents/?query=apple&ordering=score")
self.assertEqual(response.status_code, status.HTTP_200_OK)
ids = [r["id"] for r in response.data["results"]]
self.assertEqual(
ids[0],
doc_low.id,
"Least relevant doc should be first for +score",
)
def test_search_with_tantivy_native_sort(self) -> None:
"""When ordering by a Tantivy-sortable field, results must be correctly sorted."""
backend = get_backend()
for i, asn in enumerate([30, 10, 20]):
doc = Document.objects.create(
title=f"sortable doc {i}",
content="searchable content",
checksum=f"TNS{i}",
archive_serial_number=asn,
)
backend.add_or_update(doc)
response = self.client.get(
"/api/documents/?query=searchable&ordering=archive_serial_number",
)
self.assertEqual(response.status_code, status.HTTP_200_OK)
asns = [doc["archive_serial_number"] for doc in response.data["results"]]
self.assertEqual(asns, [10, 20, 30])
response = self.client.get(
"/api/documents/?query=searchable&ordering=-archive_serial_number",
)
self.assertEqual(response.status_code, status.HTTP_200_OK)
asns = [doc["archive_serial_number"] for doc in response.data["results"]]
self.assertEqual(asns, [30, 20, 10])
def test_search_page_2_returns_correct_slice(self) -> None:
"""Page 2 must return the second slice, not overlap with page 1."""
backend = get_backend()
for i in range(10):
doc = Document.objects.create(
title=f"doc {i}",
content="paginated content",
checksum=f"PG2{i}",
archive_serial_number=i + 1,
)
backend.add_or_update(doc)
response = self.client.get(
"/api/documents/?query=paginated&ordering=archive_serial_number&page=1&page_size=3",
)
page1_ids = [r["id"] for r in response.data["results"]]
self.assertEqual(len(page1_ids), 3)
response = self.client.get(
"/api/documents/?query=paginated&ordering=archive_serial_number&page=2&page_size=3",
)
page2_ids = [r["id"] for r in response.data["results"]]
self.assertEqual(len(page2_ids), 3)
# No overlap between pages
self.assertEqual(set(page1_ids) & set(page2_ids), set())
# Page 2 ASNs are higher than page 1
page1_asns = [
Document.objects.get(pk=pk).archive_serial_number for pk in page1_ids
]
page2_asns = [
Document.objects.get(pk=pk).archive_serial_number for pk in page2_ids
]
self.assertTrue(max(page1_asns) < min(page2_asns))
def test_search_all_field_contains_all_ids_when_paginated(self) -> None:
"""The 'all' field must contain every matching ID, even when paginated."""
backend = get_backend()
doc_ids = []
for i in range(10):
doc = Document.objects.create(
title=f"all field doc {i}",
content="allfield content",
checksum=f"AF{i}",
)
backend.add_or_update(doc)
doc_ids.append(doc.pk)
response = self.client.get(
"/api/documents/?query=allfield&page=1&page_size=3",
headers={"Accept": "application/json; version=9"},
)
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.assertEqual(len(response.data["results"]), 3)
# "all" must contain ALL 10 matching IDs
self.assertCountEqual(response.data["all"], doc_ids)
@mock.patch("documents.bulk_edit.bulk_update_documents")
def test_global_search(self, m) -> None:
"""
+29 -29
View File
@@ -4,7 +4,6 @@ import tempfile
from pathlib import Path
from unittest import mock
from celery import states
from django.contrib.auth.models import Permission
from django.contrib.auth.models import User
from django.test import override_settings
@@ -13,6 +12,7 @@ from rest_framework.test import APITestCase
from documents.models import PaperlessTask
from documents.permissions import has_system_status_permission
from documents.tests.factories import PaperlessTaskFactory
from paperless import version
@@ -102,7 +102,7 @@ class TestSystemStatus(APITestCase):
user = User.objects.create_user(username="status_user")
user.user_permissions.add(
Permission.objects.get(codename="view_system_status"),
Permission.objects.get(codename="view_system_monitoring"),
)
self.client.force_login(user)
@@ -258,10 +258,10 @@ class TestSystemStatus(APITestCase):
THEN:
- The response contains an OK classifier status
"""
PaperlessTask.objects.create(
type=PaperlessTask.TaskType.SCHEDULED_TASK,
status=states.SUCCESS,
task_name=PaperlessTask.TaskName.TRAIN_CLASSIFIER,
PaperlessTaskFactory(
task_type=PaperlessTask.TaskType.TRAIN_CLASSIFIER,
trigger_source=PaperlessTask.TriggerSource.SCHEDULED,
status=PaperlessTask.Status.SUCCESS,
)
self.client.force_login(self.user)
response = self.client.get(self.ENDPOINT)
@@ -295,11 +295,11 @@ class TestSystemStatus(APITestCase):
THEN:
- The response contains an ERROR classifier status
"""
PaperlessTask.objects.create(
type=PaperlessTask.TaskType.SCHEDULED_TASK,
status=states.FAILURE,
task_name=PaperlessTask.TaskName.TRAIN_CLASSIFIER,
result="Classifier training failed",
PaperlessTaskFactory(
task_type=PaperlessTask.TaskType.TRAIN_CLASSIFIER,
trigger_source=PaperlessTask.TriggerSource.SCHEDULED,
status=PaperlessTask.Status.FAILURE,
result_data={"error_message": "Classifier training failed"},
)
self.client.force_login(self.user)
response = self.client.get(self.ENDPOINT)
@@ -319,10 +319,10 @@ class TestSystemStatus(APITestCase):
THEN:
- The response contains an OK sanity check status
"""
PaperlessTask.objects.create(
type=PaperlessTask.TaskType.SCHEDULED_TASK,
status=states.SUCCESS,
task_name=PaperlessTask.TaskName.CHECK_SANITY,
PaperlessTaskFactory(
task_type=PaperlessTask.TaskType.SANITY_CHECK,
trigger_source=PaperlessTask.TriggerSource.SCHEDULED,
status=PaperlessTask.Status.SUCCESS,
)
self.client.force_login(self.user)
response = self.client.get(self.ENDPOINT)
@@ -356,11 +356,11 @@ class TestSystemStatus(APITestCase):
THEN:
- The response contains an ERROR sanity check status
"""
PaperlessTask.objects.create(
type=PaperlessTask.TaskType.SCHEDULED_TASK,
status=states.FAILURE,
task_name=PaperlessTask.TaskName.CHECK_SANITY,
result="5 issues found.",
PaperlessTaskFactory(
task_type=PaperlessTask.TaskType.SANITY_CHECK,
trigger_source=PaperlessTask.TriggerSource.SCHEDULED,
status=PaperlessTask.Status.FAILURE,
result_data={"error_message": "5 issues found."},
)
self.client.force_login(self.user)
response = self.client.get(self.ENDPOINT)
@@ -405,10 +405,10 @@ class TestSystemStatus(APITestCase):
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.assertEqual(response.data["tasks"]["llmindex_status"], "WARNING")
PaperlessTask.objects.create(
type=PaperlessTask.TaskType.SCHEDULED_TASK,
status=states.SUCCESS,
task_name=PaperlessTask.TaskName.LLMINDEX_UPDATE,
PaperlessTaskFactory(
task_type=PaperlessTask.TaskType.LLM_INDEX,
trigger_source=PaperlessTask.TriggerSource.SCHEDULED,
status=PaperlessTask.Status.SUCCESS,
)
response = self.client.get(self.ENDPOINT)
self.assertEqual(response.status_code, status.HTTP_200_OK)
@@ -425,11 +425,11 @@ class TestSystemStatus(APITestCase):
- The response contains the correct AI status
"""
with override_settings(AI_ENABLED=True, LLM_EMBEDDING_BACKEND="openai"):
PaperlessTask.objects.create(
type=PaperlessTask.TaskType.SCHEDULED_TASK,
status=states.FAILURE,
task_name=PaperlessTask.TaskName.LLMINDEX_UPDATE,
result="AI index update failed",
PaperlessTaskFactory(
task_type=PaperlessTask.TaskType.LLM_INDEX,
trigger_source=PaperlessTask.TriggerSource.SCHEDULED,
status=PaperlessTask.Status.FAILURE,
result_data={"error_message": "AI index update failed"},
)
self.client.force_login(self.user)
response = self.client.get(self.ENDPOINT)
File diff suppressed because it is too large Load Diff
+34
View File
@@ -99,6 +99,40 @@ class TestApiWorkflows(DirectoriesMixin, APITestCase):
self.action.assign_correspondent.pk,
)
def test_api_get_workflow_actions_ordered(self) -> None:
"""
GIVEN:
- A workflow with two actions added in reverse order (order=1 before order=0)
WHEN:
- API is called to get workflows
THEN:
- Actions are returned sorted by order ascending
"""
# Created before action_first so its pk is lower — ensures pk order
# disagrees with the order field, catching regressions if order_by is removed.
action_second = WorkflowAction.objects.create(
assign_title="Second action",
order=1,
)
action_first = WorkflowAction.objects.create(
assign_title="First action",
order=0,
)
self.workflow.actions.add(action_second)
self.workflow.actions.add(action_first)
response = self.client.get(self.ENDPOINT, format="json")
self.assertEqual(response.status_code, status.HTTP_200_OK)
resp_actions = response.data["results"][0]["actions"]
action_ids = [a["id"] for a in resp_actions]
self.assertIn(action_first.id, action_ids)
self.assertIn(action_second.id, action_ids)
self.assertLess(
action_ids.index(action_first.id),
action_ids.index(action_second.id),
)
def test_api_create_workflow(self) -> None:
"""
GIVEN:
+5 -5
View File
@@ -17,8 +17,8 @@ from documents.data_models import DocumentSource
from documents.models import Document
from documents.models import Tag
from documents.plugins.base import StopConsumeTaskError
from documents.tests.utils import ConsumeTaskMixin
from documents.tests.utils import DirectoriesMixin
from documents.tests.utils import DocumentConsumeDelayMixin
from documents.tests.utils import DummyProgressManager
from documents.tests.utils import FileSystemAssertsMixin
from documents.tests.utils import SampleDirMixin
@@ -601,7 +601,7 @@ class TestBarcodeNewConsume(
DirectoriesMixin,
FileSystemAssertsMixin,
SampleDirMixin,
DocumentConsumeDelayMixin,
ConsumeTaskMixin,
TestCase,
):
@override_settings(CONSUMER_ENABLE_BARCODES=True)
@@ -632,7 +632,7 @@ class TestBarcodeNewConsume(
),
overrides,
),
"Barcode splitting complete!",
{"reason": "Barcode splitting complete!"},
)
# 2 new document consume tasks created
self.assertEqual(self.consume_file_mock.call_count, 2)
@@ -646,7 +646,7 @@ class TestBarcodeNewConsume(
for (
new_input_doc,
new_doc_overrides,
) in self.get_all_consume_delay_call_args():
) in self.get_all_consume_task_call_args():
self.assertIsFile(new_input_doc.original_file)
self.assertEqual(new_input_doc.original_path, temp_copy)
self.assertEqual(new_input_doc.source, DocumentSource.ConsumeFolder)
@@ -1049,7 +1049,7 @@ class TestTagBarcode(DirectoriesMixin, SampleDirMixin, GetReaderPluginMixin, Tes
None,
)
self.assertEqual(result, "Barcode splitting complete!")
self.assertEqual(result, {"reason": "Barcode splitting complete!"})
documents = Document.objects.all().order_by("id")
self.assertEqual(documents.count(), 3)
+99 -78
View File
@@ -31,7 +31,7 @@ class TestBulkEdit(DirectoriesMixin, TestCase):
self.group1 = Group.objects.create(name="group1")
self.group2 = Group.objects.create(name="group2")
patcher = mock.patch("documents.bulk_edit.bulk_update_documents.delay")
patcher = mock.patch("documents.bulk_edit.bulk_update_documents.apply_async")
self.async_task = patcher.start()
self.addCleanup(patcher.stop)
self.c1 = Correspondent.objects.create(name="c1")
@@ -74,7 +74,7 @@ class TestBulkEdit(DirectoriesMixin, TestCase):
)
self.assertEqual(Document.objects.filter(correspondent=self.c2).count(), 3)
self.async_task.assert_called_once()
_, kwargs = self.async_task.call_args
kwargs = self.async_task.call_args.kwargs["kwargs"]
self.assertCountEqual(kwargs["document_ids"], [self.doc1.id, self.doc2.id])
def test_unset_correspondent(self) -> None:
@@ -82,7 +82,7 @@ class TestBulkEdit(DirectoriesMixin, TestCase):
bulk_edit.set_correspondent([self.doc1.id, self.doc2.id, self.doc3.id], None)
self.assertEqual(Document.objects.filter(correspondent=self.c2).count(), 0)
self.async_task.assert_called_once()
_, kwargs = self.async_task.call_args
kwargs = self.async_task.call_args.kwargs["kwargs"]
self.assertCountEqual(kwargs["document_ids"], [self.doc2.id, self.doc3.id])
def test_set_document_type(self) -> None:
@@ -93,7 +93,7 @@ class TestBulkEdit(DirectoriesMixin, TestCase):
)
self.assertEqual(Document.objects.filter(document_type=self.dt2).count(), 3)
self.async_task.assert_called_once()
_, kwargs = self.async_task.call_args
kwargs = self.async_task.call_args.kwargs["kwargs"]
self.assertCountEqual(kwargs["document_ids"], [self.doc1.id, self.doc2.id])
def test_unset_document_type(self) -> None:
@@ -101,7 +101,7 @@ class TestBulkEdit(DirectoriesMixin, TestCase):
bulk_edit.set_document_type([self.doc1.id, self.doc2.id, self.doc3.id], None)
self.assertEqual(Document.objects.filter(document_type=self.dt2).count(), 0)
self.async_task.assert_called_once()
_, kwargs = self.async_task.call_args
kwargs = self.async_task.call_args.kwargs["kwargs"]
self.assertCountEqual(kwargs["document_ids"], [self.doc2.id, self.doc3.id])
def test_set_document_storage_path(self) -> None:
@@ -123,7 +123,7 @@ class TestBulkEdit(DirectoriesMixin, TestCase):
self.assertEqual(Document.objects.filter(storage_path=None).count(), 4)
self.async_task.assert_called_once()
_, kwargs = self.async_task.call_args
kwargs = self.async_task.call_args.kwargs["kwargs"]
self.assertCountEqual(kwargs["document_ids"], [self.doc1.id])
@@ -154,7 +154,7 @@ class TestBulkEdit(DirectoriesMixin, TestCase):
self.assertEqual(Document.objects.filter(storage_path=None).count(), 5)
self.async_task.assert_called()
_, kwargs = self.async_task.call_args
kwargs = self.async_task.call_args.kwargs["kwargs"]
self.assertCountEqual(kwargs["document_ids"], [self.doc1.id])
@@ -166,7 +166,7 @@ class TestBulkEdit(DirectoriesMixin, TestCase):
)
self.assertEqual(Document.objects.filter(tags__id=self.t1.id).count(), 4)
self.async_task.assert_called_once()
_, kwargs = self.async_task.call_args
kwargs = self.async_task.call_args.kwargs["kwargs"]
self.assertCountEqual(kwargs["document_ids"], [self.doc1.id, self.doc3.id])
def test_remove_tag(self) -> None:
@@ -174,7 +174,7 @@ class TestBulkEdit(DirectoriesMixin, TestCase):
bulk_edit.remove_tag([self.doc1.id, self.doc3.id, self.doc4.id], self.t1.id)
self.assertEqual(Document.objects.filter(tags__id=self.t1.id).count(), 1)
self.async_task.assert_called_once()
_, kwargs = self.async_task.call_args
kwargs = self.async_task.call_args.kwargs["kwargs"]
self.assertCountEqual(kwargs["document_ids"], [self.doc4.id])
def test_modify_tags(self) -> None:
@@ -191,7 +191,7 @@ class TestBulkEdit(DirectoriesMixin, TestCase):
self.assertCountEqual(list(self.doc3.tags.all()), [self.t2, tag_unrelated])
self.async_task.assert_called_once()
_, kwargs = self.async_task.call_args
kwargs = self.async_task.call_args.kwargs["kwargs"]
# TODO: doc3 should not be affected, but the query for that is rather complicated
self.assertCountEqual(kwargs["document_ids"], [self.doc2.id, self.doc3.id])
@@ -248,7 +248,7 @@ class TestBulkEdit(DirectoriesMixin, TestCase):
)
self.async_task.assert_called_once()
_, kwargs = self.async_task.call_args
kwargs = self.async_task.call_args.kwargs["kwargs"]
self.assertCountEqual(kwargs["document_ids"], [self.doc1.id, self.doc2.id])
def test_modify_custom_fields_with_values(self) -> None:
@@ -325,7 +325,7 @@ class TestBulkEdit(DirectoriesMixin, TestCase):
)
self.async_task.assert_called_once()
_, kwargs = self.async_task.call_args
kwargs = self.async_task.call_args.kwargs["kwargs"]
self.assertCountEqual(kwargs["document_ids"], [self.doc1.id, self.doc2.id])
# removal of document link cf, should also remove symmetric link
@@ -428,7 +428,7 @@ class TestBulkEdit(DirectoriesMixin, TestCase):
self.assertEqual(source_doc.id, version2.id)
self.assertNotEqual(source_doc.id, version1.id)
@mock.patch("documents.tasks.bulk_update_documents.delay")
@mock.patch("documents.tasks.bulk_update_documents.apply_async")
def test_set_permissions(self, m) -> None:
doc_ids = [self.doc1.id, self.doc2.id, self.doc3.id]
@@ -467,7 +467,7 @@ class TestBulkEdit(DirectoriesMixin, TestCase):
)
self.assertEqual(groups_with_perms.count(), 1)
@mock.patch("documents.tasks.bulk_update_documents.delay")
@mock.patch("documents.tasks.bulk_update_documents.apply_async")
def test_set_permissions_merge(self, m) -> None:
doc_ids = [self.doc1.id, self.doc2.id, self.doc3.id]
@@ -643,20 +643,20 @@ class TestPDFActions(DirectoriesMixin, TestCase):
)
mock_consume_file.assert_called()
consume_file_args, _ = mock_consume_file.call_args
call_kwargs = mock_consume_file.call_args.kwargs
self.assertEqual(
Path(consume_file_args[0].original_file).name,
Path(call_kwargs["input_doc"].original_file).name,
expected_filename,
)
self.assertEqual(consume_file_args[1].title, None)
self.assertEqual(call_kwargs["overrides"].title, None)
# No metadata_document_id, delete_originals False, so ASN should be None
self.assertIsNone(consume_file_args[1].asn)
self.assertIsNone(call_kwargs["overrides"].asn)
# With metadata_document_id overrides
result = bulk_edit.merge(doc_ids, metadata_document_id=metadata_document_id)
consume_file_args, _ = mock_consume_file.call_args
self.assertEqual(consume_file_args[1].title, "B (merged)")
self.assertEqual(consume_file_args[1].created, self.doc2.created)
call_kwargs = mock_consume_file.call_args.kwargs
self.assertEqual(call_kwargs["overrides"].title, "B (merged)")
self.assertEqual(call_kwargs["overrides"].created, self.doc2.created)
self.assertEqual(result, "OK")
@@ -720,16 +720,15 @@ class TestPDFActions(DirectoriesMixin, TestCase):
mock_consume_file.assert_called()
mock_delete_documents.assert_called()
consume_sig = mock_consume_file.return_value
consume_sig.apply_async.assert_called_once()
mock_consume_file.return_value.set.return_value.apply_async.assert_called_once()
consume_file_args, _ = mock_consume_file.call_args
call_kwargs = mock_consume_file.call_args.kwargs
self.assertEqual(
Path(consume_file_args[0].original_file).name,
Path(call_kwargs["input_doc"].original_file).name,
expected_filename,
)
self.assertEqual(consume_file_args[1].title, None)
self.assertEqual(consume_file_args[1].asn, 101)
self.assertEqual(call_kwargs["overrides"].title, None)
self.assertEqual(call_kwargs["overrides"].asn, 101)
delete_documents_args, _ = mock_delete_documents.call_args
self.assertEqual(
@@ -764,7 +763,7 @@ class TestPDFActions(DirectoriesMixin, TestCase):
self.doc1.archive_serial_number = 111
self.doc1.save()
sig = mock.Mock()
sig.apply_async.side_effect = Exception("boom")
sig.set.return_value.apply_async.side_effect = Exception("boom")
mock_consume_file.return_value = sig
with self.assertRaises(Exception):
@@ -801,8 +800,7 @@ class TestPDFActions(DirectoriesMixin, TestCase):
)
self.assertEqual(result, "OK")
consume_file_args, _ = mock_consume_file.call_args
self.assertEqual(consume_file_args[1].asn, 202)
self.assertEqual(mock_consume_file.call_args.kwargs["overrides"].asn, 202)
def test_restore_archive_serial_numbers_task(self) -> None:
"""
@@ -843,9 +841,8 @@ class TestPDFActions(DirectoriesMixin, TestCase):
)
mock_consume_file.assert_called()
consume_file_args, _ = mock_consume_file.call_args
self.assertEqual(
Path(consume_file_args[0].original_file).name,
Path(mock_consume_file.call_args.kwargs["input_doc"].original_file).name,
expected_filename,
)
@@ -889,9 +886,11 @@ class TestPDFActions(DirectoriesMixin, TestCase):
user = User.objects.create(username="test_user")
result = bulk_edit.split(doc_ids, pages, delete_originals=False, user=user)
self.assertEqual(mock_consume_file.call_count, 2)
consume_file_args, _ = mock_consume_file.call_args
self.assertEqual(consume_file_args[1].title, "B (split 2)")
self.assertIsNone(consume_file_args[1].asn)
self.assertEqual(
mock_consume_file.call_args.kwargs["overrides"].title,
"B (split 2)",
)
self.assertIsNone(mock_consume_file.call_args.kwargs["overrides"].asn)
self.assertEqual(result, "OK")
@@ -953,8 +952,10 @@ class TestPDFActions(DirectoriesMixin, TestCase):
self.assertEqual(result, "OK")
self.assertEqual(mock_consume_file.call_count, 2)
consume_file_args, _ = mock_consume_file.call_args
self.assertEqual(consume_file_args[1].title, "B (split 2)")
self.assertEqual(
mock_consume_file.call_args.kwargs["overrides"].title,
"B (split 2)",
)
mock_delete_documents.assert_called()
mock_chord.assert_called_once()
@@ -1001,7 +1002,7 @@ class TestPDFActions(DirectoriesMixin, TestCase):
self.doc2.refresh_from_db()
self.assertEqual(self.doc2.archive_serial_number, 222)
@mock.patch("documents.tasks.consume_file.delay")
@mock.patch("documents.tasks.consume_file.apply_async")
@mock.patch("pikepdf.Pdf.save")
def test_split_with_errors(self, mock_save_pdf, mock_consume_file) -> None:
"""
@@ -1025,7 +1026,7 @@ class TestPDFActions(DirectoriesMixin, TestCase):
mock_consume_file.assert_not_called()
@mock.patch("documents.tasks.consume_file.delay")
@mock.patch("documents.tasks.consume_file.apply_async")
def test_rotate(self, mock_consume_delay):
"""
GIVEN:
@@ -1042,12 +1043,12 @@ class TestPDFActions(DirectoriesMixin, TestCase):
mock_consume_delay.call_args_list,
doc_ids,
):
consumable, overrides = call.args
self.assertEqual(consumable.root_document_id, expected_id)
self.assertIsNotNone(overrides)
task_kwargs = call.kwargs["kwargs"]
self.assertEqual(task_kwargs["input_doc"].root_document_id, expected_id)
self.assertIsNotNone(task_kwargs["overrides"])
self.assertEqual(result, "OK")
@mock.patch("documents.tasks.consume_file.delay")
@mock.patch("documents.tasks.consume_file.apply_async")
@mock.patch("pikepdf.Pdf.save")
def test_rotate_with_error(
self,
@@ -1073,7 +1074,7 @@ class TestPDFActions(DirectoriesMixin, TestCase):
self.assertIn(expected_str, error_str)
mock_consume_delay.assert_not_called()
@mock.patch("documents.tasks.consume_file.delay")
@mock.patch("documents.tasks.consume_file.apply_async")
def test_rotate_non_pdf(
self,
mock_consume_delay,
@@ -1091,13 +1092,16 @@ class TestPDFActions(DirectoriesMixin, TestCase):
expected_str = f"Document {self.img_doc.id} is not a PDF, skipping rotation"
self.assertTrue(any(expected_str in line for line in cm.output))
self.assertEqual(mock_consume_delay.call_count, 1)
consumable, overrides = mock_consume_delay.call_args[0]
self.assertEqual(consumable.root_document_id, self.doc2.id)
self.assertIsNotNone(overrides)
task_kwargs = mock_consume_delay.call_args.kwargs["kwargs"]
self.assertEqual(task_kwargs["input_doc"].root_document_id, self.doc2.id)
self.assertIsNotNone(task_kwargs["overrides"])
self.assertEqual(result, "OK")
@mock.patch("documents.data_models.magic.from_file", return_value="application/pdf")
@mock.patch("documents.tasks.consume_file.delay")
@mock.patch(
"documents.data_models.mime_detection.from_file",
return_value="application/pdf",
)
@mock.patch("documents.tasks.consume_file.apply_async")
@mock.patch("pikepdf.open")
def test_rotate_explicit_selection_uses_root_source_when_root_selected(
self,
@@ -1124,9 +1128,12 @@ class TestPDFActions(DirectoriesMixin, TestCase):
mock_open.assert_called_once_with(self.doc2.source_path)
mock_consume_delay.assert_called_once()
@mock.patch("documents.tasks.consume_file.delay")
@mock.patch("documents.tasks.consume_file.apply_async")
@mock.patch("pikepdf.Pdf.save")
@mock.patch("documents.data_models.magic.from_file", return_value="application/pdf")
@mock.patch(
"documents.data_models.mime_detection.from_file",
return_value="application/pdf",
)
def test_delete_pages(self, mock_magic, mock_pdf_save, mock_consume_delay):
"""
GIVEN:
@@ -1142,14 +1149,19 @@ class TestPDFActions(DirectoriesMixin, TestCase):
result = bulk_edit.delete_pages(doc_ids, pages)
mock_pdf_save.assert_called_once()
mock_consume_delay.assert_called_once()
consumable, overrides = mock_consume_delay.call_args[0]
self.assertEqual(consumable.root_document_id, self.doc2.id)
self.assertTrue(str(consumable.original_file).endswith("_pages_deleted.pdf"))
self.assertIsNotNone(overrides)
task_kwargs = mock_consume_delay.call_args.kwargs["kwargs"]
self.assertEqual(task_kwargs["input_doc"].root_document_id, self.doc2.id)
self.assertTrue(
str(task_kwargs["input_doc"].original_file).endswith("_pages_deleted.pdf"),
)
self.assertIsNotNone(task_kwargs["overrides"])
self.assertEqual(result, "OK")
@mock.patch("documents.data_models.magic.from_file", return_value="application/pdf")
@mock.patch("documents.tasks.consume_file.delay")
@mock.patch(
"documents.data_models.mime_detection.from_file",
return_value="application/pdf",
)
@mock.patch("documents.tasks.consume_file.apply_async")
@mock.patch("pikepdf.open")
def test_delete_pages_explicit_selection_uses_root_source_when_root_selected(
self,
@@ -1176,7 +1188,7 @@ class TestPDFActions(DirectoriesMixin, TestCase):
mock_open.assert_called_once_with(self.doc2.source_path)
mock_consume_delay.assert_called_once()
@mock.patch("documents.tasks.consume_file.delay")
@mock.patch("documents.tasks.consume_file.apply_async")
@mock.patch("pikepdf.Pdf.save")
def test_delete_pages_with_error(self, mock_pdf_save, mock_consume_delay):
"""
@@ -1259,8 +1271,7 @@ class TestPDFActions(DirectoriesMixin, TestCase):
result = bulk_edit.edit_pdf(doc_ids, operations, delete_original=True)
self.assertEqual(result, "OK")
mock_chord.assert_called_once()
consume_file_args, _ = mock_consume_file.call_args
self.assertEqual(consume_file_args[1].asn, 250)
self.assertEqual(mock_consume_file.call_args.kwargs["overrides"].asn, 250)
self.doc2.refresh_from_db()
self.assertIsNone(self.doc2.archive_serial_number)
@@ -1297,7 +1308,7 @@ class TestPDFActions(DirectoriesMixin, TestCase):
self.doc2.refresh_from_db()
self.assertEqual(self.doc2.archive_serial_number, 333)
@mock.patch("documents.tasks.consume_file.delay")
@mock.patch("documents.tasks.consume_file.apply_async")
def test_edit_pdf_with_update_document(self, mock_consume_delay):
"""
GIVEN:
@@ -1319,13 +1330,18 @@ class TestPDFActions(DirectoriesMixin, TestCase):
self.assertEqual(result, "OK")
mock_consume_delay.assert_called_once()
consumable, overrides = mock_consume_delay.call_args[0]
self.assertEqual(consumable.root_document_id, self.doc2.id)
self.assertTrue(str(consumable.original_file).endswith("_edited.pdf"))
self.assertIsNotNone(overrides)
task_kwargs = mock_consume_delay.call_args.kwargs["kwargs"]
self.assertEqual(task_kwargs["input_doc"].root_document_id, self.doc2.id)
self.assertTrue(
str(task_kwargs["input_doc"].original_file).endswith("_edited.pdf"),
)
self.assertIsNotNone(task_kwargs["overrides"])
@mock.patch("documents.data_models.magic.from_file", return_value="application/pdf")
@mock.patch("documents.tasks.consume_file.delay")
@mock.patch(
"documents.data_models.mime_detection.from_file",
return_value="application/pdf",
)
@mock.patch("documents.tasks.consume_file.apply_async")
@mock.patch("pikepdf.new")
@mock.patch("pikepdf.open")
def test_edit_pdf_explicit_selection_uses_root_source_when_root_selected(
@@ -1433,7 +1449,7 @@ class TestPDFActions(DirectoriesMixin, TestCase):
mock_consume_file.assert_not_called()
@mock.patch("documents.bulk_edit.update_document_content_maybe_archive_file.delay")
@mock.patch("documents.tasks.consume_file.delay")
@mock.patch("documents.tasks.consume_file.apply_async")
@mock.patch("documents.bulk_edit.tempfile.mkdtemp")
@mock.patch("pikepdf.open")
def test_remove_password_update_document(
@@ -1468,18 +1484,21 @@ class TestPDFActions(DirectoriesMixin, TestCase):
fake_pdf.remove_unreferenced_resources.assert_called_once()
mock_update_document.assert_not_called()
mock_consume_delay.assert_called_once()
consumable, overrides = mock_consume_delay.call_args[0]
task_kwargs = mock_consume_delay.call_args.kwargs["kwargs"]
expected_path = temp_dir / f"{doc.id}_unprotected.pdf"
self.assertTrue(expected_path.exists())
self.assertEqual(
Path(consumable.original_file).resolve(),
Path(task_kwargs["input_doc"].original_file).resolve(),
expected_path.resolve(),
)
self.assertEqual(consumable.root_document_id, doc.id)
self.assertIsNotNone(overrides)
self.assertEqual(task_kwargs["input_doc"].root_document_id, doc.id)
self.assertIsNotNone(task_kwargs["overrides"])
@mock.patch("documents.data_models.magic.from_file", return_value="application/pdf")
@mock.patch("documents.tasks.consume_file.delay")
@mock.patch(
"documents.data_models.mime_detection.from_file",
return_value="application/pdf",
)
@mock.patch("documents.tasks.consume_file.apply_async")
@mock.patch("pikepdf.open")
def test_remove_password_explicit_selection_uses_root_source_when_root_selected(
self,
@@ -1548,9 +1567,9 @@ class TestPDFActions(DirectoriesMixin, TestCase):
self.assertEqual(result, "OK")
mock_open.assert_called_once_with(doc.source_path, password="secret")
mock_consume_file.assert_called_once()
consume_args, _ = mock_consume_file.call_args
consumable_document = consume_args[0]
overrides = consume_args[1]
call_kwargs = mock_consume_file.call_args.kwargs
consumable_document = call_kwargs["input_doc"]
overrides = call_kwargs["overrides"]
expected_path = temp_dir / f"{doc.id}_unprotected.pdf"
self.assertTrue(expected_path.exists())
self.assertEqual(
@@ -1558,7 +1577,9 @@ class TestPDFActions(DirectoriesMixin, TestCase):
expected_path.resolve(),
)
self.assertEqual(overrides.owner_id, user.id)
mock_group.assert_called_once_with([mock_consume_file.return_value])
mock_group.assert_called_once_with(
[mock_consume_file.return_value.set.return_value],
)
mock_group.return_value.delay.assert_called_once()
mock_chord.assert_not_called()
+19 -26
View File
@@ -140,28 +140,25 @@ class FaultyGenericExceptionParser(_BaseNewStyleParser):
raise Exception("Generic exception.")
def fake_magic_from_file(file, *, mime=False): # NOSONAR
if mime:
filepath = Path(file)
if filepath.name.startswith("invalid_pdf"):
return "application/octet-stream"
if filepath.name.startswith("valid_pdf"):
return "application/pdf"
if filepath.suffix == ".pdf":
return "application/pdf"
elif filepath.suffix == ".png":
return "image/png"
elif filepath.suffix == ".webp":
return "image/webp"
elif filepath.suffix == ".eml":
return "message/rfc822"
else:
return "unknown"
def fake_magic_from_file(file): # NOSONAR
filepath = Path(file)
if filepath.name.startswith("invalid_pdf"):
return "application/octet-stream"
if filepath.name.startswith("valid_pdf"):
return "application/pdf"
if filepath.suffix == ".pdf":
return "application/pdf"
elif filepath.suffix == ".png":
return "image/png"
elif filepath.suffix == ".webp":
return "image/webp"
elif filepath.suffix == ".eml":
return "message/rfc822"
else:
return "A verbose string that describes the contents of the file"
return "unknown"
@mock.patch("documents.consumer.magic.from_file", fake_magic_from_file)
@mock.patch("documents.consumer.mime_detection.from_file", fake_magic_from_file)
class TestConsumer(
DirectoriesMixin,
FileSystemAssertsMixin,
@@ -1146,7 +1143,7 @@ class TestConsumer(
)
@mock.patch("documents.consumer.magic.from_file", fake_magic_from_file)
@mock.patch("documents.consumer.mime_detection.from_file", fake_magic_from_file)
class TestConsumerCreatedDate(DirectoriesMixin, GetConsumerMixin, TestCase):
def setUp(self) -> None:
super().setUp()
@@ -1328,7 +1325,7 @@ class PreConsumeTestCase(DirectoriesMixin, GetConsumerMixin, TestCase):
environment = args[1]
self.assertEqual(command[0], script.name)
self.assertEqual(command[1], str(self.test_file))
self.assertEqual(len(command), 1)
subset = {
"DOCUMENT_SOURCE_PATH": str(c.input_doc.original_file),
@@ -1478,11 +1475,7 @@ class PostConsumeTestCase(DirectoriesMixin, GetConsumerMixin, TestCase):
environment = args[1]
self.assertEqual(command[0], script.name)
self.assertEqual(command[1], str(doc.pk))
self.assertEqual(command[5], f"/api/documents/{doc.pk}/download/")
self.assertEqual(command[6], f"/api/documents/{doc.pk}/thumb/")
self.assertEqual(command[7], "my_bank")
self.assertCountEqual(command[8].split(","), ["a", "b"])
self.assertEqual(len(command), 1)
subset = {
"DOCUMENT_ID": str(doc.pk),
+4 -4
View File
@@ -83,7 +83,7 @@ class TestDoubleSided(DirectoriesMixin, FileSystemAssertsMixin, TestCase):
dt.datetime.now(),
delta=dt.timedelta(seconds=5),
)
self.assertIn("Received odd numbered pages", msg)
self.assertIn("Received odd numbered pages", msg["reason"])
def test_collation(self) -> None:
"""
@@ -129,7 +129,7 @@ class TestDoubleSided(DirectoriesMixin, FileSystemAssertsMixin, TestCase):
)
msg = self.consume_file("double-sided-odd.pdf")
self.assertIsFile(self.staging_file)
self.assertIn("Received odd numbered pages", msg)
self.assertIn("Received odd numbered pages", msg["reason"])
def test_less_odd_pages_then_even_fails(self) -> None:
"""
@@ -212,7 +212,7 @@ class TestDoubleSided(DirectoriesMixin, FileSystemAssertsMixin, TestCase):
"""
msg = self.consume_file("simple.pdf", Path("..") / "simple.pdf")
self.assertIsNotFile(self.staging_file)
self.assertRegex(msg, r"Success. New document id \d+ created")
self.assertIsInstance(msg.get("document_id"), int)
def test_subdirectory_upload(self) -> None:
"""
@@ -252,4 +252,4 @@ class TestDoubleSided(DirectoriesMixin, FileSystemAssertsMixin, TestCase):
"""
msg = self.consume_file("simple.pdf")
self.assertIsNotFile(self.staging_file)
self.assertRegex(msg, r"Success. New document id \d+ created")
self.assertIsInstance(msg.get("document_id"), int)
+1 -1
View File
@@ -211,7 +211,7 @@ class TestCreateClassifier:
call_command("document_create_classifier", skip_checks=True)
m.assert_called_once_with(scheduled=False, status_callback=mocker.ANY)
m.assert_called_once_with(status_callback=mocker.ANY)
assert callable(m.call_args.kwargs["status_callback"])
def test_create_classifier_callback_output(self, mocker: MockerFixture) -> None:
+35 -35
View File
@@ -97,12 +97,10 @@ def consumer_filter() -> ConsumerFilter:
@pytest.fixture
def mock_consume_file_delay(mocker: MockerFixture) -> MagicMock:
"""Mock the consume_file.delay celery task."""
mock_task = mocker.patch(
"""Mock the consume_file task."""
return mocker.patch(
"documents.management.commands.document_consumer.consume_file",
)
mock_task.delay = mocker.MagicMock()
return mock_task
@pytest.fixture
@@ -453,9 +451,9 @@ class TestConsumeFile:
subdirs_as_tags=False,
)
mock_consume_file_delay.delay.assert_called_once()
call_args = mock_consume_file_delay.delay.call_args
consumable_doc = call_args[0][0]
mock_consume_file_delay.apply_async.assert_called_once()
call_args = mock_consume_file_delay.apply_async.call_args
consumable_doc = call_args.kwargs["kwargs"]["input_doc"]
assert isinstance(consumable_doc, ConsumableDocument)
assert consumable_doc.original_file == target
assert consumable_doc.source == DocumentSource.ConsumeFolder
@@ -471,7 +469,7 @@ class TestConsumeFile:
consumption_dir=consumption_dir,
subdirs_as_tags=False,
)
mock_consume_file_delay.delay.assert_not_called()
mock_consume_file_delay.apply_async.assert_not_called()
def test_consume_directory(
self,
@@ -487,7 +485,7 @@ class TestConsumeFile:
consumption_dir=consumption_dir,
subdirs_as_tags=False,
)
mock_consume_file_delay.delay.assert_not_called()
mock_consume_file_delay.apply_async.assert_not_called()
def test_consume_with_permission_error(
self,
@@ -506,7 +504,7 @@ class TestConsumeFile:
consumption_dir=consumption_dir,
subdirs_as_tags=False,
)
mock_consume_file_delay.delay.assert_not_called()
mock_consume_file_delay.apply_async.assert_not_called()
def test_consume_with_tags_error(
self,
@@ -529,9 +527,9 @@ class TestConsumeFile:
consumption_dir=consumption_dir,
subdirs_as_tags=True,
)
mock_consume_file_delay.delay.assert_called_once()
call_args = mock_consume_file_delay.delay.call_args
overrides = call_args[0][1]
mock_consume_file_delay.apply_async.assert_called_once()
call_args = mock_consume_file_delay.apply_async.call_args
overrides = call_args.kwargs["kwargs"]["overrides"]
assert overrides.tag_ids is None
@@ -629,7 +627,7 @@ class TestCommandOneshot:
cmd = Command()
cmd.handle(directory=str(consumption_dir), oneshot=True, testing=False)
mock_consume_file_delay.delay.assert_called_once()
mock_consume_file_delay.apply_async.assert_called_once()
def test_processes_recursive(
self,
@@ -652,7 +650,7 @@ class TestCommandOneshot:
cmd = Command()
cmd.handle(directory=str(consumption_dir), oneshot=True, testing=False)
mock_consume_file_delay.delay.assert_called_once()
mock_consume_file_delay.apply_async.assert_called_once()
def test_ignores_unsupported_extensions(
self,
@@ -671,7 +669,7 @@ class TestCommandOneshot:
cmd = Command()
cmd.handle(directory=str(consumption_dir), oneshot=True, testing=False)
mock_consume_file_delay.delay.assert_not_called()
mock_consume_file_delay.apply_async.assert_not_called()
class ConsumerThread(Thread):
@@ -795,12 +793,12 @@ class TestCommandWatch:
target = consumption_dir / "document.pdf"
shutil.copy(sample_pdf, target)
wait_for_mock_call(mock_consume_file_delay.delay, timeout_s=2.0)
wait_for_mock_call(mock_consume_file_delay.apply_async, timeout_s=2.0)
if thread.exception:
raise thread.exception
mock_consume_file_delay.delay.assert_called()
mock_consume_file_delay.apply_async.assert_called()
def test_detects_moved_file(
self,
@@ -821,12 +819,12 @@ class TestCommandWatch:
target = consumption_dir / "document.pdf"
shutil.move(temp_location, target)
wait_for_mock_call(mock_consume_file_delay.delay, timeout_s=2.0)
wait_for_mock_call(mock_consume_file_delay.apply_async, timeout_s=2.0)
if thread.exception:
raise thread.exception
mock_consume_file_delay.delay.assert_called()
mock_consume_file_delay.apply_async.assert_called()
def test_handles_slow_write(
self,
@@ -847,12 +845,12 @@ class TestCommandWatch:
f.flush()
sleep(0.05)
wait_for_mock_call(mock_consume_file_delay.delay, timeout_s=2.0)
wait_for_mock_call(mock_consume_file_delay.apply_async, timeout_s=2.0)
if thread.exception:
raise thread.exception
mock_consume_file_delay.delay.assert_called()
mock_consume_file_delay.apply_async.assert_called()
def test_ignores_macos_files(
self,
@@ -868,13 +866,15 @@ class TestCommandWatch:
(consumption_dir / "._document.pdf").write_bytes(b"test")
shutil.copy(sample_pdf, consumption_dir / "valid.pdf")
wait_for_mock_call(mock_consume_file_delay.delay, timeout_s=2.0)
wait_for_mock_call(mock_consume_file_delay.apply_async, timeout_s=2.0)
if thread.exception:
raise thread.exception
assert mock_consume_file_delay.delay.call_count == 1
call_args = mock_consume_file_delay.delay.call_args[0][0]
assert mock_consume_file_delay.apply_async.call_count == 1
call_args = mock_consume_file_delay.apply_async.call_args.kwargs["kwargs"][
"input_doc"
]
assert call_args.original_file.name == "valid.pdf"
@pytest.mark.django_db
@@ -924,12 +924,12 @@ class TestCommandWatchPolling:
# Actively wait for consumption
# Polling needs: interval (0.5s) + stability (0.1s) + next poll (0.5s) + margin
wait_for_mock_call(mock_consume_file_delay.delay, timeout_s=5.0)
wait_for_mock_call(mock_consume_file_delay.apply_async, timeout_s=5.0)
if thread.exception:
raise thread.exception
mock_consume_file_delay.delay.assert_called()
mock_consume_file_delay.apply_async.assert_called()
@pytest.mark.management
@@ -953,12 +953,12 @@ class TestCommandWatchRecursive:
target = subdir / "document.pdf"
shutil.copy(sample_pdf, target)
wait_for_mock_call(mock_consume_file_delay.delay, timeout_s=2.0)
wait_for_mock_call(mock_consume_file_delay.apply_async, timeout_s=2.0)
if thread.exception:
raise thread.exception
mock_consume_file_delay.delay.assert_called()
mock_consume_file_delay.apply_async.assert_called()
def test_subdirs_as_tags(
self,
@@ -983,15 +983,15 @@ class TestCommandWatchRecursive:
target = subdir / "document.pdf"
shutil.copy(sample_pdf, target)
wait_for_mock_call(mock_consume_file_delay.delay, timeout_s=2.0)
wait_for_mock_call(mock_consume_file_delay.apply_async, timeout_s=2.0)
if thread.exception:
raise thread.exception
mock_consume_file_delay.delay.assert_called()
mock_consume_file_delay.apply_async.assert_called()
mock_tags.assert_called()
call_args = mock_consume_file_delay.delay.call_args
overrides = call_args[0][1]
call_args = mock_consume_file_delay.apply_async.call_args
overrides = call_args.kwargs["kwargs"]["overrides"]
assert overrides.tag_ids is not None
assert len(overrides.tag_ids) == 2
@@ -1021,7 +1021,7 @@ class TestCommandWatchEdgeCases:
if thread.exception:
raise thread.exception
mock_consume_file_delay.delay.assert_not_called()
mock_consume_file_delay.apply_async.assert_not_called()
@pytest.mark.usefixtures("mock_supported_extensions")
def test_handles_task_exception(
@@ -1035,7 +1035,7 @@ class TestCommandWatchEdgeCases:
mock_task = mocker.patch(
"documents.management.commands.document_consumer.consume_file",
)
mock_task.delay.side_effect = Exception("Task error")
mock_task.apply_async.side_effect = Exception("Task error")
thread = ConsumerThread(consumption_dir, scratch_dir)
try:
@@ -615,7 +615,7 @@ class TestExportImport(
self.assertIsFile(expected_file)
with ZipFile(expected_file) as zip:
# 11 files + 3 directory marker entries for the subdirectory structure
# Extras are from the directories, which also appear in the listing
self.assertEqual(len(zip.namelist()), 14)
self.assertIn("manifest.json", zip.namelist())
self.assertIn("metadata.json", zip.namelist())
@@ -666,57 +666,6 @@ class TestExportImport(
self.assertIn("manifest.json", zip.namelist())
self.assertIn("metadata.json", zip.namelist())
def test_export_zip_atomic_on_failure(self) -> None:
"""
GIVEN:
- Request to export documents to zipfile
WHEN:
- Export raises an exception mid-way
THEN:
- No .zip file is written at the final path
- The .tmp file is cleaned up
"""
args = ["document_exporter", self.target, "--zip"]
with mock.patch.object(
document_exporter.Command,
"dump",
side_effect=RuntimeError("simulated failure"),
):
with self.assertRaises(RuntimeError):
call_command(*args)
expected_zip = self.target / f"export-{timezone.localdate().isoformat()}.zip"
expected_tmp = (
self.target / f"export-{timezone.localdate().isoformat()}.zip.tmp"
)
self.assertIsNotFile(expected_zip)
self.assertIsNotFile(expected_tmp)
def test_export_zip_no_scratch_dir(self) -> None:
"""
GIVEN:
- Request to export documents to zipfile
WHEN:
- Documents are exported
THEN:
- No files are written under SCRATCH_DIR during the export
(the old workaround used a temp dir there)
"""
shutil.rmtree(Path(self.dirs.media_dir) / "documents")
shutil.copytree(
Path(__file__).parent / "samples" / "documents",
Path(self.dirs.media_dir) / "documents",
)
scratch_before = set(settings.SCRATCH_DIR.glob("paperless-export*"))
args = ["document_exporter", self.target, "--zip"]
call_command(*args)
scratch_after = set(settings.SCRATCH_DIR.glob("paperless-export*"))
self.assertEqual(scratch_before, scratch_after)
def test_export_target_not_exists(self) -> None:
"""
GIVEN:
+64 -15
View File
@@ -1,4 +1,5 @@
from io import StringIO
from unittest.mock import patch
import pytest
from django.core.management import CommandError
@@ -6,12 +7,11 @@ from django.core.management import call_command
from django.test import TestCase
from documents.models import Document
from documents.tests.factories import DocumentFactory
@pytest.mark.management
class TestFuzzyMatchCommand(TestCase):
MSG_REGEX = r"Document \d fuzzy match to \d \(confidence \d\d\.\d\d\d\)"
def call_command(self, *args, **kwargs):
stdout = StringIO()
stderr = StringIO()
@@ -77,7 +77,7 @@ class TestFuzzyMatchCommand(TestCase):
filename="other_test.pdf",
)
stdout, _ = self.call_command()
self.assertIn("No matches found", stdout)
self.assertIn("No duplicate documents found", stdout)
def test_with_matches(self) -> None:
"""
@@ -106,7 +106,7 @@ class TestFuzzyMatchCommand(TestCase):
filename="other_test.pdf",
)
stdout, _ = self.call_command("--processes", "1")
self.assertRegex(stdout, self.MSG_REGEX)
self.assertIn("Found 1 matching pair(s)", stdout)
def test_with_3_matches(self) -> None:
"""
@@ -142,10 +142,8 @@ class TestFuzzyMatchCommand(TestCase):
filename="final_test.pdf",
)
stdout, _ = self.call_command("--no-progress-bar", "--processes", "1")
lines = [x.strip() for x in stdout.splitlines() if x.strip()]
self.assertEqual(len(lines), 3)
for line in lines:
self.assertRegex(line, self.MSG_REGEX)
# 3 docs -> 3 unique pairs; summary confirms count and no duplication
self.assertIn("Found 3 matching pair(s)", stdout)
def test_document_deletion(self) -> None:
"""
@@ -186,22 +184,47 @@ class TestFuzzyMatchCommand(TestCase):
stdout, _ = self.call_command(
"--delete",
"--yes",
"--no-progress-bar",
"--processes",
"1",
)
self.assertIn(
"The command is configured to delete documents. Use with caution",
stdout,
)
self.assertRegex(stdout, self.MSG_REGEX)
self.assertIn("Deleting 1 documents based on ratio matches", stdout)
self.assertIn("Delete Mode", stdout)
self.assertIn("Found 1 matching pair(s)", stdout)
self.assertIn("Deleting 1 document(s)", stdout)
self.assertEqual(Document.objects.count(), 2)
self.assertIsNotNone(Document.objects.get(pk=1))
self.assertIsNotNone(Document.objects.get(pk=2))
def test_document_deletion_cancelled(self) -> None:
"""
GIVEN:
- 3 documents exist
- Document 1 to document 3 has a similarity over 85.0
WHEN:
- Command is called with --delete but user answers "n" at the prompt
THEN:
- No documents are deleted
"""
DocumentFactory(content="first document scanned by bob")
DocumentFactory(content="second document scanned by alice")
DocumentFactory(content="first document scanned by pete")
self.assertEqual(Document.objects.count(), 3)
with patch("builtins.input", return_value="n"):
stdout, _ = self.call_command(
"--delete",
"--no-progress-bar",
"--processes",
"1",
)
self.assertIn("Deletion cancelled", stdout)
self.assertEqual(Document.objects.count(), 3)
def test_empty_content(self) -> None:
"""
GIVEN:
@@ -226,4 +249,30 @@ class TestFuzzyMatchCommand(TestCase):
filename="other_test.pdf",
)
stdout, _ = self.call_command()
self.assertIn("No matches found", stdout)
self.assertIn("No duplicate documents found", stdout)
@pytest.mark.management
@pytest.mark.django_db
@pytest.mark.parametrize(
("content_a", "content_b"),
[
pytest.param("x" * 90 + "y" * 10, "x" * 100, id="yellow-90pct"), # 88-92%
pytest.param("x" * 94 + "y" * 6, "x" * 100, id="red-94pct"), # 92-97%
pytest.param("x" * 99 + "y", "x" * 100, id="bold-red-99pct"), # ≥97%
],
)
def test_similarity_color_band(content_a: str, content_b: str) -> None:
"""Each parametrized case exercises one color branch in _render_results."""
DocumentFactory(content=content_a)
DocumentFactory(content=content_b)
stdout = StringIO()
call_command(
"document_fuzzy_match",
"--no-progress-bar",
"--processes",
"1",
stdout=stdout,
skip_checks=True,
)
assert "Found 1 matching pair(s)" in stdout.getvalue()

Some files were not shown because too many files have changed in this diff Show More