Commit Graph

2299 Commits

Author SHA1 Message Date
Trenton H 25a7b2038a Fix: Always release search index writer, even on failure, so the write lock doesn't persist for later (#12865) 2026-05-29 19:38:58 +00:00
Trenton H 97e3c75720 Fix: Handle CJK title, content and metadata searching (#12862) 2026-05-29 19:11:55 +00:00
Trenton H 11c62757ef Fix: Restrict date query rewrites to date or datetime fields only (#12864) 2026-05-29 11:59:30 -07:00
Trenton H 4a8d79be6f Fix: Missing call to tanvity wait_merging_threads (#12863) 2026-05-29 10:32:15 -07:00
Trenton H 525b986e23 Fix: Handle tanvity index lock contention (#12856)
Implements and tests a retry with backoff + jitter for aquring the index update lock.  If we still can't get it, dispatch a celery task to handle it later instead (also with retry)

Signed-off-by: stumpylog <797416+stumpylog@users.noreply.github.com>
2026-05-27 09:47:13 -07:00
shamoon 01d8fad622 Security: fixes for v3 beta (#12838) 2026-05-26 16:46:23 +00:00
shamoon 0a6e0db186 Fix: use chord.on_error before apply_async (#12842) 2026-05-24 14:42:11 -07:00
shamoon 15682231b2 Chore: fix sonarcube logger warnings 2026-05-20 08:54:00 -07:00
Trenton H df861189fa Fix: Don't use smaller integer fields for some workflow fields (#12834) 2026-05-20 14:39:01 +00:00
Trenton H bd86dca57e Fix: Password removal source file location (#12830)
Co-authored-by: shamoon <4887959+shamoon@users.noreply.github.com>
2026-05-19 13:52:04 -07:00
Trenton H ff3360310b Fix: Defer password removal workflow action until the file is in place (#12814) 2026-05-16 17:14:37 -07:00
Trenton H 7e381f204e Fix: Sanitize dash or plus from the text search path (#12789) 2026-05-12 12:41:38 -07:00
shamoon 7471fedb43 Fix: Update parser contract to require empty strings, not None (#12775)
Co-authored-by: stumpylog <797416+stumpylog@users.noreply.github.com>
2026-05-11 09:16:21 -07:00
shamoon 79d0a04df0 Enhancement: support ollama embeddings (#12753) 2026-05-09 00:06:14 +00:00
Trenton H 5202dc0748 Fix: Clear ContentType/guardian caches at import and test cases (#12758) 2026-05-08 20:48:47 +00:00
shamoon 8769dc894e Fix: only update modified field in notes actions (#12750) 2026-05-08 15:36:07 +00:00
shamoon 978e54ab52 Fixhancement: version-aware thumbnail etag (#12754) 2026-05-08 08:26:37 -07:00
Trenton H 9a1e2aea50 Fix: Handle dash or plus operators in search queries (#12734) 2026-05-07 17:26:11 +00:00
Trenton H 2354f87a40 Fixes trash preview when a document has deleted versions (#12742) 2026-05-07 17:07:35 +00:00
shamoon 3097f06189 Fix: exclude versions from stats count (#12738) 2026-05-07 16:34:26 +00:00
Trenton H 2296d7fa0e Fix: Rewrite Whoosh year only queries to be to Tantivy date syntax (#12725) 2026-05-06 09:26:46 -07:00
Trenton H e2ad14f9ca Fix: workflow password removal didn't handle lists from the DB (#12716) 2026-05-05 12:52:34 -07:00
Trenton H c3b353873b use a few named tuples and data classes instead of so much unpacking (#12709) 2026-05-04 18:00:48 -07:00
Trenton H e822e72964 Feature: Further reduce document importer memory usage (#12707)
* Replaces loaddata with streaming bulk_create

Replaces call_command('loaddata') with a streaming implementation that
reads manifest records one at a time via ijson, accumulates per-model
batches up to --batch-size, and flushes via bulk_create.  This reduces
peak memory and no longer scales directly with the size of the import.

* fix(importer): avoid guardian lru_cache poisoning; include M2M through tables in check_constraints

clear_cache() inside the import transaction emptied Django's ContentType
manager cache while fixture PKs were live, causing downstream ContentType
lookups to repopulate guardian's separate @lru_cache(None) with
fixture-PK objects. After the TestCase transaction rolled back to
original PKs, guardian's lru_cache held stale fixture ContentType
objects, causing MixedContentTypeError in unrelated subsequent tests.

Remove clear_cache() since it was defending against a theoretical
stale-cache scenario that doesn't occur in a proper same-install restore.

Fix check_constraints() to explicitly include auto-created M2M through
tables (populated by .set() after bulk_create) alongside the model tables,
addressing the gap where join-table FK violations would have gone
undetected.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Excludes the consumer and AnonymousUser from any models which might have a FK relation to it.  This prevents orphan things like UI setting, which have a relation to no existing user

* Splits into more sub functions for Sonar

* Improvements to the typing of the new functions

* Coverage for some error cases, and removes handling for pk only models.  No need to support these

* Final coverage gaps

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-04 16:36:05 -07:00
shamoon a76b6b826c Fix (dev): resolve tantivy search-filtered documents in bulk edit (#12705) 2026-05-04 14:34:08 -07:00
shamoon ca4444c9a3 Fix (dev): return empty list for non-positive search limit (#12703) 2026-05-04 08:50:15 -07:00
Trenton H 5010f37174 Fix: avoid unnecessary close_old_connections in Celery task dispatch (#12701) 2026-05-04 06:02:28 -07:00
Trenton H b38d112691 Chore: Convert more testing to pytest style (#12678) 2026-04-30 11:40:11 -07:00
shamoon d4a1de18f7 Change (dev): separate llm suggestions endpoint (#12675) 2026-04-30 05:52:45 -07:00
shamoon fbadc2e1b8 Enhancement: Paperless-ngx v3 Logo (#12673) 2026-04-29 16:38:25 -07:00
Trenton H 493d282059 Chore: Upgrades tantivy-py to the latest release (#12605) 2026-04-29 10:09:50 -07:00
shamoon 69cb4d06c6 Enhancement (dev): Use OpenAI-like backend (#12668) 2026-04-28 10:06:59 -07:00
Trenton H 14fe520319 Chore: Update typing and baselines again (#12641)
a
2026-04-28 09:28:05 -07:00
shamoon ff95512b9a Fix: apply tag changes directly to document in db (#12664) 2026-04-28 08:18:40 -07:00
shamoon 21ff254ffc Merge branch 'main' into dev 2026-04-26 19:12:40 -07:00
Gaëtan GOUZI be8658d61a fix: Return HTTP 400 instead of HTTP 500 on DELETE /api/documents/{id}/notes/ with missing or invalid note id (#12582) 2026-04-26 19:05:36 -07:00
shamoon 552e5cf422 Merge commit from fork 2026-04-26 17:46:11 -07:00
shamoon 778d6b9fe3 Fix(dev): catch llm configuration error on get suggestions (#12647) 2026-04-25 22:43:04 -07:00
Trenton H a2dbe17a78 Fix: Use FileResponse for file API responses (#12638)
* Updates code to use a FileResponse for streaming and unlink the file, but keep a handle to it

* Transitions the rest of the code to use FileResponse instead of a basic response, fixes up tests which assumed .content exists

* While here, let's add schema for it
2026-04-25 06:40:09 -07:00
Trenton H 610702d757 Fix: v9 API task response, removing pagination (#12637) 2026-04-24 13:34:53 -07:00
Trenton H d6e45093e8 Chore: Paginate the task listing (#12633)
Co-authored-by: shamoon <4887959+shamoon@users.noreply.github.com>
2026-04-24 10:31:37 -07:00
shamoon 5bb9241e9a Enhancement: show small task summary in system status (#12634) 2026-04-24 09:11:42 -07:00
Trenton H bdbecac7e8 Replaces two sentinel files with .index_settings.json which can properly store multiple values and handle None (#12625) 2026-04-23 02:38:26 +00:00
Trenton H 0c25c2dac5 Feature: Allow monitoring access to tasks summary (#12624)
Co-authored-by: shamoon <4887959+shamoon@users.noreply.github.com>
2026-04-22 13:48:54 -07:00
Trenton H 89a9e7f190 Performance: Increases workflow related M2M prefetching (#12618) 2026-04-21 22:01:51 +00:00
Trenton H a89cd2d5d9 Fix: Exact custom field monetary exact searching (#12592) 2026-04-21 18:01:27 +00:00
Trenton H 6017b11c42 Fix: Prefetches the custom field instance and the custom field all at once (#12617) 2026-04-21 10:24:51 -07:00
shamoon f784a74eba Enhancement: add highlighting to title + content searches (#12593) 2026-04-20 21:28:02 +00:00
Trenton H 58789e5061 Chore: Structured consume task return values (#12612) 2026-04-20 13:19:54 -07:00
Trenton H fbf4e32646 Chore: Converts all call sites and test asserts to use apply_async and headers (#12591) 2026-04-20 11:40:04 -07:00