paperless-ngx

mirror of https://github.com/paperless-ngx/paperless-ngx.git synced 2026-07-19 10:24:57 +00:00

Author	SHA1	Message	Date
shamoon	ab8fe0521b	Merge branch 'beta' into dev	2026-06-02 08:32:54 -07:00
shamoon	2638554969	Merge branch 'main' into dev	2026-06-02 08:32:43 -07:00
Trenton H GitHub Claude Sonnet 4.6	2c58d86380	Fix: Minor fixes for the AI indexing (#12893 ) * Fix: Remove all nodes for multi-chunk documents in update_llm_index incremental path The existing_nodes dict comprehension keyed on document_id silently dropped all but the last node per document, so only that one node was deleted when a modified document was re-indexed, leaving all other chunks as ghost vectors in the FAISS index. Switch to a defaultdict(list) that collects every node per document_id, then iterate and delete all of them before inserting fresh nodes. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Fix: Wire document_updated signal to LLM index update handler Connect document_updated to add_or_update_document_in_llm_index in DocumentsConfig.ready() so REST API edits (PATCH /api/documents/{id}/) enqueue an LLM vector store update, matching the existing document_consumption_finished behavior. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Fix: Add file lock around FAISS index mutations to prevent concurrent write corruption Two concurrent Celery workers calling llm_index_add_or_update_document or llm_index_remove_document each loaded the same on-disk index independently, made their own change, and the last writer silently overwrote the first's update. Wrap both functions and the rebuild/persist body of update_llm_index in a filelock.FileLock keyed on LLM_INDEX_DIR/index.lock. Add a TOCTOU comment on queue_llm_index_update_if_needed explaining the residual risk (duplicate rebuild tasks are wasteful but not corrupting because the lock serialises the actual write). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Fix: Apply _normalize() in extract_unmatched_names to prevent duplicate suggestions extract_unmatched_names was using .lower() while _match_names_to_queryset uses _normalize() (which also strips punctuation). A name like "J. Smith" matched to existing correspondent "J Smith" would still appear in the unmatched list, causing duplicate object creation. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Fix: Skip LLM index update gracefully when document has no indexable content Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Fix: Persist empty index when all documents are deleted to clear stale FAISS vectors The early-return guard in update_llm_index fired before persist() when no documents existed, leaving a stale on-disk FAISS index that returned phantom hits for deleted document IDs. Now the guard only returns early for the incremental (rebuild=False) path when no index exists on disk; the rebuild path always continues through to persist(), producing an empty clean index. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Chore: Simplify incremental index update — use docs.values() and deduplicate node extend --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-01 13:40:49 -07:00
shamoonandGitHub	52222d23d3	Fix (beta): dont use tool calling with ollama (#12896 )	2026-06-01 12:12:23 -07:00
shamoonandGitHub	27426c04b0	Enhancement: try to respect language for AI suggestions (#12894 )	2026-06-01 12:11:46 -07:00
shamoonandGitHub	f6c865bf47	Enhancement: AI LLM chunk size and context window config (#12891 )	2026-06-01 17:56:21 +00:00
Trenton H GitHub Claude Sonnet 4.6	bb860a5834	Fix: Improvements for security around the AI (#12895 ) * Fix: Validate and limit chat question input in ChatStreamingView Add max_length=4000 to ChatStreamingSerializer.q and replace the bare request.data["q"] read with proper serializer.is_valid(raise_exception=True) so oversized or missing questions are rejected with HTTP 400 before reaching the LLM. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Fix: Add defensive prompt framing to mark document content as untrusted * Also adds a system prompt which is treated higher that this is untrusted stuff --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-01 10:03:27 -07:00
Nathanaël HounandGitHub	432fa45e0c	Fix: correctly show timestamp tooltip on history item (#12879 )	2026-05-31 05:18:58 -07:00
shamoon	20d73d26b6	Bump dessant/lock-threads action to v6.0.2	2026-05-29 21:51:16 -07:00
Trenton HandGitHub	889ccfd67a	Fix: Fold query and autocomplete terms with Tantivy's ascii_fold so special letters match (#12868 )	2026-05-29 16:42:07 -07:00
Trenton HandGitHub	bbceb5dac6	Fix: Don't store autocomplete_word, only index it (#12867 )	2026-05-29 14:09:04 -07:00
Trenton HandGitHub	98a7ed32e3	Fix: Preserve Whoosh date range swapping in Tantviy (#12866 )	2026-05-29 20:21:59 +00:00
Trenton HandGitHub	25a7b2038a	Fix: Always release search index writer, even on failure, so the write lock doesn't persist for later (#12865 )	2026-05-29 19:38:58 +00:00
Trenton HandGitHub	97e3c75720	Fix: Handle CJK title, content and metadata searching (#12862 )	2026-05-29 19:11:55 +00:00
Trenton HandGitHub	11c62757ef	Fix: Restrict date query rewrites to date or datetime fields only (#12864 )	2026-05-29 11:59:30 -07:00
Trenton HandGitHub	4a8d79be6f	Fix: Missing call to tanvity wait_merging_threads (#12863 )	2026-05-29 10:32:15 -07:00
Trenton HandGitHub	525b986e23	Fix: Handle tanvity index lock contention (#12856 ) Implements and tests a retry with backoff + jitter for aquring the index update lock. If we still can't get it, dispatch a celery task to handle it later instead (also with retry) Signed-off-by: stumpylog <797416+stumpylog@users.noreply.github.com>	2026-05-27 09:47:13 -07:00
shamoonandGitHub	4ce5f2022c	Fix (beta): better catch chat errors (#12854 )	2026-05-26 19:05:47 +00:00
shamoonandGitHub	ab47185712	Performance (beta): dont re-build vector index with each chat (#12847 )	2026-05-26 11:36:05 -07:00
shamoonandGitHub	01d8fad622	Security: fixes for v3 beta (#12838 )	2026-05-26 16:46:23 +00:00
shamoonandGitHub	da3e845b8b	Fix (beta): normalize long punctuation chunks to improve embedding (#12848 )	2026-05-26 09:32:38 -07:00
Matt Van HornandGitHub	45ba35dd3a	docs: remove duplicate words in three files (#12852 )	2026-05-26 06:40:30 -07:00
shamoonandGitHub	6d57ba4481	Chore: tweak anti-slop workflow (#12851 )	2026-05-26 06:34:00 -07:00
shamoonandGitHub	0a6e0db186	Fix: use chord.on_error before apply_async (#12842 )	2026-05-24 14:42:11 -07:00
shamoon	15682231b2	Chore: fix sonarcube logger warnings	2026-05-20 08:54:00 -07:00
Trenton HandGitHub	df861189fa	Fix: Don't use smaller integer fields for some workflow fields (#12834 )	2026-05-20 14:39:01 +00:00
Trenton H GitHub shamoon	bd86dca57e	Fix: Password removal source file location (#12830 ) Co-authored-by: shamoon <4887959+shamoon@users.noreply.github.com>	2026-05-19 13:52:04 -07:00
Trenton HandGitHub	9f45737b94	Upgrades this dep so it handles newer models, like gpt-5-5 which require a locked 1.0 temperature value (#12824 )	2026-05-18 12:30:03 -07:00
shamoonandGitHub	83d59ad3bf	Fix (beta): use correct html button type for custom field buttons (#12819 )	2026-05-17 19:15:03 -07:00
Trenton HandGitHub	ff3360310b	Fix: Defer password removal workflow action until the file is in place (#12814 )	2026-05-16 17:14:37 -07:00
Trenton HandGitHub	9a68dcdddf	Fix: Allow setting allauth rate limit configuration settings (#12798 )	2026-05-14 07:29:49 -07:00
Trenton HandGitHub	9a78882b5a	Fix: Don't embed the metadata which is already embedded into the context (#12795 )	2026-05-13 09:01:34 -07:00
Trenton HandGitHub	7e381f204e	Fix: Sanitize dash or plus from the text search path (#12789 )	2026-05-12 12:41:38 -07:00
shamoon	5f42854d99	Fix: two more css tweaks to tasks page	2026-05-11 13:50:02 -07:00
shamoonandGitHub	bc1d2fbccb	Fix: improve new tasks ui layout across screen sizes (#12784 )	2026-05-11 13:46:17 -07:00
shamoon GitHub stumpylog	7471fedb43	Fix: Update parser contract to require empty strings, not None (#12775 ) Co-authored-by: stumpylog <797416+stumpylog@users.noreply.github.com>	2026-05-11 09:16:21 -07:00
Trenton HandGitHub	1527c347e3	Chore: Further dependency minor security updates (#12780 )	2026-05-11 08:59:19 -07:00
Trenton HandGitHub	da0f25b546	Fix: Use a persistent, writeable location for hugging face models (#12771 )	2026-05-09 18:23:11 -07:00
dependabot[bot]GitHubdependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	6cd5784bd7	Chore(deps): Bump hono (#12767 ) Bumps the npm_and_yarn group with 1 update in the /src-ui directory: [hono](https://github.com/honojs/hono). Updates `hono` from 4.12.16 to 4.12.18 - [Release notes](https://github.com/honojs/hono/releases) - [Commits](https://github.com/honojs/hono/compare/v4.12.16...v4.12.18) --- updated-dependencies: - dependency-name: hono dependency-version: 4.12.18 dependency-type: indirect dependency-group: npm_and_yarn ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-05-08 22:42:12 -07:00
shamoonandGitHub	79d0a04df0	Enhancement: support ollama embeddings (#12753 )	2026-05-09 00:06:14 +00:00
Moritz Stückler GitHub Claude Sonnet 4.6	177d81c8d4	Fix: create LLM_INDEX_DIR before writing meta.json on first run (#12759 ) Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-08 23:38:41 +00:00
dependabot[bot]GitHubdependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	4207999b63	Chore(deps): Bump @babel/plugin-transform-modules-systemjs (#12764 ) Bumps the npm_and_yarn group with 1 update in the /src-ui directory: [@babel/plugin-transform-modules-systemjs](https://github.com/babel/babel/tree/HEAD/packages/babel-plugin-transform-modules-systemjs). Updates `@babel/plugin-transform-modules-systemjs` from 7.29.0 to 7.29.4 - [Release notes](https://github.com/babel/babel/releases) - [Changelog](https://github.com/babel/babel/blob/main/CHANGELOG.md) - [Commits](https://github.com/babel/babel/commits/v7.29.4/packages/babel-plugin-transform-modules-systemjs) --- updated-dependencies: - dependency-name: "@babel/plugin-transform-modules-systemjs" dependency-version: 7.29.4 dependency-type: indirect dependency-group: npm_and_yarn ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-05-08 15:45:59 -07:00
dependabot[bot]GitHubdependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	934a1ed8d7	Chore(deps): Bump fast-uri from 3.1.1 to 3.1.2 in /src-ui in the npm_and_yarn group across 1 directory (#12763 ) Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-05-08 15:05:57 -07:00
Trenton HandGitHub	5202dc0748	Fix: Clear ContentType/guardian caches at import and test cases (#12758 )	2026-05-08 20:48:47 +00:00
Trenton HandGitHub	b1e44f5d6b	Tweakhancment: Include the last applied 'documents' migration in the log (#12757 )	2026-05-08 20:37:10 +00:00
shamoonandGitHub	57b91ad2cf	Fix: use response synthesizer for RAG doc chat (#12751 )	2026-05-08 20:01:44 +00:00
shamoonandGitHub	8769dc894e	Fix: only update modified field in notes actions (#12750 )	2026-05-08 15:36:07 +00:00
shamoonandGitHub	978e54ab52	Fixhancement: version-aware thumbnail etag (#12754 )	2026-05-08 08:26:37 -07:00
shamoonandGitHub	268ded92bc	Documentation: Update v3 migration docs (#12752 )	2026-05-08 08:19:15 -07:00
Trenton HandGitHub	9a1e2aea50	Fix: Handle dash or plus operators in search queries (#12734 )	2026-05-07 17:26:11 +00:00

1 2 3 4 5 ...