paperless-ngx

mirror of https://github.com/paperless-ngx/paperless-ngx.git synced 2026-05-31 02:45:27 +00:00

Author	SHA1	Message	Date
Trenton Holmes	0b5b6fdad5	refactor: extract _parse_query and _apply_permission_filter helpers Deduplicates query parsing (3 call sites) and permission filter wrapping (4 call sites) into private helper methods on TantivyBackend. Also documents the N-lookup limitation of highlight_hits(). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-05 13:31:37 -07:00
Trenton Holmes	d98dbd50f4	fix: address code review findings (int keys, docstring, empty ordering) - TantivyRelevanceList.__getitem__ now handles int keys, not just slices - search_ids() docstring corrected ("no highlights or scores") - Empty ordering param now correctly becomes None instead of "" Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-05 13:26:10 -07:00
Trenton Holmes	7649e4a6b1	Merge remote-tracking branch 'origin/dev' into feature-search-pagination-improvements	2026-04-05 13:18:43 -07:00
Trenton Holmes	610ba27891	feat: replace 10000 overfetch with search_ids + page-only highlights Use search_ids() for the full set of matching IDs (lightweight ints, no arbitrary cap) and highlight_hits() for just the displayed page. TantivyRelevanceList now holds ordered IDs for count/selection_data and a small page of rich SearchHit dicts for serialization. Removes the hardcoded 10000 limit that silently truncated results for large collections. Memory usage down ~10% on sorted/paginated search paths at 200 docs, with larger gains expected at scale. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-05 12:54:47 -07:00
Trenton H	5f5fb263c9	Fix: Don't create a new note highlight generator per note in the loop (#12512 )	2026-04-03 17:34:15 -07:00
Trenton Holmes	7c50e0077c	chore: remove temporary profiling infrastructure Profiling tests and helper served their purpose during the search performance optimization work. Baseline and post-implementation data captured in docs/superpowers/plans/. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-03 15:53:55 -07:00
Trenton Holmes	288740ea62	refactor: promote sort_field_map to class-level constant on TantivyBackend Single source of truth for sort field mapping. The viewset now references TantivyBackend.SORTABLE_FIELDS instead of maintaining a duplicate set. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-03 15:53:49 -07:00
Trenton Holmes	d998d3fbaf	feat: delegate sorting to Tantivy and use page-only highlights in viewset Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-03 15:35:14 -07:00
Trenton Holmes	6cf01dd383	feat: add search_ids() and more_like_this_ids() lightweight methods search_ids() returns only document IDs matching a query — no highlights, no SearchHit objects. more_like_this_ids() does the same for MLT queries. These provide lightweight paths when only IDs are needed. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-03 15:21:52 -07:00
Trenton Holmes	0d915c58a4	feat: add highlight_page/highlight_page_size params to search() Gate expensive snippet/highlight generation to only the requested slice of hits, allowing the viewset to avoid generating highlights for all 10k results when only 25 are displayed. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-03 15:10:00 -07:00
Trenton Holmes	46008d2da7	test: add baseline profiling tests for search performance Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-03 14:58:11 -07:00
shamoon	b807b107ad	Enhancement: include sharelinks + bundles in export/import (#12479 )	2026-04-03 21:51:57 +00:00
Trenton H	c2f02851da	Chore: Better typed status manager messages (#12509 )	2026-04-03 21:18:01 +00:00
GitHub Actions	d0f8a98a9a	Auto translate strings	2026-04-03 20:55:14 +00:00
shamoon	566afdffca	Enhancement: unify text search to use tantivy (#12485 )	2026-04-03 13:53:45 -07:00
Trenton H	f32ad98d8e	Feature: Update consumer logging to include task ID for log correlation (#12510 )	2026-04-03 13:31:40 -07:00
Trenton H	d365f19962	Security: Registers a custom serializer which signs the task payload (#12504 )	2026-04-03 03:49:54 +00:00
GitHub Actions	2703c12f1a	Auto translate strings	2026-04-03 03:25:57 +00:00
shamoon	e7c7978d67	Enhancement: allow opt-in blocking internal mail hosts (#12502 )	2026-04-03 03:24:28 +00:00
GitHub Actions	83501757df	Auto translate strings	2026-04-02 22:36:32 +00:00
Trenton H	dda05a7c00	Security: Improve overall security in a few ways (#12501 ) - Make sure we're always using regex with timeouts for user controlled data - Adds rate limiting to the token endpoint (configurable) - Signs the classifier pickle file with the SECRET_KEY and refuse to load one which doesn't verify. - Require the user to set a secret key, instead of falling back to our old hard coded one	2026-04-02 15:30:26 -07:00
Trenton H	376af81b9c	Fix: Resolve another TC assuming an object has been created somewhere (#12503 )	2026-04-02 14:58:28 -07:00
GitHub Actions	05c9e21fac	Auto translate strings	2026-04-02 19:40:05 +00:00
Trenton H	aed9abe48c	Feature: Replace Whoosh with tantivy search backend (#12471 ) Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: Antoine Mérino <3023499+Merinorus@users.noreply.github.com>	2026-04-02 12:38:22 -07:00
GitHub Actions	2aa0c9f0b4	Auto translate strings	2026-03-31 18:25:03 +00:00
shamoon	d2328b776a	Performance: support bulk edit without id lists (#12355 )	2026-03-31 18:23:28 +00:00
GitHub Actions	e1da2a1efe	Auto translate strings	2026-03-31 14:57:34 +00:00
shamoon	245514ad10	Performance: deprecate and remove usage of `all` in API results (#12309 )	2026-03-31 07:55:59 -07:00
GitHub Actions	020057e1a4	Auto translate strings	2026-03-30 16:40:47 +00:00
shamoon	f715533770	Performance: support passing selection data with filtered document requests (#12300 )	2026-03-30 16:38:52 +00:00
Jan Kleine	0292edbee7	Fixhancement: include trashed documents in document exporter/importer (#12425 )	2026-03-30 16:30:22 +00:00
Andreas Schneider	85e0d1842a	Tests: add regression test for redis URL with empty username (#12460 ) * Tests: add regression test for redis URL with empty username and password Covers the unix://:SECRET@/path.sock format (empty username, password only), which was missing from the existing test cases for PR #12239. * Update src/paperless/tests/settings/test_custom_parsers.py --------- Co-authored-by: shamoon <4887959+shamoon@users.noreply.github.com>	2026-03-29 06:31:18 -07:00
GitHub Actions	62f79c088e	Auto translate strings	2026-03-28 21:00:05 +00:00
shamoon	129da3ade7	Tweakhancement: show file extension in StoragePath test (#12452 )	2026-03-28 13:58:33 -07:00
Trenton H	9383471fa0	Feature: Transition all checksums to use SHA256 (#12432 )	2026-03-26 11:28:02 -07:00
GitHub Actions	b153ec803b	Auto translate strings	2026-03-26 14:38:10 +00:00
shamoon	ae0474450f	Chore: logger, response and template sanitization cleanup (#12439 )	2026-03-26 07:36:02 -07:00
Trenton H	8efb01010c	fix: Don't silently drop the change_groups and switch to a couple slightly more efficient implementations (#12431 )	2026-03-26 14:15:42 +00:00
Trenton H	d18bbfa9c3	Chore: Instead of manual temporary directory management, use a context manager (#12430 )	2026-03-26 14:05:58 +00:00
GitHub Actions	79def8a200	Auto translate strings	2026-03-22 13:55:02 +00:00
Trenton H	701735f6e5	Chore: Drop old signal and unneeded apps, transition to parser registry instead (#12405 ) * refactor: switch consumer and callers to ParserRegistry (Phase 4) Replace all Django signal-based parser discovery with direct registry calls. Removes `_parser_cleanup`, `parser_is_new_style` shims, and all old-style isinstance checks. All parser instantiation now uses the `with parser_class() as parser:` context manager pattern. - documents/parsers.py: delegate to get_parser_registry(); drop lru_cache - documents/consumer.py: use registry + context manager; remove shims - documents/tasks.py: same pattern - documents/management/commands/document_thumbnails.py: same pattern - documents/views.py: get_metadata uses context manager - documents/checks.py: use get_parser_registry().all_parsers() - paperless/parsers/registry.py: add all_parsers() public method - tests: update mocks to target documents.consumer.get_parser_class_for_mime_type Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * refactor: drop get_parser_class_for_mime_type; callers use registry directly All callers now call get_parser_registry().get_parser_for_file() with the actual filename and path, enabling score() to use file extension hints. The MIME-only helper is removed. - consumer.py: passes self.filename + self.working_copy - tasks.py: passes document.original_filename + document.source_path - document_thumbnails.py: same pattern - views.py: passes Path(file).name + Path(file) - parsers.py: internal helpers inline the registry call with filename="" - test_parsers.py: drop TestParserDiscovery (was testing mock behavior); TestParserAvailability uses registry directly - test_consumer.py: mocks switch to documents.consumer.get_parser_registry Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * refactor: remove document_consumer_declaration signal infrastructure Remove the document_consumer_declaration signal that was previously used for parser registration. Each parser app no longer connects to this signal, and the signal declaration itself has been removed from documents/signals. Changes: - Remove document_consumer_declaration from documents/signals/__init__.py - Remove ready() methods and signal imports from all parser app configs - Delete signal shim files (signals.py) from all parser apps: - paperless_tesseract/signals.py - paperless_text/signals.py - paperless_tika/signals.py - paperless_mail/signals.py - paperless_remote/signals.py Parser discovery now happens exclusively through the ParserRegistry system introduced in the previous refactor phases. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * refactor: remove empty paperless_text and paperless_tika Django apps After parser classes were moved to paperless/parsers/ in the plugin refactor, these Django apps contained only empty AppConfig classes with no models, views, tasks, migrations, or other functionality. - Remove paperless_text and paperless_tika from INSTALLED_APPS - Delete empty app directories entirely - Update pyproject.toml test exclusions - Clean stale mypy baseline entries for moved parser files paperless_remote app is retained as it contains meaningful system checks for Azure AI configuration. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Moves the checks and tests to the main application and removes the old applications * Adds a comment to satisy Sonar * refactor: remove automatic log_summary() call from get_parser_registry() The summary was logged once per process, causing it to appear repeatedly during Docker startup (management commands, web server, each Celery worker subprocess). External parsers are already announced individually at INFO when discovered; the full summary is redundant noise. log_summary() is retained on ParserRegistry for manual/debug use. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Cleans up the duplicate test file/fixture * Fixes a race condition where webserver threads could race to populate the registry --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-22 06:53:32 -07:00
GitHub Actions	07f54bfdab	Auto translate strings	2026-03-21 09:26:23 +00:00
shamoon	0f84af27d0	Merge branch 'main' into dev # Conflicts: # docs/setup.md # src-ui/src/main.ts # src/documents/tests/test_api_bulk_edit.py # src/documents/tests/test_api_custom_fields.py # src/documents/tests/test_api_search.py # src/documents/tests/test_api_status.py # src/documents/tests/test_workflows.py # src/paperless_mail/tests/test_api.py	2026-03-21 02:12:19 -07:00
shamoon	9646b8c67d	Bump version to 2.20.13	2026-03-21 01:50:04 -07:00
shamoon	e590d7df69	Merge branch 'release/v2.20.x'	2026-03-21 01:49:32 -07:00
shamoon	cc71aad058	Fix: suggest corrections only if visible results	2026-03-21 01:24:23 -07:00
shamoon	3cbdf5d0b7	Fix: require view permission for more-like search	2026-03-21 01:20:59 -07:00
shamoon	f84e0097e5	Fix validate document link targets	2026-03-21 00:55:36 -07:00
shamoon	7dbf8bdd4a	Fix: enforce permissions when attaching accounts to mail rules	2026-03-21 00:44:28 -07:00
shamoon	2cb155e717	Bump version to 2.20.12	2026-03-20 15:47:37 -07:00

1 2 3 4 5 ...

3929 Commits