Commit Graph

355 Commits

Author SHA1 Message Date
Trenton H 0d3f2ffb75 test(mail): lift composite test API endpoints to module constants 2026-04-30 10:02:08 -07:00
Trenton H 4942ab1000 test(mail): convert test_api.py to pytest style 2026-04-30 09:58:37 -07:00
Trenton H 9e17c55a9a test(mail): add mail_api_user, mail_api_client, bogus_mailbox fixtures 2026-04-30 09:45:11 -07:00
Trenton H 42da2e2fd6 test(mail): use pytest tmp_path for MessageEncryptor GPG homes 2026-04-30 09:07:20 -07:00
Trenton H 4872bfbeae test(mail): convert test_preprocessor.py to pytest style 2026-04-30 08:56:26 -07:00
Trenton H e40cd6048d test(mail): add mail_mocker fixture wrapping MailMocker helper 2026-04-30 08:48:23 -07:00
Trenton H 56ebc08a65 test(mail): convert test_mail_oauth.py to pytest style 2026-04-30 08:44:37 -07:00
Trenton H b7d466c242 test(mail): add mail_user and oauth_settings pytest fixtures 2026-04-30 08:32:52 -07:00
Trenton H 14fe520319 Chore: Update typing and baselines again (#12641)
a
2026-04-28 09:28:05 -07:00
shamoon 21ff254ffc Merge branch 'main' into dev 2026-04-26 19:12:40 -07:00
shamoon e5561ba06f Fix: correctly scope mail account enumeration (#12636) 2026-04-24 10:15:59 -07:00
Trenton H 50ec987a81 Chore: Refactors all of the mail tests to use model factories instead of bare create (#12613) 2026-04-20 15:43:43 -07:00
Trenton H 58789e5061 Chore: Structured consume task return values (#12612) 2026-04-20 13:19:54 -07:00
Trenton H fbf4e32646 Chore: Converts all call sites and test asserts to use apply_async and headers (#12591) 2026-04-20 11:40:04 -07:00
Trenton H 5e609101d1 Chore: Update API schema fields (#12611)
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-20 11:05:00 -07:00
shamoon 8f036c9521 Merge branch 'main' into dev
# Conflicts:
#	docs/usage.md
#	src/documents/signals/handlers.py
#	src/documents/tests/test_api_documents.py
#	src/documents/views.py
2026-04-14 15:11:23 -07:00
Trenton H 3b6edcdd8e Chore: Add generic type params and update our baselines (#12566)
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-13 14:12:59 -07:00
shamoon e7c7978d67 Enhancement: allow opt-in blocking internal mail hosts (#12502) 2026-04-03 03:24:28 +00:00
shamoon ae0474450f Chore: logger, response and template sanitization cleanup (#12439) 2026-03-26 07:36:02 -07:00
Trenton H 701735f6e5 Chore: Drop old signal and unneeded apps, transition to parser registry instead (#12405)
* refactor: switch consumer and callers to ParserRegistry (Phase 4)

Replace all Django signal-based parser discovery with direct registry
calls. Removes `_parser_cleanup`, `parser_is_new_style` shims, and all
old-style isinstance checks. All parser instantiation now uses the
`with parser_class() as parser:` context manager pattern.

- documents/parsers.py: delegate to get_parser_registry(); drop lru_cache
- documents/consumer.py: use registry + context manager; remove shims
- documents/tasks.py: same pattern
- documents/management/commands/document_thumbnails.py: same pattern
- documents/views.py: get_metadata uses context manager
- documents/checks.py: use get_parser_registry().all_parsers()
- paperless/parsers/registry.py: add all_parsers() public method
- tests: update mocks to target documents.consumer.get_parser_class_for_mime_type

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor: drop get_parser_class_for_mime_type; callers use registry directly

All callers now call get_parser_registry().get_parser_for_file() with
the actual filename and path, enabling score() to use file extension
hints. The MIME-only helper is removed.

- consumer.py: passes self.filename + self.working_copy
- tasks.py: passes document.original_filename + document.source_path
- document_thumbnails.py: same pattern
- views.py: passes Path(file).name + Path(file)
- parsers.py: internal helpers inline the registry call with filename=""
- test_parsers.py: drop TestParserDiscovery (was testing mock behavior);
  TestParserAvailability uses registry directly
- test_consumer.py: mocks switch to documents.consumer.get_parser_registry

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor: remove document_consumer_declaration signal infrastructure

Remove the document_consumer_declaration signal that was previously used
for parser registration. Each parser app no longer connects to this signal,
and the signal declaration itself has been removed from documents/signals.

Changes:
- Remove document_consumer_declaration from documents/signals/__init__.py
- Remove ready() methods and signal imports from all parser app configs
- Delete signal shim files (signals.py) from all parser apps:
  - paperless_tesseract/signals.py
  - paperless_text/signals.py
  - paperless_tika/signals.py
  - paperless_mail/signals.py
  - paperless_remote/signals.py

Parser discovery now happens exclusively through the ParserRegistry
system introduced in the previous refactor phases.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor: remove empty paperless_text and paperless_tika Django apps

After parser classes were moved to paperless/parsers/ in the plugin
refactor, these Django apps contained only empty AppConfig classes
with no models, views, tasks, migrations, or other functionality.

- Remove paperless_text and paperless_tika from INSTALLED_APPS
- Delete empty app directories entirely
- Update pyproject.toml test exclusions
- Clean stale mypy baseline entries for moved parser files

paperless_remote app is retained as it contains meaningful system
checks for Azure AI configuration.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Moves the checks and tests to the main application and removes the old applications

* Adds a comment to satisy Sonar

* refactor: remove automatic log_summary() call from get_parser_registry()

The summary was logged once per process, causing it to appear repeatedly
during Docker startup (management commands, web server, each Celery
worker subprocess). External parsers are already announced individually
at INFO when discovered; the full summary is redundant noise.
log_summary() is retained on ParserRegistry for manual/debug use.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Cleans up the duplicate test file/fixture

* Fixes a race condition where webserver threads could race to populate the registry

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-22 06:53:32 -07:00
shamoon 3cfe9fa2a8 Fixhancement: default mail -created correspondent matching to exact (#12414) 2026-03-21 16:15:18 -07:00
shamoon 0f84af27d0 Merge branch 'main' into dev
# Conflicts:
#	docs/setup.md
#	src-ui/src/main.ts
#	src/documents/tests/test_api_bulk_edit.py
#	src/documents/tests/test_api_custom_fields.py
#	src/documents/tests/test_api_search.py
#	src/documents/tests/test_api_status.py
#	src/documents/tests/test_workflows.py
#	src/paperless_mail/tests/test_api.py
2026-03-21 02:12:19 -07:00
shamoon 7dbf8bdd4a Fix: enforce permissions when attaching accounts to mail rules 2026-03-21 00:44:28 -07:00
Trenton H c2b8b22fb4 Chore: Convert mail parser to plugin style (#12397)
* Refactor(mail): rename paperless_mail/parsers.py → paperless/parsers/mail.py

Preserve git history for MailDocumentParser by committing the rename
separately before editing, following the project convention.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Refactor(mail): move mail parser tests to paperless/tests/parsers/

Move test_parsers.py → test_mail_parser.py and test_parsers_live.py →
test_mail_parser_live.py alongside the other built-in parser tests,
preserving git history before editing. Update MailDocumentParser import
to the new canonical location.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Chore: move mail parser sample files to paperless/tests/samples/mail/

Relocate all mail test fixtures from src/paperless_mail/tests/samples/ to
src/paperless/tests/samples/mail/ ahead of the parser plugin refactor.
Add the new path to the codespell skip list to prevent false-positive
spell corrections in binary/fixture email files.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Feat(tests): add mail parser fixtures to paperless/tests/parsers/conftest.py

Add mail_samples_dir, per-file sample fixtures, and mail_parser
(context-manager style) to mirror the old paperless_mail conftest
but rooted at the new samples/mail/ location.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Feat(parsers): migrate MailDocumentParser to ParserProtocol

Move the mail parser from paperless_mail/parsers.py to
paperless/parsers/mail.py and refactor it to implement ParserProtocol:

- Class-level name/version/author/url attributes
- supported_mime_types() and score() classmethods (score=20)
- can_produce_archive=False, requires_pdf_rendition=True
- Context manager lifecycle (__enter__/__exit__)
- New parse() signature without mailrule_id kwarg; consumer sets
  parser.mailrule_id before calling parse() instead
- get_text()/get_date()/get_archive_path() accessor methods
- extract_metadata() returning email headers and attachment info

Register MailDocumentParser in the ParserRegistry alongside Text and
Tika parsers. Update consumer, signals, and all import sites to use
the new location. Update tests to use the new accessor API, patch
paths, and context-manager fixture.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Fix(parsers): pop legacy constructor args in mail signal wrapper

MailDocumentParser.__init__ takes no constructor args in the new
protocol. Update the get_parser() signal wrapper to pop logging_group
and progress_callback (passed by the legacy consumer dispatch path)
before instantiating — the same pattern used by TextDocumentParser.

Also update test_mail_parser_receives_mailrule to use the real signal
wrapper (mail_get_parser) instead of MailDocumentParser directly, so
the test exercises the actual dispatch path and matches the new
parse() call signature (no mailrule kwarg).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Bumps this so we can run

* Fixes location of the fixture

* Removes fixtures which were duplicated

* Feat(parsers): add ParserContext and configure() to ParserProtocol

Replace the ad-hoc mailrule_id attribute assignment with a typed,
immutable ParserContext dataclass and a configure() method on the
Protocol:

- ParserContext(frozen=True, slots=True) lives in paperless/parsers/
  alongside ParserProtocol and MetadataEntry; currently carries only
  mailrule_id but is designed to grow with output_type, ocr_mode, and
  ocr_language in a future phase (decoupling parsers from settings.*)
- ParserProtocol.configure(context: ParserContext) -> None is the
  extension point; no-op by default
- MailDocumentParser.configure() reads mailrule_id into _mailrule_id
- TextDocumentParser and TikaDocumentParser implement a no-op configure()
- Consumer calls document_parser.configure(ParserContext(...)) before
  parse(), replacing the isinstance(parser, MailDocumentParser) guard
  and the direct attribute mutation

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Feat(parsers): call configure(ParserContext()) in update_document task

Apply the same new-style parser shim pattern as the consumer to
update_document_content_maybe_archive_file:

- Call __enter__ for Text/Tika parsers after instantiation
- Call configure(ParserContext()) before parse() for all new-style parsers
  (mailrule_id is not available here — this is a re-process of an
  existing document, so the default empty context is correct)
- Call parse(path, mime_type) with 2 args for new-style parsers
- Call get_thumbnail(path, mime_type) with 2 args for new-style parsers
- Call __exit__ instead of cleanup() in the finally block

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Fix(tests): add configure() to DummyParser and missing-method parametrize

ParserProtocol now requires configure(context: ParserContext) -> None.
Update DummyParser in test_registry.py to implement it, and add
'missing-configure' to the test_partial_compliant_fails_isinstance
parametrize list so the new method is covered by the negative test.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Cleans up the reprocess task and generally reduces duplicate of classes

* Corrects the score return

* Updates so we can report a page count for these parsers, assuming we do have an archive produced when called

* Increases test coverage

* One more coverage

* Updates typing

* Updates typing

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-20 09:22:18 -07:00
Trenton H 68fc898042 Fix: Resolve more instances of tests which mutated global states (#12395) 2026-03-19 10:05:07 -07:00
Trenton H 1caa3eb8aa Chore: Disables the system checks for management commands in tests and when unnecessary (#12332) 2026-03-16 15:10:35 +00:00
Trenton H 5498503d60 Chore: Improve user migration path (#12232)
Co-authored-by: shamoon <4887959+shamoon@users.noreply.github.com>
2026-03-03 15:51:48 -08:00
shamoon 95484db71b Update paperless_mail npm packges 2026-03-02 00:12:52 -08:00
shamoon 8e7084eba7 Bump tailwindcss to v3.4.19 2026-03-02 00:12:52 -08:00
Jan Kleine 0bc032a67d Development: improve test portability (#12187)
* Fix: improve test portability

* Make settings always consistent

* Make a few more tests deterministic wrt settings

* Dont pollute settings for this one

* Fix timezone issue with mail parser

* Update test_parser.py

* Uh, I guess OCR gives variants for this

---------

Co-authored-by: shamoon <4887959+shamoon@users.noreply.github.com>
2026-02-27 23:24:11 +00:00
shamoon 90ae55252f Enhancement: prevent duplicate mail processing across rules (#12159) 2026-02-26 00:58:45 +00:00
shamoon 5b667621cd Try not to piss off mypy 2026-02-21 17:48:11 -08:00
shamoon 1b912c137a Merge branch 'main' into dev 2026-02-21 17:45:28 -08:00
shamoon 1bb4b9b473 More permissions on mail account test endpoint 2026-02-21 16:47:55 -08:00
shamoon f85094dc2b Set owner on OAuth mail credentials 2026-02-21 16:37:32 -08:00
shamoon 65ca78e9e7 Security: fix/GHSA-7qqc-wrcw-2fj9 2026-02-21 16:34:33 -08:00
shamoon 83f68d6063 Fix mailrule_stop_processing migration 2026-02-20 10:19:16 -08:00
shamoon 56d1b5677a Fix migration conflict 2026-02-13 09:47:15 -08:00
shamoon a467df0755 Enhancement: option to stop processing further mail rules (#12053) 2026-02-13 17:33:29 +00:00
Trenton H e8e027abc0 Chore: Optimizes the integer fields for choice types mostly, while leaving plenty of room to grow (#12057) 2026-02-10 15:11:44 -08:00
Trenton H 276dc13e3f Chore: Fixes the TO filter chaining so it doesn't reset the messages list + deterministic UIDs (#11987) 2026-02-03 11:31:19 -08:00
Sebastian Steinbeißer 3b5ffbf9fa Chore(mypy): Annotate None returns for typing improvements (#11213) 2026-02-02 08:44:12 -08:00
Trenton H 01b21377af Chore: Use a local http server instead of external to reduce flakiness (#11916) 2026-01-28 03:57:12 +00:00
Trenton H c84f2f04b3 Chore: Switch to a local IMAP server instead of a real email service (#11913) 2026-01-27 11:35:12 -08:00
Trenton H 4271812c2d Chore: Remove old migrations and re-create them fresh for v3 (#11854) 2026-01-24 15:02:56 -08:00
Trenton H c06e1e7cba Resolves gpg-agent hanging around and using inotify handles too (#11848) 2026-01-21 15:53:54 -08:00
shamoon 4e789acf2d Chore: mark test_error_skip_rule as flaky 2025-12-18 10:15:04 -08:00
shamoon 598540fda0 Chore: mark another test flaky 2025-12-18 07:17:50 -08:00
dependabot[bot] d3bfb186e0 Chore(deps-dev): Bump glob in /src/paperless_mail/templates (#11413)
Bumps [glob](https://github.com/isaacs/node-glob) from 10.4.1 to 10.5.0.
- [Changelog](https://github.com/isaacs/node-glob/blob/main/changelog.md)
- [Commits](https://github.com/isaacs/node-glob/compare/v10.4.1...v10.5.0)

---
updated-dependencies:
- dependency-name: glob
  dependency-version: 10.5.0
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-11-19 15:49:19 +00:00
shamoon 533b64cb70 Chore: replace test image URLs to our own files (#11390) 2025-11-17 10:22:54 -08:00