Files
paperless-ngx/src
Trenton H 0f6bdaf5de Feature: Add parser plugin registry and ParserProtocol (Phase 1 & 2)
Introduces the foundation of the entrypoint-based parser discovery
system to replace the signal-based document_consumer_declaration approach.

- Add ParserProtocol: runtime_checkable Protocol defining the full
  contract for document parsers (supported_mime_types, score, parse,
  context manager, result accessors)
- Add ParserRegistry: lazy singleton with entrypoint discovery via
  importlib.metadata group 'paperless_ngx.parsers', uniform score-based
  selection across external and built-in parsers
- Add get_parser_registry(), init_builtin_parsers(), reset_parser_registry()
  module-level helpers
- Wire Celery worker_process_init to call init_builtin_parsers() eagerly
  in each worker, deferring third-party discovery to first task use
- Add 28 pytest tests covering Protocol compliance, singleton lifecycle,
  scoring logic, entrypoint discovery, and log output

Built-in parsers and consumer migration follow in Phases 3-6.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09 13:54:52 -07:00
..
2026-03-09 01:52:08 +00:00
2023-04-26 09:35:27 -07:00