Files
paperless-ngx/src
Trenton H 7eb417e796 Feat: refactor TextDocumentParser to ParserProtocol
Starting from the moved paperless_text/parsers.py, rewrite the class to
satisfy ParserProtocol without inheriting from the old DocumentParser base:

- Add class-level identity attributes (name, version, author, url)
- Add supported_mime_types() and score() classmethods
- Add can_produce_archive and requires_pdf_rendition properties (both False)
- Replace tempdir / read_file_handle_unicode_errors from old base class with
  a self-contained __init__, __enter__, __exit__, and _read_text helper
- Drop file_name parameter from parse() and get_thumbnail(); add produce_archive kwarg
- Add extract_metadata() returning [] (plain text has no structured metadata)
- Remove get_settings() (not part of ParserProtocol)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09 16:54:52 -07:00
..
2026-03-09 01:52:08 +00:00
2023-04-26 09:35:27 -07:00