Commit Graph

122 Commits

Author SHA1 Message Date
jonaswinkler 7f9a0204b5 removed most of the logic that extracts data from filename patterns #156 2020-12-20 00:08:05 +01:00
jonaswinkler 6003122b06 fixes #112 2020-12-09 22:16:57 +01:00
jonaswinkler 74a99cf330 removed slugs entirely, since their only purpose was purely cosmetic anyway. 2020-12-09 00:04:37 +01:00
jonaswinkler 550a74347c a test that "verifies" that the file renaming lock works and no inconsistencies are created. 2020-12-08 21:08:44 +01:00
jonaswinkler 9da11f29c7 fixes #90 2020-12-08 13:54:49 +01:00
jonaswinkler 834352130c checking file types against parsers in the consumer. 2020-12-01 15:26:05 +01:00
jonaswinkler 8a5c782425 filename handling for archive files. 2020-11-30 21:38:42 +01:00
jonaswinkler aaa6599283 Merge branch 'dev' into feature-ocrmypdf 2020-11-30 16:48:09 +01:00
jonaswinkler f51207fc32 added file type checks to the parsers to prevent temporary files from being consumed. Also: parsers announce file types they wish to use as default for each mime type. 2020-11-30 00:40:04 +01:00
jonaswinkler 39c682dc07 Merge branch 'dev' into feature-ocrmypdf 2020-11-29 18:37:38 +01:00
jonaswinkler 023aeea7ea test cases for #67 2020-11-29 15:47:56 +01:00
jonaswinkler a27daaebe9 fixes an issue with paperless not assigning metadata when FILENAME_FORMAT is specified and resolves an invalid warning about missing files fixes #67 2020-11-29 14:45:43 +01:00
jonaswinkler 9677631bb2 error logging. 2020-11-29 12:37:11 +01:00
jonaswinkler 24767f62c7 added checksums for archived documents. 2020-11-29 12:31:26 +01:00
jonaswinkler ea9de1bcf1 Merge branch 'dev' into feature-ocrmypdf 2020-11-27 14:03:19 +01:00
jonaswinkler d04b54140c moved consumption dir check into the correct spot 2020-11-27 13:12:13 +01:00
Jonas Winkler 7e84863beb Merge branch 'dev' into feature-ocrmypdf 2020-11-25 21:13:02 +01:00
Jonas Winkler ef15de18a9 Paperless will continue to operate with encrypted files, however, all new files will be stored unencrypted. 2020-11-25 21:03:06 +01:00
Jonas Winkler 6f30ceea38 GnuPG for archive file. 2020-11-25 20:16:27 +01:00
Jonas Winkler 9bd0bee2f6 codestyle 2020-11-25 19:51:02 +01:00
Jonas Winkler df801d17e1 reworked the interface of the parsers. 2020-11-25 19:36:39 +01:00
Jonas Winkler 8069c2eb6a add support for archive files. 2020-11-25 14:47:17 +01:00
Jonas Winkler 9a33f191a7 added archive directory. 2020-11-25 14:45:21 +01:00
Jonas Winkler b44f8383e4 code cleanup 2020-11-21 14:03:45 +01:00
Jonas Winkler 3d5b66c2b7 FileType does not care about the extension anymore. 2020-11-20 16:18:59 +01:00
Jonas Winkler 41650f20f4 mime type handling 2020-11-20 13:31:03 +01:00
Jonas Winkler 727f86c369 codestyle 2020-11-18 22:41:14 +01:00
Jonas Winkler 8908bc259e updated logging, logging for the mail consumer to see whats happening 2020-11-18 13:23:30 +01:00
Jonas Winkler c7c6be42be refactor 2020-11-17 11:49:44 +01:00
Jonas Winkler 70d8e8bc56 added more testing 2020-11-16 23:16:37 +01:00
Jonas Winkler 8dca459573 first version of the new consumer. 2020-11-16 18:26:54 +01:00
Jonas Winkler 2e04ba1c04 code style fixes 2020-11-12 21:09:45 +01:00
Jonas Winkler 734da28b69 fixed the file handling implementation. The feature is cool, but the original implementation had so many small flaws it wasn't even funny. 2020-11-11 14:21:33 +01:00
Jonas Winkler 02ef7cb038 small consumer fixes 2020-11-11 14:14:21 +01:00
Jonas Winkler 83f82f3caf added a setting: delete duplicate documents 2020-11-10 01:47:58 +01:00
Jonas Winkler 296c113b16 updated the classifier. Its now much faster and does not retrain when data hasnt changed. 2020-11-06 14:46:06 +01:00
Jonas Winkler f4cebda085 A handy script to redo ocr on all documents, 2020-11-03 14:04:11 +01:00
Jonas Winkler 7d282a4e4e removed unused code, small fixes 2020-11-02 18:20:04 +01:00
Jonas Winkler d15405ef56 reworked most of the tesseract parser, better logging 2020-11-02 15:40:44 +01:00
Jonas Winkler 9f29dc2863 updated consumer: now using watchdog 2020-11-01 23:07:54 +01:00
Jonas Winkler 05f20c19c3 the document classifier is now stateless 2020-10-29 14:33:42 +01:00
Jonas Winkler 11af74ba36 unified document matching, legacy and automatching work alongside now 2020-10-28 11:45:11 +01:00
Jonas Winkler 052c1680f3 added
- document index
- api access for thumbnails/downloads
- more api filters

updated
- pipfile

removed
- filename handling
- legacy thumb/download access
- obsolete admin gui settings (per page items, FY, inline view)
2020-10-25 23:03:02 +01:00
Jonas Winkler 421dab786d Merge branch 'master' into dev 2020-10-16 15:02:57 +02:00
JOKer 8698f92ac9 Merge pull request #593 from BastianPoe/feature-293
Give stored documents a structured and configurable filename
2020-05-02 08:33:49 +02:00
Johann Bauer 22c7f309a7 Warn if consume directory contains subdirectories
.
2020-01-04 01:09:54 +01:00
Wolf-Bastian Poettner 6813805712 Allows to configure directory and filename formats for documents stored in paperless
Default configuration is as before (incrementing numbers), but additional fields can be added at will
2019-12-27 14:25:38 +00:00
Jonas Winkler ea58c66fd4 Merge branch 'master' into dev 2018-12-11 12:38:15 +01:00
Jonas Winkler 766109ae4e Merge remote-tracking branch 'upstream/master' 2018-12-11 12:06:15 +01:00
Daniel Quinn 750ab5bf85 Use optipng to optimise document thumbnails 2018-10-07 14:56:38 +01:00