Commit Graph

9940 Commits

Author SHA1 Message Date
Daniel Quinn 5342db6ada Fix pycodestyle complaints
Apparently, pycodestyle updated itself to now check for invalid escape
sequences, which only complain if the regex in use isn't a raw string
(r"").
2018-09-09 20:00:12 +01:00
Daniel Quinn 5c39fff51b Add tox to dev dependencies 2018-09-09 19:59:47 +01:00
ahyear ed0e40d3e6 add migrate commande to docker update process 2018-09-06 15:32:41 +02:00
Jonas Winkler 11adc94e5e mode change 2018-09-06 12:00:01 +02:00
Jonas Winkler 04bf5fc094 fixed merge error 2018-09-06 10:15:15 +02:00
Joshua Taillon 652ead2f5c remove debugging print statement 2018-09-05 23:05:37 -04:00
Joshua Taillon be9757894a add INLINE_DOC to settings.py 2018-09-05 23:03:30 -04:00
Joshua Taillon 22378789e2 add option for inline vs. attachment for document rendering 2018-09-05 22:58:38 -04:00
Joshua Taillon 72c828170e move date-matching regex pattern to base parser module for use by all subclasses 2018-09-05 21:13:36 -04:00
Jonas Winkler d26f940a91 Merge branch 'dev' into machine-learning 2018-09-06 00:29:41 +02:00
Jonas Winkler 13725ef8ee Merge branch 'master' into dev 2018-09-06 00:28:58 +02:00
Jonas Winkler 6f0ca432c4 Added scikit-learn to requirements 2018-09-06 00:20:44 +02:00
Joshua Taillon cac63494f0 change tesseract parser to only convert first page to save (potentially) massive amounts of work 2018-09-05 15:18:35 -04:00
Jonas Winkler dd8746bac7 fixed the api 2018-09-05 15:29:05 +02:00
Jonas Winkler 8eeded95c4 Merge branch 'dev' into machine-learning 2018-09-05 15:26:39 +02:00
Jonas Winkler 131e1c9dd8 fixed the api 2018-09-05 15:25:14 +02:00
Jonas Winkler a6b4fc7e81 fixed api 2018-09-05 14:57:37 +02:00
Jonas Winkler cea880f245 implemented automatic classification field functionality 2018-09-05 14:31:02 +02:00
Jonas Winkler 82bc0e3368 Fixed a few things 2018-09-05 12:43:11 +02:00
Daniel Quinn 939a67bd4b Add empty requirements for rtd to reference 2018-09-05 11:16:42 +01:00
Daniel Quinn fbc6a58f5a Add credits for 2.2.0 that I forgot 2018-09-05 10:59:06 +01:00
Daniel Quinn 01a358d2b0 Re-flow text to keep it <80c wide 2018-09-05 10:58:41 +01:00
David Martin 6b447628ed Bump required version for Pyocr to support the latest tesseract 4.
This recently changed in the official tesseract engine [0]. -psm is
not allowed as an option anymore and --psm has to be used instead. The
latest pyocr enables support for this [1].

[0] tesseract-ocr/tesseract@ee201e1
[1] https://gitlab.gnome.org/World/OpenPaperwork/pyocr/commit/5abd0a566a0518bea00cb4247c16e67d0d3c2d65
2018-09-05 13:03:42 +10:00
Thomas Niederprüm 2308d5a613 Catch ProgrammingError in Document checks.
When running PostgreSQL or MariaDB/MySQL backends, a query to a non-existent
table will raise a "ProgrammingError". This patch properly catches this error.
Without this patch all management calls to manage.py will lead to an error when
running PostgreSQL or MariaDB as a backend.
2018-09-04 20:11:48 +02:00
Jonas Winkler 70bd05450a removed matching model fields, automatic classifier reloading, added autmatic_classification field to matching model 2018-09-04 18:40:26 +02:00
Jonas Winkler c765ef5eeb Merge remote-tracking branch 'upstream/master' 2018-09-04 16:02:48 +02:00
Jonas Winkler 30134034e2 Fixed documents not being saved after modification 2018-09-04 15:33:51 +02:00
Jonas Winkler 8a1a736340 Merge branch 'document-type' into dev 2018-09-04 14:55:59 +02:00
Jonas Winkler 68652c8c37 Document Type exporting 2018-09-04 14:55:29 +02:00
Jonas Winkler c091eba26e Implemented the classifier model, including automatic tagging of new documents 2018-09-04 14:39:55 +02:00
Joshua Taillon 23bf79274c Merge branch 'master' into ENH_text_consumer 2018-09-03 23:47:30 -04:00
Joshua Taillon 4849249d86 explicitly add txt, md, and csv types for consumer and viewer; fix thumbnail generation 2018-09-03 23:46:13 -04:00
Jonas Winkler ca315ba76c Added code that trains models based on data from the databasae 2018-09-03 15:55:41 +02:00
Daniel Quinn ee20af71e8 Bump for 2.2.1 2.2.1 2018-09-03 00:27:40 +01:00
Daniel Quinn 3c8aa3ba42 Don't try to remove SessionAuthenticationMiddleware
It was remove entirely in Django 2.0
2018-09-03 00:25:10 +01:00
Daniel Quinn 778ffa488d Add Tim to the credits for 2.2.0 2018-09-02 21:53:52 +01:00
Daniel Quinn 0868390d63 Merge branch 'dadosch-django-v2' 2.2.0 2018-09-02 21:48:59 +01:00
Daniel Quinn d5180fe5e1 Updates for 2.2.0 2018-09-02 21:48:09 +01:00
Daniel Quinn 08174a6b52 Add note about the removal of puritanical language 2018-09-02 21:46:52 +01:00
Daniel Quinn f5e725c691 Switch out field_name= for name=
This appears to be a django-filter version change thing.
2018-09-02 21:26:30 +01:00
Daniel Quinn 2400245b96 pep8 2018-09-02 21:26:20 +01:00
Daniel Quinn 729f005600 Remove old Python 2.x style code 2018-09-02 21:26:06 +01:00
Daniel Quinn 39afe41f08 Drop django-flat-responsive
It's not necessary for Django 2.0+ as the new system is responsive by
default.
2018-09-02 21:25:30 +01:00
Daniel Quinn 2d4008371b Merge @dadosch's changes & fix dependency conflicts 2018-09-02 21:06:40 +01:00
Daniel Quinn 218809ce15 Merge pull request #391 from sbrunner/tag-list
Better interface when we have many tags
2018-09-02 20:57:32 +01:00
Daniel Quinn 7db4410c1b Default sort order for tags to use 'name' 2018-09-02 20:56:45 +01:00
Daniel Quinn f1e1bb4deb Fix #384: duplicate tags due to case insensitivity 2018-09-02 20:48:51 +01:00
Daniel Quinn cccc9e1a24 Clean up some linter complaints 2018-09-02 20:33:49 +01:00
Daniel Quinn 39ef81d398 Update dependencies 2018-09-02 20:33:28 +01:00
Jonas Winkler 350da81081 Added command to create datasets 2018-09-02 12:47:19 +02:00