Compare commits

...

16 Commits

Author SHA1 Message Date
Trenton H 0d3f2ffb75 test(mail): lift composite test API endpoints to module constants 2026-04-30 10:02:08 -07:00
Trenton H 4942ab1000 test(mail): convert test_api.py to pytest style 2026-04-30 09:58:37 -07:00
Trenton H 9e17c55a9a test(mail): add mail_api_user, mail_api_client, bogus_mailbox fixtures 2026-04-30 09:45:11 -07:00
Trenton H 42da2e2fd6 test(mail): use pytest tmp_path for MessageEncryptor GPG homes 2026-04-30 09:07:20 -07:00
Trenton H 4872bfbeae test(mail): convert test_preprocessor.py to pytest style 2026-04-30 08:56:26 -07:00
Trenton H e40cd6048d test(mail): add mail_mocker fixture wrapping MailMocker helper 2026-04-30 08:48:23 -07:00
Trenton H 56ebc08a65 test(mail): convert test_mail_oauth.py to pytest style 2026-04-30 08:44:37 -07:00
Trenton H b7d466c242 test(mail): add mail_user and oauth_settings pytest fixtures 2026-04-30 08:32:52 -07:00
Trenton H 493d282059 Chore: Upgrades tantivy-py to the latest release (#12605) 2026-04-29 10:09:50 -07:00
GitHub Actions 88824f94e5 Auto translate strings 2026-04-28 20:02:11 +00:00
shamoon 354df34e47 Enhancement: chat message document links (#12670) 2026-04-28 13:00:20 -07:00
GitHub Actions 8e6fd010a0 Auto translate strings 2026-04-28 17:08:38 +00:00
shamoon 69cb4d06c6 Enhancement (dev): Use OpenAI-like backend (#12668) 2026-04-28 10:06:59 -07:00
shamoon 2f8f126223 Fix: fix a minor pdf viewer uncaught error (#12669) 2026-04-28 09:47:20 -07:00
Trenton H 14fe520319 Chore: Update typing and baselines again (#12641)
a
2026-04-28 09:28:05 -07:00
shamoon ff95512b9a Fix: apply tag changes directly to document in db (#12664) 2026-04-28 08:18:40 -07:00
76 changed files with 6224 additions and 7351 deletions
+301 -606
View File
File diff suppressed because it is too large Load Diff
+3779 -5267
View File
File diff suppressed because it is too large Load Diff
+22 -14
View File
@@ -2014,49 +2014,57 @@ suggestions. This setting is required to be set to true in order to use the AI f
#### [`PAPERLESS_AI_LLM_EMBEDDING_BACKEND=<str>`](#PAPERLESS_AI_LLM_EMBEDDING_BACKEND) {#PAPERLESS_AI_LLM_EMBEDDING_BACKEND}
: The embedding backend to use for RAG. This can be either "openai" or "huggingface".
: The embedding backend to use for RAG. This can be either "openai-like" or "huggingface". The
"openai-like" backend uses an OpenAI-compatible embeddings API.
Defaults to None.
#### [`PAPERLESS_AI_LLM_EMBEDDING_MODEL=<str>`](#PAPERLESS_AI_LLM_EMBEDDING_MODEL) {#PAPERLESS_AI_LLM_EMBEDDING_MODEL}
: The model to use for the embedding backend for RAG. This can be set to any of the embedding models supported by the current embedding backend. If not supplied, defaults to "text-embedding-3-small" for OpenAI and "sentence-transformers/all-MiniLM-L6-v2" for Huggingface.
: The model to use for the embedding backend for RAG. This can be set to any of the embedding
models supported by the current embedding backend. If not supplied, defaults to
"text-embedding-3-small" for the OpenAI-compatible backend and
"sentence-transformers/all-MiniLM-L6-v2" for Huggingface.
Defaults to None.
#### [`PAPERLESS_AI_LLM_BACKEND=<str>`](#PAPERLESS_AI_LLM_BACKEND) {#PAPERLESS_AI_LLM_BACKEND}
: The AI backend to use. This can be either "openai" or "ollama". If set to "ollama", the AI
features will be run locally on your machine. If set to "openai", the AI features will be run
using the OpenAI API. This setting is required to be set to use the AI features.
: The AI backend to use. This can be either "openai-like" or "ollama". If set to "ollama", the AI
features will be run locally on your machine. If set to "openai-like", the AI features will use
an OpenAI-compatible API endpoint, including OpenAI itself and compatible providers. This
setting is required to be set to use the AI features.
Defaults to None.
!!! note
The OpenAI API is a paid service. You will need to set up an OpenAI account and
will be charged for usage incurred by Paperless-ngx features and your document data
will (of course) be sent to the OpenAI API. Paperless-ngx does not endorse the use of the
OpenAI API in any way.
Remote AI providers may be paid services. If you use a hosted OpenAI-compatible API, you
are responsible for any usage charges incurred by Paperless-ngx features, and your
document data will be sent to the provider you configure.
Refer to the OpenAI terms of service, and use at your own risk.
Paperless-ngx does not endorse any specific provider. Refer to your provider's terms of
service and privacy policy, and use at your own risk.
#### [`PAPERLESS_AI_LLM_MODEL=<str>`](#PAPERLESS_AI_LLM_MODEL) {#PAPERLESS_AI_LLM_MODEL}
: The model to use for the AI backend, i.e. "gpt-3.5-turbo", "gpt-4" or any of the models supported by the
current backend. If not supplied, defaults to "gpt-3.5-turbo" for OpenAI and "llama3.1" for Ollama.
: The model to use for the AI backend, i.e. "gpt-3.5-turbo", "gpt-4" or any of the models supported
by the current backend. If not supplied, defaults to "gpt-3.5-turbo" for the OpenAI-compatible
backend and "llama3.1" for Ollama.
Defaults to None.
#### [`PAPERLESS_AI_LLM_API_KEY=<str>`](#PAPERLESS_AI_LLM_API_KEY) {#PAPERLESS_AI_LLM_API_KEY}
: The API key to use for the AI backend. This is required for the OpenAI backend (optional for others).
: The API key to use for the AI backend. This is typically required for the OpenAI-compatible
backend (optional for others).
Defaults to None.
#### [`PAPERLESS_AI_LLM_ENDPOINT=<str>`](#PAPERLESS_AI_LLM_ENDPOINT) {#PAPERLESS_AI_LLM_ENDPOINT}
: The endpoint / url to use for the AI backend. This is required for the Ollama backend (optional for others).
: The endpoint / url to use for the AI backend. This is required for the Ollama backend and may be
used with the OpenAI-compatible backend to target a custom provider or local gateway.
Defaults to None.
+8 -2
View File
@@ -302,13 +302,19 @@ Paperless-ngx includes several features that use AI to enhance the document mana
!!! warning
Remember that Paperless-ngx will send document content to the AI provider you have configured, so consider the privacy implications of using these features, especially if using a remote model (e.g. OpenAI), instead of the default local model.
Remember that Paperless-ngx will send document content to the AI provider you have configured,
so consider the privacy implications of using these features, especially if using a remote
model or API provider instead of the default local model.
The AI features work by creating an embedding of the text content and metadata of documents, which is then used for various tasks such as similarity search and question answering. This uses the FAISS vector store.
### AI-Enhanced Suggestions
If enabled, Paperless-ngx can use an AI LLM model to suggest document titles, dates, tags, correspondents and document types for documents. This feature will always be "opt-in" and does not disable the existing classifier-based suggestion system. Currently, both remote (via the OpenAI API) and local (via Ollama) models are supported, see [configuration](configuration.md#ai) for details.
If enabled, Paperless-ngx can use an AI LLM model to suggest document titles, dates, tags,
correspondents and document types for documents. This feature will always be "opt-in" and does not
disable the existing classifier-based suggestion system. Currently, both remote
(via OpenAI-compatible APIs) and local (via Ollama) models are supported, see
[configuration](configuration.md#ai) for details.
### Document Chat
+16 -2
View File
@@ -30,11 +30,25 @@
"**/.idea": true,
"**/.venv": true,
"**/.coverage": true,
"**/coverage.json": true
"**/coverage.json": true,
"htmlcov/": true,
"coverage.xml": true,
"junit.xml": true
},
"python.defaultInterpreterPath": ".venv/bin/python3",
"python.languageServer": "Pylance",
"python.defaultInterpreterPath": "${workspaceFolder:paperless-ngx}/.venv/bin/python3",
"python.analysis.extraPaths": ["${workspaceFolder:paperless-ngx}/src"],
"python.analysis.inlayHints.pytestParameters": true,
"python.testing.pytestEnabled": true,
"python.testing.unittestEnabled": false,
"[python]": {
"editor.defaultFormatter": "charliermarsh.ruff",
"editor.formatOnSave": true,
"editor.codeActionsOnSave": {
"source.fixAll.ruff": "explicit",
"source.organizeImports.ruff": "explicit"
}
}
},
"extensions": {
"recommendations": ["ms-python.python", "charliermarsh.ruff", "editorconfig.editorconfig"],
+7 -4
View File
@@ -53,9 +53,9 @@ dependencies = [
"langdetect~=1.0.9",
"llama-index-core>=0.14.12",
"llama-index-embeddings-huggingface>=0.6.1",
"llama-index-embeddings-openai>=0.5.1",
"llama-index-embeddings-openai-like>=0.2.2",
"llama-index-llms-ollama>=0.9.1",
"llama-index-llms-openai>=0.6.13",
"llama-index-llms-openai-like>=0.7.1",
"llama-index-vector-stores-faiss>=0.5.2",
"nltk~=3.9.1",
"ocrmypdf~=17.4.0",
@@ -73,7 +73,7 @@ dependencies = [
"scikit-learn~=1.8.0",
"sentence-transformers>=4.1",
"setproctitle~=1.3.4",
"tantivy>=0.25.1",
"tantivy~=0.26.0",
"tika-client~=0.11.0",
"torch~=2.11.0",
"watchfiles>=1.1.1",
@@ -143,7 +143,8 @@ typing = [
"types-python-dateutil",
"types-pytz",
"types-redis",
"types-setuptools",
"types-regex",
"types-setuptools"
]
[tool.uv]
@@ -178,6 +179,8 @@ respect-gitignore = true
fix = true
show-fixes = true
output-format = "grouped"
[tool.ruff.format]
line-ending = "lf"
[tool.ruff.lint]
# https://docs.astral.sh/ruff/rules/
extend-select = [
+2 -2
View File
@@ -3560,14 +3560,14 @@
<source>Ask a question about this document...</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/chat/chat/chat.component.ts</context>
<context context-type="linenumber">37</context>
<context context-type="linenumber">42</context>
</context-group>
</trans-unit>
<trans-unit id="224049052163028286" datatype="html">
<source>Ask a question about a document...</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/components/chat/chat/chat.component.ts</context>
<context context-type="linenumber">38</context>
<context context-type="linenumber">43</context>
</context-group>
</trans-unit>
<trans-unit id="8700121026680200191" datatype="html">
@@ -8,10 +8,21 @@
<div class="chat-messages font-monospace small">
@for (message of messages; track message) {
<div class="message d-flex flex-row small" [class.justify-content-end]="message.role === 'user'">
<span class="p-2 m-2" [class.bg-dark]="message.role === 'user'">
{{ message.content }}
@if (message.isStreaming) { <span class="blinking-cursor">|</span> }
</span>
<div class="p-2 m-2" [class.bg-dark]="message.role === 'user'">
<span>
{{ message.content }}
@if (message.isStreaming) { <span class="blinking-cursor">|</span> }
</span>
@if (message.role === 'assistant' && message.references?.length) {
<div class="chat-references list-group mt-3">
@for (reference of message.references; track reference.id) {
<a class="list-group-item list-group-item-action text-primary" [routerLink]="['/documents', reference.id]">
<i-bs width="0.9em" height="0.9em" name="file-text" class="me-1"></i-bs><span>{{ reference.title }}</span>
</a>
}
</div>
}
</div>
</div>
}
<div #scrollAnchor></div>
@@ -7,6 +7,10 @@
overflow-y: auto;
}
.chat-references {
font-family: var(--bs-font-sans-serif);
}
.dropdown-toggle::after {
display: none;
}
@@ -3,9 +3,13 @@ import { provideHttpClientTesting } from '@angular/common/http/testing'
import { ElementRef } from '@angular/core'
import { ComponentFixture, TestBed } from '@angular/core/testing'
import { NavigationEnd, Router } from '@angular/router'
import { RouterTestingModule } from '@angular/router/testing'
import { allIcons, NgxBootstrapIconsModule } from 'ngx-bootstrap-icons'
import { Subject } from 'rxjs'
import { ChatService } from 'src/app/services/chat.service'
import {
CHAT_METADATA_DELIMITER,
ChatService,
} from 'src/app/services/chat.service'
import { ChatComponent } from './chat.component'
describe('ChatComponent', () => {
@@ -18,7 +22,11 @@ describe('ChatComponent', () => {
beforeEach(async () => {
TestBed.configureTestingModule({
imports: [NgxBootstrapIconsModule.pick(allIcons), ChatComponent],
imports: [
NgxBootstrapIconsModule.pick(allIcons),
RouterTestingModule,
ChatComponent,
],
providers: [
provideHttpClient(withInterceptorsFromDi()),
provideHttpClientTesting(),
@@ -84,6 +92,57 @@ describe('ChatComponent', () => {
expect(component.messages[1].isStreaming).toBe(false)
})
it('should parse references from the metadata trailer without showing it', () => {
component.input = 'Hello'
component.sendMessage()
mockStream$.next(
`Hi there${CHAT_METADATA_DELIMITER}{"references":[{"id":42,"title":"Bread Recipe"}]}`
)
jest.advanceTimersByTime(1000)
expect(component.messages[1].content).toBe('Hi there')
expect(component.messages[1].references).toEqual([
{ id: 42, title: 'Bread Recipe' },
])
})
it('should render document reference links under assistant messages', () => {
component.input = 'Hello'
component.sendMessage()
mockStream$.next(
`Hi there${CHAT_METADATA_DELIMITER}{"references":[{"id":42,"title":"Bread Recipe"}]}`
)
jest.advanceTimersByTime(1000)
fixture.detectChanges()
const link = fixture.nativeElement.querySelector('.chat-references a')
expect(link.textContent).toContain('Bread Recipe')
expect(link.getAttribute('href')).toContain('/documents/42')
})
it('should remove delimiter fragments that were already streamed', () => {
component.input = 'Hello'
component.sendMessage()
mockStream$.next(`Hi there${CHAT_METADATA_DELIMITER.slice(0, 8)}`)
jest.advanceTimersByTime(1000)
expect(component.messages[1].content).toBe(
`Hi there${CHAT_METADATA_DELIMITER.slice(0, 8)}`
)
mockStream$.next(
`Hi there${CHAT_METADATA_DELIMITER}{"references":[{"id":42,"title":"Bread Recipe"}]}`
)
jest.advanceTimersByTime(1000)
expect(component.messages[1].content).toBe('Hi there')
expect(component.messages[1].references).toEqual([
{ id: 42, title: 'Bread Recipe' },
])
})
it('should handle errors during streaming', () => {
component.input = 'Hello'
component.sendMessage()
@@ -1,16 +1,21 @@
import { Component, ElementRef, inject, OnInit, ViewChild } from '@angular/core'
import { FormsModule, ReactiveFormsModule } from '@angular/forms'
import { NavigationEnd, Router } from '@angular/router'
import { NavigationEnd, Router, RouterModule } from '@angular/router'
import { NgbDropdownModule } from '@ng-bootstrap/ng-bootstrap'
import { NgxBootstrapIconsModule } from 'ngx-bootstrap-icons'
import { filter, map } from 'rxjs'
import { ChatMessage, ChatService } from 'src/app/services/chat.service'
import {
ChatMessage,
ChatService,
parseChatResponse,
} from 'src/app/services/chat.service'
@Component({
selector: 'pngx-chat',
imports: [
FormsModule,
ReactiveFormsModule,
RouterModule,
NgxBootstrapIconsModule,
NgbDropdownModule,
],
@@ -70,13 +75,24 @@ export class ChatComponent implements OnInit {
this.messages.push(assistantMessage)
this.loading = true
let lastPartialLength = 0
let lastVisibleContent = ''
this.chatService.streamChat(this.documentId, this.input).subscribe({
next: (chunk) => {
const delta = chunk.substring(lastPartialLength)
lastPartialLength = chunk.length
this.enqueueTypewriter(delta, assistantMessage)
const nextResponse = parseChatResponse(chunk)
if (nextResponse.content.length < lastVisibleContent.length) {
this.resetTypewriter(assistantMessage, nextResponse.content)
lastVisibleContent = nextResponse.content
} else {
const visibleDelta = nextResponse.content.substring(
lastVisibleContent.length
)
lastVisibleContent = nextResponse.content
this.enqueueTypewriter(visibleDelta, assistantMessage)
}
assistantMessage.references = nextResponse.references
},
error: () => {
assistantMessage.content += '\n\n⚠️ Error receiving response.'
@@ -93,6 +109,13 @@ export class ChatComponent implements OnInit {
this.input = ''
}
private resetTypewriter(message: ChatMessage, content: string): void {
this.typewriterBuffer = []
this.typewriterActive = false
message.content = content
this.scrollToBottom()
}
enqueueTypewriter(chunk: string, message: ChatMessage): void {
if (!chunk) return
@@ -272,7 +272,7 @@ export class PngxPdfViewerComponent
if (!this.hasRenderedPage) {
return
}
const query = this.searchQuery.trim()
const query = this.searchQuery?.trim()
if (query === this.lastFindQuery) {
return
}
+2 -2
View File
@@ -55,12 +55,12 @@ export const ConfigCategory = {
}
export const LLMEmbeddingBackendConfig = {
OPENAI: 'openai',
OPENAI_LIKE: 'openai-like',
HUGGINGFACE: 'huggingface',
}
export const LLMBackendConfig = {
OPENAI: 'openai',
OPENAI_LIKE: 'openai-like',
OLLAMA: 'ollama',
}
+23 -1
View File
@@ -9,7 +9,11 @@ import {
} from '@angular/common/http/testing'
import { TestBed } from '@angular/core/testing'
import { environment } from 'src/environments/environment'
import { ChatService } from './chat.service'
import {
CHAT_METADATA_DELIMITER,
ChatService,
parseChatResponse,
} from './chat.service'
describe('ChatService', () => {
let service: ChatService
@@ -55,4 +59,22 @@ describe('ChatService', () => {
partialText: mockResponse,
} as any)
})
it('should parse chat references from the metadata trailer', () => {
const parsed = parseChatResponse(
`Answer text${CHAT_METADATA_DELIMITER}{"references":[{"id":1,"title":"Document 1"}]}`
)
expect(parsed.content).toBe('Answer text')
expect(parsed.references).toEqual([{ id: 1, title: 'Document 1' }])
})
it('should hide incomplete metadata trailer from the visible content', () => {
const parsed = parseChatResponse(
`Answer text${CHAT_METADATA_DELIMITER}{"references"`
)
expect(parsed.content).toBe('Answer text')
expect(parsed.references).toBeUndefined()
})
})
+40
View File
@@ -11,6 +11,46 @@ export interface ChatMessage {
role: 'user' | 'assistant'
content: string
isStreaming?: boolean
references?: ChatReference[]
}
export interface ChatReference {
id: number
title: string
}
export interface ParsedChatResponse {
content: string
references?: ChatReference[]
}
export const CHAT_METADATA_DELIMITER = '\n\n__PAPERLESS_CHAT_METADATA__'
export function parseChatResponse(response: string): ParsedChatResponse {
const delimiterIndex = response.indexOf(CHAT_METADATA_DELIMITER)
if (delimiterIndex === -1) {
return { content: response }
}
const metadataString = response.slice(
delimiterIndex + CHAT_METADATA_DELIMITER.length
)
try {
const metadata = JSON.parse(metadataString) as {
references?: ChatReference[]
}
return {
content: response.slice(0, delimiterIndex),
references: metadata.references ?? [],
}
} catch {
return {
content: response.slice(0, delimiterIndex),
}
}
}
@Injectable({
+2 -1
View File
@@ -218,7 +218,8 @@ def set_llm_suggestions_cache(
timeout: int = CACHE_50_MINUTES,
) -> None:
"""
Cache LLM-generated suggestions using a backend-specific identifier (e.g. 'openai:gpt-4').
Cache LLM-generated suggestions using a backend-specific identifier
(e.g. 'openai-like:gpt-4').
"""
doc_key = get_suggestion_cache_key(document_id)
cache.set(
@@ -6,6 +6,7 @@ import tempfile
from itertools import islice
from pathlib import Path
from typing import TYPE_CHECKING
from typing import Any
from allauth.mfa.models import Authenticator
from allauth.socialaccount.models import SocialAccount
@@ -68,7 +69,7 @@ from paperless_mail.models import MailRule
def serialize_queryset_batched(
queryset: "QuerySet",
queryset: "QuerySet[Any]",
*,
batch_size: int = 500,
) -> "Generator[list[dict], None, None]":
@@ -364,7 +365,7 @@ class Command(CryptMixin, PaperlessCommand):
# 2. Create manifest, containing all correspondents, types, tags, storage paths
# note, documents and ui_settings
manifest_key_to_object_query: dict[str, QuerySet] = {
manifest_key_to_object_query: dict[str, QuerySet[Any]] = {
"correspondents": Correspondent.objects.all(),
"tags": Tag.objects.all(),
"document_types": DocumentType.objects.all(),
+1 -1
View File
@@ -261,7 +261,7 @@ def get_objects_for_user_owner_aware(
Model: Any,
*,
include_deleted: bool = False,
) -> QuerySet:
) -> QuerySet[Any]:
"""
Returns objects the user owns, are unowned, or has explicit perms.
When include_deleted is True, soft-deleted items are also included.
+90 -142
View File
@@ -3,7 +3,6 @@ from __future__ import annotations
import logging
import re
import threading
from collections import Counter
from datetime import UTC
from datetime import datetime
from enum import StrEnum
@@ -11,6 +10,7 @@ from typing import TYPE_CHECKING
from typing import Self
from typing import TypedDict
from typing import TypeVar
from typing import cast
import filelock
import regex
@@ -36,7 +36,7 @@ from documents.utils import identity
if TYPE_CHECKING:
from pathlib import Path
from django.contrib.auth.base_user import AbstractBaseUser
from django.contrib.auth.models import AbstractUser
from django.db.models import QuerySet
from documents.models import Document
@@ -169,9 +169,16 @@ class WriteBatch:
def __init__(self, backend: TantivyBackend, lock_timeout: float):
self._backend = backend
self._lock_timeout = lock_timeout
self._writer = None
self._raw_writer: tantivy.IndexWriter | None = None
self._lock = None
@property
def _writer(self) -> tantivy.IndexWriter:
assert self._raw_writer is not None, (
"WriteBatch not entered; use as context manager"
)
return self._raw_writer
def __enter__(self) -> Self:
if self._backend._path is not None:
lock_path = self._backend._path / ".tantivy.lock"
@@ -183,7 +190,7 @@ class WriteBatch:
f"Could not acquire index lock within {self._lock_timeout}s",
) from e
self._writer = self._backend._index.writer()
self._raw_writer = self._backend._index.writer()
return self
def __exit__(self, exc_type, exc_val, exc_tb):
@@ -193,9 +200,9 @@ class WriteBatch:
self._backend._index.reload()
# Explicitly delete writer to release tantivy's internal lock.
# On exception the uncommitted writer is simply discarded.
if self._writer is not None:
del self._writer
self._writer = None
if self._raw_writer is not None:
del self._raw_writer
self._raw_writer = None
finally:
if self._lock is not None:
self._lock.release()
@@ -222,24 +229,9 @@ class WriteBatch:
self._writer.add_document(doc)
def remove(self, doc_id: int) -> None:
"""
Remove a document from the batch by its primary key.
Uses range_query instead of term_query to work around a tantivy-py bug
where Python integers are inferred as i64, producing Terms that never
match u64 fields.
TODO: Replace with term_query("id", doc_id) once
https://github.com/quickwit-oss/tantivy-py/pull/642 lands.
"""
"""Remove a document from the batch by its primary key."""
self._writer.delete_documents_by_query(
tantivy.Query.range_query(
self._backend._schema,
"id",
tantivy.FieldType.Unsigned,
doc_id,
doc_id,
),
tantivy.Query.term_query(self._backend._schema, "id", doc_id),
)
@@ -288,8 +280,18 @@ class TantivyBackend:
# path=None → in-memory index (for tests)
# path=some_dir → on-disk index (for production)
self._path = path
self._index = None
self._schema = None
self._raw_index: tantivy.Index | None = None
self._raw_schema: tantivy.Schema | None = None
@property
def _index(self) -> tantivy.Index:
assert self._raw_index is not None, "Index not open; call open() first"
return self._raw_index
@property
def _schema(self) -> tantivy.Schema:
assert self._raw_schema is not None, "Schema not open; call open() first"
return self._raw_schema
def open(self) -> None:
"""
@@ -299,14 +301,14 @@ class TantivyBackend:
version or language changes. Registers custom tokenizers after opening.
Safe to call multiple times - subsequent calls are no-ops.
"""
if self._index is not None:
if self._raw_index is not None:
return # pragma: no cover
if self._path is not None:
self._index = open_or_rebuild_index(self._path)
self._raw_index = open_or_rebuild_index(self._path)
else:
self._index = tantivy.Index(build_schema())
register_tokenizers(self._index, settings.SEARCH_LANGUAGE)
self._schema = self._index.schema
self._raw_index = tantivy.Index(build_schema())
register_tokenizers(self._raw_index, settings.SEARCH_LANGUAGE)
self._raw_schema = self._raw_index.schema
def close(self) -> None:
"""
@@ -314,12 +316,12 @@ class TantivyBackend:
Safe to call multiple times - subsequent calls are no-ops.
"""
self._index = None
self._schema = None
self._raw_index = None
self._raw_schema = None
def _ensure_open(self) -> None:
"""Ensure the index is open before operations."""
if self._index is None:
if self._raw_index is None:
self.open() # pragma: no cover
def _parse_query(
@@ -339,7 +341,7 @@ class TantivyBackend:
def _apply_permission_filter(
self,
query: tantivy.Query,
user: AbstractBaseUser | None,
user: AbstractUser | None,
) -> tantivy.Query:
"""Wrap a query with a permission filter if the user is not a superuser."""
if user is not None:
@@ -526,15 +528,6 @@ class TantivyBackend:
Use this when you already know which documents to display (from
search_ids + ORM filtering) and just need highlight data.
Note: Each doc_id requires an individual index lookup because tantivy-py
does not yet expose a batch fast-field read API. This is acceptable for
page-sized batches (typically 25 docs) but should not be called with
thousands of IDs.
TODO: When https://github.com/quickwit-oss/tantivy-py/pull/641 lands,
the per-doc range_query lookups here can be replaced with a single
collect_u64_fast_field("id", doc_addresses) call.
Args:
query: The search query (used for snippet generation)
doc_ids: Ordered list of document IDs to generate hits for
@@ -571,32 +564,42 @@ class TantivyBackend:
notes_text_query = user_query
searcher = self._index.searcher()
# Fetch all requested docs in a single search: user_query MUST match
# and exactly the requested IDs MUST match (OR of term_queries).
id_filter = tantivy.Query.boolean_query(
[
(
tantivy.Occur.Should,
tantivy.Query.term_query(self._schema, "id", did),
)
for did in doc_ids
],
)
batch_query = tantivy.Query.boolean_query(
[
(tantivy.Occur.Must, user_query),
(tantivy.Occur.Must, id_filter),
],
)
batch_results = searcher.search(batch_query, limit=len(doc_ids))
result_addrs = [addr for _score, addr in batch_results.hits]
result_ids = cast("list[int]", searcher.fast_field_values("id", result_addrs))
addr_by_id: dict[int, tuple[float, tantivy.DocAddress]] = {
doc_id: (score, addr)
for (score, addr), doc_id in zip(batch_results.hits, result_ids)
}
snippet_generator = None
notes_snippet_generator = None
hits: list[SearchHit] = []
for rank, doc_id in enumerate(doc_ids, start=rank_start):
# Look up document by ID, scoring against the user query so that
# the returned SearchHit carries a real BM25 relevance score.
id_query = tantivy.Query.range_query(
self._schema,
"id",
tantivy.FieldType.Unsigned,
doc_id,
doc_id,
)
scored_query = tantivy.Query.boolean_query(
[
(tantivy.Occur.Must, user_query),
(tantivy.Occur.Must, id_query),
],
)
results = searcher.search(scored_query, limit=1)
if not results.hits:
if doc_id not in addr_by_id:
continue
score, doc_address = results.hits[0]
score, doc_address = addr_by_id[doc_id]
actual_doc = searcher.doc(doc_address)
doc_dict = actual_doc.to_dict()
@@ -647,7 +650,7 @@ class TantivyBackend:
def search_ids(
self,
query: str,
user: AbstractBaseUser | None,
user: AbstractUser | None,
*,
sort_field: str | None = None,
sort_reverse: bool = False,
@@ -701,16 +704,16 @@ class TantivyBackend:
if threshold is not None:
all_hits = [hit for hit in all_hits if hit[1] >= threshold]
# TODO: Replace with searcher.collect_u64_fast_field("id", addrs) once
# https://github.com/quickwit-oss/tantivy-py/pull/641 lands — eliminates
# one stored-doc fetch per result (~80% reduction in search_ids latency).
return [searcher.doc(doc_addr).to_dict()["id"][0] for doc_addr, *_ in all_hits]
return cast(
"list[int]",
searcher.fast_field_values("id", [doc_addr for doc_addr, *_ in all_hits]),
)
def autocomplete(
self,
term: str,
limit: int,
user: AbstractBaseUser | None = None,
user: AbstractUser | None = None,
) -> list[str]:
"""
Get autocomplete suggestions for search queries.
@@ -738,69 +741,25 @@ class TantivyBackend:
searcher = self._index.searcher()
# Build a prefix query on autocomplete_word so we only scan docs
# containing words that start with the prefix, not the entire index.
# tantivy regex is implicitly anchored; .+ avoids the empty-match
# error that .* triggers. We OR with term_query to also match the
# exact prefix as a complete word.
escaped = re.escape(normalized_term)
prefix_query = tantivy.Query.boolean_query(
[
(
tantivy.Occur.Should,
tantivy.Query.term_query(
self._schema,
"autocomplete_word",
normalized_term,
),
),
(
tantivy.Occur.Should,
tantivy.Query.regex_query(
self._schema,
"autocomplete_word",
f"{escaped}.+",
),
),
],
)
permission_query = None
# Intersect with permission filter so autocomplete words from
# invisible documents don't leak to other users.
if user is not None and not user.is_superuser:
final_query = tantivy.Query.boolean_query(
[
(tantivy.Occur.Must, prefix_query),
(tantivy.Occur.Must, build_permission_filter(self._schema, user)),
],
)
else:
final_query = prefix_query
permission_query = build_permission_filter(self._schema, user)
results = searcher.search(final_query, limit=searcher.num_docs)
# Count how many visible documents each matching word appears in.
word_counts: Counter[str] = Counter()
for _score, doc_address in results.hits:
stored_doc = searcher.doc(doc_address)
doc_dict = stored_doc.to_dict()
if "autocomplete_word" in doc_dict:
for word in doc_dict["autocomplete_word"]:
if word.startswith(normalized_term):
word_counts[word] += 1
# Sort by document frequency descending; break ties alphabetically.
matches = sorted(
word_counts,
key=lambda w: (-word_counts[w], w),
matches = searcher.terms_with_prefix(
"autocomplete_word",
normalized_term,
permission_query,
limit,
)
return matches[:limit]
return [x[0] for x in matches]
def more_like_this_ids(
self,
doc_id: int,
user: AbstractBaseUser | None,
user: AbstractUser | None,
*,
limit: int | None = None,
) -> list[int]:
@@ -821,13 +780,7 @@ class TantivyBackend:
self._ensure_open()
searcher = self._index.searcher()
id_query = tantivy.Query.range_query(
self._schema,
"id",
tantivy.FieldType.Unsigned,
doc_id,
doc_id,
)
id_query = tantivy.Query.term_query(self._schema, "id", doc_id)
results = searcher.search(id_query, limit=1)
if not results.hits:
@@ -851,14 +804,9 @@ class TantivyBackend:
# Fetch one extra to account for excluding the original document
results = searcher.search(final_query, limit=effective_limit + 1)
# TODO: Replace with collect_u64_fast_field("id", addrs) once
# https://github.com/quickwit-oss/tantivy-py/pull/641 lands.
ids = []
for _score, doc_address in results.hits:
result_doc_id = searcher.doc(doc_address).to_dict()["id"][0]
if result_doc_id != doc_id:
ids.append(result_doc_id)
addrs = [addr for _score, addr in results.hits]
all_ids = cast("list[int]", searcher.fast_field_values("id", addrs))
ids = [rid for rid in all_ids if rid != doc_id]
return ids[:limit] if limit is not None else ids
def batch_update(self, lock_timeout: float = 30.0) -> WriteBatch:
@@ -906,9 +854,9 @@ class TantivyBackend:
register_tokenizers(new_index, settings.SEARCH_LANGUAGE)
# Point instance at the new index so _build_tantivy_doc uses it
old_index, old_schema = self._index, self._schema
self._index = new_index
self._schema = new_index.schema
old_index, old_schema = self._raw_index, self._raw_schema
self._raw_index = new_index
self._raw_schema = new_index.schema
try:
writer = new_index.writer()
@@ -922,8 +870,8 @@ class TantivyBackend:
new_index.reload()
except BaseException: # pragma: no cover
# Restore old index on failure so the backend remains usable
self._index = old_index
self._schema = old_schema
self._raw_index = old_index
self._raw_schema = old_schema
raise
+3 -38
View File
@@ -410,9 +410,6 @@ def normalize_query(query: str) -> str:
raise ValueError("Query too complex to process (normalization timed out)")
_MAX_U64 = 2**64 - 1 # u64 max — used as inclusive upper bound for "any owner" range
def build_permission_filter(
schema: tantivy.Schema,
user: AbstractBaseUser,
@@ -432,48 +429,16 @@ def build_permission_filter(
Returns:
Tantivy query that filters results to visible documents
Implementation Notes:
- Uses range_query instead of term_query for owner_id/viewer_id to work
around a tantivy-py bug where Python ints are inferred as i64, causing
term_query to return no hits on u64 fields.
TODO: Replace with term_query once
https://github.com/quickwit-oss/tantivy-py/pull/642 lands.
- Uses range_query(owner_id, 1, MAX_U64) as an "owner exists" check
because exists_query is not yet available in tantivy-py 0.25.
TODO: Replace with exists_query("owner_id") once that is exposed in
a tantivy-py release.
- Uses disjunction_max_query to combine permission clauses with OR logic
"""
owner_any = tantivy.Query.range_query(
schema,
"owner_id",
tantivy.FieldType.Unsigned,
1,
_MAX_U64,
)
owner_any = tantivy.Query.exists_query("owner_id")
no_owner = tantivy.Query.boolean_query(
[
(tantivy.Occur.Must, tantivy.Query.all_query()),
(tantivy.Occur.MustNot, owner_any),
],
)
owned = tantivy.Query.range_query(
schema,
"owner_id",
tantivy.FieldType.Unsigned,
user.pk,
user.pk,
)
shared = tantivy.Query.range_query(
schema,
"viewer_id",
tantivy.FieldType.Unsigned,
user.pk,
user.pk,
)
owned = tantivy.Query.term_query(schema, "owner_id", user.pk)
shared = tantivy.Query.term_query(schema, "viewer_id", user.pk)
return tantivy.Query.disjunction_max_query([no_owner, owned, shared])
+5 -2
View File
@@ -4,6 +4,8 @@ import json
import logging
import shutil
from typing import TYPE_CHECKING
from typing import Final
from typing import cast
import tantivy
from django.conf import settings
@@ -13,7 +15,8 @@ if TYPE_CHECKING:
logger = logging.getLogger("paperless.search")
SCHEMA_VERSION = 1
# v1 - Initial tantivy schema format
SCHEMA_VERSION: Final[int] = 1
def build_schema() -> tantivy.Schema:
@@ -172,7 +175,7 @@ def open_or_rebuild_index(index_dir: Path | None = None) -> tantivy.Index:
Opened Tantivy index (caller must register custom tokenizers)
"""
if index_dir is None:
index_dir = settings.INDEX_DIR
index_dir = cast("Path", settings.INDEX_DIR)
if not index_dir.exists():
return tantivy.Index(build_schema())
if needs_rebuild(index_dir):
+17 -17
View File
@@ -214,7 +214,7 @@ class SetPermissionsMixin:
set_permissions_for_object(permissions, object)
class SerializerWithPerms(serializers.Serializer):
class SerializerWithPerms(serializers.Serializer[dict[str, Any]]):
def __init__(self, *args, **kwargs) -> None:
self.user = kwargs.pop("user", None)
self.full_perms = kwargs.pop("full_perms", False)
@@ -961,20 +961,12 @@ def _get_viewable_duplicates(
return duplicates.filter(id__in=allowed)
class DuplicateDocumentSummarySerializer(serializers.Serializer):
class DuplicateDocumentSummarySerializer(serializers.Serializer[dict[str, Any]]):
id = serializers.IntegerField()
title = serializers.CharField()
deleted_at = serializers.DateTimeField(allow_null=True)
class DocumentVersionInfoSerializer(serializers.Serializer):
id = serializers.IntegerField()
added = serializers.DateTimeField()
version_label = serializers.CharField(required=False, allow_null=True)
checksum = serializers.CharField(required=False, allow_null=True)
is_root = serializers.BooleanField()
class _DocumentVersionInfo(TypedDict):
id: int
added: datetime
@@ -983,6 +975,14 @@ class _DocumentVersionInfo(TypedDict):
is_root: bool
class DocumentVersionInfoSerializer(serializers.Serializer[_DocumentVersionInfo]):
id = serializers.IntegerField()
added = serializers.DateTimeField()
version_label = serializers.CharField(required=False, allow_null=True)
checksum = serializers.CharField(required=False, allow_null=True)
is_root = serializers.BooleanField()
@extend_schema_serializer(
deprecate_fields=["created_date"],
)
@@ -1532,7 +1532,7 @@ class SavedViewSerializer(OwnedObjectSerializer):
return saved_view
class DocumentListSerializer(serializers.Serializer):
class DocumentListSerializer(serializers.Serializer[dict[str, list[int]]]):
documents = serializers.ListField(
required=True,
label="Documents",
@@ -2085,7 +2085,7 @@ class BulkEditSerializer(
return attrs
class PostDocumentSerializer(serializers.Serializer):
class PostDocumentSerializer(serializers.Serializer[dict[str, Any]]):
created = serializers.DateTimeField(
label="Created",
allow_null=True,
@@ -2262,7 +2262,7 @@ class PostDocumentSerializer(serializers.Serializer):
return created.date()
class DocumentVersionSerializer(serializers.Serializer):
class DocumentVersionSerializer(serializers.Serializer[dict[str, Any]]):
document = serializers.FileField(
label="Document",
write_only=True,
@@ -2278,7 +2278,7 @@ class DocumentVersionSerializer(serializers.Serializer):
validate_document = PostDocumentSerializer().validate_document
class DocumentVersionLabelSerializer(serializers.Serializer):
class DocumentVersionLabelSerializer(serializers.Serializer[dict[str, str | None]]):
version_label = serializers.CharField(
label="Version label",
required=True,
@@ -2484,7 +2484,7 @@ class TaskSerializerV10(OwnedObjectSerializer):
read_only_fields = fields
class TaskSerializerV9(serializers.ModelSerializer):
class TaskSerializerV9(serializers.ModelSerializer[PaperlessTask]):
"""Task serializer for API v9 backwards compatibility.
Maps old field names to the new model fields so existing clients continue
@@ -2609,7 +2609,7 @@ class TaskSerializerV9(serializers.ModelSerializer):
return list(qs.values("id", "title", "deleted_at"))
class TaskSummarySerializer(serializers.Serializer):
class TaskSummarySerializer(serializers.Serializer[dict[str, Any]]):
task_type = serializers.CharField()
total_count = serializers.IntegerField()
pending_count = serializers.IntegerField()
@@ -2622,7 +2622,7 @@ class TaskSummarySerializer(serializers.Serializer):
last_failure = serializers.DateTimeField(allow_null=True)
class RunTaskSerializer(serializers.Serializer):
class RunTaskSerializer(serializers.Serializer[dict[str, str]]):
task_type = serializers.ChoiceField(
choices=PaperlessTask.TaskType.choices,
label="Task Type",
+6 -6
View File
@@ -16,7 +16,7 @@ from documents.models import StoragePath
from documents.models import Tag
class CorrespondentFactory(DjangoModelFactory):
class CorrespondentFactory(DjangoModelFactory[Correspondent]):
class Meta:
model = Correspondent
@@ -25,7 +25,7 @@ class CorrespondentFactory(DjangoModelFactory):
matching_algorithm = MatchingModel.MATCH_NONE
class DocumentTypeFactory(DjangoModelFactory):
class DocumentTypeFactory(DjangoModelFactory[DocumentType]):
class Meta:
model = DocumentType
@@ -34,7 +34,7 @@ class DocumentTypeFactory(DjangoModelFactory):
matching_algorithm = MatchingModel.MATCH_NONE
class TagFactory(DjangoModelFactory):
class TagFactory(DjangoModelFactory[Tag]):
class Meta:
model = Tag
@@ -44,7 +44,7 @@ class TagFactory(DjangoModelFactory):
is_inbox_tag = False
class StoragePathFactory(DjangoModelFactory):
class StoragePathFactory(DjangoModelFactory[StoragePath]):
class Meta:
model = StoragePath
@@ -56,7 +56,7 @@ class StoragePathFactory(DjangoModelFactory):
matching_algorithm = MatchingModel.MATCH_NONE
class DocumentFactory(DjangoModelFactory):
class DocumentFactory(DjangoModelFactory[Document]):
class Meta:
model = Document
@@ -68,7 +68,7 @@ class DocumentFactory(DjangoModelFactory):
storage_path = None
class PaperlessTaskFactory(DjangoModelFactory):
class PaperlessTaskFactory(DjangoModelFactory[PaperlessTask]):
class Meta:
model = PaperlessTask
@@ -4,6 +4,7 @@ from __future__ import annotations
import io
from typing import TYPE_CHECKING
from typing import Any
import pytest
from django.core.management import CommandError
@@ -122,7 +123,7 @@ def mock_queryset():
This verifies we use .count() instead of len() for querysets.
"""
class MockQuerySet(QuerySet):
class MockQuerySet(QuerySet[Any]):
def __init__(self, items: list):
self._items = items
self.count_called = False
@@ -147,7 +148,7 @@ def mock_queryset():
class TestProcessResult:
"""Tests for the ProcessResult dataclass."""
def test_success_result(self):
def test_success_result(self) -> None:
result = ProcessResult(item=1, result=2, error=None)
assert result.item == 1
@@ -155,7 +156,7 @@ class TestProcessResult:
assert result.error is None
assert result.success is True
def test_error_result(self):
def test_error_result(self) -> None:
error = ValueError("test error")
result = ProcessResult(item=1, result=None, error=error)
@@ -169,7 +170,7 @@ class TestProcessResult:
class TestPaperlessCommandArguments:
"""Tests for argument parsing behavior."""
def test_progress_bar_argument_added_by_default(self):
def test_progress_bar_argument_added_by_default(self) -> None:
command = SimpleCommand()
parser = command.create_parser("manage.py", "simple")
@@ -179,14 +180,14 @@ class TestPaperlessCommandArguments:
options = parser.parse_args([])
assert options.no_progress_bar is False
def test_progress_bar_argument_not_added_when_disabled(self):
def test_progress_bar_argument_not_added_when_disabled(self) -> None:
command = NoProgressBarCommand()
parser = command.create_parser("manage.py", "noprogress")
options = parser.parse_args([])
assert not hasattr(options, "no_progress_bar")
def test_processes_argument_added_when_multiprocessing_enabled(self):
def test_processes_argument_added_when_multiprocessing_enabled(self) -> None:
command = MultiprocessCommand()
parser = command.create_parser("manage.py", "multiprocess")
@@ -196,7 +197,7 @@ class TestPaperlessCommandArguments:
options = parser.parse_args([])
assert options.processes >= 1
def test_processes_argument_not_added_when_multiprocessing_disabled(self):
def test_processes_argument_not_added_when_multiprocessing_disabled(self) -> None:
command = SimpleCommand()
parser = command.create_parser("manage.py", "simple")
@@ -231,7 +232,7 @@ class TestPaperlessCommandExecute:
*,
no_progress_bar_flag: bool,
expected: bool,
):
) -> None:
command = SimpleCommand()
command.stdout = io.StringIO()
command.stderr = io.StringIO()
@@ -241,7 +242,10 @@ class TestPaperlessCommandExecute:
assert command.no_progress_bar is expected
def test_no_progress_bar_always_true_when_not_supported(self, base_options: dict):
def test_no_progress_bar_always_true_when_not_supported(
self,
base_options: dict,
) -> None:
command = NoProgressBarCommand()
command.stdout = io.StringIO()
command.stderr = io.StringIO()
@@ -262,7 +266,7 @@ class TestPaperlessCommandExecute:
base_options: dict,
processes: int,
expected: int,
):
) -> None:
command = MultiprocessCommand()
command.stdout = io.StringIO()
command.stderr = io.StringIO()
@@ -283,7 +287,7 @@ class TestPaperlessCommandExecute:
self,
base_options: dict,
invalid_count: int,
):
) -> None:
command = MultiprocessCommand()
command.stdout = io.StringIO()
command.stderr = io.StringIO()
@@ -293,7 +297,10 @@ class TestPaperlessCommandExecute:
with pytest.raises(CommandError, match="--processes must be at least 1"):
command.execute(**options)
def test_process_count_defaults_to_one_when_not_supported(self, base_options: dict):
def test_process_count_defaults_to_one_when_not_supported(
self,
base_options: dict,
) -> None:
command = SimpleCommand()
command.stdout = io.StringIO()
command.stderr = io.StringIO()
@@ -312,7 +319,7 @@ class TestGetIterableLength:
self,
simple_command: SimpleCommand,
mock_queryset,
):
) -> None:
"""Should call .count() on Django querysets rather than len()."""
queryset = mock_queryset([1, 2, 3, 4, 5])
@@ -321,13 +328,16 @@ class TestGetIterableLength:
assert result == 5
assert queryset.count_called is True
def test_uses_len_for_sized(self, simple_command: SimpleCommand):
def test_uses_len_for_sized(self, simple_command: SimpleCommand) -> None:
"""Should use len() for sequences and other Sized types."""
result = simple_command._get_iterable_length([1, 2, 3, 4])
assert result == 4
def test_returns_none_for_unsized_iterables(self, simple_command: SimpleCommand):
def test_returns_none_for_unsized_iterables(
self,
simple_command: SimpleCommand,
) -> None:
"""Should return None for generators and other iterables without len()."""
result = simple_command._get_iterable_length(x for x in [1, 2, 3])
@@ -338,7 +348,7 @@ class TestGetIterableLength:
class TestTrack:
"""Tests for the track() method."""
def test_with_progress_bar_disabled(self, simple_command: SimpleCommand):
def test_with_progress_bar_disabled(self, simple_command: SimpleCommand) -> None:
simple_command.no_progress_bar = True
items = ["a", "b", "c"]
@@ -346,7 +356,7 @@ class TestTrack:
assert result == items
def test_with_progress_bar_enabled(self, simple_command: SimpleCommand):
def test_with_progress_bar_enabled(self, simple_command: SimpleCommand) -> None:
simple_command.no_progress_bar = False
items = [1, 2, 3]
@@ -354,7 +364,7 @@ class TestTrack:
assert result == items
def test_with_explicit_total(self, simple_command: SimpleCommand):
def test_with_explicit_total(self, simple_command: SimpleCommand) -> None:
simple_command.no_progress_bar = False
def gen():
@@ -364,7 +374,7 @@ class TestTrack:
assert result == [1, 2, 3]
def test_with_generator_no_total(self, simple_command: SimpleCommand):
def test_with_generator_no_total(self, simple_command: SimpleCommand) -> None:
def gen():
yield from [1, 2, 3]
@@ -372,7 +382,7 @@ class TestTrack:
assert result == [1, 2, 3]
def test_empty_iterable(self, simple_command: SimpleCommand):
def test_empty_iterable(self, simple_command: SimpleCommand) -> None:
result = list(simple_command.track([]))
assert result == []
@@ -382,7 +392,7 @@ class TestTrack:
simple_command: SimpleCommand,
mock_queryset,
mocker: MockerFixture,
):
) -> None:
"""Verify track() uses .count() for querysets."""
simple_command.no_progress_bar = False
queryset = mock_queryset([1, 2, 3])
@@ -403,7 +413,7 @@ class TestProcessParallel:
def test_sequential_processing_single_process(
self,
multiprocess_command: MultiprocessCommand,
):
) -> None:
multiprocess_command.process_count = 1
items = [1, 2, 3, 4, 5]
@@ -418,7 +428,7 @@ class TestProcessParallel:
def test_sequential_processing_handles_errors(
self,
multiprocess_command: MultiprocessCommand,
):
) -> None:
multiprocess_command.process_count = 1
items = [1, 2, 0, 4] # 0 causes ZeroDivisionError
@@ -438,7 +448,7 @@ class TestProcessParallel:
self,
multiprocess_command: MultiprocessCommand,
mocker: MockerFixture,
):
) -> None:
multiprocess_command.process_count = 2
items = [1, 2, 3]
@@ -455,7 +465,7 @@ class TestProcessParallel:
self,
multiprocess_command: MultiprocessCommand,
mocker: MockerFixture,
):
) -> None:
multiprocess_command.process_count = 2
items = [1, 2, 0, 4]
@@ -467,7 +477,7 @@ class TestProcessParallel:
assert len(failures) == 1
assert failures[0].item == 0
def test_empty_items(self, multiprocess_command: MultiprocessCommand):
def test_empty_items(self, multiprocess_command: MultiprocessCommand) -> None:
results = list(multiprocess_command.process_parallel(_double_value, []))
assert results == []
@@ -475,7 +485,7 @@ class TestProcessParallel:
def test_result_contains_original_item(
self,
multiprocess_command: MultiprocessCommand,
):
) -> None:
items = [10, 20, 30]
results = list(multiprocess_command.process_parallel(_double_value, items))
@@ -488,7 +498,7 @@ class TestProcessParallel:
self,
multiprocess_command: MultiprocessCommand,
mocker: MockerFixture,
):
) -> None:
"""Verify single process uses sequential path (important for testing)."""
multiprocess_command.process_count = 1
@@ -504,7 +514,7 @@ class TestProcessParallel:
self,
multiprocess_command: MultiprocessCommand,
mocker: MockerFixture,
):
) -> None:
"""Verify multiple processes uses parallel path."""
multiprocess_command.process_count = 2
+49 -31
View File
@@ -16,7 +16,7 @@ pytestmark = [pytest.mark.search, pytest.mark.django_db]
class TestWriteBatch:
"""Test WriteBatch context manager functionality."""
def test_rolls_back_on_exception(self, backend: TantivyBackend):
def test_rolls_back_on_exception(self, backend: TantivyBackend) -> None:
"""Batch operations must rollback on exception to preserve index integrity."""
doc = Document.objects.create(
title="Rollback Target",
@@ -43,7 +43,7 @@ class TestSearch:
def test_text_mode_limits_default_search_to_title_and_content(
self,
backend: TantivyBackend,
):
) -> None:
"""Simple text mode must not match metadata-only fields."""
doc = Document.objects.create(
title="Invoice document",
@@ -71,7 +71,7 @@ class TestSearch:
def test_title_mode_limits_default_search_to_title_only(
self,
backend: TantivyBackend,
):
) -> None:
"""Title mode must not match content-only terms."""
doc = Document.objects.create(
title="Invoice document",
@@ -93,7 +93,7 @@ class TestSearch:
def test_text_mode_matches_partial_term_substrings(
self,
backend: TantivyBackend,
):
) -> None:
"""Simple text mode should support substring matching within tokens."""
doc = Document.objects.create(
title="Account access",
@@ -117,7 +117,7 @@ class TestSearch:
def test_text_mode_does_not_match_on_partial_term_overlap(
self,
backend: TantivyBackend,
):
) -> None:
"""Simple text mode should not match documents that merely share partial fragments."""
doc = Document.objects.create(
title="Adobe Acrobat PDF Files",
@@ -135,7 +135,7 @@ class TestSearch:
def test_text_mode_anchors_later_query_tokens_to_token_starts(
self,
backend: TantivyBackend,
):
) -> None:
"""Multi-token simple search should not match later tokens in the middle of a word."""
exact_doc = Document.objects.create(
title="Z-Berichte 6",
@@ -170,7 +170,7 @@ class TestSearch:
def test_text_mode_ignores_queries_without_searchable_tokens(
self,
backend: TantivyBackend,
):
) -> None:
"""Simple text mode should safely return no hits for symbol-only strings."""
doc = Document.objects.create(
title="Guide",
@@ -187,7 +187,7 @@ class TestSearch:
def test_title_mode_matches_partial_term_substrings(
self,
backend: TantivyBackend,
):
) -> None:
"""Title mode should support substring matching within title tokens."""
doc = Document.objects.create(
title="Password guide",
@@ -210,7 +210,7 @@ class TestSearch:
== 1
)
def test_sort_field_ascending(self, backend: TantivyBackend):
def test_sort_field_ascending(self, backend: TantivyBackend) -> None:
"""Searching with sort_reverse=False must return results in ascending ASN order."""
for asn in [30, 10, 20]:
doc = Document.objects.create(
@@ -231,7 +231,7 @@ class TestSearch:
asns = [Document.objects.get(pk=doc_id).archive_serial_number for doc_id in ids]
assert asns == [10, 20, 30]
def test_sort_field_descending(self, backend: TantivyBackend):
def test_sort_field_descending(self, backend: TantivyBackend) -> None:
"""Searching with sort_reverse=True must return results in descending ASN order."""
for asn in [30, 10, 20]:
doc = Document.objects.create(
@@ -256,7 +256,7 @@ class TestSearch:
class TestSearchIds:
"""Test lightweight ID-only search."""
def test_returns_matching_ids(self, backend: TantivyBackend):
def test_returns_matching_ids(self, backend: TantivyBackend) -> None:
"""search_ids must return IDs of all matching documents."""
docs = []
for i in range(5):
@@ -282,7 +282,7 @@ class TestSearchIds:
assert set(ids) == {d.pk for d in docs}
assert other.pk not in ids
def test_respects_permission_filter(self, backend: TantivyBackend):
def test_respects_permission_filter(self, backend: TantivyBackend) -> None:
"""search_ids must respect user permission filtering."""
owner = User.objects.create_user("ids_owner")
other = User.objects.create_user("ids_other")
@@ -303,7 +303,7 @@ class TestSearchIds:
backend.search_ids("secret", user=other, search_mode=SearchMode.QUERY) == []
)
def test_respects_fuzzy_threshold(self, backend: TantivyBackend, settings):
def test_respects_fuzzy_threshold(self, backend: TantivyBackend, settings) -> None:
"""search_ids must apply the same fuzzy threshold as search()."""
doc = Document.objects.create(
title="threshold test",
@@ -316,7 +316,7 @@ class TestSearchIds:
ids = backend.search_ids("unique", user=None, search_mode=SearchMode.QUERY)
assert ids == []
def test_returns_ids_for_text_mode(self, backend: TantivyBackend):
def test_returns_ids_for_text_mode(self, backend: TantivyBackend) -> None:
"""search_ids must work with TEXT search mode."""
doc = Document.objects.create(
title="text mode doc",
@@ -332,7 +332,7 @@ class TestSearchIds:
class TestRebuild:
"""Test index rebuilding functionality."""
def test_with_iter_wrapper_called(self, backend: TantivyBackend):
def test_with_iter_wrapper_called(self, backend: TantivyBackend) -> None:
"""Index rebuild must pass documents through iter_wrapper for progress tracking."""
seen = []
@@ -349,7 +349,7 @@ class TestRebuild:
class TestAutocomplete:
"""Test autocomplete functionality."""
def test_basic_functionality(self, backend: TantivyBackend):
def test_basic_functionality(self, backend: TantivyBackend) -> None:
"""Autocomplete must return words matching the given prefix."""
doc = Document.objects.create(
title="Invoice from Microsoft Corporation",
@@ -362,7 +362,10 @@ class TestAutocomplete:
results = backend.autocomplete("micro", limit=10)
assert "microsoft" in results
def test_results_ordered_by_document_frequency(self, backend: TantivyBackend):
def test_results_ordered_by_document_frequency(
self,
backend: TantivyBackend,
) -> None:
"""Autocomplete results must be ordered by document frequency to prioritize common terms."""
# "payment" appears in 3 docs; "payslip" in 1 — "pay" prefix should
# return "payment" before "payslip".
@@ -390,7 +393,10 @@ class TestAutocomplete:
class TestMoreLikeThis:
"""Test more like this functionality."""
def test_more_like_this_ids_excludes_original(self, backend: TantivyBackend):
def test_more_like_this_ids_excludes_original(
self,
backend: TantivyBackend,
) -> None:
"""more_like_this_ids must return IDs of similar documents, excluding the original."""
doc1 = Document.objects.create(
title="Important document",
@@ -421,11 +427,11 @@ class TestSingleton:
yield
reset_backend()
def test_returns_same_instance_on_repeated_calls(self, index_dir):
def test_returns_same_instance_on_repeated_calls(self, index_dir) -> None:
"""Singleton pattern: repeated calls to get_backend() must return the same instance."""
assert get_backend() is get_backend()
def test_reinitializes_when_index_dir_changes(self, tmp_path, settings):
def test_reinitializes_when_index_dir_changes(self, tmp_path, settings) -> None:
"""Backend singleton must reinitialize when INDEX_DIR setting changes for test isolation."""
settings.INDEX_DIR = tmp_path / "a"
(tmp_path / "a").mkdir()
@@ -438,7 +444,7 @@ class TestSingleton:
assert b1 is not b2
assert b2._path == tmp_path / "b"
def test_reset_forces_new_instance(self, index_dir):
def test_reset_forces_new_instance(self, index_dir) -> None:
"""reset_backend() must force creation of a new backend instance on next get_backend() call."""
b1 = get_backend()
reset_backend()
@@ -449,7 +455,7 @@ class TestSingleton:
class TestFieldHandling:
"""Test handling of various document fields."""
def test_none_values_handled_correctly(self, backend: TantivyBackend):
def test_none_values_handled_correctly(self, backend: TantivyBackend) -> None:
"""Document fields with None values must not cause indexing errors."""
doc = Document.objects.create(
title="Test Doc",
@@ -464,7 +470,10 @@ class TestFieldHandling:
assert len(backend.search_ids("test", user=None)) == 1
def test_custom_fields_include_name_and_value(self, backend: TantivyBackend):
def test_custom_fields_include_name_and_value(
self,
backend: TantivyBackend,
) -> None:
"""Custom fields must be indexed with both field name and value for structured queries."""
field = CustomField.objects.create(
name="Invoice Number",
@@ -486,7 +495,10 @@ class TestFieldHandling:
assert len(backend.search_ids("invoice", user=None)) == 1
def test_select_custom_field_indexes_label_not_id(self, backend: TantivyBackend):
def test_select_custom_field_indexes_label_not_id(
self,
backend: TantivyBackend,
) -> None:
"""SELECT custom fields must index the human-readable label, not the opaque option ID."""
field = CustomField.objects.create(
name="Category",
@@ -514,7 +526,7 @@ class TestFieldHandling:
assert len(backend.search_ids("custom_fields.value:invoice", user=None)) == 1
assert len(backend.search_ids("custom_fields.value:opt_abc", user=None)) == 0
def test_none_custom_field_value_not_indexed(self, backend: TantivyBackend):
def test_none_custom_field_value_not_indexed(self, backend: TantivyBackend) -> None:
"""Custom field instances with no value set must not produce an index entry."""
field = CustomField.objects.create(
name="Optional",
@@ -536,7 +548,7 @@ class TestFieldHandling:
assert len(backend.search_ids("custom_fields.value:none", user=None)) == 0
def test_notes_include_user_information(self, backend: TantivyBackend):
def test_notes_include_user_information(self, backend: TantivyBackend) -> None:
"""Notes must be indexed with user information when available for structured queries."""
user = User.objects.create_user("notewriter")
doc = Document.objects.create(
@@ -566,7 +578,7 @@ class TestHighlightHits:
def test_highlights_simple_text_mode_returns_html_string(
self,
backend: TantivyBackend,
):
) -> None:
"""Simple text search should still produce content highlights for exact-token hits."""
doc = Document.objects.create(
title="Highlight Test",
@@ -583,7 +595,10 @@ class TestHighlightHits:
assert "content" in highlights
assert "<b>" in highlights["content"]
def test_highlights_content_returns_html_string(self, backend: TantivyBackend):
def test_highlights_content_returns_html_string(
self,
backend: TantivyBackend,
) -> None:
"""highlight_hits must return HTML strings (from Snippet.to_html()), not Snippet objects."""
doc = Document.objects.create(
title="Highlight Test",
@@ -607,7 +622,10 @@ class TestHighlightHits:
f"Expected HTML with <b> tags, got: {content_highlight!r}"
)
def test_highlights_notes_returns_html_string(self, backend: TantivyBackend):
def test_highlights_notes_returns_html_string(
self,
backend: TantivyBackend,
) -> None:
"""Note highlights must be HTML strings via notes_text companion field.
The notes JSON field does not support tantivy SnippetGenerator; the
@@ -642,12 +660,12 @@ class TestHighlightHits:
f"Expected HTML with <b> tags, got: {note_highlight!r}"
)
def test_empty_doc_list_returns_empty_hits(self, backend: TantivyBackend):
def test_empty_doc_list_returns_empty_hits(self, backend: TantivyBackend) -> None:
"""highlight_hits with no doc IDs must return an empty list."""
hits = backend.highlight_hits("anything", [])
assert hits == []
def test_no_highlights_when_no_match(self, backend: TantivyBackend):
def test_no_highlights_when_no_match(self, backend: TantivyBackend) -> None:
"""Documents not matching the query should not appear in results."""
doc = Document.objects.create(
title="Unrelated",
@@ -79,60 +79,60 @@ class TestMigrateFulltextQueryFieldPrefixes(TestMigrations):
value="note:something",
)
def test_note_prefix_rewritten(self):
def test_note_prefix_rewritten(self) -> None:
self.rule_note.refresh_from_db()
self.assertEqual(self.rule_note.value, "notes.note:invoice")
def test_custom_field_prefix_rewritten(self):
def test_custom_field_prefix_rewritten(self) -> None:
self.rule_cf.refresh_from_db()
self.assertEqual(self.rule_cf.value, "custom_fields.value:amount")
def test_combined_query_rewritten(self):
def test_combined_query_rewritten(self) -> None:
self.rule_combined.refresh_from_db()
self.assertEqual(
self.rule_combined.value,
"notes.note:invoice AND custom_fields.value:total",
)
def test_parenthesized_groups(self):
def test_parenthesized_groups(self) -> None:
self.rule_parens.refresh_from_db()
self.assertEqual(
self.rule_parens.value,
"(notes.note:invoice OR notes.note:receipt)",
)
def test_plus_prefix(self):
def test_plus_prefix(self) -> None:
self.rule_plus.refresh_from_db()
self.assertEqual(self.rule_plus.value, "+notes.note:foo")
def test_minus_prefix(self):
def test_minus_prefix(self) -> None:
self.rule_minus.refresh_from_db()
self.assertEqual(self.rule_minus.value, "-notes.note:bar")
def test_boosted(self):
def test_boosted(self) -> None:
self.rule_boost.refresh_from_db()
self.assertEqual(self.rule_boost.value, "notes.note:test^2")
def test_no_match_unchanged(self):
def test_no_match_unchanged(self) -> None:
self.rule_no_match.refresh_from_db()
self.assertEqual(self.rule_no_match.value, "title:hello content:world")
def test_word_boundary_no_false_positive(self):
def test_word_boundary_no_false_positive(self) -> None:
self.rule_denote.refresh_from_db()
self.assertEqual(self.rule_denote.value, "denote:foo")
def test_already_migrated_idempotent(self):
def test_already_migrated_idempotent(self) -> None:
self.rule_already_migrated.refresh_from_db()
self.assertEqual(self.rule_already_migrated.value, "notes.note:foo")
def test_already_migrated_cf_idempotent(self):
def test_already_migrated_cf_idempotent(self) -> None:
self.rule_already_migrated_cf.refresh_from_db()
self.assertEqual(self.rule_already_migrated_cf.value, "custom_fields.value:bar")
def test_null_value_no_crash(self):
def test_null_value_no_crash(self) -> None:
self.rule_null.refresh_from_db()
self.assertIsNone(self.rule_null.value)
def test_non_fulltext_rule_untouched(self):
def test_non_fulltext_rule_untouched(self) -> None:
self.rule_other_type.refresh_from_db()
self.assertEqual(self.rule_other_type.value, "note:something")
+2 -8
View File
@@ -24,13 +24,7 @@ class TestDocumentAdmin(DirectoriesMixin, TestCase):
backend = get_backend()
searcher = backend._index.searcher()
results = searcher.search(
tantivy.Query.range_query(
backend._schema,
"id",
tantivy.FieldType.Unsigned,
doc.pk,
doc.pk,
),
tantivy.Query.term_query(backend._schema, "id", doc.pk),
limit=1,
)
if results.hits:
@@ -100,7 +94,7 @@ class TestTagAdmin(DirectoriesMixin, TestCase):
self.tag_admin = TagAdmin(model=Tag, admin_site=AdminSite())
@patch("documents.tasks.bulk_update_documents")
def test_parent_tags_get_added(self, mock_bulk_update):
def test_parent_tags_get_added(self, mock_bulk_update) -> None:
document = Document.objects.create(title="test")
parent = Tag.objects.create(name="parent")
child = Tag.objects.create(name="child")
+9 -1
View File
@@ -91,6 +91,7 @@ class TestApiAppConfig(DirectoriesMixin, APITestCase):
- app_title and app_logo are included
"""
config = ApplicationConfiguration.objects.first()
assert config is not None
config.app_title = "Fancy New Title"
config.app_logo = "/logo/example.jpg"
config.save()
@@ -125,6 +126,7 @@ class TestApiAppConfig(DirectoriesMixin, APITestCase):
)
self.assertEqual(response.status_code, status.HTTP_200_OK)
config = ApplicationConfiguration.objects.first()
assert config is not None
self.assertEqual(config.color_conversion_strategy, ColorConvertChoices.RGB)
def test_api_update_config_empty_fields(self) -> None:
@@ -150,6 +152,7 @@ class TestApiAppConfig(DirectoriesMixin, APITestCase):
)
self.assertEqual(response.status_code, status.HTTP_200_OK)
config = ApplicationConfiguration.objects.first()
assert config is not None
self.assertEqual(config.user_args, None)
self.assertEqual(config.language, None)
self.assertEqual(config.barcode_tag_mapping, None)
@@ -187,6 +190,7 @@ class TestApiAppConfig(DirectoriesMixin, APITestCase):
self.assertIn("image/jpeg", response["Content-Type"])
config = ApplicationConfiguration.objects.first()
assert config is not None
old_logo = config.app_logo
self.assertTrue(Path(old_logo.path).exists())
self.client.patch(
@@ -233,6 +237,7 @@ class TestApiAppConfig(DirectoriesMixin, APITestCase):
self.assertEqual(response.status_code, status.HTTP_200_OK)
config = ApplicationConfiguration.objects.first()
assert config is not None
with Image.open(config.app_logo.path) as stored_logo:
stored_exif = stored_logo.getexif()
@@ -268,6 +273,7 @@ class TestApiAppConfig(DirectoriesMixin, APITestCase):
self.assertEqual(response.status_code, status.HTTP_200_OK)
config = ApplicationConfiguration.objects.first()
assert config is not None
with Image.open(config.app_logo.path) as stored_logo:
stored_text = stored_logo.text
@@ -786,6 +792,7 @@ class TestApiAppConfig(DirectoriesMixin, APITestCase):
- llm_api_key is set to None
"""
config = ApplicationConfiguration.objects.first()
assert config is not None
config.llm_api_key = "1234567890"
config.save()
@@ -826,6 +833,7 @@ class TestApiAppConfig(DirectoriesMixin, APITestCase):
- LLM index is triggered to update
"""
config = ApplicationConfiguration.objects.first()
assert config is not None
config.ai_enabled = False
config.llm_embedding_backend = None
config.save()
@@ -840,7 +848,7 @@ class TestApiAppConfig(DirectoriesMixin, APITestCase):
json.dumps(
{
"ai_enabled": True,
"llm_embedding_backend": "openai",
"llm_embedding_backend": "openai-like",
},
),
content_type="application/json",
+1 -1
View File
@@ -918,7 +918,7 @@ class TestBulkEditAPI(DirectoriesMixin, APITestCase):
],
)
def test_api_selection_data_requires_view_permission(self):
def test_api_selection_data_requires_view_permission(self) -> None:
self.doc2.owner = self.user
self.doc2.save()
+48 -16
View File
@@ -276,7 +276,9 @@ class TestCustomFieldsAPI(DirectoriesMixin, APITestCase):
)
doc.refresh_from_db()
self.assertEqual(doc.custom_fields.first().value, None)
_cf_1 = doc.custom_fields.first()
assert _cf_1 is not None
self.assertEqual(_cf_1.value, None)
@mock.patch("documents.signals.handlers.process_cf_select_update.apply_async")
def test_custom_field_update_offloaded_once(self, mock_delay) -> None:
@@ -567,7 +569,9 @@ class TestCustomFieldsAPI(DirectoriesMixin, APITestCase):
self.assertEqual(resp.status_code, status.HTTP_200_OK)
self.assertEqual(CustomFieldInstance.objects.count(), 1)
self.assertEqual(doc.custom_fields.first().value, "test value")
_cf_2 = doc.custom_fields.first()
assert _cf_2 is not None
self.assertEqual(_cf_2.value, "test value")
# Update
resp = self.client.patch(
@@ -584,7 +588,9 @@ class TestCustomFieldsAPI(DirectoriesMixin, APITestCase):
)
self.assertEqual(resp.status_code, status.HTTP_200_OK)
self.assertEqual(CustomFieldInstance.objects.count(), 1)
self.assertEqual(doc.custom_fields.first().value, "a new test value")
_cf_3 = doc.custom_fields.first()
assert _cf_3 is not None
self.assertEqual(_cf_3.value, "a new test value")
def test_delete_custom_field_instance(self) -> None:
"""
@@ -650,7 +656,9 @@ class TestCustomFieldsAPI(DirectoriesMixin, APITestCase):
self.assertEqual(CustomFieldInstance.objects.count(), 1)
self.assertEqual(Document.objects.count(), 1)
self.assertEqual(len(doc.custom_fields.all()), 1)
self.assertEqual(doc.custom_fields.first().value, date_value)
_cf_4 = doc.custom_fields.first()
assert _cf_4 is not None
self.assertEqual(_cf_4.value, date_value)
def test_custom_field_validation(self) -> None:
"""
@@ -1062,9 +1070,15 @@ class TestCustomFieldsAPI(DirectoriesMixin, APITestCase):
self.assertEqual(resp.status_code, status.HTTP_200_OK)
self.assertEqual(CustomFieldInstance.objects.count(), 4)
self.assertEqual(doc2.custom_fields.first().value, [1])
self.assertEqual(doc3.custom_fields.first().value, [1])
self.assertEqual(doc4.custom_fields.first().value, [1])
_cf_5 = doc2.custom_fields.first()
assert _cf_5 is not None
self.assertEqual(_cf_5.value, [1])
_cf_6 = doc3.custom_fields.first()
assert _cf_6 is not None
self.assertEqual(_cf_6.value, [1])
_cf_7 = doc4.custom_fields.first()
assert _cf_7 is not None
self.assertEqual(_cf_7.value, [1])
# Add links appends if necessary
resp = self.client.patch(
@@ -1081,7 +1095,9 @@ class TestCustomFieldsAPI(DirectoriesMixin, APITestCase):
)
self.assertEqual(resp.status_code, status.HTTP_200_OK)
self.assertEqual(doc4.custom_fields.first().value, [1, 3])
_cf_8 = doc4.custom_fields.first()
assert _cf_8 is not None
self.assertEqual(_cf_8.value, [1, 3])
# Remove one of the links, removed on other doc
resp = self.client.patch(
@@ -1098,9 +1114,15 @@ class TestCustomFieldsAPI(DirectoriesMixin, APITestCase):
)
self.assertEqual(resp.status_code, status.HTTP_200_OK)
self.assertEqual(doc2.custom_fields.first().value, [1])
self.assertEqual(doc3.custom_fields.first().value, [1, 4])
self.assertEqual(doc4.custom_fields.first().value, [3])
_cf_9 = doc2.custom_fields.first()
assert _cf_9 is not None
self.assertEqual(_cf_9.value, [1])
_cf_10 = doc3.custom_fields.first()
assert _cf_10 is not None
self.assertEqual(_cf_10.value, [1, 4])
_cf_11 = doc4.custom_fields.first()
assert _cf_11 is not None
self.assertEqual(_cf_11.value, [3])
# Removes the field entirely
resp = self.client.patch(
@@ -1112,9 +1134,15 @@ class TestCustomFieldsAPI(DirectoriesMixin, APITestCase):
)
self.assertEqual(resp.status_code, status.HTTP_200_OK)
self.assertEqual(doc2.custom_fields.first().value, [])
self.assertEqual(doc3.custom_fields.first().value, [4])
self.assertEqual(doc4.custom_fields.first().value, [3])
_cf_12 = doc2.custom_fields.first()
assert _cf_12 is not None
self.assertEqual(_cf_12.value, [])
_cf_13 = doc3.custom_fields.first()
assert _cf_13 is not None
self.assertEqual(_cf_13.value, [4])
_cf_14 = doc4.custom_fields.first()
assert _cf_14 is not None
self.assertEqual(_cf_14.value, [3])
# If field exists on target doc but value is None
doc5 = Document.objects.create(
@@ -1139,7 +1167,9 @@ class TestCustomFieldsAPI(DirectoriesMixin, APITestCase):
)
self.assertEqual(resp.status_code, status.HTTP_200_OK)
self.assertEqual(doc5.custom_fields.first().value, [1])
_cf_15 = doc5.custom_fields.first()
assert _cf_15 is not None
self.assertEqual(_cf_15.value, [1])
def test_documentlink_patch_requires_change_permission_on_target_documents(
self,
@@ -1321,7 +1351,9 @@ class TestCustomFieldsAPI(DirectoriesMixin, APITestCase):
results = response.data["results"]
self.assertEqual(results[0]["document_count"], 0)
def test_patch_document_invalid_date_custom_field_returns_validation_error(self):
def test_patch_document_invalid_date_custom_field_returns_validation_error(
self,
) -> None:
"""
GIVEN:
- A date custom field
+6 -6
View File
@@ -1168,7 +1168,7 @@ class TestDocumentApi(DirectoriesMixin, ConsumeTaskMixin, APITestCase):
self.assertIn("all", response.data)
self.assertCountEqual(response.data["all"], [d.id for d in docs])
def test_default_ordering_uses_id_as_tiebreaker(self):
def test_default_ordering_uses_id_as_tiebreaker(self) -> None:
"""
GIVEN:
- Documents sharing the same created date
@@ -2156,7 +2156,7 @@ class TestDocumentApi(DirectoriesMixin, ConsumeTaskMixin, APITestCase):
match_tags,
match_document_types,
match_storage_paths,
):
) -> None:
doc = Document.objects.create(
title="test",
mime_type="application/pdf",
@@ -2193,7 +2193,7 @@ class TestDocumentApi(DirectoriesMixin, ConsumeTaskMixin, APITestCase):
match_document_types,
match_storage_paths,
mocked_load,
):
) -> None:
"""
GIVEN:
- Request for suggestions for a document
@@ -2276,7 +2276,7 @@ class TestDocumentApi(DirectoriesMixin, ConsumeTaskMixin, APITestCase):
def test_get_suggestions_dates_disabled(
self,
mock_get_date_parser: mock.MagicMock,
):
) -> None:
"""
GIVEN:
- NUMBER_OF_SUGGESTED_DATES = 0 (disables feature)
@@ -3409,7 +3409,7 @@ class TestDocumentApi(DirectoriesMixin, ConsumeTaskMixin, APITestCase):
)
self.assertEqual(resp.status_code, status.HTTP_200_OK)
def test_create_share_link_requires_view_permission_for_document(self):
def test_create_share_link_requires_view_permission_for_document(self) -> None:
"""
GIVEN:
- A user with add_sharelink but without view permission on a document
@@ -3457,7 +3457,7 @@ class TestDocumentApi(DirectoriesMixin, ConsumeTaskMixin, APITestCase):
self.assertEqual(create_resp.status_code, status.HTTP_201_CREATED)
self.assertEqual(create_resp.data["document"], doc.pk)
def test_next_asn(self):
def test_next_asn(self) -> None:
"""
GIVEN:
- Existing documents with ASNs, highest owned by user2
+1 -1
View File
@@ -933,7 +933,7 @@ class TestApiUser(DirectoriesMixin, APITestCase):
returned_user1 = User.objects.get(pk=user1.pk)
self.assertEqual(returned_user1.is_superuser, False)
def test_only_superusers_can_create_or_alter_staff_status(self):
def test_only_superusers_can_create_or_alter_staff_status(self) -> None:
"""
GIVEN:
- Existing user account
+24 -14
View File
@@ -79,14 +79,14 @@ class TestApiSchema(APITestCase):
class TestTasksSummarySchema:
"""tasks_summary_retrieve: response must be an array of TaskSummarySerializer."""
def test_summary_response_is_array(self, api_schema: SchemaGenerator):
def test_summary_response_is_array(self, api_schema: SchemaGenerator) -> None:
op = api_schema["paths"]["/api/tasks/summary/"]["get"]
resp_200 = op["responses"]["200"]["content"]["application/json"]["schema"]
assert resp_200["type"] == "array", (
"tasks_summary_retrieve response must be type:array"
)
def test_summary_items_have_total_count(self, api_schema: SchemaGenerator):
def test_summary_items_have_total_count(self, api_schema: SchemaGenerator) -> None:
op = api_schema["paths"]["/api/tasks/summary/"]["get"]
resp_200 = op["responses"]["200"]["content"]["application/json"]["schema"]
items = resp_200.get("items", {})
@@ -100,7 +100,10 @@ class TestTasksSummarySchema:
"summary items must have 'total_count' (TaskSummarySerializer)"
)
def test_summary_days_parameter_constraints(self, api_schema: SchemaGenerator):
def test_summary_days_parameter_constraints(
self,
api_schema: SchemaGenerator,
) -> None:
op = api_schema["paths"]["/api/tasks/summary/"]["get"]
params = {p["name"]: p for p in op.get("parameters", [])}
assert "days" in params, "days query parameter must be declared"
@@ -112,14 +115,14 @@ class TestTasksSummarySchema:
class TestTasksActiveSchema:
"""tasks_active_retrieve: response must be an array of TaskSerializerV10."""
def test_active_response_is_array(self, api_schema: SchemaGenerator):
def test_active_response_is_array(self, api_schema: SchemaGenerator) -> None:
op = api_schema["paths"]["/api/tasks/active/"]["get"]
resp_200 = op["responses"]["200"]["content"]["application/json"]["schema"]
assert resp_200["type"] == "array", (
"tasks_active_retrieve response must be type:array"
)
def test_active_items_ref_named_schema(self, api_schema: SchemaGenerator):
def test_active_items_ref_named_schema(self, api_schema: SchemaGenerator) -> None:
op = api_schema["paths"]["/api/tasks/active/"]["get"]
resp_200 = op["responses"]["200"]["content"]["application/json"]["schema"]
items = resp_200.get("items", {})
@@ -133,7 +136,11 @@ class TestMetadataSchema:
"""Metadata component: array fields and optional archive fields."""
@pytest.mark.parametrize("field", ["original_metadata", "archive_metadata"])
def test_metadata_field_is_array(self, api_schema: SchemaGenerator, field: str):
def test_metadata_field_is_array(
self,
api_schema: SchemaGenerator,
field: str,
) -> None:
props = api_schema["components"]["schemas"]["Metadata"]["properties"]
assert props[field]["type"] == "array", (
f"{field} should be type:array, not type:object"
@@ -144,7 +151,7 @@ class TestMetadataSchema:
self,
api_schema: SchemaGenerator,
field: str,
):
) -> None:
props = api_schema["components"]["schemas"]["Metadata"]["properties"]
items = props[field]["items"]
ref = items.get("$ref", "")
@@ -166,7 +173,7 @@ class TestMetadataSchema:
"archive_metadata",
],
)
def test_archive_field_not_required(self, api_schema, field):
def test_archive_field_not_required(self, api_schema, field) -> None:
schema = api_schema["components"]["schemas"]["Metadata"]
required = schema.get("required", [])
assert field not in required
@@ -179,7 +186,7 @@ class TestMetadataSchema:
class TestStoragePathTestSchema:
"""storage_paths_test_create: response must be a string, not a StoragePath object."""
def test_test_action_response_is_string(self, api_schema: SchemaGenerator):
def test_test_action_response_is_string(self, api_schema: SchemaGenerator) -> None:
op = api_schema["paths"]["/api/storage_paths/test/"]["post"]
resp_200 = op["responses"]["200"]["content"]["application/json"]["schema"]
assert resp_200.get("type") == "string", (
@@ -189,7 +196,7 @@ class TestStoragePathTestSchema:
def test_test_action_request_uses_storage_path_test_serializer(
self,
api_schema: SchemaGenerator,
):
) -> None:
op = api_schema["paths"]["/api/storage_paths/test/"]["post"]
content = (
op.get("requestBody", {}).get("content", {}).get("application/json", {})
@@ -220,11 +227,14 @@ class TestProcessedMailBulkDeleteSchema:
self,
api_schema: SchemaGenerator,
field: str,
):
) -> None:
props = self._get_props(api_schema)
assert field in props, f"bulk_delete 200 response must have a '{field}' field"
def test_bulk_delete_response_is_not_processed_mail_serializer(self, api_schema):
def test_bulk_delete_response_is_not_processed_mail_serializer(
self,
api_schema,
) -> None:
op = api_schema["paths"]["/api/processed_mail/bulk_delete/"]["post"]
resp_200 = op["responses"]["200"]["content"]["application/json"]["schema"]
ref = resp_200.get("$ref", "")
@@ -237,13 +247,13 @@ class TestProcessedMailBulkDeleteSchema:
class TestShareLinkBundleRebuildSchema:
"""share_link_bundles_rebuild_create: 200 returns bundle data; 400 is documented."""
def test_rebuild_has_400_response(self, api_schema: SchemaGenerator):
def test_rebuild_has_400_response(self, api_schema: SchemaGenerator) -> None:
op = api_schema["paths"]["/api/share_link_bundles/{id}/rebuild/"]["post"]
assert "400" in op["responses"], (
"rebuild must document the 400 response for 'Bundle is already being processed.'"
)
def test_rebuild_400_has_detail_field(self, api_schema: SchemaGenerator):
def test_rebuild_400_has_detail_field(self, api_schema: SchemaGenerator) -> None:
op = api_schema["paths"]["/api/share_link_bundles/{id}/rebuild/"]["post"]
resp_400 = op["responses"]["400"]["content"]["application/json"]["schema"]
ref = resp_400.get("$ref", "")
+1 -1
View File
@@ -943,7 +943,7 @@ class TestDocumentSearchApi(DirectoriesMixin, APITestCase):
response = self.client.get("/api/documents/?query=things")
self.assertIsNone(response.data["corrected_query"])
def test_search_spelling_suggestion_suppressed_for_private_terms(self):
def test_search_spelling_suggestion_suppressed_for_private_terms(self) -> None:
owner = User.objects.create_user("owner")
attacker = User.objects.create_user("attacker")
attacker.user_permissions.add(
+2 -2
View File
@@ -404,7 +404,7 @@ class TestSystemStatus(APITestCase):
THEN:
- The response contains the correct AI status
"""
with override_settings(AI_ENABLED=True, LLM_EMBEDDING_BACKEND="openai"):
with override_settings(AI_ENABLED=True, LLM_EMBEDDING_BACKEND="openai-like"):
self.client.force_login(self.user)
# No tasks found
@@ -431,7 +431,7 @@ class TestSystemStatus(APITestCase):
THEN:
- The response contains the correct AI status
"""
with override_settings(AI_ENABLED=True, LLM_EMBEDDING_BACKEND="openai"):
with override_settings(AI_ENABLED=True, LLM_EMBEDDING_BACKEND="openai-like"):
PaperlessTaskFactory(
task_type=PaperlessTask.TaskType.LLM_INDEX,
trigger_source=PaperlessTask.TriggerSource.SCHEDULED,
+24 -39
View File
@@ -273,6 +273,7 @@ class TestApiWorkflows(DirectoriesMixin, APITestCase):
self.assertEqual(Workflow.objects.count(), 2)
workflow = Workflow.objects.get(name="Workflow 2")
trigger = workflow.triggers.first()
assert trigger is not None
self.assertSetEqual(
set(trigger.filter_has_tags.values_list("id", flat=True)),
{self.t1.id},
@@ -493,44 +494,24 @@ class TestApiWorkflows(DirectoriesMixin, APITestCase):
self.assertEqual(response.status_code, status.HTTP_200_OK)
workflow = Workflow.objects.get(id=response.data["id"])
self.assertEqual(workflow.name, "Workflow Updated")
self.assertEqual(workflow.triggers.first().filter_has_tags.first(), self.t1)
trigger = workflow.triggers.first()
assert trigger is not None
action = workflow.actions.first()
assert action is not None
self.assertEqual(trigger.filter_has_tags.first(), self.t1)
self.assertEqual(trigger.filter_has_all_tags.first(), self.t2)
self.assertEqual(trigger.filter_has_not_tags.first(), self.t3)
self.assertEqual(trigger.filter_has_any_correspondents.first(), self.c)
self.assertEqual(trigger.filter_has_not_correspondents.first(), self.c2)
self.assertEqual(trigger.filter_has_any_document_types.first(), self.dt)
self.assertEqual(trigger.filter_has_not_document_types.first(), self.dt2)
self.assertEqual(trigger.filter_has_any_storage_paths.first(), self.sp)
self.assertEqual(trigger.filter_has_not_storage_paths.first(), self.sp2)
self.assertEqual(
workflow.triggers.first().filter_has_all_tags.first(),
self.t2,
)
self.assertEqual(
workflow.triggers.first().filter_has_not_tags.first(),
self.t3,
)
self.assertEqual(
workflow.triggers.first().filter_has_any_correspondents.first(),
self.c,
)
self.assertEqual(
workflow.triggers.first().filter_has_not_correspondents.first(),
self.c2,
)
self.assertEqual(
workflow.triggers.first().filter_has_any_document_types.first(),
self.dt,
)
self.assertEqual(
workflow.triggers.first().filter_has_not_document_types.first(),
self.dt2,
)
self.assertEqual(
workflow.triggers.first().filter_has_any_storage_paths.first(),
self.sp,
)
self.assertEqual(
workflow.triggers.first().filter_has_not_storage_paths.first(),
self.sp2,
)
self.assertEqual(
workflow.triggers.first().filter_custom_field_query,
trigger.filter_custom_field_query,
json.dumps(["AND", [[self.cf1.id, "exact", "value"]]]),
)
self.assertEqual(workflow.actions.first().assign_title, "Action New Title")
self.assertEqual(action.assign_title, "Action New Title")
def test_api_update_workflow_no_trigger_actions(self) -> None:
"""
@@ -612,9 +593,13 @@ class TestApiWorkflows(DirectoriesMixin, APITestCase):
self.assertEqual(response.status_code, status.HTTP_200_OK)
workflow = Workflow.objects.get(id=response.data["id"])
self.assertEqual(WorkflowTrigger.objects.all().count(), 1)
self.assertNotEqual(workflow.triggers.first().id, self.trigger.id)
new_trigger = workflow.triggers.first()
assert new_trigger is not None
self.assertNotEqual(new_trigger.id, self.trigger.id)
self.assertEqual(WorkflowAction.objects.all().count(), 1)
self.assertNotEqual(workflow.actions.first().id, self.action.id)
new_action = workflow.actions.first()
assert new_action is not None
self.assertNotEqual(new_action.id, self.action.id)
def test_email_action_validation(self) -> None:
"""
@@ -873,7 +858,7 @@ class TestApiWorkflows(DirectoriesMixin, APITestCase):
self.action.refresh_from_db()
self.assertEqual(self.action.assign_title, "Patched Title")
def test_password_action_passwords_field(self):
def test_password_action_passwords_field(self) -> None:
"""
GIVEN:
- Nothing
@@ -896,7 +881,7 @@ class TestApiWorkflows(DirectoriesMixin, APITestCase):
self.assertEqual(response.status_code, status.HTTP_201_CREATED)
self.assertEqual(response.data["passwords"], passwords)
def test_password_action_invalid_passwords_field(self):
def test_password_action_invalid_passwords_field(self) -> None:
"""
GIVEN:
- Nothing
+9 -3
View File
@@ -86,7 +86,7 @@ class TestBarcode(
self.assertDictEqual(separator_page_numbers, {1: False})
@override_settings(CONSUMER_ENABLE_ASN_BARCODE=True)
def test_asn_barcode_duplicate_in_trash_fails(self):
def test_asn_barcode_duplicate_in_trash_fails(self) -> None:
"""
GIVEN:
- A document with ASN barcode 123 is in the trash
@@ -585,6 +585,7 @@ class TestBarcode(
- The barcode config is used
"""
app_config = ApplicationConfiguration.objects.first()
assert app_config is not None
app_config.barcodes_enabled = True
app_config.barcode_string = "CUSTOM BARCODE"
app_config.save()
@@ -771,6 +772,7 @@ class TestAsnBarcode(DirectoriesMixin, SampleDirMixin, GetReaderPluginMixin, Tes
)
document = Document.objects.first()
assert document is not None
self.assertEqual(document.archive_serial_number, 123)
@@ -1059,11 +1061,15 @@ class TestTagBarcode(DirectoriesMixin, SampleDirMixin, GetReaderPluginMixin, Tes
doc2 = documents[1]
self.assertEqual(doc2.tags.count(), 1)
self.assertEqual(doc2.tags.first().name, "invoice")
_tag_1 = doc2.tags.first()
assert _tag_1 is not None
self.assertEqual(_tag_1.name, "invoice")
doc3 = documents[2]
self.assertEqual(doc3.tags.count(), 1)
self.assertEqual(doc3.tags.first().name, "receipt")
_tag_2 = doc3.tags.first()
assert _tag_2 is not None
self.assertEqual(_tag_2.name, "receipt")
@override_settings(
CONSUMER_ENABLE_TAG_BARCODE=True,
+28 -30
View File
@@ -319,8 +319,10 @@ class TestBulkEdit(DirectoriesMixin, TestCase):
[self.doc3.id],
)
# assert reflect document link
_cf_1 = self.doc3.custom_fields.first()
assert _cf_1 is not None
self.assertEqual(
self.doc3.custom_fields.first().value,
_cf_1.value,
[self.doc2.id, self.doc1.id],
)
@@ -334,14 +336,12 @@ class TestBulkEdit(DirectoriesMixin, TestCase):
add_custom_fields={},
remove_custom_fields=[cf3.id],
)
self.assertNotIn(
self.doc3.id,
self.doc1.custom_fields.filter(field=cf3).first().value,
)
self.assertNotIn(
self.doc3.id,
self.doc2.custom_fields.filter(field=cf3).first().value,
)
_cf_2 = self.doc1.custom_fields.filter(field=cf3).first()
assert _cf_2 is not None
self.assertNotIn(self.doc3.id, _cf_2.value)
_cf_3 = self.doc2.custom_fields.filter(field=cf3).first()
assert _cf_3 is not None
self.assertNotIn(self.doc3.id, _cf_3.value)
def test_modify_custom_fields_doclink_self_link(self) -> None:
"""
@@ -363,14 +363,12 @@ class TestBulkEdit(DirectoriesMixin, TestCase):
remove_custom_fields=[],
)
self.assertEqual(
self.doc1.custom_fields.first().value,
[self.doc2.id],
)
self.assertEqual(
self.doc2.custom_fields.first().value,
[self.doc1.id],
)
_cf_4 = self.doc1.custom_fields.first()
assert _cf_4 is not None
self.assertEqual(_cf_4.value, [self.doc2.id])
_cf_5 = self.doc2.custom_fields.first()
assert _cf_5 is not None
self.assertEqual(_cf_5.value, [self.doc1.id])
def test_delete(self) -> None:
self.assertEqual(Document.objects.count(), 5)
@@ -693,7 +691,7 @@ class TestPDFActions(DirectoriesMixin, TestCase):
self,
mock_consume_file,
mock_delete_documents,
):
) -> None:
"""
GIVEN:
- Existing documents
@@ -932,7 +930,7 @@ class TestPDFActions(DirectoriesMixin, TestCase):
mock_chord,
mock_consume_file,
mock_delete_documents,
):
) -> None:
"""
GIVEN:
- Existing documents
@@ -1027,7 +1025,7 @@ class TestPDFActions(DirectoriesMixin, TestCase):
mock_consume_file.assert_not_called()
@mock.patch("documents.tasks.consume_file.apply_async")
def test_rotate(self, mock_consume_delay):
def test_rotate(self, mock_consume_delay) -> None:
"""
GIVEN:
- Existing documents
@@ -1054,7 +1052,7 @@ class TestPDFActions(DirectoriesMixin, TestCase):
self,
mock_pdf_save,
mock_consume_delay,
):
) -> None:
"""
GIVEN:
- Existing documents
@@ -1078,7 +1076,7 @@ class TestPDFActions(DirectoriesMixin, TestCase):
def test_rotate_non_pdf(
self,
mock_consume_delay,
):
) -> None:
"""
GIVEN:
- Existing documents
@@ -1105,7 +1103,7 @@ class TestPDFActions(DirectoriesMixin, TestCase):
mock_open,
mock_consume_delay,
mock_magic,
):
) -> None:
Document.objects.create(
checksum="B-v1",
title="B version 1",
@@ -1128,7 +1126,7 @@ class TestPDFActions(DirectoriesMixin, TestCase):
@mock.patch("documents.tasks.consume_file.apply_async")
@mock.patch("pikepdf.Pdf.save")
@mock.patch("documents.data_models.magic.from_file", return_value="application/pdf")
def test_delete_pages(self, mock_magic, mock_pdf_save, mock_consume_delay):
def test_delete_pages(self, mock_magic, mock_pdf_save, mock_consume_delay) -> None:
"""
GIVEN:
- Existing documents
@@ -1159,7 +1157,7 @@ class TestPDFActions(DirectoriesMixin, TestCase):
mock_open,
mock_consume_delay,
mock_magic,
):
) -> None:
Document.objects.create(
checksum="B-v1",
title="B version 1",
@@ -1181,7 +1179,7 @@ class TestPDFActions(DirectoriesMixin, TestCase):
@mock.patch("documents.tasks.consume_file.apply_async")
@mock.patch("pikepdf.Pdf.save")
def test_delete_pages_with_error(self, mock_pdf_save, mock_consume_delay):
def test_delete_pages_with_error(self, mock_pdf_save, mock_consume_delay) -> None:
"""
GIVEN:
- Existing documents
@@ -1300,7 +1298,7 @@ class TestPDFActions(DirectoriesMixin, TestCase):
self.assertEqual(self.doc2.archive_serial_number, 333)
@mock.patch("documents.tasks.consume_file.apply_async")
def test_edit_pdf_with_update_document(self, mock_consume_delay):
def test_edit_pdf_with_update_document(self, mock_consume_delay) -> None:
"""
GIVEN:
- A single existing PDF document
@@ -1338,7 +1336,7 @@ class TestPDFActions(DirectoriesMixin, TestCase):
mock_new,
mock_consume_delay,
mock_magic,
):
) -> None:
Document.objects.create(
checksum="B-v1",
title="B version 1",
@@ -1416,7 +1414,7 @@ class TestPDFActions(DirectoriesMixin, TestCase):
self,
mock_consume_file,
mock_group,
):
) -> None:
"""
GIVEN:
- Existing document
@@ -1446,7 +1444,7 @@ class TestPDFActions(DirectoriesMixin, TestCase):
mock_mkdtemp,
mock_consume_delay,
mock_update_document,
):
) -> None:
doc = self.doc1
temp_dir = self.dirs.scratch_dir / "remove-password-update"
temp_dir.mkdir(parents=True, exist_ok=True)
+26
View File
@@ -246,6 +246,7 @@ class TestConsumer(
consumer.run()
document = Document.objects.first()
assert document is not None
self.assertIsNotNone(document)
@@ -297,6 +298,7 @@ class TestConsumer(
consumer.run()
document = Document.objects.first()
assert document is not None
self.assertIsNotNone(document)
@@ -316,6 +318,7 @@ class TestConsumer(
consumer.run()
document = Document.objects.first()
assert document is not None
self.assertIsNotNone(document)
@@ -331,6 +334,7 @@ class TestConsumer(
consumer.run()
document = Document.objects.first()
assert document is not None
self.assertIsNotNone(document)
@@ -347,6 +351,7 @@ class TestConsumer(
consumer.run()
document = Document.objects.first()
assert document is not None
self.assertIsNotNone(document)
@@ -363,6 +368,7 @@ class TestConsumer(
consumer.run()
document = Document.objects.first()
assert document is not None
self.assertEqual(document.document_type.id, dt.id)
self._assert_first_last_send_progress()
@@ -377,6 +383,7 @@ class TestConsumer(
consumer.run()
document = Document.objects.first()
assert document is not None
self.assertEqual(document.storage_path.id, sp.id)
self._assert_first_last_send_progress()
@@ -393,6 +400,7 @@ class TestConsumer(
consumer.run()
document = Document.objects.first()
assert document is not None
self.assertIn(t1, document.tags.all())
self.assertNotIn(t2, document.tags.all())
@@ -419,6 +427,7 @@ class TestConsumer(
consumer.run()
document = Document.objects.first()
assert document is not None
fields_used = [
field_instance.field for field_instance in document.custom_fields.all()
@@ -441,6 +450,7 @@ class TestConsumer(
consumer.run()
document = Document.objects.first()
assert document is not None
self.assertEqual(document.archive_serial_number, 123)
self._assert_first_last_send_progress()
@@ -460,6 +470,7 @@ class TestConsumer(
consumer.run()
document = Document.objects.first()
assert document is not None
now = timezone.now()
self.assertEqual(document.title, f"{c.name}{dt.name} {now.strftime('%m-%y')}")
@@ -475,6 +486,7 @@ class TestConsumer(
consumer.run()
document = Document.objects.first()
assert document is not None
self.assertEqual(document.owner, testuser)
self._assert_first_last_send_progress()
@@ -493,6 +505,7 @@ class TestConsumer(
consumer.run()
document = Document.objects.first()
assert document is not None
user_checker = ObjectPermissionChecker(testuser)
self.assertTrue(user_checker.has_perm("view_document", document))
@@ -565,6 +578,7 @@ class TestConsumer(
consumer.run()
document = Document.objects.first()
assert document is not None
document.delete()
with self.assertRaisesMessage(ConsumerError, "document is in the trash"):
@@ -645,6 +659,7 @@ class TestConsumer(
consumer.run()
document = Document.objects.first()
assert document is not None
self.assertEqual(document.title, "new docs")
self.assertEqual(document.filename, "none/new docs.pdf")
@@ -666,6 +681,7 @@ class TestConsumer(
consumer.run()
document = Document.objects.first()
assert document is not None
self.assertIsNotNone(document)
assert document is not None
@@ -704,6 +720,7 @@ class TestConsumer(
consumer.run()
document = Document.objects.first()
assert document is not None
self.assertEqual(document.title, "new docs")
self.assertIsNotNone(document.title)
@@ -724,6 +741,7 @@ class TestConsumer(
document = Document.objects.first()
assert document is not None
assert document is not None
self.assertEqual(document.version_label, "v1")
@@ -940,6 +958,7 @@ class TestConsumer(
consumer.run()
document = Document.objects.first()
assert document is not None
self.assertEqual(document.correspondent, correspondent)
self.assertEqual(document.document_type, dtype)
@@ -957,6 +976,7 @@ class TestConsumer(
consumer.run()
document = Document.objects.first()
assert document is not None
self._assert_first_last_send_progress()
@@ -987,6 +1007,7 @@ class TestConsumer(
# Move the existing document to trash
document = Document.objects.first()
assert document is not None
document.delete()
dst = self.get_test_file()
@@ -1015,6 +1036,7 @@ class TestConsumer(
consumer.run()
document = Document.objects.first()
assert document is not None
self._assert_first_last_send_progress()
@@ -1173,6 +1195,7 @@ class TestConsumerCreatedDate(DirectoriesMixin, GetConsumerMixin, TestCase):
consumer.run()
document = Document.objects.first()
assert document is not None
self.assertEqual(
document.created,
@@ -1203,6 +1226,7 @@ class TestConsumerCreatedDate(DirectoriesMixin, GetConsumerMixin, TestCase):
consumer.run()
document = Document.objects.first()
assert document is not None
self.assertEqual(
document.created,
@@ -1233,6 +1257,7 @@ class TestConsumerCreatedDate(DirectoriesMixin, GetConsumerMixin, TestCase):
consumer.run()
document = Document.objects.first()
assert document is not None
self.assertEqual(
document.created,
@@ -1265,6 +1290,7 @@ class TestConsumerCreatedDate(DirectoriesMixin, GetConsumerMixin, TestCase):
consumer.run()
document = Document.objects.first()
assert document is not None
self.assertEqual(
document.created,
+2 -2
View File
@@ -1514,7 +1514,7 @@ class TestFilenameGeneration(DirectoriesMixin, TestCase):
Path("somepath/asn-201-400/asn-3xx/Does Matter.pdf"),
)
def test_template_related_context_keeps_legacy_string_coercion(self):
def test_template_related_context_keeps_legacy_string_coercion(self) -> None:
"""
GIVEN:
- A storage path template that uses related objects directly as strings
@@ -1908,7 +1908,7 @@ class TestCustomFieldFilenameUpdates(
self.assertLessEqual(m.call_count, 1)
@override_settings(FILENAME_FORMAT=None)
def test_overlong_storage_path_keeps_existing_filename(self):
def test_overlong_storage_path_keeps_existing_filename(self) -> None:
initial_filename = generate_filename(self.doc)
Document.objects.filter(pk=self.doc.pk).update(filename=str(initial_filename))
self.doc.refresh_from_db()
+5 -5
View File
@@ -83,7 +83,7 @@ class TestDateLocalization:
format_style: str,
locale_str: str,
expected_output: str,
):
) -> None:
"""
Tests `localize_date` with `date` objects across different locales and formats.
"""
@@ -144,7 +144,7 @@ class TestDateLocalization:
format_style: str,
locale_str: str,
expected_output: str,
):
) -> None:
# To handle the non-breaking space in French and other locales
result = localize_date(value, format_style, locale_str)
assert result.replace("\u202f", " ") == expected_output.replace("\u202f", " ")
@@ -161,7 +161,7 @@ class TestDateLocalization:
def test_localize_date_raises_type_error_for_invalid_input(
self,
invalid_value: None | list[object] | dict[Any, Any] | Literal[1698330605],
):
) -> None:
with pytest.raises(TypeError) as excinfo:
localize_date(invalid_value, "medium", "en_US")
@@ -228,7 +228,7 @@ class TestDateLocalization:
format_style: str,
locale_str: str,
expected_output: str,
):
) -> None:
"""
Tests `localize_date` with `date` string across different locales and formats.
"""
@@ -289,7 +289,7 @@ class TestDateLocalization:
format_style: str,
locale_str: str,
expected_output: str,
):
) -> None:
"""
Tests `localize_date` with `date` string across different locales and formats.
"""
+18 -13
View File
@@ -10,10 +10,10 @@ from documents.regex import validate_regex_pattern
class TestValidateRegexPattern:
def test_valid_pattern(self):
def test_valid_pattern(self) -> None:
validate_regex_pattern(r"\d+")
def test_invalid_pattern_raises(self):
def test_invalid_pattern_raises(self) -> None:
with pytest.raises(ValueError):
validate_regex_pattern(r"[invalid")
@@ -40,7 +40,7 @@ class TestSafeRegexSearchAndMatch:
),
],
)
def test_match_found(self, func, pattern, text, expected_group):
def test_match_found(self, func, pattern, text, expected_group) -> None:
result = func(pattern, text)
assert result is not None
assert result.group() == expected_group
@@ -52,7 +52,7 @@ class TestSafeRegexSearchAndMatch:
pytest.param(safe_regex_match, r"\d+", "abc123", id="match-no-match"),
],
)
def test_no_match(self, func, pattern, text):
def test_no_match(self, func, pattern, text) -> None:
assert func(pattern, text) is None
@pytest.mark.parametrize(
@@ -62,7 +62,7 @@ class TestSafeRegexSearchAndMatch:
pytest.param(safe_regex_match, id="match"),
],
)
def test_invalid_pattern_returns_none(self, func):
def test_invalid_pattern_returns_none(self, func) -> None:
assert func(r"[invalid", "test") is None
@pytest.mark.parametrize(
@@ -72,7 +72,7 @@ class TestSafeRegexSearchAndMatch:
pytest.param(safe_regex_match, id="match"),
],
)
def test_flags_respected(self, func):
def test_flags_respected(self, func) -> None:
assert func(r"abc", "ABC", flags=regex.IGNORECASE) is not None
@pytest.mark.parametrize(
@@ -82,7 +82,12 @@ class TestSafeRegexSearchAndMatch:
pytest.param(safe_regex_match, "match", id="match"),
],
)
def test_timeout_returns_none(self, func, method_name, mocker: MockerFixture):
def test_timeout_returns_none(
self,
func,
method_name,
mocker: MockerFixture,
) -> None:
mock_compile = mocker.patch("documents.regex.regex.compile")
getattr(mock_compile.return_value, method_name).side_effect = TimeoutError
assert func(r"\d+", "test") is None
@@ -97,31 +102,31 @@ class TestSafeRegexSub:
pytest.param(r"abc", "X", "ABC", "X", id="flags"),
],
)
def test_substitution(self, pattern, repl, text, expected):
def test_substitution(self, pattern, repl, text, expected) -> None:
flags = regex.IGNORECASE if pattern == r"abc" else 0
result = safe_regex_sub(pattern, repl, text, flags=flags)
assert result == expected
def test_invalid_pattern_returns_none(self):
def test_invalid_pattern_returns_none(self) -> None:
assert safe_regex_sub(r"[invalid", "x", "test") is None
def test_timeout_returns_none(self, mocker: MockerFixture):
def test_timeout_returns_none(self, mocker: MockerFixture) -> None:
mock_compile = mocker.patch("documents.regex.regex.compile")
mock_compile.return_value.sub.side_effect = TimeoutError
assert safe_regex_sub(r"\d+", "X", "test") is None
class TestSafeRegexFinditer:
def test_yields_matches(self):
def test_yields_matches(self) -> None:
pattern = regex.compile(r"\d+")
matches = list(safe_regex_finditer(pattern, "a1b22c333"))
assert [m.group() for m in matches] == ["1", "22", "333"]
def test_no_matches(self):
def test_no_matches(self) -> None:
pattern = regex.compile(r"\d+")
assert list(safe_regex_finditer(pattern, "abcdef")) == []
def test_timeout_stops_iteration(self, mocker: MockerFixture):
def test_timeout_stops_iteration(self, mocker: MockerFixture) -> None:
mock_pattern = mocker.MagicMock()
mock_pattern.finditer.side_effect = TimeoutError
mock_pattern.pattern = r"\d+"
@@ -280,6 +280,7 @@ class ShareLinkBundleBuildTaskTests(DirectoriesMixin, APITestCase):
self.document.archive_filename = f"{self.document.pk:07}.pdf"
self.document.save()
path = self.document.archive_path
assert path is not None
else:
path = self.document.source_path
path.parent.mkdir(parents=True, exist_ok=True)
@@ -304,6 +305,7 @@ class ShareLinkBundleBuildTaskTests(DirectoriesMixin, APITestCase):
self.assertGreater(bundle.size_bytes or 0, 0)
final_path = bundle.absolute_file_path
self.assertIsNotNone(final_path)
assert final_path is not None
self.assertTrue(final_path.exists())
with zipfile.ZipFile(final_path) as zipf:
names = zipf.namelist()
@@ -327,6 +329,7 @@ class ShareLinkBundleBuildTaskTests(DirectoriesMixin, APITestCase):
bundle.refresh_from_db()
final_path = bundle.absolute_file_path
self.assertIsNotNone(final_path)
assert final_path is not None
self.assertTrue(final_path.exists())
self.assertNotEqual(final_path.read_bytes(), b"old")
@@ -354,6 +357,7 @@ class ShareLinkBundleBuildTaskTests(DirectoriesMixin, APITestCase):
bundle.refresh_from_db()
self.assertEqual(bundle.status, ShareLinkBundle.Status.FAILED)
self.assertIsInstance(bundle.last_error, dict)
assert isinstance(bundle.last_error, dict)
self.assertEqual(bundle.last_error.get("message"), "zip failure")
self.assertEqual(bundle.last_error.get("exception_type"), "RuntimeError")
scratch_zips = list(Path(settings.SCRATCH_DIR).glob("*.zip"))
+28 -24
View File
@@ -56,7 +56,11 @@ def send_publish(
@pytest.mark.django_db
class TestBeforeTaskPublishHandler:
def test_creates_task_for_consume_file(self, consume_input_doc, consume_overrides):
def test_creates_task_for_consume_file(
self,
consume_input_doc,
consume_overrides,
) -> None:
task_id = send_publish(
"documents.tasks.consume_file",
(),
@@ -70,18 +74,18 @@ class TestBeforeTaskPublishHandler:
assert task.input_data["filename"] == "invoice.pdf"
assert task.owner_id == consume_overrides.owner_id
def test_creates_task_for_train_classifier(self):
def test_creates_task_for_train_classifier(self) -> None:
task_id = send_publish("documents.tasks.train_classifier", (), {})
task = PaperlessTask.objects.get(task_id=task_id)
assert task.task_type == PaperlessTask.TaskType.TRAIN_CLASSIFIER
assert task.trigger_source == PaperlessTask.TriggerSource.MANUAL
def test_creates_task_for_sanity_check(self):
def test_creates_task_for_sanity_check(self) -> None:
task_id = send_publish("documents.tasks.sanity_check", (), {})
task = PaperlessTask.objects.get(task_id=task_id)
assert task.task_type == PaperlessTask.TaskType.SANITY_CHECK
def test_creates_task_for_process_mail_accounts(self):
def test_creates_task_for_process_mail_accounts(self) -> None:
task_id = send_publish(
"paperless_mail.tasks.process_mail_accounts",
(),
@@ -91,13 +95,13 @@ class TestBeforeTaskPublishHandler:
assert task.task_type == PaperlessTask.TaskType.MAIL_FETCH
assert task.input_data["account_ids"] == [1, 2]
def test_mail_fetch_no_account_ids_stores_empty_input(self):
def test_mail_fetch_no_account_ids_stores_empty_input(self) -> None:
"""Beat-scheduled mail checks pass no account_ids; input_data should be {} not {"account_ids": None}."""
task_id = send_publish("paperless_mail.tasks.process_mail_accounts", (), {})
task = PaperlessTask.objects.get(task_id=task_id)
assert task.input_data == {}
def test_overrides_date_serialized_as_iso_string(self, consume_input_doc):
def test_overrides_date_serialized_as_iso_string(self, consume_input_doc) -> None:
"""A datetime.date in overrides is stored as an ISO string so input_data is JSON-safe."""
overrides = DocumentMetadataOverrides(created=datetime.date(2024, 1, 15))
@@ -110,7 +114,7 @@ class TestBeforeTaskPublishHandler:
task = PaperlessTask.objects.get(task_id=task_id)
assert task.input_data["overrides"]["created"] == "2024-01-15"
def test_overrides_path_serialized_as_string(self, consume_input_doc):
def test_overrides_path_serialized_as_string(self, consume_input_doc) -> None:
"""A Path value in overrides is stored as a plain string so input_data is JSON-safe."""
overrides = DocumentMetadataOverrides()
overrides.filename = Path("/uploads/invoice.pdf") # type: ignore[assignment]
@@ -159,11 +163,11 @@ class TestBeforeTaskPublishHandler:
task = PaperlessTask.objects.get(task_id=task_id)
assert task.trigger_source == expected_trigger_source
def test_ignores_untracked_task(self):
def test_ignores_untracked_task(self) -> None:
send_publish("documents.tasks.some_untracked_task", (), {})
assert PaperlessTask.objects.count() == 0
def test_ignores_none_headers(self):
def test_ignores_none_headers(self) -> None:
before_task_publish_handler(sender=None, headers=None, body=None)
assert PaperlessTask.objects.count() == 0
@@ -185,7 +189,7 @@ class TestBeforeTaskPublishHandler:
@pytest.mark.django_db
class TestTaskPrerunHandler:
def test_marks_task_started(self):
def test_marks_task_started(self) -> None:
task = PaperlessTaskFactory(status=PaperlessTask.Status.PENDING)
task_prerun_handler(task_id=task.task_id)
@@ -215,7 +219,7 @@ class TestTaskPostrunHandler:
date_started=timezone.now(),
)
def test_records_success_with_dict_result(self):
def test_records_success_with_dict_result(self) -> None:
task = self._started_task()
task_postrun_handler(
@@ -230,7 +234,7 @@ class TestTaskPostrunHandler:
assert task.duration_seconds is not None
assert task.wait_time_seconds is not None
def test_skips_failure_state(self):
def test_skips_failure_state(self) -> None:
"""postrun skips FAILURE; task_failure_handler owns that path."""
task = self._started_task()
@@ -238,7 +242,7 @@ class TestTaskPostrunHandler:
task.refresh_from_db()
assert task.status == PaperlessTask.Status.STARTED
def test_records_success_with_consume_result(self):
def test_records_success_with_consume_result(self) -> None:
"""ConsumeFileSuccessResult dict is stored directly as result_data."""
from documents.data_models import ConsumeFileSuccessResult
@@ -251,7 +255,7 @@ class TestTaskPostrunHandler:
task.refresh_from_db()
assert task.result_data == {"document_id": 42}
def test_records_stopped_with_reason(self):
def test_records_stopped_with_reason(self) -> None:
"""ConsumeFileStoppedResult dict is stored directly as result_data."""
from documents.data_models import ConsumeFileStoppedResult
@@ -264,14 +268,14 @@ class TestTaskPostrunHandler:
task.refresh_from_db()
assert task.result_data == {"reason": "Barcode splitting complete!"}
def test_none_retval_stores_no_result_data(self):
def test_none_retval_stores_no_result_data(self) -> None:
"""None return value (non-consume tasks) leaves result_data untouched."""
task = self._started_task()
task_postrun_handler(task_id=task.task_id, retval=None, state="SUCCESS")
task.refresh_from_db()
assert task.result_data is None
def test_ignores_unknown_task_id(self):
def test_ignores_unknown_task_id(self) -> None:
task_postrun_handler(
task_id="nonexistent",
@@ -279,7 +283,7 @@ class TestTaskPostrunHandler:
state="SUCCESS",
) # must not raise
def test_records_revoked_state(self):
def test_records_revoked_state(self) -> None:
task = self._started_task()
task_postrun_handler(task_id=task.task_id, retval=None, state="REVOKED")
@@ -289,7 +293,7 @@ class TestTaskPostrunHandler:
@pytest.mark.django_db
class TestTaskFailureHandler:
def test_records_failure_with_exception(self):
def test_records_failure_with_exception(self) -> None:
task = PaperlessTaskFactory(
task_type=PaperlessTask.TaskType.CONSUME_FILE,
@@ -308,7 +312,7 @@ class TestTaskFailureHandler:
assert task.result_data["error_message"] == "PDF parse failed"
assert task.date_done is not None
def test_records_traceback_when_provided(self):
def test_records_traceback_when_provided(self) -> None:
task = PaperlessTaskFactory(
task_type=PaperlessTask.TaskType.CONSUME_FILE,
@@ -331,7 +335,7 @@ class TestTaskFailureHandler:
assert "traceback" in task.result_data
assert len(task.result_data["traceback"]) <= 5000
def test_computes_duration_and_wait_time(self):
def test_computes_duration_and_wait_time(self) -> None:
now = timezone.now()
task = PaperlessTaskFactory(
@@ -350,14 +354,14 @@ class TestTaskFailureHandler:
assert task.duration_seconds == pytest.approx(5.0, abs=1.0)
assert task.wait_time_seconds == pytest.approx(5.0, abs=1.0)
def test_ignores_none_task_id(self):
def test_ignores_none_task_id(self) -> None:
task_failure_handler(task_id=None, exception=ValueError("x"), traceback=None)
@pytest.mark.django_db
class TestTaskRevokedHandler:
def test_marks_task_revoked(self, mocker: pytest_mock.MockerFixture):
def test_marks_task_revoked(self, mocker: pytest_mock.MockerFixture) -> None:
"""task_revoked_handler moves a queued task to REVOKED and stamps date_done."""
task = PaperlessTaskFactory(status=PaperlessTask.Status.PENDING)
request = mocker.MagicMock()
@@ -368,12 +372,12 @@ class TestTaskRevokedHandler:
assert task.status == PaperlessTask.Status.REVOKED
assert task.date_done is not None
def test_ignores_none_request(self):
def test_ignores_none_request(self) -> None:
"""task_revoked_handler must not raise when request is None."""
task_revoked_handler(request=None) # must not raise
def test_ignores_unknown_task_id(self, mocker: pytest_mock.MockerFixture):
def test_ignores_unknown_task_id(self, mocker: pytest_mock.MockerFixture) -> None:
"""task_revoked_handler must not raise for a task_id not in the database."""
request = mocker.MagicMock()
request.id = "nonexistent-id"
+4 -2
View File
@@ -359,7 +359,7 @@ class TestAISuggestions(DirectoriesMixin, TestCase):
@patch("documents.views.get_ai_document_classification")
@override_settings(
AI_ENABLED=True,
LLM_BACKEND="openai",
LLM_BACKEND="openai-like",
)
def test_suggestions_with_invalid_ai_configuration(
self,
@@ -379,7 +379,9 @@ class TestAISuggestions(DirectoriesMixin, TestCase):
"ai": ["Invalid AI configuration."],
},
)
self.assertIsNone(get_llm_suggestion_cache(self.document.pk, backend="openai"))
self.assertIsNone(
get_llm_suggestion_cache(self.document.pk, backend="openai-like"),
)
def test_invalidate_suggestions_cache(self) -> None:
self.client.force_login(user=self.user)
+117 -68
View File
@@ -187,6 +187,7 @@ class TestWorkflows(
)
document = Document.objects.first()
assert document is not None
self.assertEqual(document.correspondent, self.c)
self.assertEqual(document.document_type, self.dt)
self.assertEqual(list(document.tags.all()), [self.t1, self.t2, self.t3])
@@ -298,6 +299,7 @@ class TestWorkflows(
None,
)
document = Document.objects.first()
assert document is not None
self.assertEqual(document.correspondent, self.c)
self.assertEqual(document.document_type, self.dt)
self.assertEqual(list(document.tags.all()), [self.t1, self.t2, self.t3])
@@ -415,6 +417,7 @@ class TestWorkflows(
None,
)
document = Document.objects.first()
assert document is not None
# workflow 1
self.assertEqual(document.document_type, self.dt)
# workflow 2
@@ -483,6 +486,7 @@ class TestWorkflows(
None,
)
document = Document.objects.first()
assert document is not None
self.assertEqual(document.title, "Doc fnmatch title")
expected_str = f"Document matched {trigger} from {w}"
@@ -535,6 +539,7 @@ class TestWorkflows(
None,
)
document = Document.objects.first()
assert document is not None
self.assertIsNone(document.correspondent)
self.assertIsNone(document.document_type)
self.assertEqual(document.tags.all().count(), 0)
@@ -547,7 +552,8 @@ class TestWorkflows(
).count(),
0,
)
self.assertEqual(get_groups_with_perms(document).count(), 0)
group_perms: QuerySet[Any] = get_groups_with_perms(document)
self.assertEqual(group_perms.count(), 0)
self.assertEqual(
get_users_with_perms(
document,
@@ -555,7 +561,8 @@ class TestWorkflows(
).count(),
0,
)
self.assertEqual(get_groups_with_perms(document).count(), 0)
group_perms: QuerySet[Any] = get_groups_with_perms(document)
self.assertEqual(group_perms.count(), 0)
self.assertEqual(document.title, "simple")
expected_str = f"Document did not match {w}"
@@ -609,6 +616,7 @@ class TestWorkflows(
None,
)
document = Document.objects.first()
assert document is not None
self.assertIsNone(document.correspondent)
self.assertIsNone(document.document_type)
self.assertEqual(document.tags.all().count(), 0)
@@ -621,12 +629,8 @@ class TestWorkflows(
).count(),
0,
)
self.assertEqual(
get_groups_with_perms(
document,
).count(),
0,
)
group_perms: QuerySet[Any] = get_groups_with_perms(document)
self.assertEqual(group_perms.count(), 0)
self.assertEqual(
get_users_with_perms(
document,
@@ -634,12 +638,8 @@ class TestWorkflows(
).count(),
0,
)
self.assertEqual(
get_groups_with_perms(
document,
).count(),
0,
)
group_perms: QuerySet[Any] = get_groups_with_perms(document)
self.assertEqual(group_perms.count(), 0)
self.assertEqual(document.title, "simple")
expected_str = f"Document did not match {w}"
@@ -696,6 +696,7 @@ class TestWorkflows(
None,
)
document = Document.objects.first()
assert document is not None
self.assertIsNone(document.correspondent)
self.assertIsNone(document.document_type)
self.assertEqual(document.tags.all().count(), 0)
@@ -708,12 +709,8 @@ class TestWorkflows(
).count(),
0,
)
self.assertEqual(
get_groups_with_perms(
document,
).count(),
0,
)
group_perms: QuerySet[Any] = get_groups_with_perms(document)
self.assertEqual(group_perms.count(), 0)
self.assertEqual(
get_users_with_perms(
document,
@@ -721,12 +718,8 @@ class TestWorkflows(
).count(),
0,
)
self.assertEqual(
get_groups_with_perms(
document,
).count(),
0,
)
group_perms: QuerySet[Any] = get_groups_with_perms(document)
self.assertEqual(group_perms.count(), 0)
self.assertEqual(document.title, "simple")
expected_str = f"Document did not match {w}"
@@ -780,6 +773,7 @@ class TestWorkflows(
None,
)
document = Document.objects.first()
assert document is not None
self.assertIsNone(document.correspondent)
self.assertIsNone(document.document_type)
self.assertEqual(document.tags.all().count(), 0)
@@ -792,12 +786,8 @@ class TestWorkflows(
).count(),
0,
)
self.assertEqual(
get_groups_with_perms(
document,
).count(),
0,
)
group_perms: QuerySet[Any] = get_groups_with_perms(document)
self.assertEqual(group_perms.count(), 0)
self.assertEqual(
get_users_with_perms(
document,
@@ -805,12 +795,8 @@ class TestWorkflows(
).count(),
0,
)
self.assertEqual(
get_groups_with_perms(
document,
).count(),
0,
)
group_perms: QuerySet[Any] = get_groups_with_perms(document)
self.assertEqual(group_perms.count(), 0)
self.assertEqual(document.title, "simple")
expected_str = f"Document did not match {w}"
@@ -898,6 +884,7 @@ class TestWorkflows(
None,
)
document = Document.objects.first()
assert document is not None
self.assertEqual(
list(document.custom_fields.all().values_list("field", flat=True)),
[self.cf1.pk],
@@ -1968,6 +1955,7 @@ class TestWorkflows(
None,
)
document = Document.objects.first()
assert document is not None
self.assertRegex(
document.title,
r"Doc added in \w{3,}",
@@ -2064,11 +2052,11 @@ class TestWorkflows(
format="json",
)
view_users_perms: QuerySet = get_users_with_perms(
view_users_perms: QuerySet[Any] = get_users_with_perms(
doc,
only_with_perms_in=["view_document"],
)
change_users_perms: QuerySet = get_users_with_perms(
change_users_perms: QuerySet[Any] = get_users_with_perms(
doc,
only_with_perms_in=["change_document"],
)
@@ -2079,7 +2067,7 @@ class TestWorkflows(
self.assertIn(self.user3, view_users_perms)
self.assertIn(self.user3, change_users_perms)
group_perms: QuerySet = get_groups_with_perms(doc)
group_perms: QuerySet[Any] = get_groups_with_perms(doc)
# group1 should still have permissions
self.assertIn(self.group1, group_perms)
# group2 should have been added
@@ -2845,9 +2833,72 @@ class TestWorkflows(
self.assertEqual(doc.custom_fields.all().count(), 0)
self.assertFalse(self.user3.has_perm("documents.view_document", doc))
self.assertFalse(self.user3.has_perm("documents.change_document", doc))
group_perms: QuerySet = get_groups_with_perms(doc)
group_perms: QuerySet[Any] = get_groups_with_perms(doc)
self.assertNotIn(self.group1, group_perms)
def test_document_updated_workflow_assignment_persists_when_removing_trigger_tag(
self,
) -> None:
"""
GIVEN:
- A document updated workflow filtered on a tag
- The workflow assigns a new title and removes that same tag
WHEN:
- The document is updated while carrying the trigger tag
THEN:
- The new title persists and the trigger tag is removed
"""
trigger = WorkflowTrigger.objects.create(
type=WorkflowTrigger.WorkflowTriggerType.DOCUMENT_UPDATED,
)
trigger.filter_has_tags.add(self.t1)
assignment = WorkflowAction.objects.create(
type=WorkflowAction.WorkflowActionType.ASSIGNMENT,
assign_title="workflow renamed",
order=0,
)
removal = WorkflowAction.objects.create(
type=WorkflowAction.WorkflowActionType.REMOVAL,
order=1,
)
removal.remove_tags.add(self.t1)
removal.save()
workflow = Workflow.objects.create(
name="Workflow rename and remove trigger tag",
order=0,
)
workflow.triggers.add(trigger)
workflow.actions.add(assignment, removal)
workflow.save()
doc = Document.objects.create(
title="sample test",
mime_type="application/pdf",
checksum="rename-remove-trigger-tag",
original_filename="sample.pdf",
)
generated = generate_unique_filename(doc)
destination = (settings.ORIGINALS_DIR / generated).resolve()
create_source_path_directory(destination)
shutil.copy(self.SAMPLE_DIR / "simple.pdf", destination)
Document.objects.filter(pk=doc.pk).update(filename=generated.as_posix())
doc.refresh_from_db()
doc.tags.set([self.t1, self.t2])
superuser = User.objects.create_superuser("superuser")
self.client.force_authenticate(user=superuser)
self.client.patch(
f"/api/documents/{doc.id}/",
{"title": "user update to trigger workflow"},
format="json",
)
doc.refresh_from_db()
self.assertEqual(doc.title, "workflow renamed")
self.assertFalse(doc.tags.filter(pk=self.t1.pk).exists())
self.assertTrue(doc.tags.filter(pk=self.t2.pk).exists())
def test_removal_action_document_updated_removeall(self) -> None:
"""
GIVEN:
@@ -2916,7 +2967,7 @@ class TestWorkflows(
self.assertEqual(doc.custom_fields.all().count(), 0)
self.assertFalse(self.user3.has_perm("documents.view_document", doc))
self.assertFalse(self.user3.has_perm("documents.change_document", doc))
group_perms: QuerySet = get_groups_with_perms(doc)
group_perms: QuerySet[Any] = get_groups_with_perms(doc)
self.assertNotIn(self.group1, group_perms)
def test_removal_action_document_consumed(self) -> None:
@@ -2994,6 +3045,7 @@ class TestWorkflows(
)
document = Document.objects.first()
assert document is not None
self.assertIsNone(document.correspondent)
self.assertIsNone(document.document_type)
@@ -3116,6 +3168,7 @@ class TestWorkflows(
None,
)
document = Document.objects.first()
assert document is not None
self.assertIsNone(document.correspondent)
self.assertIsNone(document.document_type)
self.assertEqual(document.tags.all().count(), 0)
@@ -3129,12 +3182,8 @@ class TestWorkflows(
).count(),
0,
)
self.assertEqual(
get_groups_with_perms(
document,
).count(),
0,
)
group_perms: QuerySet[Any] = get_groups_with_perms(document)
self.assertEqual(group_perms.count(), 0)
self.assertEqual(
get_users_with_perms(
document,
@@ -3142,12 +3191,8 @@ class TestWorkflows(
).count(),
0,
)
self.assertEqual(
get_groups_with_perms(
document,
).count(),
0,
)
group_perms: QuerySet[Any] = get_groups_with_perms(document)
self.assertEqual(group_perms.count(), 0)
self.assertEqual(
document.custom_fields.all()
.values_list(
@@ -3216,7 +3261,10 @@ class TestWorkflows(
PAPERLESS_URL="http://localhost:8000",
)
@mock.patch("django.core.mail.message.EmailMessage.send")
def test_workflow_assignment_then_email_includes_attachment(self, mock_email_send):
def test_workflow_assignment_then_email_includes_attachment(
self,
mock_email_send,
) -> None:
"""
GIVEN:
- Workflow with assignment and email actions
@@ -3765,7 +3813,7 @@ class TestWorkflows(
def test_workflow_webhook_action_does_not_overwrite_concurrent_tags(
self,
mock_execute_webhook_action,
):
) -> None:
"""
GIVEN:
- A document updated workflow with only a webhook action
@@ -3819,7 +3867,7 @@ class TestWorkflows(
def test_workflow_tag_actions_do_not_overwrite_concurrent_tags(
self,
mock_execute_webhook_action,
):
) -> None:
"""
GIVEN:
- A document updated workflow that clears tags and assigns an inbox tag
@@ -4097,7 +4145,7 @@ class TestWorkflows(
def test_password_removal_action_attempts_multiple_passwords(
self,
mock_remove_password,
):
) -> None:
"""
GIVEN:
- Workflow password removal action
@@ -4151,7 +4199,7 @@ class TestWorkflows(
def test_password_removal_action_fails_without_correct_password(
self,
mock_remove_password,
):
) -> None:
"""
GIVEN:
- Workflow password removal action
@@ -4184,7 +4232,7 @@ class TestWorkflows(
def test_password_removal_action_skips_without_passwords(
self,
mock_remove_password,
):
) -> None:
"""
GIVEN:
- Workflow password removal action with no passwords
@@ -4216,7 +4264,7 @@ class TestWorkflows(
def test_password_removal_consumable_document_deferred(
self,
mock_remove_password,
):
) -> None:
"""
GIVEN:
- Workflow password removal action
@@ -4283,7 +4331,7 @@ class TestWorkflows(
)
assert mock_remove_password.call_count == 2
def test_workflow_trash_action_soft_delete(self):
def test_workflow_trash_action_soft_delete(self) -> None:
"""
GIVEN:
- Document updated workflow with delete action
@@ -4326,7 +4374,7 @@ class TestWorkflows(
PAPERLESS_URL="http://localhost:8000",
)
@mock.patch("django.core.mail.message.EmailMessage.send")
def test_workflow_trash_with_email_action(self, mock_email_send):
def test_workflow_trash_with_email_action(self, mock_email_send) -> None:
"""
GIVEN:
- Workflow with email action, then move to trash action
@@ -4381,7 +4429,7 @@ class TestWorkflows(
PAPERLESS_URL="http://localhost:8000",
)
@mock.patch("documents.workflows.webhooks.send_webhook.apply_async")
def test_workflow_trash_with_webhook_action(self, mock_webhook_delay):
def test_workflow_trash_with_webhook_action(self, mock_webhook_delay) -> None:
"""
GIVEN:
- Workflow with webhook action (include_document=True), then move to trash action
@@ -4514,7 +4562,7 @@ class TestWorkflows(
self.assertEqual(Document.objects.count(), 0)
self.assertEqual(Document.deleted_objects.count(), 1)
def test_multiple_workflows_trash_then_assignment(self):
def test_multiple_workflows_trash_then_assignment(self) -> None:
"""
GIVEN:
- Workflow 1 (order=0) with move to trash action
@@ -4583,7 +4631,7 @@ class TestWorkflows(
log_output,
)
def test_workflow_delete_action_during_consumption(self):
def test_workflow_delete_action_during_consumption(self) -> None:
"""
GIVEN:
- Workflow with consumption trigger and delete action
@@ -4642,7 +4690,7 @@ class TestWorkflows(
# No document should be created
self.assertEqual(Document.objects.count(), 0)
def test_workflow_delete_action_during_consumption_with_assignment(self):
def test_workflow_delete_action_during_consumption_with_assignment(self) -> None:
"""
GIVEN:
- Workflow with consumption trigger, assignment action, then delete action
@@ -5156,4 +5204,5 @@ class TestDateWorkflowLocalization(
None,
)
document = Document.objects.first()
assert document is not None
assert document.title == expected_title
+7 -7
View File
@@ -184,22 +184,22 @@ class FileSystemAssertsMixin:
Utilities for checks various state information of the file system
"""
def assertIsFile(self, path: PathLike | str) -> None:
def assertIsFile(self, path: PathLike[str] | str) -> None:
self.assertTrue(Path(path).resolve().is_file(), f"File does not exist: {path}")
def assertIsNotFile(self, path: PathLike | str) -> None:
def assertIsNotFile(self, path: PathLike[str] | str) -> None:
self.assertFalse(Path(path).resolve().is_file(), f"File does exist: {path}")
def assertIsDir(self, path: PathLike | str) -> None:
def assertIsDir(self, path: PathLike[str] | str) -> None:
self.assertTrue(Path(path).resolve().is_dir(), f"Dir does not exist: {path}")
def assertIsNotDir(self, path: PathLike | str) -> None:
def assertIsNotDir(self, path: PathLike[str] | str) -> None:
self.assertFalse(Path(path).resolve().is_dir(), f"Dir does exist: {path}")
def assertFilesEqual(
self,
path1: PathLike | str,
path2: PathLike | str,
path1: PathLike[str] | str,
path2: PathLike[str] | str,
) -> None:
path1 = Path(path1)
path2 = Path(path2)
@@ -210,7 +210,7 @@ class FileSystemAssertsMixin:
self.assertEqual(hash1, hash2, "File SHA256 mismatch")
def assertFileCountInDir(self, path: PathLike | str, count: int) -> None:
def assertFileCountInDir(self, path: PathLike[str] | str, count: int) -> None:
path = Path(path).resolve()
self.assertTrue(path.is_dir(), f"Path {path} is not a directory")
files = [x for x in path.iterdir() if x.is_file()]
+1 -1
View File
@@ -2009,7 +2009,7 @@ class DocumentViewSet(
)
class ChatStreamingSerializer(serializers.Serializer):
class ChatStreamingSerializer(serializers.Serializer[dict[str, Any]]):
q = serializers.CharField(required=True)
document_id = serializers.IntegerField(required=False, allow_null=True)
+2 -2
View File
@@ -202,7 +202,7 @@ def apply_removal_to_document(
"""
if action.remove_all_tags:
document.tags.clear()
Document.objects.get(pk=document.pk).tags.clear()
else:
tag_ids_to_remove: set[int] = set()
for tag in action.remove_tags.all():
@@ -210,7 +210,7 @@ def apply_removal_to_document(
tag_ids_to_remove.update(int(pk) for pk in tag.get_descendants_pks())
if tag_ids_to_remove:
document.tags.remove(*tag_ids_to_remove)
Document.objects.get(pk=document.pk).tags.remove(*tag_ids_to_remove)
if action.remove_all_correspondents or (
document.correspondent
+2 -2
View File
@@ -2,7 +2,7 @@ msgid ""
msgstr ""
"Project-Id-Version: paperless-ngx\n"
"Report-Msgid-Bugs-To: \n"
"POT-Creation-Date: 2026-04-27 03:18+0000\n"
"POT-Creation-Date: 2026-04-28 17:07+0000\n"
"PO-Revision-Date: 2022-02-17 04:17\n"
"Last-Translator: \n"
"Language-Team: English\n"
@@ -1771,7 +1771,7 @@ msgid "CMYK"
msgstr ""
#: paperless/models.py:78 paperless/models.py:87
msgid "OpenAI"
msgid "OpenAI-compatible"
msgstr ""
#: paperless/models.py:79
+2 -4
View File
@@ -1,6 +1,4 @@
import grp
import os
import pwd
import shutil
import stat
import subprocess
@@ -38,8 +36,8 @@ def path_check(var: str, directory: Path) -> list[Error]:
except PermissionError:
dir_stat: os.stat_result = Path(directory).stat()
dir_mode: str = stat.filemode(dir_stat.st_mode)
dir_owner: str = pwd.getpwuid(dir_stat.st_uid).pw_name
dir_group: str = grp.getgrgid(dir_stat.st_gid).gr_name
dir_owner: str = ""
dir_group: str = ""
messages.append(
Error(
writeable_message.format(var),
+8 -5
View File
@@ -38,7 +38,9 @@ class OutputTypeConfig(BaseConfig):
def __post_init__(self) -> None:
app_config = self._get_config_instance()
self.output_type = app_config.output_type or settings.OCR_OUTPUT_TYPE
self.output_type = app_config.output_type or OutputTypeChoices(
settings.OCR_OUTPUT_TYPE,
)
@dataclasses.dataclass
@@ -70,12 +72,13 @@ class OcrConfig(OutputTypeConfig):
self.pages = app_config.pages or settings.OCR_PAGES
self.language = app_config.language or settings.OCR_LANGUAGE
self.mode = app_config.mode or settings.OCR_MODE
self.mode = app_config.mode or ModeChoices(settings.OCR_MODE)
self.archive_file_generation = (
app_config.archive_file_generation or settings.ARCHIVE_FILE_GENERATION
app_config.archive_file_generation
or ArchiveFileGenerationChoices(settings.ARCHIVE_FILE_GENERATION)
)
self.image_dpi = app_config.image_dpi or settings.OCR_IMAGE_DPI
self.clean = app_config.unpaper_clean or settings.OCR_CLEAN
self.clean = app_config.unpaper_clean or CleanChoices(settings.OCR_CLEAN)
self.deskew = (
app_config.deskew if app_config.deskew is not None else settings.OCR_DESKEW
)
@@ -92,7 +95,7 @@ class OcrConfig(OutputTypeConfig):
)
self.color_conversion_strategy = (
app_config.color_conversion_strategy
or settings.OCR_COLOR_CONVERSION_STRATEGY
or ColorConvertChoices(settings.OCR_COLOR_CONVERSION_STRATEGY)
)
user_args = None
@@ -34,7 +34,7 @@ class Migration(migrations.Migration):
name="llm_backend",
field=models.CharField(
blank=True,
choices=[("openai", "OpenAI"), ("ollama", "Ollama")],
choices=[("openai-like", "OpenAI-compatible"), ("ollama", "Ollama")],
max_length=128,
null=True,
verbose_name="Sets the LLM backend",
@@ -45,7 +45,10 @@ class Migration(migrations.Migration):
name="llm_embedding_backend",
field=models.CharField(
blank=True,
choices=[("openai", "OpenAI"), ("huggingface", "Huggingface")],
choices=[
("openai-like", "OpenAI-compatible"),
("huggingface", "Huggingface"),
],
max_length=128,
null=True,
verbose_name="Sets the LLM embedding backend",
+2 -2
View File
@@ -75,7 +75,7 @@ class ColorConvertChoices(models.TextChoices):
class LLMEmbeddingBackend(models.TextChoices):
OPENAI = ("openai", _("OpenAI"))
OPENAI_LIKE = ("openai-like", _("OpenAI-compatible"))
HUGGINGFACE = ("huggingface", _("Huggingface"))
@@ -84,7 +84,7 @@ class LLMBackend(models.TextChoices):
Matches to --llm-backend
"""
OPENAI = ("openai", _("OpenAI"))
OPENAI_LIKE = ("openai-like", _("OpenAI-compatible"))
OLLAMA = ("ollama", _("Ollama"))
+2 -2
View File
@@ -1174,9 +1174,9 @@ REMOTE_OCR_ENDPOINT = os.getenv("PAPERLESS_REMOTE_OCR_ENDPOINT")
AI_ENABLED = get_bool_from_env("PAPERLESS_AI_ENABLED", "NO")
LLM_EMBEDDING_BACKEND = os.getenv(
"PAPERLESS_AI_LLM_EMBEDDING_BACKEND",
) # "huggingface" or "openai"
) # "huggingface" or "openai-like"
LLM_EMBEDDING_MODEL = os.getenv("PAPERLESS_AI_LLM_EMBEDDING_MODEL")
LLM_BACKEND = os.getenv("PAPERLESS_AI_LLM_BACKEND") # "ollama" or "openai"
LLM_BACKEND = os.getenv("PAPERLESS_AI_LLM_BACKEND") # "ollama" or "openai-like"
LLM_MODEL = os.getenv("PAPERLESS_AI_LLM_MODEL")
LLM_API_KEY = os.getenv("PAPERLESS_AI_LLM_API_KEY")
LLM_ENDPOINT = os.getenv("PAPERLESS_AI_LLM_ENDPOINT")
@@ -766,7 +766,11 @@ class TestParser:
content=b"Pretend merged PDF content",
)
def test_layout_option(layout_option, expected_calls, expected_pdf_names):
def test_layout_option(
layout_option,
expected_calls,
expected_pdf_names,
) -> None:
mock_mailrule_get.return_value = mock.Mock(pdf_layout=layout_option)
mail_parser.configure(ParserContext(mailrule_id=1))
mail_parser.parse(
@@ -16,6 +16,8 @@ from typing import TYPE_CHECKING
import pytest
from paperless.models import ModeChoices
if TYPE_CHECKING:
from pytest_mock import MockerFixture
@@ -72,7 +74,7 @@ class TestAutoModeWithText:
)
mock_ocr = mocker.patch("ocrmypdf.ocr")
tesseract_parser.settings.mode = "auto"
tesseract_parser.settings.mode = ModeChoices.AUTO
tesseract_parser.parse(
simple_digital_pdf_file,
"application/pdf",
@@ -106,7 +108,7 @@ class TestAutoModeWithText:
)
mock_ocr = mocker.patch("ocrmypdf.ocr")
tesseract_parser.settings.mode = "auto"
tesseract_parser.settings.mode = ModeChoices.AUTO
tesseract_parser.parse(
simple_digital_pdf_file,
"application/pdf",
@@ -158,7 +160,7 @@ class TestAutoModeNoText:
mocker.patch.object(tesseract_parser, "extract_text", side_effect=_extract_side)
mock_ocr = mocker.patch("ocrmypdf.ocr")
tesseract_parser.settings.mode = "auto"
tesseract_parser.settings.mode = ModeChoices.AUTO
tesseract_parser.parse(
multi_page_images_pdf_file,
"application/pdf",
@@ -200,7 +202,7 @@ class TestAutoModeNoText:
mocker.patch.object(tesseract_parser, "extract_text", side_effect=_extract_side)
mock_ocr = mocker.patch("ocrmypdf.ocr")
tesseract_parser.settings.mode = "auto"
tesseract_parser.settings.mode = ModeChoices.AUTO
tesseract_parser.parse(
multi_page_images_pdf_file,
"application/pdf",
@@ -243,7 +245,7 @@ class TestOffModePdf:
)
mock_ocr = mocker.patch("ocrmypdf.ocr")
tesseract_parser.settings.mode = "off"
tesseract_parser.settings.mode = ModeChoices.OFF
tesseract_parser.parse(
simple_digital_pdf_file,
"application/pdf",
@@ -283,7 +285,7 @@ class TestOffModePdf:
)
mocker.patch("ocrmypdf.pdfa.generate_pdfa_ps")
tesseract_parser.settings.mode = "off"
tesseract_parser.settings.mode = ModeChoices.OFF
tesseract_parser.parse(
simple_digital_pdf_file,
"application/pdf",
@@ -323,7 +325,7 @@ class TestOffModeImage:
"""
mock_ocr = mocker.patch("ocrmypdf.ocr")
tesseract_parser.settings.mode = "off"
tesseract_parser.settings.mode = ModeChoices.OFF
tesseract_parser.parse(simple_png_file, "image/png", produce_archive=False)
mock_ocr.assert_not_called()
@@ -355,7 +357,7 @@ class TestOffModeImage:
)
mock_ocr = mocker.patch("ocrmypdf.ocr")
tesseract_parser.settings.mode = "off"
tesseract_parser.settings.mode = ModeChoices.OFF
tesseract_parser.parse(simple_png_file, "image/png", produce_archive=True)
mock_convert.assert_called_once_with(simple_png_file)
@@ -429,7 +431,7 @@ class TestProduceArchiveFalse:
)
mock_ocr = mocker.patch("ocrmypdf.ocr")
tesseract_parser.settings.mode = "auto"
tesseract_parser.settings.mode = ModeChoices.AUTO
tesseract_parser.parse(
simple_digital_pdf_file,
"application/pdf",
@@ -44,6 +44,7 @@ class TestParserSettingsFromDb(DirectoriesMixin, FileSystemAssertsMixin, TestCas
"""
with override_settings(OCR_PAGES=10):
instance = ApplicationConfiguration.objects.all().first()
assert instance is not None
instance.pages = 5
instance.save()
@@ -62,6 +63,7 @@ class TestParserSettingsFromDb(DirectoriesMixin, FileSystemAssertsMixin, TestCas
"""
with override_settings(OCR_LANGUAGE="eng+deu"):
instance = ApplicationConfiguration.objects.all().first()
assert instance is not None
instance.language = "fra+ita"
instance.save()
@@ -80,6 +82,7 @@ class TestParserSettingsFromDb(DirectoriesMixin, FileSystemAssertsMixin, TestCas
"""
with override_settings(OCR_OUTPUT_TYPE="pdfa-3"):
instance = ApplicationConfiguration.objects.all().first()
assert instance is not None
instance.output_type = OutputTypeChoices.PDF_A
instance.save()
@@ -100,6 +103,7 @@ class TestParserSettingsFromDb(DirectoriesMixin, FileSystemAssertsMixin, TestCas
# AUTO mode with skip_text=True explicitly passed: skip_text is set
with override_settings(OCR_MODE="redo"):
instance = ApplicationConfiguration.objects.all().first()
assert instance is not None
instance.mode = ModeChoices.AUTO
instance.save()
@@ -118,6 +122,7 @@ class TestParserSettingsFromDb(DirectoriesMixin, FileSystemAssertsMixin, TestCas
# AUTO mode alone (no skip_text): no extra OCR flag is set
with override_settings(OCR_MODE="redo"):
instance = ApplicationConfiguration.objects.all().first()
assert instance is not None
instance.mode = ModeChoices.AUTO
instance.save()
@@ -138,6 +143,7 @@ class TestParserSettingsFromDb(DirectoriesMixin, FileSystemAssertsMixin, TestCas
"""
with override_settings(OCR_CLEAN="clean-final"):
instance = ApplicationConfiguration.objects.all().first()
assert instance is not None
instance.unpaper_clean = CleanChoices.CLEAN
instance.save()
@@ -147,6 +153,7 @@ class TestParserSettingsFromDb(DirectoriesMixin, FileSystemAssertsMixin, TestCas
with override_settings(OCR_CLEAN="clean-final"):
instance = ApplicationConfiguration.objects.all().first()
assert instance is not None
instance.unpaper_clean = CleanChoices.FINAL
instance.save()
@@ -166,6 +173,7 @@ class TestParserSettingsFromDb(DirectoriesMixin, FileSystemAssertsMixin, TestCas
"""
with override_settings(OCR_DESKEW=False):
instance = ApplicationConfiguration.objects.all().first()
assert instance is not None
instance.deskew = True
instance.save()
@@ -185,6 +193,7 @@ class TestParserSettingsFromDb(DirectoriesMixin, FileSystemAssertsMixin, TestCas
with override_settings(OCR_ROTATE_PAGES=False, OCR_ROTATE_PAGES_THRESHOLD=30.0):
instance = ApplicationConfiguration.objects.all().first()
assert instance is not None
assert instance is not None
instance.rotate_pages = True
instance.rotate_pages_threshold = 15.0
instance.save()
@@ -205,6 +214,7 @@ class TestParserSettingsFromDb(DirectoriesMixin, FileSystemAssertsMixin, TestCas
"""
with override_settings(OCR_MAX_IMAGE_PIXELS=2_000_000.0):
instance = ApplicationConfiguration.objects.all().first()
assert instance is not None
instance.max_image_pixels = 1_000_000.0
instance.save()
@@ -223,6 +233,7 @@ class TestParserSettingsFromDb(DirectoriesMixin, FileSystemAssertsMixin, TestCas
"""
with override_settings(OCR_COLOR_CONVERSION_STRATEGY="LeaveColorUnchanged"):
instance = ApplicationConfiguration.objects.all().first()
assert instance is not None
instance.color_conversion_strategy = ColorConvertChoices.INDEPENDENT
instance.save()
@@ -246,6 +257,7 @@ class TestParserSettingsFromDb(DirectoriesMixin, FileSystemAssertsMixin, TestCas
OCR_USER_ARGS=json.dumps({"continue_on_soft_render_error": True}),
):
instance = ApplicationConfiguration.objects.all().first()
assert instance is not None
instance.user_args = {"unpaper_args": "--pre-rotate 90"}
instance.save()
@@ -18,6 +18,7 @@ from ocrmypdf import SubprocessOutputError
from documents.parsers import ParseError
from documents.parsers import run_convert
from paperless.models import ModeChoices
from paperless.parsers import ParserProtocol
from paperless.parsers.tesseract import RasterisedDocumentParser
from paperless.parsers.tesseract import post_process_text
@@ -387,8 +388,10 @@ class TestParsePdf:
)
assert tesseract_parser.archive_path is not None
assert tesseract_parser.archive_path.is_file()
_text = tesseract_parser.get_text()
assert _text is not None
assert_ordered_substrings(
tesseract_parser.get_text().lower(),
_text.lower(),
["page 1", "page 2", "page 3"],
)
@@ -413,7 +416,7 @@ class TestParsePdf:
tesseract_parser: RasterisedDocumentParser,
tesseract_samples_dir: Path,
) -> None:
tesseract_parser.settings.mode = "redo"
tesseract_parser.settings.mode = ModeChoices.REDO
tesseract_parser.parse(
tesseract_samples_dir / "with-form.pdf",
"application/pdf",
@@ -430,7 +433,7 @@ class TestParsePdf:
tesseract_parser: RasterisedDocumentParser,
tesseract_samples_dir: Path,
) -> None:
tesseract_parser.settings.mode = "force"
tesseract_parser.settings.mode = ModeChoices.FORCE
tesseract_parser.parse(
tesseract_samples_dir / "with-form.pdf",
"application/pdf",
@@ -445,7 +448,7 @@ class TestParsePdf:
tesseract_parser: RasterisedDocumentParser,
tesseract_samples_dir: Path,
) -> None:
tesseract_parser.settings.mode = "auto"
tesseract_parser.settings.mode = ModeChoices.AUTO
tesseract_parser.parse(tesseract_samples_dir / "signed.pdf", "application/pdf")
assert tesseract_parser.archive_path is None
assert_ordered_substrings(
@@ -461,7 +464,7 @@ class TestParsePdf:
tesseract_parser: RasterisedDocumentParser,
tesseract_samples_dir: Path,
) -> None:
tesseract_parser.settings.mode = "auto"
tesseract_parser.settings.mode = ModeChoices.AUTO
tesseract_parser.parse(
tesseract_samples_dir / "encrypted.pdf",
"application/pdf",
@@ -530,7 +533,9 @@ class TestParseImages:
tesseract_parser.parse(tesseract_samples_dir / "simple-no-dpi.png", "image/png")
assert tesseract_parser.archive_path is not None
assert tesseract_parser.archive_path.is_file()
assert "this is a test document." in tesseract_parser.get_text().lower()
_text = tesseract_parser.get_text()
assert _text is not None
assert "this is a test document." in _text.lower()
def test_no_dpi_no_fallback_raises(
self,
@@ -563,8 +568,10 @@ class TestParseMultiPage:
)
assert tesseract_parser.archive_path is not None
assert tesseract_parser.archive_path.is_file()
_text = tesseract_parser.get_text()
assert _text is not None
assert_ordered_substrings(
tesseract_parser.get_text().lower(),
_text.lower(),
["page 1", "page 2", "page 3"],
)
@@ -589,8 +596,10 @@ class TestParseMultiPage:
"application/pdf",
)
assert tesseract_parser.archive_path is not None
_text = tesseract_parser.get_text()
assert _text is not None
assert_ordered_substrings(
tesseract_parser.get_text().lower(),
_text.lower(),
["page 1", "page 2", "page 3"],
)
@@ -599,14 +608,16 @@ class TestParseMultiPage:
tesseract_parser: RasterisedDocumentParser,
tesseract_samples_dir: Path,
) -> None:
tesseract_parser.settings.mode = "auto"
tesseract_parser.settings.mode = ModeChoices.AUTO
tesseract_parser.parse(
tesseract_samples_dir / "multi-page-images.pdf",
"application/pdf",
)
assert tesseract_parser.archive_path is not None
_text = tesseract_parser.get_text()
assert _text is not None
assert_ordered_substrings(
tesseract_parser.get_text().lower(),
_text.lower(),
["page 1", "page 2", "page 3"],
)
@@ -626,13 +637,15 @@ class TestParseMultiPage:
- Pages 1 and 2 extracted; page 3 absent
"""
tesseract_parser.settings.pages = 2
tesseract_parser.settings.mode = "redo"
tesseract_parser.settings.mode = ModeChoices.REDO
tesseract_parser.parse(
tesseract_samples_dir / "multi-page-images.pdf",
"application/pdf",
)
assert tesseract_parser.archive_path is not None
text = tesseract_parser.get_text().lower()
text = tesseract_parser.get_text()
assert text is not None
text = text.lower()
assert_ordered_substrings(text, ["page 1", "page 2"])
assert "page 3" not in text
@@ -652,13 +665,15 @@ class TestParseMultiPage:
- Only page 1 extracted
"""
tesseract_parser.settings.pages = 1
tesseract_parser.settings.mode = "force"
tesseract_parser.settings.mode = ModeChoices.FORCE
tesseract_parser.parse(
tesseract_samples_dir / "multi-page-images.pdf",
"application/pdf",
)
assert tesseract_parser.archive_path is not None
text = tesseract_parser.get_text().lower()
text = tesseract_parser.get_text()
assert text is not None
text = text.lower()
assert "page 1" in text
assert "page 2" not in text
assert "page 3" not in text
@@ -681,8 +696,10 @@ class TestParseMultiPage:
"image/tiff",
)
assert tesseract_parser.archive_path is not None
_text = tesseract_parser.get_text()
assert _text is not None
assert_ordered_substrings(
tesseract_parser.get_text().lower(),
_text.lower(),
["page 1", "page 2", "page 3"],
)
@@ -704,8 +721,10 @@ class TestParseMultiPage:
shutil.copy(tesseract_samples_dir / "multi-page-images-alpha.tiff", dest)
tesseract_parser.parse(dest, "image/tiff")
assert tesseract_parser.archive_path is not None
_text = tesseract_parser.get_text()
assert _text is not None
assert_ordered_substrings(
tesseract_parser.get_text().lower(),
_text.lower(),
["page 1", "page 2", "page 3"],
)
@@ -727,8 +746,10 @@ class TestParseMultiPage:
shutil.copy(tesseract_samples_dir / "multi-page-images-alpha-rgb.tiff", dest)
tesseract_parser.parse(dest, "image/tiff")
assert tesseract_parser.archive_path is not None
_text = tesseract_parser.get_text()
assert _text is not None
assert_ordered_substrings(
tesseract_parser.get_text().lower(),
_text.lower(),
["page 1", "page 2", "page 3"],
)
@@ -754,15 +775,17 @@ class TestSkipArchive:
- Text extracted from original; no archive created (text exists +
produce_archive=False skips OCRmyPDF entirely)
"""
tesseract_parser.settings.mode = "auto"
tesseract_parser.settings.mode = ModeChoices.AUTO
tesseract_parser.parse(
tesseract_samples_dir / "multi-page-digital.pdf",
"application/pdf",
produce_archive=False,
)
assert tesseract_parser.archive_path is None
_text = tesseract_parser.get_text()
assert _text is not None
assert_ordered_substrings(
tesseract_parser.get_text().lower(),
_text.lower(),
["page 1", "page 2", "page 3"],
)
@@ -780,14 +803,16 @@ class TestSkipArchive:
THEN:
- Text extracted; archive created (OCR needed, no existing text)
"""
tesseract_parser.settings.mode = "auto"
tesseract_parser.settings.mode = ModeChoices.AUTO
tesseract_parser.parse(
tesseract_samples_dir / "multi-page-images.pdf",
"application/pdf",
)
assert tesseract_parser.archive_path is not None
_text = tesseract_parser.get_text()
assert _text is not None
assert_ordered_substrings(
tesseract_parser.get_text().lower(),
_text.lower(),
["page 1", "page 2", "page 3"],
)
@@ -838,13 +863,15 @@ class TestSkipArchive:
- archive_path is set if and only if produce_archive=True
- Text is always extracted
"""
tesseract_parser.settings.mode = "auto"
tesseract_parser.settings.mode = ModeChoices.AUTO
tesseract_parser.parse(
tesseract_samples_dir / filename,
"application/pdf",
produce_archive=produce_archive,
)
text = tesseract_parser.get_text().lower()
text = tesseract_parser.get_text()
assert text is not None
text = text.lower()
assert_ordered_substrings(text, ["page 1", "page 2", "page 3"])
if expect_archive:
assert tesseract_parser.archive_path is not None
@@ -868,7 +895,7 @@ class TestSkipArchive:
- Text is extracted from the original via pdftotext
- No archive is produced
"""
tesseract_parser.settings.mode = "auto"
tesseract_parser.settings.mode = ModeChoices.AUTO
mock_ocr = mocker.patch("ocrmypdf.ocr")
tesseract_parser.parse(
tesseract_samples_dir / "simple-digital.pdf",
@@ -895,7 +922,7 @@ class TestSkipArchive:
- Archive is produced
- Text is preserved from the original
"""
tesseract_parser.settings.mode = "auto"
tesseract_parser.settings.mode = ModeChoices.AUTO
tesseract_parser.parse(
tesseract_samples_dir / "simple-digital.pdf",
"application/pdf",
@@ -925,15 +952,17 @@ class TestParseMixed:
THEN:
- All pages extracted; archive created; sidecar notes skipped pages
"""
tesseract_parser.settings.mode = "auto"
tesseract_parser.settings.mode = ModeChoices.AUTO
tesseract_parser.parse(
tesseract_samples_dir / "multi-page-mixed.pdf",
"application/pdf",
)
assert tesseract_parser.archive_path is not None
assert tesseract_parser.archive_path.is_file()
_text = tesseract_parser.get_text()
assert _text is not None
assert_ordered_substrings(
tesseract_parser.get_text().lower(),
_text.lower(),
["page 1", "page 2", "page 3", "page 4", "page 5", "page 6"],
)
sidecar = (tesseract_parser.tempdir / "sidecar.txt").read_text()
@@ -953,15 +982,17 @@ class TestParseMixed:
THEN:
- Both text layer and image text extracted; archive created
"""
tesseract_parser.settings.mode = "redo"
tesseract_parser.settings.mode = ModeChoices.REDO
tesseract_parser.parse(
tesseract_samples_dir / "single-page-mixed.pdf",
"application/pdf",
)
assert tesseract_parser.archive_path is not None
assert tesseract_parser.archive_path.is_file()
_text = tesseract_parser.get_text()
assert _text is not None
assert_ordered_substrings(
tesseract_parser.get_text().lower(),
_text.lower(),
[
"this is some normal text, present on page 1 of the document.",
"this is some text, but in an image, also on page 1.",
@@ -989,15 +1020,17 @@ class TestParseMixed:
THEN:
- No archive created (produce_archive=False); text from text layer present
"""
tesseract_parser.settings.mode = "auto"
tesseract_parser.settings.mode = ModeChoices.AUTO
tesseract_parser.parse(
tesseract_samples_dir / "multi-page-mixed.pdf",
"application/pdf",
produce_archive=False,
)
assert tesseract_parser.archive_path is None
_text = tesseract_parser.get_text()
assert _text is not None
assert_ordered_substrings(
tesseract_parser.get_text().lower(),
_text.lower(),
["page 4", "page 5", "page 6"],
)
@@ -1013,7 +1046,7 @@ class TestParseRotate:
tesseract_parser: RasterisedDocumentParser,
tesseract_samples_dir: Path,
) -> None:
tesseract_parser.settings.mode = "auto"
tesseract_parser.settings.mode = ModeChoices.AUTO
tesseract_parser.settings.rotate = True
tesseract_parser.parse(tesseract_samples_dir / "rotated.pdf", "application/pdf")
assert_ordered_substrings(
@@ -1052,14 +1085,16 @@ class TestParseRtl:
force-ocr with English Tesseract (producing garbage). Using mode="off" forces
skip_text=True so the Arabic text layer is preserved through PDF/A conversion.
"""
tesseract_parser.settings.mode = "off"
tesseract_parser.settings.mode = ModeChoices.OFF
tesseract_parser.parse(
tesseract_samples_dir / "rtl-test.pdf",
"application/pdf",
)
_text = tesseract_parser.get_text()
assert _text is not None
normalised = "".join(
ch
for ch in unicodedata.normalize("NFKC", tesseract_parser.get_text())
for ch in unicodedata.normalize("NFKC", _text)
if unicodedata.category(ch) != "Cf" and not ch.isspace()
)
assert "ةرازو" in normalised
@@ -1196,7 +1231,9 @@ class TestParserFileTypes:
tesseract_parser.parse(tesseract_samples_dir / filename, mime_type)
assert tesseract_parser.archive_path is not None
assert tesseract_parser.archive_path.is_file()
assert "this is a test document" in tesseract_parser.get_text().lower()
_text = tesseract_parser.get_text()
assert _text is not None
assert "this is a test document" in _text.lower()
def test_heic(
self,
@@ -1205,7 +1242,9 @@ class TestParserFileTypes:
) -> None:
tesseract_parser.parse(tesseract_samples_dir / "simple.heic", "image/heic")
assert tesseract_parser.archive_path is not None
assert "pizza" in tesseract_parser.get_text().lower()
_text = tesseract_parser.get_text()
assert _text is not None
assert "pizza" in _text.lower()
def test_gif_with_explicit_dpi(
self,
@@ -1215,7 +1254,9 @@ class TestParserFileTypes:
tesseract_parser.settings.image_dpi = 200
tesseract_parser.parse(tesseract_samples_dir / "simple.gif", "image/gif")
assert tesseract_parser.archive_path is not None
assert "this is a test document" in tesseract_parser.get_text().lower()
_text = tesseract_parser.get_text()
assert _text is not None
assert "this is a test document" in _text.lower()
def test_webp_with_explicit_dpi(
self,
@@ -1225,9 +1266,11 @@ class TestParserFileTypes:
tesseract_parser.settings.image_dpi = 72
tesseract_parser.parse(tesseract_samples_dir / "document.webp", "image/webp")
assert tesseract_parser.archive_path is not None
_text = tesseract_parser.get_text()
assert _text is not None
assert re.search(
r"this is a ?webp document, created 11/14/2022\.",
tesseract_parser.get_text().lower(),
_text.lower(),
)
@@ -26,7 +26,7 @@ class TestStringToBool:
pytest.param(" True ", id="whitespace_true"),
],
)
def test_true_conversion(self, true_value: str):
def test_true_conversion(self, true_value: str) -> None:
"""Test that various 'true' strings correctly evaluate to True."""
assert str_to_bool(true_value) is True
@@ -41,18 +41,18 @@ class TestStringToBool:
pytest.param(" False ", id="whitespace_false"),
],
)
def test_false_conversion(self, false_value: str):
def test_false_conversion(self, false_value: str) -> None:
"""Test that various 'false' strings correctly evaluate to False."""
assert str_to_bool(false_value) is False
def test_invalid_conversion(self):
def test_invalid_conversion(self) -> None:
"""Test that an invalid string raises a ValueError."""
with pytest.raises(ValueError, match="Cannot convert 'maybe' to a boolean\\."):
str_to_bool("maybe")
class TestParseDictFromString:
def test_empty_and_none_input(self):
def test_empty_and_none_input(self) -> None:
"""Test behavior with None or empty string input."""
assert parse_dict_from_str(None) == {}
assert parse_dict_from_str("") == {}
@@ -62,13 +62,13 @@ class TestParseDictFromString:
# Ensure it returns a copy, not the original object
assert res is not defaults
def test_basic_parsing(self):
def test_basic_parsing(self) -> None:
"""Test simple key-value parsing without defaults or types."""
env_str = "key1=val1, key2=val2"
expected = {"key1": "val1", "key2": "val2"}
assert parse_dict_from_str(env_str) == expected
def test_with_defaults(self):
def test_with_defaults(self) -> None:
"""Test that environment values override defaults correctly."""
defaults = {"host": "localhost", "port": 8000, "user": "default"}
env_str = "port=9090, host=db.example.com"
@@ -76,7 +76,7 @@ class TestParseDictFromString:
result = parse_dict_from_str(env_str, defaults=defaults)
assert result == expected
def test_type_casting(self):
def test_type_casting(self) -> None:
"""Test successful casting of values to specified types."""
env_str = "port=9090, debug=true, timeout=12.5, user=admin"
type_map = {"port": int, "debug": bool, "timeout": float}
@@ -84,7 +84,7 @@ class TestParseDictFromString:
result = parse_dict_from_str(env_str, type_map=type_map)
assert result == expected
def test_type_casting_with_defaults(self):
def test_type_casting_with_defaults(self) -> None:
"""Test casting when values come from both defaults and env string."""
defaults = {"port": 8000, "debug": False, "retries": 3}
env_str = "port=9090, debug=true"
@@ -97,7 +97,7 @@ class TestParseDictFromString:
assert result == expected
assert isinstance(result["retries"], int)
def test_path_casting(self, tmp_path: Path):
def test_path_casting(self, tmp_path: Path) -> None:
"""Test successful casting of a string to a resolved pathlib.Path object."""
# Create a dummy file to resolve against
test_file = tmp_path / "test_file.txt"
@@ -111,14 +111,14 @@ class TestParseDictFromString:
assert isinstance(result["config_path"], Path)
assert result["config_path"] == test_file.resolve()
def test_custom_separator(self):
def test_custom_separator(self) -> None:
"""Test parsing with a custom separator like a semicolon."""
env_str = "host=db; port=5432; user=test"
expected = {"host": "db", "port": "5432", "user": "test"}
result = parse_dict_from_str(env_str, separator=";")
assert result == expected
def test_edge_cases_in_string(self):
def test_edge_cases_in_string(self) -> None:
"""Test malformed strings to ensure robustness."""
# Malformed pair 'debug' is skipped, extra comma is ignored
env_str = "key=val,, debug, foo=bar"
@@ -130,7 +130,7 @@ class TestParseDictFromString:
expected = {"url": "postgres://user:pass@host:5432/db"}
assert parse_dict_from_str(env_str) == expected
def test_casting_error_handling(self):
def test_casting_error_handling(self) -> None:
"""Test that a ValueError is raised for invalid casting."""
env_str = "port=not-a-number"
type_map = {"port": int}
@@ -142,14 +142,14 @@ class TestParseDictFromString:
assert "value 'not-a-number'" in str(excinfo.value)
assert "to type 'int'" in str(excinfo.value)
def test_bool_casting_error(self):
def test_bool_casting_error(self) -> None:
"""Test that an invalid boolean string raises a ValueError."""
env_str = "debug=maybe"
type_map = {"debug": bool}
with pytest.raises(ValueError, match="Error casting key 'debug'"):
parse_dict_from_str(env_str, type_map=type_map)
def test_nested_key_parsing_basic(self):
def test_nested_key_parsing_basic(self) -> None:
"""Basic nested key parsing using dot-notation."""
env_str = "database.host=db.example.com, database.port=5432, logging.level=INFO"
result = parse_dict_from_str(env_str)
@@ -158,7 +158,7 @@ class TestParseDictFromString:
"logging": {"level": "INFO"},
}
def test_nested_overrides_defaults_and_deepcopy(self):
def test_nested_overrides_defaults_and_deepcopy(self) -> None:
"""Nested env keys override defaults and defaults are deep-copied."""
defaults = {"database": {"host": "127.0.0.1", "port": 3306, "user": "default"}}
env_str = "database.host=db.example.com, debug=true"
@@ -176,7 +176,7 @@ class TestParseDictFromString:
assert result is not defaults
assert result["database"] is not defaults["database"]
def test_nested_type_casting(self):
def test_nested_type_casting(self) -> None:
"""Type casting for nested keys (dot-notation) should work."""
env_str = "database.host=db.example.com, database.port=5433, debug=false"
type_map = {"database.port": int, "debug": bool}
@@ -188,7 +188,7 @@ class TestParseDictFromString:
assert result["debug"] is False
assert isinstance(result["debug"], bool)
def test_nested_casting_error_message(self):
def test_nested_casting_error_message(self) -> None:
"""Error messages should include the full dotted key name on failure."""
env_str = "database.port=not-a-number"
type_map = {"database.port": int}
@@ -200,7 +200,7 @@ class TestParseDictFromString:
assert "value 'not-a-number'" in msg
assert "to type 'int'" in msg
def test_type_map_does_not_recast_non_string_defaults(self):
def test_type_map_does_not_recast_non_string_defaults(self) -> None:
"""If a default already provides a non-string value, the caster should skip it."""
defaults = {"database": {"port": 3306}}
type_map = {"database.port": int}
@@ -210,22 +210,22 @@ class TestParseDictFromString:
class TestGetBoolFromEnv:
def test_existing_env_var(self, mocker):
def test_existing_env_var(self, mocker) -> None:
"""Test that an existing environment variable is read and converted."""
mocker.patch.dict(os.environ, {"TEST_VAR": "true"})
assert get_bool_from_env("TEST_VAR") is True
def test_missing_env_var_uses_default_no(self, mocker):
def test_missing_env_var_uses_default_no(self, mocker) -> None:
"""Test that a missing environment variable uses default 'NO' and returns False."""
mocker.patch.dict(os.environ, {}, clear=True)
assert get_bool_from_env("MISSING_VAR") is False
def test_missing_env_var_with_explicit_default(self, mocker):
def test_missing_env_var_with_explicit_default(self, mocker) -> None:
"""Test that a missing environment variable uses the provided default."""
mocker.patch.dict(os.environ, {}, clear=True)
assert get_bool_from_env("MISSING_VAR", default="yes") is True
def test_invalid_value_raises_error(self, mocker):
def test_invalid_value_raises_error(self, mocker) -> None:
"""Test that an invalid value raises ValueError (delegates to str_to_bool)."""
mocker.patch.dict(os.environ, {"INVALID_VAR": "maybe"})
with pytest.raises(ValueError):
@@ -243,7 +243,7 @@ class TestGetIntFromEnv:
pytest.param("-999", -999, id="large_negative"),
],
)
def test_existing_env_var_valid_ints(self, mocker, env_value, expected):
def test_existing_env_var_valid_ints(self, mocker, env_value, expected) -> None:
"""Test that existing environment variables with valid integers return correct values."""
mocker.patch.dict(os.environ, {"INT_VAR": env_value})
assert get_int_from_env("INT_VAR") == expected
@@ -257,12 +257,12 @@ class TestGetIntFromEnv:
pytest.param(None, None, id="none_default"),
],
)
def test_missing_env_var_with_defaults(self, mocker, default, expected):
def test_missing_env_var_with_defaults(self, mocker, default, expected) -> None:
"""Test that missing environment variables return provided defaults."""
mocker.patch.dict(os.environ, {}, clear=True)
assert get_int_from_env("MISSING_VAR", default=default) == expected
def test_missing_env_var_no_default(self, mocker):
def test_missing_env_var_no_default(self, mocker) -> None:
"""Test that missing environment variable with no default returns None."""
mocker.patch.dict(os.environ, {}, clear=True)
assert get_int_from_env("MISSING_VAR") is None
@@ -279,7 +279,7 @@ class TestGetIntFromEnv:
pytest.param("1.0", id="decimal"),
],
)
def test_invalid_int_values_raise_error(self, mocker, invalid_value):
def test_invalid_int_values_raise_error(self, mocker, invalid_value) -> None:
"""Test that invalid integer values raise ValueError."""
mocker.patch.dict(os.environ, {"INVALID_INT": invalid_value})
with pytest.raises(ValueError):
@@ -300,7 +300,7 @@ class TestGetFloatFromEnv:
pytest.param("-1.23e4", -12300.0, id="sci_large"),
],
)
def test_existing_env_var_valid_floats(self, mocker, env_value, expected):
def test_existing_env_var_valid_floats(self, mocker, env_value, expected) -> None:
"""Test that existing environment variables with valid floats return correct values."""
mocker.patch.dict(os.environ, {"FLOAT_VAR": env_value})
assert get_float_from_env("FLOAT_VAR") == expected
@@ -314,12 +314,12 @@ class TestGetFloatFromEnv:
pytest.param(None, None, id="none_default"),
],
)
def test_missing_env_var_with_defaults(self, mocker, default, expected):
def test_missing_env_var_with_defaults(self, mocker, default, expected) -> None:
"""Test that missing environment variables return provided defaults."""
mocker.patch.dict(os.environ, {}, clear=True)
assert get_float_from_env("MISSING_VAR", default=default) == expected
def test_missing_env_var_no_default(self, mocker):
def test_missing_env_var_no_default(self, mocker) -> None:
"""Test that missing environment variable with no default returns None."""
mocker.patch.dict(os.environ, {}, clear=True)
assert get_float_from_env("MISSING_VAR") is None
@@ -336,7 +336,7 @@ class TestGetFloatFromEnv:
pytest.param("1.2.3", id="triple_decimal"),
],
)
def test_invalid_float_values_raise_error(self, mocker, invalid_value):
def test_invalid_float_values_raise_error(self, mocker, invalid_value) -> None:
"""Test that invalid float values raise ValueError."""
mocker.patch.dict(os.environ, {"INVALID_FLOAT": invalid_value})
with pytest.raises(ValueError):
@@ -355,19 +355,19 @@ class TestGetPathFromEnv:
pytest.param("/", id="root"),
],
)
def test_existing_env_var_paths(self, mocker, env_value):
def test_existing_env_var_paths(self, mocker, env_value) -> None:
"""Test that existing environment variables with paths return resolved Path objects."""
mocker.patch.dict(os.environ, {"PATH_VAR": env_value})
result = get_path_from_env("PATH_VAR")
assert isinstance(result, Path)
assert result == Path(env_value).resolve()
def test_missing_env_var_no_default(self, mocker):
def test_missing_env_var_no_default(self, mocker) -> None:
"""Test that missing environment variable with no default returns None."""
mocker.patch.dict(os.environ, {}, clear=True)
assert get_path_from_env("MISSING_VAR") is None
def test_missing_env_var_with_none_default(self, mocker):
def test_missing_env_var_with_none_default(self, mocker) -> None:
"""Test that missing environment variable with None default returns None."""
mocker.patch.dict(os.environ, {}, clear=True)
assert get_path_from_env("MISSING_VAR", default=None) is None
@@ -380,7 +380,7 @@ class TestGetPathFromEnv:
pytest.param(".", id="current_default"),
],
)
def test_missing_env_var_with_path_defaults(self, mocker, default_path_str):
def test_missing_env_var_with_path_defaults(self, mocker, default_path_str) -> None:
"""Test that missing environment variables return resolved default Path objects."""
mocker.patch.dict(os.environ, {}, clear=True)
default_path = Path(default_path_str)
@@ -388,7 +388,7 @@ class TestGetPathFromEnv:
assert isinstance(result, Path)
assert result == default_path.resolve()
def test_relative_paths_are_resolved(self, mocker):
def test_relative_paths_are_resolved(self, mocker) -> None:
"""Test that relative paths are properly resolved to absolute paths."""
mocker.patch.dict(os.environ, {"REL_PATH": "relative/path"})
result = get_path_from_env("REL_PATH")
@@ -407,7 +407,7 @@ class TestGetListFromEnv:
pytest.param("a,,b,c", ["a", "b", "c"], id="empty_elements_removed"),
],
)
def test_existing_env_var_basic_parsing(self, mocker, env_value, expected):
def test_existing_env_var_basic_parsing(self, mocker, env_value, expected) -> None:
"""Test that existing environment variables are parsed correctly."""
mocker.patch.dict(os.environ, {"LIST_VAR": env_value})
result = get_list_from_env("LIST_VAR")
@@ -421,7 +421,7 @@ class TestGetListFromEnv:
pytest.param(";", "a;b;c", ["a", "b", "c"], id="semicolon_separator"),
],
)
def test_custom_separators(self, mocker, separator, env_value, expected):
def test_custom_separators(self, mocker, separator, env_value, expected) -> None:
"""Test that custom separators work correctly."""
mocker.patch.dict(os.environ, {"LIST_VAR": env_value})
result = get_list_from_env("LIST_VAR", separator=separator)
@@ -439,19 +439,19 @@ class TestGetListFromEnv:
pytest.param(None, [], id="none_default_returns_empty_list"),
],
)
def test_missing_env_var_with_defaults(self, mocker, default, expected):
def test_missing_env_var_with_defaults(self, mocker, default, expected) -> None:
"""Test that missing environment variables return provided defaults."""
mocker.patch.dict(os.environ, {}, clear=True)
result = get_list_from_env("MISSING_VAR", default=default)
assert result == expected
def test_missing_env_var_no_default(self, mocker):
def test_missing_env_var_no_default(self, mocker) -> None:
"""Test that missing environment variable with no default returns empty list."""
mocker.patch.dict(os.environ, {}, clear=True)
result = get_list_from_env("MISSING_VAR")
assert result == []
def test_required_env_var_missing_raises_error(self, mocker):
def test_required_env_var_missing_raises_error(self, mocker) -> None:
"""Test that missing required environment variable raises ValueError."""
mocker.patch.dict(os.environ, {}, clear=True)
with pytest.raises(
@@ -460,19 +460,19 @@ class TestGetListFromEnv:
):
get_list_from_env("REQUIRED_VAR", required=True)
def test_required_env_var_with_default_does_not_raise(self, mocker):
def test_required_env_var_with_default_does_not_raise(self, mocker) -> None:
"""Test that required environment variable with default does not raise error."""
mocker.patch.dict(os.environ, {}, clear=True)
result = get_list_from_env("REQUIRED_VAR", default=["default"], required=True)
assert result == ["default"]
def test_strip_whitespace_false(self, mocker):
def test_strip_whitespace_false(self, mocker) -> None:
"""Test that whitespace is preserved when strip_whitespace=False."""
mocker.patch.dict(os.environ, {"LIST_VAR": " a , b , c "})
result = get_list_from_env("LIST_VAR", strip_whitespace=False)
assert result == [" a ", " b ", " c "]
def test_remove_empty_false(self, mocker):
def test_remove_empty_false(self, mocker) -> None:
"""Test that empty elements are preserved when remove_empty=False."""
mocker.patch.dict(os.environ, {"LIST_VAR": "a,,b,,c"})
result = get_list_from_env("LIST_VAR", remove_empty=False)
+9 -9
View File
@@ -1,6 +1,7 @@
import hmac
import pickle
from hashlib import sha256
from pathlib import Path
import pytest
from django.test import override_settings
@@ -11,21 +12,20 @@ from paperless.celery import signed_pickle_loads
class TestSignedPickleSerializer:
def test_roundtrip_simple_types(self):
def test_roundtrip_simple_types(self) -> None:
"""Signed pickle can round-trip basic JSON-like types."""
for obj in [42, "hello", [1, 2, 3], {"key": "value"}, None, True]:
assert signed_pickle_loads(signed_pickle_dumps(obj)) == obj
def test_roundtrip_complex_types(self):
def test_roundtrip_complex_types(self) -> None:
"""Signed pickle can round-trip types that JSON cannot."""
from pathlib import Path
obj = {"path": Path("/tmp/test"), "data": {1, 2, 3}}
result = signed_pickle_loads(signed_pickle_dumps(obj))
assert result["path"] == Path("/tmp/test")
assert result["data"] == {1, 2, 3}
def test_tampered_data_rejected(self):
def test_tampered_data_rejected(self) -> None:
"""Flipping a byte in the data portion causes HMAC failure."""
payload = signed_pickle_dumps({"task": "test"})
tampered = bytearray(payload)
@@ -33,7 +33,7 @@ class TestSignedPickleSerializer:
with pytest.raises(ValueError, match="HMAC verification failed"):
signed_pickle_loads(bytes(tampered))
def test_tampered_signature_rejected(self):
def test_tampered_signature_rejected(self) -> None:
"""Flipping a byte in the signature portion causes HMAC failure."""
payload = signed_pickle_dumps({"task": "test"})
tampered = bytearray(payload)
@@ -41,17 +41,17 @@ class TestSignedPickleSerializer:
with pytest.raises(ValueError, match="HMAC verification failed"):
signed_pickle_loads(bytes(tampered))
def test_truncated_payload_rejected(self):
def test_truncated_payload_rejected(self) -> None:
"""A payload shorter than HMAC_SIZE is rejected."""
with pytest.raises(ValueError, match="too short"):
signed_pickle_loads(b"\x00" * (HMAC_SIZE - 1))
def test_empty_payload_rejected(self):
def test_empty_payload_rejected(self) -> None:
with pytest.raises(ValueError, match="too short"):
signed_pickle_loads(b"")
@override_settings(SECRET_KEY="different-secret-key")
def test_wrong_secret_key_rejected(self):
def test_wrong_secret_key_rejected(self) -> None:
"""A message signed with one key cannot be loaded with another."""
original_key = b"test-secret-key-do-not-use-in-production"
obj = {"task": "test"}
@@ -61,7 +61,7 @@ class TestSignedPickleSerializer:
with pytest.raises(ValueError, match="HMAC verification failed"):
signed_pickle_loads(payload)
def test_forged_pickle_rejected(self):
def test_forged_pickle_rejected(self) -> None:
"""A raw pickle payload (no signature) is rejected."""
raw_pickle = pickle.dumps({"task": "test"})
# Raw pickle won't have a valid HMAC prefix
@@ -45,45 +45,45 @@ class TestMigrateSkipArchiveFile(TestMigrations):
)
return ApplicationConfiguration.objects.get(pk=pk)
def test_skip_mapped_to_auto(self):
def test_skip_mapped_to_auto(self) -> None:
config = self._get_config(1)
assert config.mode == "auto"
def test_skip_archive_always_mapped_to_never(self):
def test_skip_archive_always_mapped_to_never(self) -> None:
config = self._get_config(1)
assert config.archive_file_generation == "never"
def test_redo_unchanged(self):
def test_redo_unchanged(self) -> None:
config = self._get_config(2)
assert config.mode == "redo"
def test_skip_archive_with_text_mapped_to_auto(self):
def test_skip_archive_with_text_mapped_to_auto(self) -> None:
config = self._get_config(2)
assert config.archive_file_generation == "auto"
def test_force_unchanged(self):
def test_force_unchanged(self) -> None:
config = self._get_config(3)
assert config.mode == "force"
def test_skip_archive_never_mapped_to_always(self):
def test_skip_archive_never_mapped_to_always(self) -> None:
config = self._get_config(3)
assert config.archive_file_generation == "always"
def test_skip_noarchive_mapped_to_auto(self):
def test_skip_noarchive_mapped_to_auto(self) -> None:
config = self._get_config(4)
assert config.mode == "auto"
def test_skip_noarchive_implies_archive_never(self):
def test_skip_noarchive_implies_archive_never(self) -> None:
config = self._get_config(4)
assert config.archive_file_generation == "never"
def test_skip_noarchive_explicit_skip_archive_takes_precedence(self):
def test_skip_noarchive_explicit_skip_archive_takes_precedence(self) -> None:
"""skip_archive_file=never maps to always, not overridden by skip_noarchive."""
config = self._get_config(5)
assert config.mode == "auto"
assert config.archive_file_generation == "always"
def test_null_values_remain_null(self):
def test_null_values_remain_null(self) -> None:
config = self._get_config(6)
assert config.mode is None
assert config.archive_file_generation is None
+54
View File
@@ -1,3 +1,4 @@
import json
import logging
import sys
@@ -9,6 +10,8 @@ logger = logging.getLogger("paperless_ai.chat")
MAX_SINGLE_DOC_CONTEXT_CHARS = 15000
SINGLE_DOC_SNIPPET_CHARS = 800
CHAT_METADATA_DELIMITER = "\n\n__PAPERLESS_CHAT_METADATA__"
MAX_CHAT_REFERENCES = 3
CHAT_PROMPT_TMPL = """Context information is below.
---------------------
@@ -19,6 +22,52 @@ CHAT_PROMPT_TMPL = """Context information is below.
Answer:"""
def _build_document_reference(
document: Document,
title: str | None = None,
) -> dict[str, int | str]:
return {
"id": document.pk,
"title": title or document.title or document.filename,
}
def _get_document_references(
documents: list[Document],
top_nodes: list,
) -> list[dict[str, int | str]]:
allowed_documents = {doc.pk: doc for doc in documents}
references: list[dict[str, int | str]] = []
seen_document_ids: set[int] = set()
for node in top_nodes:
try:
document_id = int(node.metadata["document_id"])
except (KeyError, TypeError, ValueError): # pragma: no cover
continue
if document_id in seen_document_ids or document_id not in allowed_documents:
continue
seen_document_ids.add(document_id)
document = allowed_documents[document_id]
references.append(
_build_document_reference(document, node.metadata.get("title")),
)
if len(references) >= MAX_CHAT_REFERENCES: # pragma: no cover
break
return references
def _format_chat_metadata_trailer(references: list[dict[str, int | str]]) -> str:
return (
f"{CHAT_METADATA_DELIMITER}"
f"{json.dumps({'references': references}, separators=(',', ':'))}"
)
def stream_chat_with_documents(query_str: str, documents: list[Document]):
client = AIClient()
index = load_or_build_index()
@@ -49,6 +98,7 @@ def stream_chat_with_documents(query_str: str, documents: list[Document]):
if len(documents) == 1:
# Just one doc — provide full content
doc = documents[0]
references = [_build_document_reference(doc)]
# TODO: include document metadata in the context
content = doc.content or ""
context_body = content
@@ -78,6 +128,7 @@ def stream_chat_with_documents(query_str: str, documents: list[Document]):
yield "Sorry, I couldn't find any content to answer your question."
return
references = _get_document_references(documents, top_nodes)
context = "\n\n".join(
f"TITLE: {node.metadata.get('title')}\n{node.text[:SINGLE_DOC_SNIPPET_CHARS]}"
for node in top_nodes
@@ -102,3 +153,6 @@ def stream_chat_with_documents(query_str: str, documents: list[Document]):
for chunk in response_stream.response_gen:
yield chunk
sys.stdout.flush()
if references:
yield _format_chat_metadata_trailer(references)
+10 -6
View File
@@ -1,10 +1,12 @@
import logging
from typing import TYPE_CHECKING
from paperless.models import LLMBackend
if TYPE_CHECKING:
from llama_index.core.llms import ChatMessage
from llama_index.llms.ollama import Ollama
from llama_index.llms.openai import OpenAI
from llama_index.llms.openai_like import OpenAILike
from paperless.config import AIConfig
from paperless.network import validate_outbound_http_url
@@ -22,8 +24,8 @@ class AIClient:
self.settings = AIConfig()
self.llm = self.get_llm()
def get_llm(self) -> "Ollama | OpenAI":
if self.settings.llm_backend == "ollama":
def get_llm(self) -> "Ollama | OpenAILike":
if self.settings.llm_backend == LLMBackend.OLLAMA:
from llama_index.llms.ollama import Ollama
endpoint = self.settings.llm_endpoint or "http://localhost:11434"
@@ -36,8 +38,8 @@ class AIClient:
base_url=endpoint,
request_timeout=120,
)
elif self.settings.llm_backend == "openai":
from llama_index.llms.openai import OpenAI
elif self.settings.llm_backend == LLMBackend.OPENAI_LIKE:
from llama_index.llms.openai_like import OpenAILike
endpoint = self.settings.llm_endpoint or None
if endpoint:
@@ -45,10 +47,12 @@ class AIClient:
endpoint,
allow_internal=self.settings.llm_allow_internal_endpoints,
)
return OpenAI(
return OpenAILike(
model=self.settings.llm_model or "gpt-3.5-turbo",
api_base=endpoint,
api_key=self.settings.llm_api_key,
is_chat_model=True,
is_function_calling_model=True,
)
else:
raise ValueError(f"Unsupported LLM backend: {self.settings.llm_backend}")
+5 -5
View File
@@ -19,8 +19,8 @@ def get_embedding_model() -> "BaseEmbedding":
config = AIConfig()
match config.llm_embedding_backend:
case LLMEmbeddingBackend.OPENAI:
from llama_index.embeddings.openai import OpenAIEmbedding
case LLMEmbeddingBackend.OPENAI_LIKE:
from llama_index.embeddings.openai_like import OpenAILikeEmbedding
endpoint = config.llm_endpoint or None
if endpoint:
@@ -28,8 +28,8 @@ def get_embedding_model() -> "BaseEmbedding":
endpoint,
allow_internal=config.llm_allow_internal_endpoints,
)
return OpenAIEmbedding(
model=config.llm_embedding_model or "text-embedding-3-small",
return OpenAILikeEmbedding(
model_name=config.llm_embedding_model or "text-embedding-3-small",
api_key=config.llm_api_key,
api_base=endpoint,
)
@@ -54,7 +54,7 @@ def get_embedding_dim() -> int:
config = AIConfig()
model = config.llm_embedding_model or (
"text-embedding-3-small"
if config.llm_embedding_backend == "openai"
if config.llm_embedding_backend == LLMEmbeddingBackend.OPENAI_LIKE
else "sentence-transformers/all-MiniLM-L6-v2"
)
+1 -1
View File
@@ -98,7 +98,7 @@ def test_update_llm_index_removes_meta(
config = AIConfig()
expected_model = config.llm_embedding_model or (
"text-embedding-3-small"
if config.llm_embedding_backend == "openai"
if config.llm_embedding_backend == "openai-like"
else "sentence-transformers/all-MiniLM-L6-v2"
)
assert meta == {"embedding_model": expected_model, "dim": 384}
+48 -5
View File
@@ -1,3 +1,4 @@
import json
from unittest.mock import MagicMock
from unittest.mock import patch
@@ -5,6 +6,7 @@ import pytest
from llama_index.core import VectorStoreIndex
from llama_index.core.schema import TextNode
from paperless_ai.chat import CHAT_METADATA_DELIMITER
from paperless_ai.chat import stream_chat_with_documents
@@ -40,6 +42,21 @@ def mock_document():
return doc
def assert_chat_output(
output: list[str],
*,
expected_chunks: list[str],
expected_references: list[dict[str, int | str]],
) -> None:
assert output[:-1] == expected_chunks
trailer = output[-1]
assert trailer.startswith(CHAT_METADATA_DELIMITER)
assert json.loads(trailer.removeprefix(CHAT_METADATA_DELIMITER)) == {
"references": expected_references,
}
def test_stream_chat_with_one_document_full_content(mock_document) -> None:
with (
patch("paperless_ai.chat.AIClient") as mock_client_cls,
@@ -68,7 +85,13 @@ def test_stream_chat_with_one_document_full_content(mock_document) -> None:
output = list(stream_chat_with_documents("What is this?", [mock_document]))
assert output == ["chunk1", "chunk2"]
assert_chat_output(
output,
expected_chunks=["chunk1", "chunk2"],
expected_references=[
{"id": mock_document.pk, "title": "Test Document"},
],
)
def test_stream_chat_with_multiple_documents_retrieval(patch_embed_nodes) -> None:
@@ -100,7 +123,20 @@ def test_stream_chat_with_multiple_documents_retrieval(patch_embed_nodes) -> Non
# Patch as_retriever to return a retriever whose retrieve() returns mock_node1 and mock_node2
mock_retriever = MagicMock()
mock_retriever.retrieve.return_value = [mock_node1, mock_node2]
mock_duplicate_node = TextNode(
text="More content for doc 1.",
metadata={"document_id": "1", "title": "Document 1 Duplicate"},
)
mock_foreign_node = TextNode(
text="Content for doc 3.",
metadata={"document_id": "3", "title": "Document 3"},
)
mock_retriever.retrieve.return_value = [
mock_node1,
mock_duplicate_node,
mock_node2,
mock_foreign_node,
]
mock_as_retriever.return_value = mock_retriever
# Mock response stream
@@ -113,12 +149,19 @@ def test_stream_chat_with_multiple_documents_retrieval(patch_embed_nodes) -> Non
mock_query_engine.query.return_value = mock_response_stream
# Fake documents
doc1 = MagicMock(pk=1)
doc2 = MagicMock(pk=2)
doc1 = MagicMock(pk=1, title="Document 1", filename="doc1.pdf")
doc2 = MagicMock(pk=2, title="Document 2", filename="doc2.pdf")
output = list(stream_chat_with_documents("What's up?", [doc1, doc2]))
assert output == ["chunk1", "chunk2"]
assert_chat_output(
output,
expected_chunks=["chunk1", "chunk2"],
expected_references=[
{"id": 1, "title": "Document 1"},
{"id": 2, "title": "Document 2"},
],
)
def test_stream_chat_no_matching_nodes() -> None:
+6 -4
View File
@@ -25,8 +25,8 @@ def mock_ollama_llm():
@pytest.fixture
def mock_openai_llm():
with patch("llama_index.llms.openai.OpenAI") as MockOpenAI:
yield MockOpenAI
with patch("llama_index.llms.openai_like.OpenAILike") as MockOpenAILike:
yield MockOpenAILike
def test_get_llm_ollama(mock_ai_config, mock_ollama_llm):
@@ -45,7 +45,7 @@ def test_get_llm_ollama(mock_ai_config, mock_ollama_llm):
def test_get_llm_openai(mock_ai_config, mock_openai_llm):
mock_ai_config.llm_backend = "openai"
mock_ai_config.llm_backend = "openai-like"
mock_ai_config.llm_model = "test_model"
mock_ai_config.llm_api_key = "test_api_key"
mock_ai_config.llm_endpoint = "http://test-url"
@@ -56,12 +56,14 @@ def test_get_llm_openai(mock_ai_config, mock_openai_llm):
model="test_model",
api_base="http://test-url",
api_key="test_api_key",
is_chat_model=True,
is_function_calling_model=True,
)
assert client.llm == mock_openai_llm.return_value
def test_get_llm_openai_blocks_internal_endpoint_when_disallowed(mock_ai_config):
mock_ai_config.llm_backend = "openai"
mock_ai_config.llm_backend = "openai-like"
mock_ai_config.llm_model = "test_model"
mock_ai_config.llm_api_key = "test_api_key"
mock_ai_config.llm_endpoint = "http://127.0.0.1:1234"
+9 -7
View File
@@ -54,15 +54,17 @@ def mock_document():
def test_get_embedding_model_openai(mock_ai_config):
mock_ai_config.return_value.llm_embedding_backend = LLMEmbeddingBackend.OPENAI
mock_ai_config.return_value.llm_embedding_backend = LLMEmbeddingBackend.OPENAI_LIKE
mock_ai_config.return_value.llm_embedding_model = "text-embedding-3-small"
mock_ai_config.return_value.llm_api_key = "test_api_key"
mock_ai_config.return_value.llm_endpoint = "http://test-url"
with patch("llama_index.embeddings.openai.OpenAIEmbedding") as MockOpenAIEmbedding:
with patch(
"llama_index.embeddings.openai_like.OpenAILikeEmbedding",
) as MockOpenAIEmbedding:
model = get_embedding_model()
MockOpenAIEmbedding.assert_called_once_with(
model="text-embedding-3-small",
model_name="text-embedding-3-small",
api_key="test_api_key",
api_base="http://test-url",
)
@@ -72,7 +74,7 @@ def test_get_embedding_model_openai(mock_ai_config):
def test_get_embedding_model_openai_blocks_internal_endpoint_when_disallowed(
mock_ai_config,
):
mock_ai_config.return_value.llm_embedding_backend = LLMEmbeddingBackend.OPENAI
mock_ai_config.return_value.llm_embedding_backend = LLMEmbeddingBackend.OPENAI_LIKE
mock_ai_config.return_value.llm_embedding_model = "text-embedding-3-small"
mock_ai_config.return_value.llm_api_key = "test_api_key"
mock_ai_config.return_value.llm_endpoint = "http://127.0.0.1:11434"
@@ -109,7 +111,7 @@ def test_get_embedding_model_invalid_backend(mock_ai_config):
def test_get_embedding_dim_infers_and_saves(temp_llm_index_dir, mock_ai_config):
mock_ai_config.return_value.llm_embedding_backend = "openai"
mock_ai_config.return_value.llm_embedding_backend = "openai-like"
mock_ai_config.return_value.llm_embedding_model = None
class DummyEmbedding:
@@ -129,7 +131,7 @@ def test_get_embedding_dim_infers_and_saves(temp_llm_index_dir, mock_ai_config):
def test_get_embedding_dim_reads_existing_meta(temp_llm_index_dir, mock_ai_config):
mock_ai_config.return_value.llm_embedding_backend = "openai"
mock_ai_config.return_value.llm_embedding_backend = "openai-like"
mock_ai_config.return_value.llm_embedding_model = None
(temp_llm_index_dir / "meta.json").write_text(
@@ -142,7 +144,7 @@ def test_get_embedding_dim_reads_existing_meta(temp_llm_index_dir, mock_ai_confi
def test_get_embedding_dim_raises_on_model_change(temp_llm_index_dir, mock_ai_config):
mock_ai_config.return_value.llm_embedding_backend = "openai"
mock_ai_config.return_value.llm_embedding_backend = "openai-like"
mock_ai_config.return_value.llm_embedding_model = None
(temp_llm_index_dir / "meta.json").write_text(
+108
View File
@@ -1,11 +1,19 @@
from collections.abc import Generator
from typing import TYPE_CHECKING
import pytest
import pytest_mock
from django.contrib.auth.models import User
from django.test import Client
from rest_framework.test import APIClient
from paperless_mail.mail import MailAccountHandler
from paperless_mail.models import MailAccount
from paperless_mail.tests.factories import MailAccountFactory
if TYPE_CHECKING:
from paperless_mail.tests.test_mail import BogusMailBox
@pytest.fixture()
def greenmail_mail_account(db: None) -> Generator[MailAccount, None, None]:
@@ -27,3 +35,103 @@ def greenmail_mail_account(db: None) -> Generator[MailAccount, None, None]:
@pytest.fixture()
def mail_account_handler() -> MailAccountHandler:
return MailAccountHandler()
@pytest.fixture()
def mail_user(
db: None,
django_user_model,
client: Client,
):
"""
Create a user with the `add_mailaccount` permission and log them in via
the test client. Returned so tests can mutate permissions if needed.
"""
from django.contrib.auth.models import Permission
user = django_user_model.objects.create_user("testuser")
user.user_permissions.add(
*Permission.objects.filter(codename__in=["add_mailaccount"]),
)
user.save()
client.force_login(user)
return user
@pytest.fixture()
def oauth_settings(settings):
"""
Apply the OAuth callback / client-id settings the OAuth flow needs. Uses
pytest-django's `settings` fixture so values are reverted automatically.
"""
settings.OAUTH_CALLBACK_BASE_URL = "http://localhost:8000"
settings.GMAIL_OAUTH_CLIENT_ID = "test_gmail_client_id"
settings.GMAIL_OAUTH_CLIENT_SECRET = "test_gmail_client_secret"
settings.OUTLOOK_OAUTH_CLIENT_ID = "test_outlook_client_id"
settings.OUTLOOK_OAUTH_CLIENT_SECRET = "test_outlook_client_secret"
return settings
@pytest.fixture()
def mail_mocker(db: None):
"""
Provides a MailMocker instance with its `MailBox` and
`queue_consumption_tasks` patches active. Cleanups registered via
TestCase.addCleanup are run on teardown by calling doCleanups().
"""
from paperless_mail.tests.test_mail import MailMocker
mocker = MailMocker()
mocker.setUp()
try:
yield mocker
finally:
mocker.doCleanups()
@pytest.fixture()
def mail_api_user(
db: None,
django_user_model: type[User],
) -> User:
"""
Fully-permissioned (regular) user used by the mail API tests.
Has every model-level permission but is NOT a Django superuser/staff:
the owner-aware filtering and bulk_delete permission tests rely on
django-guardian's object-level checks, and `is_superuser` short-circuits
those checks. The name avoids `admin` to make this distinction explicit.
"""
from django.contrib.auth.models import Permission
user = django_user_model.objects.create_user(username="mail_api_user")
user.user_permissions.add(*Permission.objects.all())
user.save()
return user
@pytest.fixture()
def mail_api_client(mail_api_user: User) -> APIClient:
"""
DRF APIClient force-authenticated as `mail_api_user` and pinned to API v10
via the Accept header (matches `documents/tests/conftest.py:admin_client`).
"""
client = APIClient()
client.force_authenticate(user=mail_api_user)
client.credentials(HTTP_ACCEPT="application/json; version=10")
return client
@pytest.fixture()
def bogus_mailbox(mocker: pytest_mock.MockerFixture) -> "BogusMailBox":
"""
Patch `paperless_mail.mail.MailBox` with a `BogusMailBox` instance so the
`/api/mail_accounts/test/` endpoint can run without a real IMAP server.
Returns the bogus mailbox so tests can introspect/manipulate it.
"""
from paperless_mail.tests.test_mail import BogusMailBox
mailbox = BogusMailBox()
mock_mailbox_cls = mocker.patch("paperless_mail.mail.MailBox")
mock_mailbox_cls.return_value = mailbox
return mailbox
+3 -3
View File
@@ -9,7 +9,7 @@ from paperless_mail.models import MailRule
from paperless_mail.models import ProcessedMail
class MailAccountFactory(DjangoModelFactory):
class MailAccountFactory(DjangoModelFactory[MailAccount]):
class Meta:
model = MailAccount
@@ -24,7 +24,7 @@ class MailAccountFactory(DjangoModelFactory):
is_token = False
class MailRuleFactory(DjangoModelFactory):
class MailRuleFactory(DjangoModelFactory[MailRule]):
class Meta:
model = MailRule
@@ -44,7 +44,7 @@ class MailRuleFactory(DjangoModelFactory):
stop_processing = False
class ProcessedMailFactory(DjangoModelFactory):
class ProcessedMailFactory(DjangoModelFactory[ProcessedMail]):
class Meta:
model = ProcessedMail
+305 -273
View File
@@ -1,44 +1,39 @@
import json
from unittest import mock
from typing import TYPE_CHECKING
from django.contrib.auth.models import Permission
import pytest
from django.contrib.auth.models import User
from guardian.shortcuts import assign_perm
from rest_framework import status
from rest_framework.test import APITestCase
from rest_framework.test import APIClient
from documents.tests.factories import CorrespondentFactory
from documents.tests.factories import DocumentTypeFactory
from documents.tests.factories import TagFactory
from documents.tests.utils import DirectoriesMixin
from paperless_mail.models import MailAccount
from paperless_mail.models import MailRule
from paperless_mail.models import ProcessedMail
from paperless_mail.tests.factories import MailAccountFactory
from paperless_mail.tests.factories import MailRuleFactory
from paperless_mail.tests.factories import ProcessedMailFactory
from paperless_mail.tests.test_mail import BogusMailBox
if TYPE_CHECKING:
from paperless_mail.tests.test_mail import BogusMailBox
class TestAPIMailAccounts(DirectoriesMixin, APITestCase):
ENDPOINT = "/api/mail_accounts/"
MAIL_ACCOUNTS_ENDPOINT = "/api/mail_accounts/"
MAIL_ACCOUNTS_TEST_ENDPOINT = f"{MAIL_ACCOUNTS_ENDPOINT}test/"
MAIL_RULES_ENDPOINT = "/api/mail_rules/"
PROCESSED_MAIL_ENDPOINT = "/api/processed_mail/"
PROCESSED_MAIL_BULK_DELETE_ENDPOINT = f"{PROCESSED_MAIL_ENDPOINT}bulk_delete/"
def setUp(self) -> None:
self.bogus_mailbox = BogusMailBox()
patcher = mock.patch("paperless_mail.mail.MailBox")
m = patcher.start()
m.return_value = self.bogus_mailbox
self.addCleanup(patcher.stop)
super().setUp()
self.user = User.objects.create_user(username="temp_admin")
self.user.user_permissions.add(*Permission.objects.all())
self.user.save()
self.client.force_authenticate(user=self.user)
def test_get_mail_accounts(self) -> None:
@pytest.mark.django_db
class TestAPIMailAccounts:
def test_get_mail_accounts(
self,
mail_api_client: APIClient,
) -> None:
"""
GIVEN:
- Configured mail accounts
@@ -47,7 +42,6 @@ class TestAPIMailAccounts(DirectoriesMixin, APITestCase):
THEN:
- Configured mail accounts are provided
"""
account1 = MailAccountFactory(
name="Email1",
username="username1",
@@ -56,31 +50,30 @@ class TestAPIMailAccounts(DirectoriesMixin, APITestCase):
imap_port=443,
)
response = self.client.get(self.ENDPOINT)
response = mail_api_client.get(MAIL_ACCOUNTS_ENDPOINT)
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.assertEqual(response.data["count"], 1)
assert response.status_code == status.HTTP_200_OK
assert response.data["count"] == 1
returned_account1 = response.data["results"][0]
self.assertEqual(returned_account1["name"], account1.name)
self.assertEqual(returned_account1["username"], account1.username)
self.assertEqual(
returned_account1["password"],
"**********",
)
self.assertEqual(returned_account1["imap_server"], account1.imap_server)
self.assertEqual(returned_account1["imap_port"], account1.imap_port)
self.assertEqual(returned_account1["imap_security"], account1.imap_security)
self.assertEqual(returned_account1["character_set"], account1.character_set)
assert returned_account1["name"] == account1.name
assert returned_account1["username"] == account1.username
assert returned_account1["password"] == "**********"
assert returned_account1["imap_server"] == account1.imap_server
assert returned_account1["imap_port"] == account1.imap_port
assert returned_account1["imap_security"] == account1.imap_security
assert returned_account1["character_set"] == account1.character_set
def test_create_mail_account(self) -> None:
def test_create_mail_account(
self,
mail_api_client: APIClient,
) -> None:
"""
WHEN:
- API request is made to add a mail account
THEN:
- A new mail account is created
"""
account1 = {
"name": "Email1",
"username": "username1",
@@ -91,24 +84,27 @@ class TestAPIMailAccounts(DirectoriesMixin, APITestCase):
"character_set": "UTF-8",
}
response = self.client.post(
self.ENDPOINT,
response = mail_api_client.post(
MAIL_ACCOUNTS_ENDPOINT,
data=account1,
)
self.assertEqual(response.status_code, status.HTTP_201_CREATED)
assert response.status_code == status.HTTP_201_CREATED
returned_account1 = MailAccount.objects.get(name="Email1")
self.assertEqual(returned_account1.name, account1["name"])
self.assertEqual(returned_account1.username, account1["username"])
self.assertEqual(returned_account1.password, account1["password"])
self.assertEqual(returned_account1.imap_server, account1["imap_server"])
self.assertEqual(returned_account1.imap_port, account1["imap_port"])
self.assertEqual(returned_account1.imap_security, account1["imap_security"])
self.assertEqual(returned_account1.character_set, account1["character_set"])
assert returned_account1.name == account1["name"]
assert returned_account1.username == account1["username"]
assert returned_account1.password == account1["password"]
assert returned_account1.imap_server == account1["imap_server"]
assert returned_account1.imap_port == account1["imap_port"]
assert returned_account1.imap_security == account1["imap_security"]
assert returned_account1.character_set == account1["character_set"]
def test_delete_mail_account(self) -> None:
def test_delete_mail_account(
self,
mail_api_client: APIClient,
) -> None:
"""
GIVEN:
- Existing mail account
@@ -117,18 +113,20 @@ class TestAPIMailAccounts(DirectoriesMixin, APITestCase):
THEN:
- Account is deleted
"""
account1 = MailAccountFactory()
response = self.client.delete(
f"{self.ENDPOINT}{account1.pk}/",
response = mail_api_client.delete(
f"{MAIL_ACCOUNTS_ENDPOINT}{account1.pk}/",
)
self.assertEqual(response.status_code, status.HTTP_204_NO_CONTENT)
assert response.status_code == status.HTTP_204_NO_CONTENT
self.assertEqual(len(MailAccount.objects.all()), 0)
assert MailAccount.objects.count() == 0
def test_update_mail_account(self) -> None:
def test_update_mail_account(
self,
mail_api_client: APIClient,
) -> None:
"""
GIVEN:
- Existing mail accounts
@@ -137,49 +135,51 @@ class TestAPIMailAccounts(DirectoriesMixin, APITestCase):
THEN:
- The mail account is updated, password only updated if not '****'
"""
account1 = MailAccountFactory()
response = self.client.patch(
f"{self.ENDPOINT}{account1.pk}/",
response = mail_api_client.patch(
f"{MAIL_ACCOUNTS_ENDPOINT}{account1.pk}/",
data={
"name": "Updated Name 1",
"password": "******",
},
)
self.assertEqual(response.status_code, status.HTTP_200_OK)
assert response.status_code == status.HTTP_200_OK
returned_account1 = MailAccount.objects.get(pk=account1.pk)
self.assertEqual(returned_account1.name, "Updated Name 1")
self.assertEqual(returned_account1.password, account1.password)
assert returned_account1.name == "Updated Name 1"
assert returned_account1.password == account1.password
response = self.client.patch(
f"{self.ENDPOINT}{account1.pk}/",
response = mail_api_client.patch(
f"{MAIL_ACCOUNTS_ENDPOINT}{account1.pk}/",
data={
"name": "Updated Name 2",
"password": "123xyz",
},
)
self.assertEqual(response.status_code, status.HTTP_200_OK)
assert response.status_code == status.HTTP_200_OK
returned_account2 = MailAccount.objects.get(pk=account1.pk)
self.assertEqual(returned_account2.name, "Updated Name 2")
self.assertEqual(returned_account2.password, "123xyz")
assert returned_account2.name == "Updated Name 2"
assert returned_account2.password == "123xyz"
def test_mail_account_test_fail(self) -> None:
def test_mail_account_test_fail(
self,
mail_api_client: APIClient,
bogus_mailbox: "BogusMailBox",
) -> None:
"""
GIVEN:
- Errnoeous mail account details
- Erroneous mail account details
WHEN:
- API call is made to test account
THEN:
- API returns 400 bad request
"""
response = self.client.post(
f"{self.ENDPOINT}test/",
response = mail_api_client.post(
MAIL_ACCOUNTS_TEST_ENDPOINT,
json.dumps(
{
"imap_server": "server.example.com",
@@ -192,9 +192,13 @@ class TestAPIMailAccounts(DirectoriesMixin, APITestCase):
content_type="application/json",
)
self.assertEqual(response.status_code, status.HTTP_400_BAD_REQUEST)
assert response.status_code == status.HTTP_400_BAD_REQUEST
def test_mail_account_test_success(self) -> None:
def test_mail_account_test_success(
self,
mail_api_client: APIClient,
bogus_mailbox: "BogusMailBox",
) -> None:
"""
GIVEN:
- Working mail account details
@@ -203,9 +207,8 @@ class TestAPIMailAccounts(DirectoriesMixin, APITestCase):
THEN:
- API returns success
"""
response = self.client.post(
f"{self.ENDPOINT}test/",
response = mail_api_client.post(
MAIL_ACCOUNTS_TEST_ENDPOINT,
json.dumps(
{
"imap_server": "server.example.com",
@@ -217,10 +220,14 @@ class TestAPIMailAccounts(DirectoriesMixin, APITestCase):
),
content_type="application/json",
)
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.assertEqual(response.data["success"], True)
assert response.status_code == status.HTTP_200_OK
assert response.data["success"] is True
def test_mail_account_test_existing(self) -> None:
def test_mail_account_test_existing(
self,
mail_api_client: APIClient,
bogus_mailbox: "BogusMailBox",
) -> None:
"""
GIVEN:
- Testing server details for an existing account with obfuscated password (***)
@@ -236,8 +243,8 @@ class TestAPIMailAccounts(DirectoriesMixin, APITestCase):
imap_port=443,
)
response = self.client.post(
f"{self.ENDPOINT}test/",
response = mail_api_client.post(
MAIL_ACCOUNTS_TEST_ENDPOINT,
json.dumps(
{
"id": account.pk,
@@ -250,12 +257,16 @@ class TestAPIMailAccounts(DirectoriesMixin, APITestCase):
),
content_type="application/json",
)
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.assertEqual(response.data["success"], True)
assert response.status_code == status.HTTP_200_OK
assert response.data["success"] is True
def test_mail_account_test_existing_nonexistent_id_forbidden(self) -> None:
response = self.client.post(
f"{self.ENDPOINT}test/",
def test_mail_account_test_existing_nonexistent_id_forbidden(
self,
mail_api_client: APIClient,
bogus_mailbox: "BogusMailBox",
) -> None:
response = mail_api_client.post(
MAIL_ACCOUNTS_TEST_ENDPOINT,
json.dumps(
{
"id": 999999,
@@ -268,10 +279,15 @@ class TestAPIMailAccounts(DirectoriesMixin, APITestCase):
),
content_type="application/json",
)
self.assertEqual(response.status_code, status.HTTP_403_FORBIDDEN)
self.assertEqual(response.content.decode(), "Insufficient permissions")
assert response.status_code == status.HTTP_403_FORBIDDEN
assert response.content.decode() == "Insufficient permissions"
def test_get_mail_accounts_owner_aware(self) -> None:
def test_get_mail_accounts_owner_aware(
self,
mail_api_client: APIClient,
mail_api_user: User,
django_user_model: type[User],
) -> None:
"""
GIVEN:
- Configured accounts with different users
@@ -280,36 +296,29 @@ class TestAPIMailAccounts(DirectoriesMixin, APITestCase):
THEN:
- Only unowned, owned by user or granted accounts are provided
"""
user2 = User.objects.create_user(username="temp_admin2")
user2 = django_user_model.objects.create_user(username="temp_admin2")
account1 = MailAccountFactory(name="Email1")
account2 = MailAccountFactory(name="Email2", owner=self.user)
account2 = MailAccountFactory(name="Email2", owner=mail_api_user)
_account3 = MailAccountFactory(name="Email3", owner=user2)
account4 = MailAccountFactory(name="Email4", owner=user2)
assign_perm("view_mailaccount", self.user, account4)
assign_perm("view_mailaccount", mail_api_user, account4)
response = self.client.get(self.ENDPOINT)
response = mail_api_client.get(MAIL_ACCOUNTS_ENDPOINT)
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.assertEqual(response.data["count"], 3)
self.assertEqual(response.data["results"][0]["name"], account1.name)
self.assertEqual(response.data["results"][1]["name"], account2.name)
self.assertEqual(response.data["results"][2]["name"], account4.name)
assert response.status_code == status.HTTP_200_OK
assert response.data["count"] == 3
assert response.data["results"][0]["name"] == account1.name
assert response.data["results"][1]["name"] == account2.name
assert response.data["results"][2]["name"] == account4.name
class TestAPIMailRules(DirectoriesMixin, APITestCase):
ENDPOINT = "/api/mail_rules/"
def setUp(self) -> None:
super().setUp()
self.user = User.objects.create_user(username="temp_admin")
self.user.user_permissions.add(*Permission.objects.all())
self.user.save()
self.client.force_authenticate(user=self.user)
def test_get_mail_rules(self) -> None:
@pytest.mark.django_db
class TestAPIMailRules:
def test_get_mail_rules(
self,
mail_api_client: APIClient,
) -> None:
"""
GIVEN:
- Configured mail accounts and rules
@@ -318,7 +327,6 @@ class TestAPIMailRules(DirectoriesMixin, APITestCase):
THEN:
- Configured mail rules are provided
"""
account1 = MailAccountFactory()
rule1 = MailRuleFactory(
name="Rule1",
@@ -330,34 +338,37 @@ class TestAPIMailRules(DirectoriesMixin, APITestCase):
filter_attachment_filename_include="file.pdf",
)
response = self.client.get(self.ENDPOINT)
response = mail_api_client.get(MAIL_RULES_ENDPOINT)
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.assertEqual(response.data["count"], 1)
assert response.status_code == status.HTTP_200_OK
assert response.data["count"] == 1
returned_rule1 = response.data["results"][0]
self.assertEqual(returned_rule1["name"], rule1.name)
self.assertEqual(returned_rule1["account"], account1.pk)
self.assertEqual(returned_rule1["folder"], rule1.folder)
self.assertEqual(returned_rule1["filter_from"], rule1.filter_from)
self.assertEqual(returned_rule1["filter_to"], rule1.filter_to)
self.assertEqual(returned_rule1["filter_subject"], rule1.filter_subject)
self.assertEqual(returned_rule1["filter_body"], rule1.filter_body)
self.assertEqual(
returned_rule1["filter_attachment_filename_include"],
rule1.filter_attachment_filename_include,
assert returned_rule1["name"] == rule1.name
assert returned_rule1["account"] == account1.pk
assert returned_rule1["folder"] == rule1.folder
assert returned_rule1["filter_from"] == rule1.filter_from
assert returned_rule1["filter_to"] == rule1.filter_to
assert returned_rule1["filter_subject"] == rule1.filter_subject
assert returned_rule1["filter_body"] == rule1.filter_body
assert (
returned_rule1["filter_attachment_filename_include"]
== rule1.filter_attachment_filename_include
)
self.assertEqual(returned_rule1["maximum_age"], rule1.maximum_age)
self.assertEqual(returned_rule1["action"], rule1.action)
self.assertEqual(returned_rule1["assign_title_from"], rule1.assign_title_from)
self.assertEqual(
returned_rule1["assign_correspondent_from"],
rule1.assign_correspondent_from,
assert returned_rule1["maximum_age"] == rule1.maximum_age
assert returned_rule1["action"] == rule1.action
assert returned_rule1["assign_title_from"] == rule1.assign_title_from
assert (
returned_rule1["assign_correspondent_from"]
== rule1.assign_correspondent_from
)
self.assertEqual(returned_rule1["order"], rule1.order)
self.assertEqual(returned_rule1["attachment_type"], rule1.attachment_type)
assert returned_rule1["order"] == rule1.order
assert returned_rule1["attachment_type"] == rule1.attachment_type
def test_create_mail_rule(self) -> None:
def test_create_mail_rule(
self,
mail_api_client: APIClient,
) -> None:
"""
GIVEN:
- Configured mail account exists
@@ -366,7 +377,6 @@ class TestAPIMailRules(DirectoriesMixin, APITestCase):
THEN:
- A new mail rule is created
"""
account1 = MailAccountFactory()
tag = TagFactory(name="t")
correspondent = CorrespondentFactory(name="c")
@@ -394,58 +404,51 @@ class TestAPIMailRules(DirectoriesMixin, APITestCase):
"assign_owner_from_rule": True,
}
response = self.client.post(
self.ENDPOINT,
response = mail_api_client.post(
MAIL_RULES_ENDPOINT,
data=rule1,
)
self.assertEqual(response.status_code, status.HTTP_201_CREATED)
assert response.status_code == status.HTTP_201_CREATED
response = self.client.get(self.ENDPOINT)
response = mail_api_client.get(MAIL_RULES_ENDPOINT)
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.assertEqual(response.data["count"], 1)
assert response.status_code == status.HTTP_200_OK
assert response.data["count"] == 1
returned_rule1 = response.data["results"][0]
self.assertEqual(returned_rule1["name"], rule1["name"])
self.assertEqual(returned_rule1["account"], account1.pk)
self.assertEqual(returned_rule1["folder"], rule1["folder"])
self.assertEqual(returned_rule1["filter_from"], rule1["filter_from"])
self.assertEqual(returned_rule1["filter_to"], rule1["filter_to"])
self.assertEqual(returned_rule1["filter_subject"], rule1["filter_subject"])
self.assertEqual(returned_rule1["filter_body"], rule1["filter_body"])
self.assertEqual(
returned_rule1["filter_attachment_filename_include"],
rule1["filter_attachment_filename_include"],
assert returned_rule1["name"] == rule1["name"]
assert returned_rule1["account"] == account1.pk
assert returned_rule1["folder"] == rule1["folder"]
assert returned_rule1["filter_from"] == rule1["filter_from"]
assert returned_rule1["filter_to"] == rule1["filter_to"]
assert returned_rule1["filter_subject"] == rule1["filter_subject"]
assert returned_rule1["filter_body"] == rule1["filter_body"]
assert (
returned_rule1["filter_attachment_filename_include"]
== rule1["filter_attachment_filename_include"]
)
self.assertEqual(returned_rule1["maximum_age"], rule1["maximum_age"])
self.assertEqual(returned_rule1["action"], rule1["action"])
self.assertEqual(
returned_rule1["assign_title_from"],
rule1["assign_title_from"],
assert returned_rule1["maximum_age"] == rule1["maximum_age"]
assert returned_rule1["action"] == rule1["action"]
assert returned_rule1["assign_title_from"] == rule1["assign_title_from"]
assert (
returned_rule1["assign_correspondent_from"]
== rule1["assign_correspondent_from"]
)
self.assertEqual(
returned_rule1["assign_correspondent_from"],
rule1["assign_correspondent_from"],
)
self.assertEqual(returned_rule1["order"], rule1["order"])
self.assertEqual(returned_rule1["attachment_type"], rule1["attachment_type"])
self.assertEqual(returned_rule1["action_parameter"], rule1["action_parameter"])
self.assertEqual(
returned_rule1["assign_correspondent"],
rule1["assign_correspondent"],
)
self.assertEqual(
returned_rule1["assign_document_type"],
rule1["assign_document_type"],
)
self.assertEqual(returned_rule1["assign_tags"], rule1["assign_tags"])
self.assertEqual(
returned_rule1["assign_owner_from_rule"],
rule1["assign_owner_from_rule"],
assert returned_rule1["order"] == rule1["order"]
assert returned_rule1["attachment_type"] == rule1["attachment_type"]
assert returned_rule1["action_parameter"] == rule1["action_parameter"]
assert returned_rule1["assign_correspondent"] == rule1["assign_correspondent"]
assert returned_rule1["assign_document_type"] == rule1["assign_document_type"]
assert returned_rule1["assign_tags"] == rule1["assign_tags"]
assert (
returned_rule1["assign_owner_from_rule"] == rule1["assign_owner_from_rule"]
)
def test_delete_mail_rule(self) -> None:
def test_delete_mail_rule(
self,
mail_api_client: APIClient,
) -> None:
"""
GIVEN:
- Existing mail rule
@@ -454,19 +457,21 @@ class TestAPIMailRules(DirectoriesMixin, APITestCase):
THEN:
- Rule is deleted
"""
account1 = MailAccountFactory()
rule1 = MailRuleFactory(account=account1)
response = self.client.delete(
f"{self.ENDPOINT}{rule1.pk}/",
response = mail_api_client.delete(
f"{MAIL_RULES_ENDPOINT}{rule1.pk}/",
)
self.assertEqual(response.status_code, status.HTTP_204_NO_CONTENT)
assert response.status_code == status.HTTP_204_NO_CONTENT
self.assertEqual(len(MailRule.objects.all()), 0)
assert MailRule.objects.count() == 0
def test_update_mail_rule(self) -> None:
def test_update_mail_rule(
self,
mail_api_client: APIClient,
) -> None:
"""
GIVEN:
- Existing mail rule
@@ -475,30 +480,33 @@ class TestAPIMailRules(DirectoriesMixin, APITestCase):
THEN:
- The mail rule is updated
"""
account1 = MailAccountFactory()
rule1 = MailRuleFactory(account=account1)
response = self.client.patch(
f"{self.ENDPOINT}{rule1.pk}/",
response = mail_api_client.patch(
f"{MAIL_RULES_ENDPOINT}{rule1.pk}/",
data={
"name": "Updated Name 1",
"action": MailRule.MailAction.DELETE,
},
)
self.assertEqual(response.status_code, status.HTTP_200_OK)
assert response.status_code == status.HTTP_200_OK
returned_rule1 = MailRule.objects.get(pk=rule1.pk)
self.assertEqual(returned_rule1.name, "Updated Name 1")
self.assertEqual(returned_rule1.action, MailRule.MailAction.DELETE)
assert returned_rule1.name == "Updated Name 1"
assert returned_rule1.action == MailRule.MailAction.DELETE
def test_create_mail_rule_scopes_accounts(self) -> None:
other_user = User.objects.create_user(username="mail-owner")
def test_create_mail_rule_scopes_accounts(
self,
mail_api_client: APIClient,
django_user_model: type[User],
) -> None:
other_user = django_user_model.objects.create_user(username="mail-owner")
foreign_account = MailAccountFactory(name="ForeignEmail", owner=other_user)
response = self.client.post(
self.ENDPOINT,
response = mail_api_client.post(
MAIL_RULES_ENDPOINT,
data={
"name": "Rule1",
"account": foreign_account.pk,
@@ -512,8 +520,8 @@ class TestAPIMailRules(DirectoriesMixin, APITestCase):
"attachment_type": MailRule.AttachmentProcessing.ATTACHMENTS_ONLY,
},
)
missing_response = self.client.post(
self.ENDPOINT,
missing_response = mail_api_client.post(
MAIL_RULES_ENDPOINT,
data={
"name": "Rule1",
"account": foreign_account.pk + 1000,
@@ -528,21 +536,24 @@ class TestAPIMailRules(DirectoriesMixin, APITestCase):
},
)
self.assertEqual(response.status_code, status.HTTP_400_BAD_REQUEST)
self.assertEqual(missing_response.status_code, status.HTTP_400_BAD_REQUEST)
self.assertEqual(response.data["account"][0].code, "does_not_exist")
self.assertEqual(missing_response.data["account"][0].code, "does_not_exist")
self.assertEqual(MailRule.objects.count(), 0)
assert response.status_code == status.HTTP_400_BAD_REQUEST
assert missing_response.status_code == status.HTTP_400_BAD_REQUEST
assert response.data["account"][0].code == "does_not_exist"
assert missing_response.data["account"][0].code == "does_not_exist"
assert MailRule.objects.count() == 0
def test_create_mail_rule_allowed_for_granted_account_change_permission(
self,
mail_api_client: APIClient,
mail_api_user: User,
django_user_model: type[User],
) -> None:
other_user = User.objects.create_user(username="mail-owner")
other_user = django_user_model.objects.create_user(username="mail-owner")
foreign_account = MailAccountFactory(name="ForeignEmail", owner=other_user)
assign_perm("change_mailaccount", self.user, foreign_account)
assign_perm("change_mailaccount", mail_api_user, foreign_account)
response = self.client.post(
self.ENDPOINT,
response = mail_api_client.post(
MAIL_RULES_ENDPOINT,
data={
"name": "Rule1",
"account": foreign_account.pk,
@@ -557,25 +568,34 @@ class TestAPIMailRules(DirectoriesMixin, APITestCase):
},
)
self.assertEqual(response.status_code, status.HTTP_201_CREATED)
self.assertEqual(MailRule.objects.get().account, foreign_account)
assert response.status_code == status.HTTP_201_CREATED
assert MailRule.objects.get().account == foreign_account
def test_update_mail_rule_forbidden_for_unpermitted_account(self) -> None:
def test_update_mail_rule_forbidden_for_unpermitted_account(
self,
mail_api_client: APIClient,
django_user_model: type[User],
) -> None:
own_account = MailAccountFactory()
other_user = User.objects.create_user(username="mail-owner")
other_user = django_user_model.objects.create_user(username="mail-owner")
foreign_account = MailAccountFactory(owner=other_user)
rule1 = MailRuleFactory(account=own_account)
response = self.client.patch(
f"{self.ENDPOINT}{rule1.pk}/",
response = mail_api_client.patch(
f"{MAIL_RULES_ENDPOINT}{rule1.pk}/",
data={"account": foreign_account.pk},
)
self.assertEqual(response.status_code, status.HTTP_400_BAD_REQUEST)
assert response.status_code == status.HTTP_400_BAD_REQUEST
rule1.refresh_from_db()
self.assertEqual(rule1.account, own_account)
assert rule1.account == own_account
def test_get_mail_rules_owner_aware(self) -> None:
def test_get_mail_rules_owner_aware(
self,
mail_api_client: APIClient,
mail_api_user: User,
django_user_model: type[User],
) -> None:
"""
GIVEN:
- Configured rules with different users
@@ -584,24 +604,26 @@ class TestAPIMailRules(DirectoriesMixin, APITestCase):
THEN:
- Only unowned, owned by user or granted mail rules are provided
"""
user2 = User.objects.create_user(username="temp_admin2")
user2 = django_user_model.objects.create_user(username="temp_admin2")
account1 = MailAccountFactory()
rule1 = MailRuleFactory(account=account1, order=0)
rule2 = MailRuleFactory(account=account1, order=1, owner=self.user)
rule2 = MailRuleFactory(account=account1, order=1, owner=mail_api_user)
MailRuleFactory(account=account1, order=2, owner=user2)
rule4 = MailRuleFactory(account=account1, order=3, owner=user2)
assign_perm("view_mailrule", self.user, rule4)
assign_perm("view_mailrule", mail_api_user, rule4)
response = self.client.get(self.ENDPOINT)
response = mail_api_client.get(MAIL_RULES_ENDPOINT)
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.assertEqual(response.data["count"], 3)
self.assertEqual(response.data["results"][0]["name"], rule1.name)
self.assertEqual(response.data["results"][1]["name"], rule2.name)
self.assertEqual(response.data["results"][2]["name"], rule4.name)
assert response.status_code == status.HTTP_200_OK
assert response.data["count"] == 3
assert response.data["results"][0]["name"] == rule1.name
assert response.data["results"][1]["name"] == rule2.name
assert response.data["results"][2]["name"] == rule4.name
def test_mailrule_maxage_validation(self) -> None:
def test_mailrule_maxage_validation(
self,
mail_api_client: APIClient,
) -> None:
"""
GIVEN:
- An existing mail account
@@ -629,24 +651,24 @@ class TestAPIMailRules(DirectoriesMixin, APITestCase):
"attachment_type": MailRule.AttachmentProcessing.ATTACHMENTS_ONLY,
}
response = self.client.post(self.ENDPOINT, data=rule_data, format="json")
response = mail_api_client.post(
MAIL_RULES_ENDPOINT,
data=rule_data,
format="json",
)
self.assertEqual(response.status_code, status.HTTP_400_BAD_REQUEST)
self.assertIn("maximum_age", response.data)
assert response.status_code == status.HTTP_400_BAD_REQUEST
assert "maximum_age" in response.data
class TestAPIProcessedMails(DirectoriesMixin, APITestCase):
ENDPOINT = "/api/processed_mail/"
def setUp(self) -> None:
super().setUp()
self.user = User.objects.create_user(username="temp_admin")
self.user.user_permissions.add(*Permission.objects.all())
self.user.save()
self.client.force_authenticate(user=self.user)
def test_get_processed_mails_owner_aware(self) -> None:
@pytest.mark.django_db
class TestAPIProcessedMails:
def test_get_processed_mails_owner_aware(
self,
mail_api_client: APIClient,
mail_api_user: User,
django_user_model: type[User],
) -> None:
"""
GIVEN:
- Configured processed mails with different users
@@ -655,27 +677,31 @@ class TestAPIProcessedMails(DirectoriesMixin, APITestCase):
THEN:
- Only unowned, owned by user or granted processed mails are provided
"""
user2 = User.objects.create_user(username="temp_admin2")
user2 = django_user_model.objects.create_user(username="temp_admin2")
rule = MailRuleFactory()
pm1 = ProcessedMailFactory(rule=rule)
pm2 = ProcessedMailFactory(
rule=rule,
status="FAILED",
error="err",
owner=self.user,
owner=mail_api_user,
)
ProcessedMailFactory(rule=rule, owner=user2)
pm4 = ProcessedMailFactory(rule=rule, owner=user2)
assign_perm("view_processedmail", self.user, pm4)
assign_perm("view_processedmail", mail_api_user, pm4)
response = self.client.get(self.ENDPOINT)
response = mail_api_client.get(PROCESSED_MAIL_ENDPOINT)
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.assertEqual(response.data["count"], 3)
assert response.status_code == status.HTTP_200_OK
assert response.data["count"] == 3
returned_ids = {r["id"] for r in response.data["results"]}
self.assertSetEqual(returned_ids, {pm1.id, pm2.id, pm4.id})
assert returned_ids == {pm1.id, pm2.id, pm4.id}
def test_get_processed_mails_filter_by_rule(self) -> None:
def test_get_processed_mails_filter_by_rule(
self,
mail_api_client: APIClient,
mail_api_user: User,
) -> None:
"""
GIVEN:
- Processed mails belonging to two different rules
@@ -687,17 +713,22 @@ class TestAPIProcessedMails(DirectoriesMixin, APITestCase):
account = MailAccountFactory()
rule1 = MailRuleFactory(account=account)
rule2 = MailRuleFactory(account=account)
pm1 = ProcessedMailFactory(rule=rule1, owner=self.user)
pm1 = ProcessedMailFactory(rule=rule1, owner=mail_api_user)
pm2 = ProcessedMailFactory(rule=rule1, status="FAILED", error="e")
ProcessedMailFactory(rule=rule2)
response = self.client.get(f"{self.ENDPOINT}?rule={rule1.pk}")
response = mail_api_client.get(f"{PROCESSED_MAIL_ENDPOINT}?rule={rule1.pk}")
self.assertEqual(response.status_code, status.HTTP_200_OK)
assert response.status_code == status.HTTP_200_OK
returned_ids = {r["id"] for r in response.data["results"]}
self.assertSetEqual(returned_ids, {pm1.id, pm2.id})
assert returned_ids == {pm1.id, pm2.id}
def test_bulk_delete_processed_mails(self) -> None:
def test_bulk_delete_processed_mails(
self,
mail_api_client: APIClient,
mail_api_user: User,
django_user_model: type[User],
) -> None:
"""
GIVEN:
- Processed mails belonging to two different rules and different users
@@ -706,7 +737,7 @@ class TestAPIProcessedMails(DirectoriesMixin, APITestCase):
THEN:
- Only the specified processed mails are deleted, respecting ownership and permissions
"""
user2 = User.objects.create_user(username="temp_admin2")
user2 = django_user_model.objects.create_user(username="temp_admin2")
rule = MailRuleFactory()
# unowned, owned by self, and one with explicit object perm
pm_unowned = ProcessedMailFactory(rule=rule)
@@ -714,46 +745,47 @@ class TestAPIProcessedMails(DirectoriesMixin, APITestCase):
rule=rule,
status="FAILED",
error="e",
owner=self.user,
owner=mail_api_user,
)
pm_granted = ProcessedMailFactory(rule=rule, owner=user2)
assign_perm("delete_processedmail", self.user, pm_granted)
assign_perm("delete_processedmail", mail_api_user, pm_granted)
pm_forbidden = ProcessedMailFactory(rule=rule, owner=user2)
# Success for allowed items
response = self.client.post(
f"{self.ENDPOINT}bulk_delete/",
response = mail_api_client.post(
PROCESSED_MAIL_BULK_DELETE_ENDPOINT,
data={
"mail_ids": [pm_unowned.id, pm_owned.id, pm_granted.id],
},
format="json",
)
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.assertEqual(response.data["result"], "OK")
self.assertSetEqual(
set(response.data["deleted_mail_ids"]),
{pm_unowned.id, pm_owned.id, pm_granted.id},
)
self.assertFalse(ProcessedMail.objects.filter(id=pm_unowned.id).exists())
self.assertFalse(ProcessedMail.objects.filter(id=pm_owned.id).exists())
self.assertFalse(ProcessedMail.objects.filter(id=pm_granted.id).exists())
self.assertTrue(ProcessedMail.objects.filter(id=pm_forbidden.id).exists())
assert response.status_code == status.HTTP_200_OK
assert response.data["result"] == "OK"
assert set(response.data["deleted_mail_ids"]) == {
pm_unowned.id,
pm_owned.id,
pm_granted.id,
}
assert not ProcessedMail.objects.filter(id=pm_unowned.id).exists()
assert not ProcessedMail.objects.filter(id=pm_owned.id).exists()
assert not ProcessedMail.objects.filter(id=pm_granted.id).exists()
assert ProcessedMail.objects.filter(id=pm_forbidden.id).exists()
# 403 and not deleted
response = self.client.post(
f"{self.ENDPOINT}bulk_delete/",
response = mail_api_client.post(
PROCESSED_MAIL_BULK_DELETE_ENDPOINT,
data={
"mail_ids": [pm_forbidden.id],
},
format="json",
)
self.assertEqual(response.status_code, status.HTTP_403_FORBIDDEN)
self.assertTrue(ProcessedMail.objects.filter(id=pm_forbidden.id).exists())
assert response.status_code == status.HTTP_403_FORBIDDEN
assert ProcessedMail.objects.filter(id=pm_forbidden.id).exists()
# missing mail_ids
response = self.client.post(
f"{self.ENDPOINT}bulk_delete/",
response = mail_api_client.post(
PROCESSED_MAIL_BULK_DELETE_ENDPOINT,
data={"mail_ids": "not-a-list"},
format="json",
)
self.assertEqual(response.status_code, status.HTTP_400_BAD_REQUEST)
assert response.status_code == status.HTTP_400_BAD_REQUEST
+5 -4
View File
@@ -1592,7 +1592,7 @@ class TestPostConsumeAction(TestCase):
mock_get_rule_action,
mock_mailbox_login,
mock_get_mailbox,
):
) -> None:
mock_mailbox = mock.MagicMock()
mock_get_mailbox.return_value.__enter__.return_value = mock_mailbox
mock_action = mock.MagicMock()
@@ -1625,7 +1625,7 @@ class TestPostConsumeAction(TestCase):
mock_get_rule_action,
mock_mailbox_login,
mock_get_mailbox,
):
) -> None:
mock_mailbox = mock.MagicMock()
mock_get_mailbox.return_value.__enter__.return_value = mock_mailbox
mock_action = mock.MagicMock()
@@ -1762,7 +1762,7 @@ class TestTasks(TestCase):
self.assertIn("No new", result)
@mock.patch("paperless_mail.tasks.MailAccountHandler.handle_mail_account")
def test_rule_with_stop_processing(self, m):
def test_rule_with_stop_processing(self, m) -> None:
"""
GIVEN:
- Mail account with a rule with stop_processing=True
@@ -1863,7 +1863,7 @@ class TestMailAccountTestView(APITestCase):
def test_mail_account_test_view_refresh_token(
self,
mock_refresh_account_oauth_token,
):
) -> None:
"""
GIVEN:
- Mail account with expired token
@@ -2050,6 +2050,7 @@ class TestMailRuleAPI(APITestCase):
self.assertEqual(response.status_code, status.HTTP_201_CREATED)
self.assertEqual(MailRule.objects.count(), 1)
rule = MailRule.objects.first()
assert rule is not None
self.assertEqual(rule.name, "Test Rule")
def test_mail_rule_action_parameter_required_for_tag_or_move(self) -> None:
+330 -298
View File
@@ -1,13 +1,14 @@
from datetime import timedelta
from unittest import mock
import pytest
import pytest_mock
from django.contrib.auth.models import Permission
from django.contrib.auth.models import User
from django.test import TestCase
from django.test import override_settings
from django.test import Client
from django.utils import timezone
from httpx_oauth.oauth2 import GetAccessTokenError
from httpx_oauth.oauth2 import RefreshTokenError
from pytest_django.fixtures import SettingsWrapper
from rest_framework import status
from paperless_mail.mail import MailAccountHandler
@@ -16,340 +17,374 @@ from paperless_mail.oauth import PaperlessMailOAuth2Manager
from paperless_mail.tests.factories import MailAccountFactory
@override_settings(
OAUTH_CALLBACK_BASE_URL="http://localhost:8000",
GMAIL_OAUTH_CLIENT_ID="test_gmail_client_id",
GMAIL_OAUTH_CLIENT_SECRET="test_gmail_client_secret",
OUTLOOK_OAUTH_CLIENT_ID="test_outlook_client_id",
OUTLOOK_OAUTH_CLIENT_SECRET="test_outlook_client_secret",
)
class TestMailOAuth(
TestCase,
):
def setUp(self) -> None:
self.user = User.objects.create_user("testuser")
self.user.user_permissions.add(
*Permission.objects.filter(
codename__in=[
"add_mailaccount",
],
),
)
self.user.save()
self.client.force_login(self.user)
self.mail_account_handler = MailAccountHandler()
super().setUp()
@pytest.fixture()
def oauth_manager() -> PaperlessMailOAuth2Manager:
return PaperlessMailOAuth2Manager()
def test_generate_paths(self) -> None:
"""
GIVEN:
- Mocked settings for OAuth callback and base URLs
WHEN:
- get_oauth_callback_url and get_oauth_redirect_url are called
THEN:
- Correct URLs are generated
"""
# Callback URL
oauth_manager = PaperlessMailOAuth2Manager()
with override_settings(OAUTH_CALLBACK_BASE_URL="http://paperless.example.com"):
self.assertEqual(
oauth_manager.oauth_callback_url,
@pytest.fixture()
def oauth_session(client: Client) -> Client:
"""Seed the test client session with a known oauth_state."""
session = client.session
session.update({"oauth_state": "test_state"})
session.save()
return client
class TestOAuthUrlGeneration:
"""OAuth callback / redirect URL construction by PaperlessMailOAuth2Manager."""
@pytest.mark.parametrize(
("overrides", "expected"),
[
pytest.param(
{"OAUTH_CALLBACK_BASE_URL": "http://paperless.example.com"},
"http://paperless.example.com/api/oauth/callback/",
)
with override_settings(
OAUTH_CALLBACK_BASE_URL=None,
PAPERLESS_URL="http://paperless.example.com",
):
self.assertEqual(
oauth_manager.oauth_callback_url,
id="callback-base-url-set",
),
pytest.param(
{
"OAUTH_CALLBACK_BASE_URL": None,
"PAPERLESS_URL": "http://paperless.example.com",
},
"http://paperless.example.com/api/oauth/callback/",
)
with override_settings(
OAUTH_CALLBACK_BASE_URL=None,
PAPERLESS_URL="http://paperless.example.com",
BASE_URL="/paperless/",
):
self.assertEqual(
oauth_manager.oauth_callback_url,
id="falls-back-to-paperless-url",
),
pytest.param(
{
"OAUTH_CALLBACK_BASE_URL": None,
"PAPERLESS_URL": "http://paperless.example.com",
"BASE_URL": "/paperless/",
},
"http://paperless.example.com/paperless/api/oauth/callback/",
)
# Redirect URL
with override_settings(DEBUG=True):
self.assertEqual(
oauth_manager.oauth_redirect_url,
"http://localhost:4200/mail",
)
with override_settings(DEBUG=False):
self.assertEqual(
oauth_manager.oauth_redirect_url,
"/mail",
)
@mock.patch(
"paperless_mail.oauth.PaperlessMailOAuth2Manager.get_gmail_access_token",
)
@mock.patch(
"paperless_mail.oauth.PaperlessMailOAuth2Manager.get_outlook_access_token",
)
def test_oauth_callback_view_success(
self,
mock_get_outlook_access_token,
mock_get_gmail_access_token,
):
"""
GIVEN:
- Mocked settings for Gmail and Outlook OAuth client IDs and secrets
WHEN:
- OAuth callback is called with a code and scope
- OAuth callback is called with a code and no scope
THEN:
- Gmail mail account is created
- Outlook mail account is created
"""
mock_get_gmail_access_token.return_value = {
"access_token": "test_access_token",
"refresh_token": "test_refresh_token",
"expires_in": 3600,
}
mock_get_outlook_access_token.return_value = {
"access_token": "test_access_token",
"refresh_token": "test_refresh_token",
"expires_in": 3600,
}
session = self.client.session
session.update(
{
"oauth_state": "test_state",
},
)
session.save()
# Test Google OAuth callback
response = self.client.get(
"/api/oauth/callback/?code=test_code&scope=https://mail.google.com/&state=test_state",
)
self.assertEqual(response.status_code, status.HTTP_302_FOUND)
self.assertIn("oauth_success=1", response.url)
mock_get_gmail_access_token.assert_called_once()
self.assertTrue(
MailAccount.objects.filter(imap_server="imap.gmail.com").exists(),
)
# Test Outlook OAuth callback
response = self.client.get(
"/api/oauth/callback/?code=test_code&state=test_state",
)
self.assertEqual(response.status_code, status.HTTP_302_FOUND)
self.assertIn("oauth_success=1", response.url)
self.assertTrue(
MailAccount.objects.filter(imap_server="outlook.office365.com").exists(),
)
@mock.patch("httpx_oauth.oauth2.BaseOAuth2.get_access_token")
def test_oauth_callback_view_fails(self, mock_get_access_token) -> None:
"""
GIVEN:
- Mocked settings for Gmail and Outlook OAuth client IDs and secrets
WHEN:
- OAuth callback is called and get access token returns an error
THEN:
- No mail account is created
- Error is logged
"""
mock_get_access_token.side_effect = GetAccessTokenError("test_error")
session = self.client.session
session.update(
{
"oauth_state": "test_state",
},
)
session.save()
with self.assertLogs("paperless_mail", level="ERROR") as cm:
# Test Google OAuth callback
response = self.client.get(
"/api/oauth/callback/?code=test_code&scope=https://mail.google.com/&state=test_state",
)
self.assertEqual(response.status_code, status.HTTP_302_FOUND)
self.assertIn("oauth_success=0", response.url)
self.assertFalse(
MailAccount.objects.filter(imap_server="imap.gmail.com").exists(),
)
# Test Outlook OAuth callback
response = self.client.get(
"/api/oauth/callback/?code=test_code&state=test_state",
)
self.assertEqual(response.status_code, status.HTTP_302_FOUND)
self.assertIn("oauth_success=0", response.url)
self.assertFalse(
MailAccount.objects.filter(
imap_server="outlook.office365.com",
).exists(),
)
self.assertIn(
"Error getting access token from OAuth provider",
cm.output[0],
)
def test_oauth_callback_view_insufficient_permissions(self) -> None:
"""
GIVEN:
- Mocked settings for Gmail and Outlook OAuth client IDs and secrets
- User without add_mailaccount permission
WHEN:
- OAuth callback is called
THEN:
- 400 bad request returned, no mail accounts are created
"""
self.user.user_permissions.remove(
*Permission.objects.filter(
codename__in=[
"add_mailaccount",
],
id="respects-base-url-prefix",
),
)
self.user.save()
response = self.client.get(
"/api/oauth/callback/?code=test_code&scope=https://mail.google.com/",
)
self.assertEqual(response.status_code, status.HTTP_400_BAD_REQUEST)
self.assertFalse(
MailAccount.objects.filter(imap_server="imap.gmail.com").exists(),
)
self.assertFalse(
MailAccount.objects.filter(imap_server="outlook.office365.com").exists(),
)
def test_oauth_callback_view_no_code(self) -> None:
],
)
def test_oauth_callback_url(
self,
settings: SettingsWrapper,
oauth_manager: PaperlessMailOAuth2Manager,
overrides: dict,
expected: str,
) -> None:
"""
GIVEN:
- Mocked settings for Gmail and Outlook OAuth client IDs and secrets
- Various combinations of OAUTH_CALLBACK_BASE_URL, PAPERLESS_URL, and BASE_URL
WHEN:
- OAuth callback is called without a code
- oauth_callback_url is read from the manager
THEN:
- 400 bad request returned, no mail accounts are created
- The expected fully-qualified callback URL is produced
"""
for key, value in overrides.items():
setattr(settings, key, value)
assert oauth_manager.oauth_callback_url == expected
response = self.client.get(
"/api/oauth/callback/",
)
self.assertEqual(response.status_code, status.HTTP_400_BAD_REQUEST)
self.assertFalse(
MailAccount.objects.filter(imap_server="imap.gmail.com").exists(),
)
self.assertFalse(
MailAccount.objects.filter(imap_server="outlook.office365.com").exists(),
)
def test_oauth_callback_view_invalid_state(self) -> None:
@pytest.mark.parametrize(
("debug", "expected"),
[
pytest.param(
True,
"http://localhost:4200/mail",
id="debug-redirects-to-ng-dev",
),
pytest.param(False, "/mail", id="prod-redirects-to-relative-path"),
],
)
def test_oauth_redirect_url(
self,
settings: SettingsWrapper,
oauth_manager: PaperlessMailOAuth2Manager,
debug: bool, # noqa: FBT001
expected: str,
) -> None:
"""
GIVEN:
- Mocked settings for Gmail and Outlook OAuth client IDs and secrets
- DEBUG is toggled on or off
WHEN:
- OAuth callback is called with an invalid state
- oauth_redirect_url is read from the manager
THEN:
- 400 bad request returned, no mail accounts are created
- In DEBUG mode the Angular dev server URL is returned, otherwise a relative path
"""
settings.DEBUG = debug
assert oauth_manager.oauth_redirect_url == expected
response = self.client.get(
@pytest.mark.django_db
class TestOAuthCallbackView:
"""End-to-end behavior of the /api/oauth/callback/ endpoint."""
def test_no_code(
self,
client: Client,
mail_user: User,
oauth_settings: SettingsWrapper,
) -> None:
"""
GIVEN:
- OAuth client IDs and secrets configured
WHEN:
- The OAuth callback is called without a code parameter
THEN:
- 400 Bad Request is returned and no mail account is created
"""
response = client.get("/api/oauth/callback/")
assert response.status_code == status.HTTP_400_BAD_REQUEST
assert not MailAccount.objects.filter(imap_server="imap.gmail.com").exists()
assert not MailAccount.objects.filter(
imap_server="outlook.office365.com",
).exists()
def test_invalid_state(
self,
client: Client,
mail_user: User,
oauth_settings: SettingsWrapper,
) -> None:
"""
GIVEN:
- OAuth client IDs and secrets configured
WHEN:
- The OAuth callback is called with a state that does not match the session
THEN:
- 400 Bad Request is returned and no mail account is created
"""
response = client.get(
"/api/oauth/callback/?code=test_code&state=invalid_state",
)
self.assertEqual(response.status_code, status.HTTP_400_BAD_REQUEST)
self.assertFalse(
MailAccount.objects.filter(imap_server="imap.gmail.com").exists(),
)
self.assertFalse(
MailAccount.objects.filter(imap_server="outlook.office365.com").exists(),
)
assert response.status_code == status.HTTP_400_BAD_REQUEST
assert not MailAccount.objects.filter(imap_server="imap.gmail.com").exists()
assert not MailAccount.objects.filter(
imap_server="outlook.office365.com",
).exists()
@mock.patch("paperless_mail.mail.get_mailbox")
@mock.patch(
"httpx_oauth.oauth2.BaseOAuth2.refresh_token",
)
def test_refresh_token_on_handle_mail_account(
def test_insufficient_permissions(
self,
mock_refresh_token,
mock_get_mailbox,
):
client: Client,
mail_user: User,
oauth_settings: SettingsWrapper,
) -> None:
"""
GIVEN:
- Mail account with refresh token and expiration
- OAuth client IDs and secrets configured
- User without add_mailaccount permission
WHEN:
- handle_mail_account is called
- The OAuth callback is called
THEN:
- Refresh token is called
- 400 Bad Request is returned and no mail account is created
"""
mock_mailbox = mock.MagicMock()
mock_get_mailbox.return_value.__enter__.return_value = mock_mailbox
mail_account = MailAccountFactory(
name="Test Gmail Mail Account",
username="test_username",
account_type=MailAccount.MailAccountType.GMAIL_OAUTH,
is_token=True,
refresh_token="test_refresh_token",
expiration=timezone.now() - timedelta(days=1),
mail_user.user_permissions.remove(
*Permission.objects.filter(codename__in=["add_mailaccount"]),
)
mail_user.save()
mock_refresh_token.return_value = {
response = client.get(
"/api/oauth/callback/?code=test_code&scope=https://mail.google.com/",
)
assert response.status_code == status.HTTP_400_BAD_REQUEST
assert not MailAccount.objects.filter(imap_server="imap.gmail.com").exists()
assert not MailAccount.objects.filter(
imap_server="outlook.office365.com",
).exists()
@pytest.mark.parametrize(
("provider", "callback_query", "expected_imap"),
[
pytest.param(
"gmail",
"code=test_code&scope=https://mail.google.com/&state=test_state",
"imap.gmail.com",
id="gmail",
),
pytest.param(
"outlook",
"code=test_code&state=test_state",
"outlook.office365.com",
id="outlook",
),
],
)
def test_success(
self,
client: Client,
mail_user: User,
oauth_settings: SettingsWrapper,
oauth_session: Client,
mocker: pytest_mock.MockerFixture,
provider: str,
callback_query: str,
expected_imap: str,
) -> None:
"""
GIVEN:
- OAuth client IDs and secrets configured for Gmail and Outlook
- A valid oauth_state seeded in the session
WHEN:
- The OAuth callback is called with a code and provider-specific scope
THEN:
- A redirect with oauth_success=1 is returned
- The provider's access-token method is invoked
- A mail account for the matching provider is created
"""
token_payload = {
"access_token": "test_access_token",
"refresh_token": "test_refresh_token",
"expires_in": 3600,
}
target = (
"paperless_mail.oauth.PaperlessMailOAuth2Manager.get_gmail_access_token"
if provider == "gmail"
else "paperless_mail.oauth.PaperlessMailOAuth2Manager.get_outlook_access_token"
)
mocked = mocker.patch(target, return_value=token_payload)
self.mail_account_handler.handle_mail_account(mail_account)
mock_refresh_token.assert_called_once()
mock_refresh_token.reset_mock()
response = client.get(f"/api/oauth/callback/?{callback_query}")
mock_refresh_token.return_value = {
"access_token": "test_access_token",
"refresh_token": "test_refresh",
"expires_in": 3600,
}
outlook_mail_account = MailAccountFactory(
name="Test Outlook Mail Account",
assert response.status_code == status.HTTP_302_FOUND
assert "oauth_success=1" in response.url
mocked.assert_called_once()
assert MailAccount.objects.filter(imap_server=expected_imap).exists()
@pytest.mark.parametrize(
("callback_query", "imap_server"),
[
pytest.param(
"code=test_code&scope=https://mail.google.com/&state=test_state",
"imap.gmail.com",
id="gmail",
),
pytest.param(
"code=test_code&state=test_state",
"outlook.office365.com",
id="outlook",
),
],
)
def test_provider_error(
self,
client: Client,
mail_user: User,
oauth_settings: SettingsWrapper,
oauth_session: Client,
mocker: pytest_mock.MockerFixture,
caplog: pytest.LogCaptureFixture,
callback_query: str,
imap_server: str,
) -> None:
"""
GIVEN:
- OAuth client IDs and secrets configured
- The provider's access-token endpoint raises GetAccessTokenError
WHEN:
- The OAuth callback is called with a code (Gmail or Outlook)
THEN:
- A redirect with oauth_success=0 is returned
- No mail account is created
- The failure is logged at ERROR level
"""
mocker.patch(
"httpx_oauth.oauth2.BaseOAuth2.get_access_token",
side_effect=GetAccessTokenError("test_error"),
)
with caplog.at_level("ERROR", logger="paperless_mail"):
response = client.get(f"/api/oauth/callback/?{callback_query}")
assert response.status_code == status.HTTP_302_FOUND
assert "oauth_success=0" in response.url
assert not MailAccount.objects.filter(imap_server=imap_server).exists()
assert any(
"Error getting access token from OAuth provider" in record.message
for record in caplog.records
)
@pytest.mark.django_db
class TestRefreshTokenOnHandleMailAccount:
"""OAuth refresh-token flow exercised through MailAccountHandler.handle_mail_account."""
@pytest.mark.parametrize(
("account_type", "name"),
[
pytest.param(
MailAccount.MailAccountType.GMAIL_OAUTH,
"Test Gmail",
id="gmail",
),
pytest.param(
MailAccount.MailAccountType.OUTLOOK_OAUTH,
"Test Outlook",
id="outlook",
),
],
)
def test_refresh_token_called(
self,
mocker: pytest_mock.MockerFixture,
mail_account_handler: MailAccountHandler,
account_type: MailAccount.MailAccountType,
name: str,
) -> None:
"""
GIVEN:
- An OAuth-backed mail account with a refresh token and an expired access token
WHEN:
- handle_mail_account is called
THEN:
- The OAuth refresh_token endpoint is invoked exactly once
"""
mock_mailbox = mocker.MagicMock()
mocker.patch(
"paperless_mail.mail.get_mailbox",
).return_value.__enter__.return_value = mock_mailbox
mock_refresh = mocker.patch(
"httpx_oauth.oauth2.BaseOAuth2.refresh_token",
return_value={
"access_token": "test_access_token",
"refresh_token": "test_refresh_token",
"expires_in": 3600,
},
)
account = MailAccountFactory(
name=name,
username="test_username",
account_type=MailAccount.MailAccountType.OUTLOOK_OAUTH,
account_type=account_type,
is_token=True,
refresh_token="test_refresh_token",
expiration=timezone.now() - timedelta(days=1),
)
self.mail_account_handler.handle_mail_account(outlook_mail_account)
mock_refresh_token.assert_called_once()
mail_account_handler.handle_mail_account(account)
mock_refresh.assert_called_once()
@mock.patch("paperless_mail.mail.get_mailbox")
@mock.patch(
"httpx_oauth.oauth2.BaseOAuth2.refresh_token",
)
def test_refresh_token_on_handle_mail_account_fails(
def test_refresh_token_failure(
self,
mock_refresh_token,
mock_get_mailbox,
):
mocker: pytest_mock.MockerFixture,
caplog: pytest.LogCaptureFixture,
mail_account_handler: MailAccountHandler,
) -> None:
"""
GIVEN:
- Mail account with refresh token and expiration
- An OAuth-backed mail account with a refresh token and an expired access token
- The OAuth refresh_token endpoint raises RefreshTokenError
WHEN:
- handle_mail_account is called
- Refresh token is called but fails
THEN:
- Error is logged
- 0 processed mails is returned
- The failure is logged at ERROR level with the account context
"""
mock_mailbox = mocker.MagicMock()
mocker.patch(
"paperless_mail.mail.get_mailbox",
).return_value.__enter__.return_value = mock_mailbox
mock_refresh = mocker.patch(
"httpx_oauth.oauth2.BaseOAuth2.refresh_token",
side_effect=RefreshTokenError("test_error"),
)
mock_mailbox = mock.MagicMock()
mock_get_mailbox.return_value.__enter__.return_value = mock_mailbox
mail_account = MailAccountFactory(
account = MailAccountFactory(
name="Test Gmail Mail Account",
username="test_username",
account_type=MailAccount.MailAccountType.GMAIL_OAUTH,
@@ -358,16 +393,13 @@ class TestMailOAuth(
expiration=timezone.now() - timedelta(days=1),
)
mock_refresh_token.side_effect = RefreshTokenError("test_error")
with caplog.at_level("ERROR", logger="paperless_mail"):
result = mail_account_handler.handle_mail_account(account)
with self.assertLogs("paperless_mail", level="ERROR") as cm:
# returns 0 processed mails
self.assertEqual(
self.mail_account_handler.handle_mail_account(mail_account),
0,
)
mock_refresh_token.assert_called_once()
self.assertIn(
f"Failed to refresh oauth token for account {mail_account}: test_error",
cm.output[0],
)
assert result == 0
mock_refresh.assert_called_once()
assert any(
f"Failed to refresh oauth token for account {account}: test_error"
in record.message
for record in caplog.records
)
+154 -126
View File
@@ -1,28 +1,31 @@
import email
import email.contentmanager
import shutil
import subprocess
import tempfile
from collections.abc import Generator
from email.message import Message
from email.mime.application import MIMEApplication
from email.mime.multipart import MIMEMultipart
from unittest import mock
from pathlib import Path
import gnupg
from django.test import override_settings
import pytest
from imap_tools import MailMessage
from paperless_mail.mail import MailAccountHandler
from paperless_mail.models import MailRule
from paperless_mail.preprocessor import MailMessageDecryptor
from paperless_mail.tests.factories import MailAccountFactory
from paperless_mail.tests.test_mail import TestMail
from paperless_mail.tests.test_mail import _AttachmentDef
class MessageEncryptor:
def __init__(self) -> None:
self.gpg_home = tempfile.mkdtemp()
"""
Test helper: generates a throwaway GPG keypair in a tempdir and exposes
`encrypt(MailMessage) -> MailMessage`.
"""
def __init__(self, gpg_home: Path) -> None:
self.gpg_home = str(gpg_home)
self.gpg = gnupg.GPG(gnupghome=self.gpg_home)
self._testUser = "testuser@example.com"
# Generate a new key
@@ -36,9 +39,9 @@ class MessageEncryptor:
)
self.gpg.gen_key(input_data)
def cleanup(self) -> None:
def kill_agent(self) -> None:
"""
Kill the gpg-agent process and clean up the temporary GPG home directory.
Kill the gpg-agent so pytest can remove the GPG home.
This uses gpgconf to properly terminate the agent, which is the officially
recommended cleanup method from the GnuPG project. python-gnupg does not
@@ -57,9 +60,6 @@ class MessageEncryptor:
# gpgconf not found or hung - agent will timeout eventually
pass
# Clean up the temporary directory
shutil.rmtree(self.gpg_home, ignore_errors=True)
@staticmethod
def get_email_body_without_headers(email_message: Message) -> bytes:
"""
@@ -74,7 +74,7 @@ class MessageEncryptor:
]
return message_copy.as_bytes()
def encrypt(self, message):
def encrypt(self, message) -> MailMessage:
original_email: email.message.Message = message.obj
encrypted_data = self.gpg.encrypt(
self.get_email_body_without_headers(original_email),
@@ -104,152 +104,181 @@ class MessageEncryptor:
)
new_email.attach(encrypted_part)
encrypted_message: MailMessage = MailMessage(
return MailMessage(
[(f"UID {message.uid}".encode(), new_email.as_bytes())],
)
return encrypted_message
class TestMailMessageGpgDecryptor(TestMail):
@classmethod
def setUpClass(cls) -> None:
"""Create GPG encryptor once for all tests in this class."""
super().setUpClass()
cls.messageEncryptor = MessageEncryptor()
@pytest.fixture(scope="session")
def message_encryptor(
tmp_path_factory: pytest.TempPathFactory,
) -> Generator[MessageEncryptor, None, None]:
"""
Session-scoped: GPG keypair generation is slow (~1s+), and nothing in
these tests mutates the keyring after creation. The GPG home directory
comes from `tmp_path_factory` so pytest cleans it up at session end;
we still kill the gpg-agent ourselves so the dir is removable.
"""
gpg_home = tmp_path_factory.mktemp("gpg-home")
encryptor = MessageEncryptor(gpg_home)
yield encryptor
encryptor.kill_agent()
@classmethod
def tearDownClass(cls) -> None:
"""Clean up GPG resources after all tests complete."""
if hasattr(cls, "messageEncryptor"):
cls.messageEncryptor.cleanup()
super().tearDownClass()
def setUp(self) -> None:
with override_settings(
EMAIL_GNUPG_HOME=self.messageEncryptor.gpg_home,
EMAIL_ENABLE_GPG_DECRYPTOR=True,
):
super().setUp()
@pytest.fixture()
def gpg_settings(settings, message_encryptor: MessageEncryptor):
settings.EMAIL_GNUPG_HOME = message_encryptor.gpg_home
settings.EMAIL_ENABLE_GPG_DECRYPTOR = True
return settings
def test_preprocessor_is_able_to_run(self) -> None:
with override_settings(
EMAIL_GNUPG_HOME=self.messageEncryptor.gpg_home,
EMAIL_ENABLE_GPG_DECRYPTOR=True,
):
self.assertTrue(MailMessageDecryptor.able_to_run())
def test_preprocessor_is_able_to_run2(self) -> None:
with override_settings(
EMAIL_GNUPG_HOME=None,
EMAIL_ENABLE_GPG_DECRYPTOR=True,
):
self.assertTrue(MailMessageDecryptor.able_to_run())
@pytest.fixture()
def encrypted_pair(mail_mocker, message_encryptor: MessageEncryptor):
"""
Build a (encrypted, plaintext) MailMessage pair sharing the same UID and
headers, with two PDF attachments on the plaintext side.
"""
plaintext = mail_mocker.messageBuilder.create_message(
body="Test message with 2 attachments",
attachments=[
_AttachmentDef(filename="f1.pdf", disposition="inline"),
_AttachmentDef(filename="f2.pdf"),
],
)
encrypted = message_encryptor.encrypt(plaintext)
return encrypted, plaintext
def test_is_not_able_to_run_disabled(self) -> None:
with override_settings(
EMAIL_ENABLE_GPG_DECRYPTOR=False,
):
self.assertFalse(MailMessageDecryptor.able_to_run())
def test_is_not_able_to_run_bogus_path(self) -> None:
with override_settings(
EMAIL_ENABLE_GPG_DECRYPTOR=True,
EMAIL_GNUPG_HOME="_)@# notapath &%#$",
):
self.assertFalse(MailMessageDecryptor.able_to_run())
class TestMailMessageDecryptorAbleToRun:
"""`MailMessageDecryptor.able_to_run()` configuration matrix — no DB needed."""
def test_fails_at_initialization(self) -> None:
with (
mock.patch("gnupg.GPG.__init__") as mock_run,
override_settings(
EMAIL_ENABLE_GPG_DECRYPTOR=True,
@pytest.mark.parametrize(
("settings_overrides", "expected"),
[
pytest.param(
{
"EMAIL_GNUPG_HOME": "_gpg_home_marker",
"EMAIL_ENABLE_GPG_DECRYPTOR": True,
},
True,
id="enabled-with-valid-home",
),
):
pytest.param(
{"EMAIL_GNUPG_HOME": None, "EMAIL_ENABLE_GPG_DECRYPTOR": True},
True,
id="enabled-with-default-home",
),
pytest.param(
{"EMAIL_ENABLE_GPG_DECRYPTOR": False},
False,
id="disabled",
),
pytest.param(
{
"EMAIL_ENABLE_GPG_DECRYPTOR": True,
"EMAIL_GNUPG_HOME": "_)@# notapath &%#$",
},
False,
id="enabled-with-bogus-path",
),
],
)
def test_able_to_run(
self,
settings,
message_encryptor: MessageEncryptor,
settings_overrides: dict,
*,
expected: bool,
) -> None:
for key, value in settings_overrides.items():
if value == "_gpg_home_marker":
value = message_encryptor.gpg_home
setattr(settings, key, value)
assert MailMessageDecryptor.able_to_run() is expected
def side_effect(*args, **kwargs):
raise OSError("Cannot find 'gpg' binary")
mock_run.side_effect = side_effect
@pytest.mark.django_db
class TestMailMessageDecryptor:
"""End-to-end decrypt and consumption flow with a real GPG keyring."""
handler = MailAccountHandler()
self.assertEqual(len(handler._message_preprocessors), 0)
def test_fails_at_initialization(self, settings, mocker) -> None:
settings.EMAIL_ENABLE_GPG_DECRYPTOR = True
mocker.patch(
"gnupg.GPG.__init__",
side_effect=OSError("Cannot find 'gpg' binary"),
)
def test_decrypt_fails(self) -> None:
encrypted_message, _ = self.create_encrypted_unencrypted_message_pair()
# This test creates its own empty GPG home to test decryption failure
empty_gpg_home = tempfile.mkdtemp()
handler = MailAccountHandler()
assert len(handler._message_preprocessors) == 0
def test_decrypt_fails(self, settings, encrypted_pair, tmp_path: Path) -> None:
"""
A decryptor pointed at a fresh empty GPG home cannot decrypt the
message ensure it surfaces an exception rather than silently passing
bytes through.
"""
encrypted_message, _ = encrypted_pair
empty_gpg_home = tmp_path / "empty-gpg-home"
empty_gpg_home.mkdir()
settings.EMAIL_ENABLE_GPG_DECRYPTOR = True
settings.EMAIL_GNUPG_HOME = str(empty_gpg_home)
decryptor = MailMessageDecryptor()
try:
with override_settings(
EMAIL_ENABLE_GPG_DECRYPTOR=True,
EMAIL_GNUPG_HOME=empty_gpg_home,
):
message_decryptor = MailMessageDecryptor()
self.assertRaises(Exception, message_decryptor.run, encrypted_message)
with pytest.raises(Exception):
decryptor.run(encrypted_message)
finally:
# Clean up the temporary GPG home used only by this test
try:
subprocess.run(
["gpgconf", "--kill", "gpg-agent"],
env={"GNUPGHOME": empty_gpg_home},
env={"GNUPGHOME": str(empty_gpg_home)},
check=False,
capture_output=True,
timeout=5,
)
except (FileNotFoundError, subprocess.TimeoutExpired):
pass
shutil.rmtree(empty_gpg_home, ignore_errors=True)
def test_decrypt_encrypted_mail(self) -> None:
def test_decrypt_encrypted_mail(self, gpg_settings, encrypted_pair) -> None:
"""
Creates a mail with attachments. Then encrypts it with a new key.
Verifies that this encrypted message can be decrypted with attachments intact.
"""
encrypted_message, message = self.create_encrypted_unencrypted_message_pair()
headers = message.headers
text = message.text
encrypted_message, plaintext = encrypted_pair
headers = plaintext.headers
text = plaintext.text
self.assertEqual(len(encrypted_message.attachments), 1)
self.assertEqual(encrypted_message.attachments[0].filename, "encrypted.asc")
self.assertEqual(encrypted_message.text, "")
assert len(encrypted_message.attachments) == 1
assert encrypted_message.attachments[0].filename == "encrypted.asc"
assert encrypted_message.text == ""
with override_settings(
EMAIL_ENABLE_GPG_DECRYPTOR=True,
EMAIL_GNUPG_HOME=self.messageEncryptor.gpg_home,
):
message_decryptor = MailMessageDecryptor()
self.assertTrue(message_decryptor.able_to_run())
decrypted_message = message_decryptor.run(encrypted_message)
decryptor = MailMessageDecryptor()
assert decryptor.able_to_run()
decrypted = decryptor.run(encrypted_message)
self.assertEqual(len(decrypted_message.attachments), 2)
self.assertEqual(decrypted_message.attachments[0].filename, "f1.pdf")
self.assertEqual(decrypted_message.attachments[1].filename, "f2.pdf")
self.assertEqual(decrypted_message.headers, headers)
self.assertEqual(decrypted_message.text, text)
self.assertEqual(decrypted_message.uid, message.uid)
assert len(decrypted.attachments) == 2
assert decrypted.attachments[0].filename == "f1.pdf"
assert decrypted.attachments[1].filename == "f2.pdf"
assert decrypted.headers == headers
assert decrypted.text == text
assert decrypted.uid == plaintext.uid
def create_encrypted_unencrypted_message_pair(self):
message = self.mailMocker.messageBuilder.create_message(
body="Test message with 2 attachments",
attachments=[
_AttachmentDef(
filename="f1.pdf",
disposition="inline",
),
_AttachmentDef(filename="f2.pdf"),
],
)
encrypted_message = self.messageEncryptor.encrypt(message)
return encrypted_message, message
def test_handle_encrypted_message(self) -> None:
message = self.mailMocker.messageBuilder.create_message(
def test_handle_encrypted_message(
self,
gpg_settings,
mail_mocker,
message_encryptor: MessageEncryptor,
) -> None:
plaintext = mail_mocker.messageBuilder.create_message(
subject="the message title",
from_="Myself",
attachments=2,
body="Test mail",
)
encrypted_message = self.messageEncryptor.encrypt(message)
encrypted = message_encryptor.encrypt(plaintext)
account = MailAccountFactory()
rule = MailRule(
@@ -259,18 +288,17 @@ class TestMailMessageGpgDecryptor(TestMail):
)
rule.save()
result = self.mail_account_handler._handle_message(encrypted_message, rule)
handler = MailAccountHandler()
result = handler._handle_message(encrypted, rule)
self.assertEqual(result, 3)
self.mailMocker._queue_consumption_tasks_mock.assert_called()
self.mailMocker.assert_queue_consumption_tasks_call_args(
assert result == 3
mail_mocker._queue_consumption_tasks_mock.assert_called()
mail_mocker.assert_queue_consumption_tasks_call_args(
[
[
{
"override_title": message.subject,
"override_filename": f"{message.subject}.eml",
"override_title": plaintext.subject,
"override_filename": f"{plaintext.subject}.eml",
},
],
[
Generated
+87 -51
View File
@@ -2199,6 +2199,18 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/4e/d1/4bb0b80f4057903110060f617ef519197194b3ff5dd6153d850c8f5676fa/llama_index_embeddings_openai-0.6.0-py3-none-any.whl", hash = "sha256:039bb1007ad4267e25ddb89a206dfdab862bfb87d58da4271a3919e4f9df4d61", size = 7666, upload-time = "2026-03-12T20:21:28.079Z" },
]
[[package]]
name = "llama-index-embeddings-openai-like"
version = "0.3.1"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "llama-index-embeddings-openai", marker = "sys_platform == 'darwin' or sys_platform == 'linux'" },
]
sdist = { url = "https://files.pythonhosted.org/packages/b2/df/79e4748196213b55931d5f8377141fff41135f5988d5501860824cc95390/llama_index_embeddings_openai_like-0.3.1.tar.gz", hash = "sha256:cef7af4bce284e8e6730532dbd0aa325e77398a5d5524edb2d2e3acb122fb5b6", size = 3854, upload-time = "2026-03-13T16:15:20.647Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/01/8e/b9ea889f88318f2faa20b615989e12a15a133c9273630f9266fcf69f35a6/llama_index_embeddings_openai_like-0.3.1-py3-none-any.whl", hash = "sha256:167c7e462cde7d53ea907ceaffbbf10a750676c7c9f7bcc9bc9686a41921387a", size = 3631, upload-time = "2026-03-13T16:15:19.58Z" },
]
[[package]]
name = "llama-index-instrumentation"
version = "0.4.2"
@@ -2238,6 +2250,19 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/63/62/a847e9a94c2f92926c30188259f9f86e019dcc45122bbb222dea35a74c02/llama_index_llms_openai-0.7.5-py3-none-any.whl", hash = "sha256:c302c6386873420df3714c3d538f45379b6de27ab6a531f30c67419b39a538f5", size = 28492, upload-time = "2026-03-30T16:30:32.979Z" },
]
[[package]]
name = "llama-index-llms-openai-like"
version = "0.7.2"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "llama-index-core", marker = "sys_platform == 'darwin' or sys_platform == 'linux'" },
{ name = "llama-index-llms-openai", marker = "sys_platform == 'darwin' or sys_platform == 'linux'" },
]
sdist = { url = "https://files.pythonhosted.org/packages/3b/a3/16410b28d131aa113ada79f856b78cb68a8e92a1e27255ea9c36c27a5dec/llama_index_llms_openai_like-0.7.2.tar.gz", hash = "sha256:ed9ff73f975dce470f98ac61c982151ba78eedfa3fb9b03894bc1d1312b213ff", size = 5389, upload-time = "2026-04-23T23:05:32.525Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/f4/0c/fdddaee5391d915d3d568d2d8dbdb7c95647e65bb94d4ddb31d47cef5daf/llama_index_llms_openai_like-0.7.2-py3-none-any.whl", hash = "sha256:1f45a7b1cec8fb3f5997684327ffe6c19f93e789c2fff35dc5522465850faf0b", size = 6602, upload-time = "2026-04-23T23:05:31.708Z" },
]
[[package]]
name = "llama-index-vector-stores-faiss"
version = "0.6.0"
@@ -2881,9 +2906,9 @@ dependencies = [
{ name = "langdetect", marker = "sys_platform == 'darwin' or sys_platform == 'linux'" },
{ name = "llama-index-core", marker = "sys_platform == 'darwin' or sys_platform == 'linux'" },
{ name = "llama-index-embeddings-huggingface", marker = "sys_platform == 'darwin' or sys_platform == 'linux'" },
{ name = "llama-index-embeddings-openai", marker = "sys_platform == 'darwin' or sys_platform == 'linux'" },
{ name = "llama-index-embeddings-openai-like", marker = "sys_platform == 'darwin' or sys_platform == 'linux'" },
{ name = "llama-index-llms-ollama", marker = "sys_platform == 'darwin' or sys_platform == 'linux'" },
{ name = "llama-index-llms-openai", marker = "sys_platform == 'darwin' or sys_platform == 'linux'" },
{ name = "llama-index-llms-openai-like", marker = "sys_platform == 'darwin' or sys_platform == 'linux'" },
{ name = "llama-index-vector-stores-faiss", marker = "sys_platform == 'darwin' or sys_platform == 'linux'" },
{ name = "nltk", marker = "sys_platform == 'darwin' or sys_platform == 'linux'" },
{ name = "ocrmypdf", marker = "sys_platform == 'darwin' or sys_platform == 'linux'" },
@@ -2987,6 +3012,7 @@ typing = [
{ name = "types-python-dateutil", marker = "sys_platform == 'darwin' or sys_platform == 'linux'" },
{ name = "types-pytz", marker = "sys_platform == 'darwin' or sys_platform == 'linux'" },
{ name = "types-redis", marker = "sys_platform == 'darwin' or sys_platform == 'linux'" },
{ name = "types-regex", marker = "sys_platform == 'darwin' or sys_platform == 'linux'" },
{ name = "types-setuptools", marker = "sys_platform == 'darwin' or sys_platform == 'linux'" },
]
@@ -3030,9 +3056,9 @@ requires-dist = [
{ name = "langdetect", specifier = "~=1.0.9" },
{ name = "llama-index-core", specifier = ">=0.14.12" },
{ name = "llama-index-embeddings-huggingface", specifier = ">=0.6.1" },
{ name = "llama-index-embeddings-openai", specifier = ">=0.5.1" },
{ name = "llama-index-embeddings-openai-like", specifier = ">=0.2.2" },
{ name = "llama-index-llms-ollama", specifier = ">=0.9.1" },
{ name = "llama-index-llms-openai", specifier = ">=0.6.13" },
{ name = "llama-index-llms-openai-like", specifier = ">=0.7.1" },
{ name = "llama-index-vector-stores-faiss", specifier = ">=0.5.2" },
{ name = "mysqlclient", marker = "extra == 'mariadb'", specifier = "~=2.2.7" },
{ name = "nltk", specifier = "~=3.9.1" },
@@ -3056,7 +3082,7 @@ requires-dist = [
{ name = "scikit-learn", specifier = "~=1.8.0" },
{ name = "sentence-transformers", specifier = ">=4.1" },
{ name = "setproctitle", specifier = "~=1.3.4" },
{ name = "tantivy", specifier = ">=0.25.1" },
{ name = "tantivy", specifier = "~=0.26.0" },
{ name = "tika-client", specifier = "~=0.11.0" },
{ name = "torch", specifier = "~=2.11.0", index = "https://download.pytorch.org/whl/cpu" },
{ name = "watchfiles", specifier = ">=1.1.1" },
@@ -3125,6 +3151,7 @@ typing = [
{ name = "types-python-dateutil" },
{ name = "types-pytz" },
{ name = "types-redis" },
{ name = "types-regex" },
{ name = "types-setuptools" },
]
@@ -4660,30 +4687,30 @@ wheels = [
[[package]]
name = "tantivy"
version = "0.25.1"
version = "0.26.0"
source = { registry = "https://pypi.org/simple" }
sdist = { url = "https://files.pythonhosted.org/packages/1b/f9/0cd3955d155d3e3ef74b864769514dd191e5dacba9f0beb7af2d914942ce/tantivy-0.25.1.tar.gz", hash = "sha256:68a3314699a7d18fcf338b52bae8ce46a97dde1128a3e47e33fa4db7f71f265e", size = 75120, upload-time = "2025-12-02T11:57:12.997Z" }
sdist = { url = "https://files.pythonhosted.org/packages/57/74/ec8c3f7bb3599af86c19f1a774c37e36a6e7524d3563f3aeb99220981f6f/tantivy-0.26.0.tar.gz", hash = "sha256:7c9507fcc62bac4ef1d40b1ed37ff7fa07e44b5043b30288f63bcf4fdc62644a", size = 93615, upload-time = "2026-04-29T11:51:31.115Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/4e/7a/8a277f377e8a151fc0e71d4ffc1114aefb6e5e1c7dd609fed0955cf34ed8/tantivy-0.25.1-cp311-cp311-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl", hash = "sha256:d363d7b4207d3a5aa7f0d212420df35bed18bdb6bae26a2a8bd57428388b7c29", size = 7637033, upload-time = "2025-12-02T11:56:18.104Z" },
{ url = "https://files.pythonhosted.org/packages/71/31/8b4acdedfc9f9a2d04b1340d07eef5213d6f151d1e18da0cb423e5f090d2/tantivy-0.25.1-cp311-cp311-macosx_10_12_x86_64.whl", hash = "sha256:8f4389cf1d889a1df7c5a3195806b4b56c37cee10d8a26faaa0dea35a867b5ff", size = 3932180, upload-time = "2025-12-02T11:56:19.833Z" },
{ url = "https://files.pythonhosted.org/packages/2f/dc/3e8499c21b4b9795e8f2fc54c68ce5b92905aaeadadaa56ecfa9180b11b1/tantivy-0.25.1-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:99864c09fc54652c3c2486cdf13f86cdc8200f4b481569cb291e095ca5d496e5", size = 4197620, upload-time = "2025-12-02T11:56:21.496Z" },
{ url = "https://files.pythonhosted.org/packages/f8/8e/f2ce62fffc811eb62bead92c7b23c2e218f817cbd54c4f3b802e03ba1438/tantivy-0.25.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:05abf37ddbc5063c575548be0d62931629c086bff7a5a1b67cf5a8f5ebf4cd8c", size = 4183794, upload-time = "2025-12-02T11:56:23.215Z" },
{ url = "https://files.pythonhosted.org/packages/41/e7/6849c713ed0996c7628324c60512c4882006f0a62145e56c624a93407f90/tantivy-0.25.1-cp312-cp312-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl", hash = "sha256:90fd919e5f611809f746560ecf36eb9be824dec62e21ae17a27243759edb9aa1", size = 7621494, upload-time = "2025-12-02T11:56:27.069Z" },
{ url = "https://files.pythonhosted.org/packages/c5/22/c3d8294600dc6e7fa350daef9ff337d3c06e132b81df727de9f7a50c692a/tantivy-0.25.1-cp312-cp312-macosx_10_12_x86_64.whl", hash = "sha256:4613c7cf6c23f3a97989819690a0f956d799354957de7a204abcc60083cebe02", size = 3925219, upload-time = "2025-12-02T11:56:29.403Z" },
{ url = "https://files.pythonhosted.org/packages/41/fc/cbb1df71dd44c9110eff4eaaeda9d44f2d06182fe0452193be20ddfba93f/tantivy-0.25.1-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:c477bd20b4df804d57dfc5033431bef27cde605695ae141b03abbf6ebc069129", size = 4198699, upload-time = "2025-12-02T11:56:31.359Z" },
{ url = "https://files.pythonhosted.org/packages/47/4d/71abb78b774073c3ce12a4faa4351a9d910a71ffa3659526affba163873d/tantivy-0.25.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:f9b1a1ba1113c523c7ff7b10f282d6c4074006f7ef8d71e1d973d51bf7291ddb", size = 4183585, upload-time = "2025-12-02T11:56:33.317Z" },
{ url = "https://files.pythonhosted.org/packages/3d/25/73cfbcf1a8ea49be6c42817431cac46b70a119fe64da903fcc2d92b5b511/tantivy-0.25.1-cp313-cp313-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl", hash = "sha256:f51ff7196c6f31719202080ed8372d5e3d51e92c749c032fb8234f012e99744c", size = 7622530, upload-time = "2025-12-02T11:56:36.839Z" },
{ url = "https://files.pythonhosted.org/packages/12/c8/c0d7591cdf4f7e7a9fc4da786d1ca8cd1aacffaa2be16ea6d401a8e4a566/tantivy-0.25.1-cp313-cp313-macosx_10_12_x86_64.whl", hash = "sha256:550e63321bfcacc003859f2fa29c1e8e56450807b3c9a501c1add27cfb9236d9", size = 3925637, upload-time = "2025-12-02T11:56:38.425Z" },
{ url = "https://files.pythonhosted.org/packages/3a/09/bedfc223bffec7641b417dd7ab071134b2ef8f8550e9b1fb6014657ef52e/tantivy-0.25.1-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:fde31cc8d6e122faf7902aeea32bc008a429a6e8904e34d3468126a3ec01b016", size = 4197322, upload-time = "2025-12-02T11:56:40.411Z" },
{ url = "https://files.pythonhosted.org/packages/f5/f1/1fa5183500c8042200c9f2b840d34f5bbcfb434a1ee750e7132262d2a5c9/tantivy-0.25.1-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:b11bd5a518b0be645320b47af8493f6a40c4f3234313e37adcf4534a564d27dd", size = 4183143, upload-time = "2025-12-02T11:56:42.048Z" },
{ url = "https://files.pythonhosted.org/packages/8b/2f/581519492226f97d23bd0adc95dad991ebeaa73ea6abc8bff389a3096d9a/tantivy-0.25.1-cp313-cp313t-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl", hash = "sha256:dae99e75b7eaa9bf5bd16ab106b416370f08c135aed0e117d62a3201cd1ffe36", size = 7610316, upload-time = "2025-12-02T11:56:45.927Z" },
{ url = "https://files.pythonhosted.org/packages/91/40/5d7bc315ab9e6a22c5572656e8ada1c836cfa96dccf533377504fbc3c9d9/tantivy-0.25.1-cp313-cp313t-macosx_10_12_x86_64.whl", hash = "sha256:506e9533c5ef4d3df43bad64ffecc0aa97c76e361ea610815dc3a20a9d6b30b3", size = 3919882, upload-time = "2025-12-02T11:56:48.469Z" },
{ url = "https://files.pythonhosted.org/packages/02/b9/e0ef2f57a6a72444cb66c2ffbc310ab33ffaace275f1c4b0319d84ea3f18/tantivy-0.25.1-cp313-cp313t-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:5dbd4f8f264dacbcc9dee542832da2173fd53deaaea03f082d95214f8b5ed6bc", size = 4196031, upload-time = "2025-12-02T11:56:50.151Z" },
{ url = "https://files.pythonhosted.org/packages/1e/02/bf3f8cacfd08642e14a73f7956a3fb95d58119132c98c121b9065a1f8615/tantivy-0.25.1-cp313-cp313t-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:824c643ccb640dd9e35e00c5d5054ddf3323f56fe4219d57d428a9eeea13d22c", size = 4183437, upload-time = "2025-12-02T11:56:51.818Z" },
{ url = "https://files.pythonhosted.org/packages/ff/44/9f1d67aa5030f7eebc966c863d1316a510a971dd8bb45651df4acdfae9ed/tantivy-0.25.1-cp314-cp314-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl", hash = "sha256:7f5d29ae85dd0f23df8d15b3e7b341d4f9eb5a446bbb9640df48ac1f6d9e0c6c", size = 7623723, upload-time = "2025-12-02T11:56:55.066Z" },
{ url = "https://files.pythonhosted.org/packages/db/30/6e085bd3ed9d12da3c91c185854abd70f9dfd35fb36a75ea98428d42c30b/tantivy-0.25.1-cp314-cp314-macosx_10_12_x86_64.whl", hash = "sha256:f2d2938fb69a74fc1bb36edfaf7f0d1596fa1264db0f377bda2195c58bcb6245", size = 3926243, upload-time = "2025-12-02T11:56:57.058Z" },
{ url = "https://files.pythonhosted.org/packages/32/f5/a00d65433430f51718e5cc6938df571765d7c4e03aedec5aef4ab567aa9b/tantivy-0.25.1-cp314-cp314-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:4f5ff124c4802558e627091e780b362ca944169736caba5a372eef39a79d0ae0", size = 4207186, upload-time = "2025-12-02T11:56:58.803Z" },
{ url = "https://files.pythonhosted.org/packages/19/63/61bdb12fc95f2a7f77bd419a5149bfa9f28caa76cb569bf2b6b06e1d033e/tantivy-0.25.1-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:43b80ef62a340416139c93d19264e5f808da48e04f9305f1092b8ed22be0a5be", size = 4187312, upload-time = "2025-12-02T11:57:00.595Z" },
{ url = "https://files.pythonhosted.org/packages/3d/94/832f4ee26ed2b3aae69923e827183cb28af8239c893430fbe6240351df9c/tantivy-0.26.0-cp311-cp311-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl", hash = "sha256:5748304687042b2c643e30b646047f254e8a21441515929ea0e54c6de0ed493b", size = 8335068, upload-time = "2026-04-29T11:50:38.042Z" },
{ url = "https://files.pythonhosted.org/packages/d2/dc/88eb02ad37acb5b022ded4fdad984562b2246ec28fe16d55580e678a1251/tantivy-0.26.0-cp311-cp311-macosx_10_12_x86_64.whl", hash = "sha256:d8617af05dfc030acca0d93c14594b247e18550d803359946cb8cf76620a767e", size = 4298822, upload-time = "2026-04-29T11:50:40.27Z" },
{ url = "https://files.pythonhosted.org/packages/60/85/5a553d130be2704302c807fccebad575241f2a7bce2b765240ad9d4c697f/tantivy-0.26.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:3dbb17edf7ee25c77f60e7b757488bb8a127c28fc6cac3d8484e2d1ae862d77f", size = 4612952, upload-time = "2026-04-29T11:50:42.644Z" },
{ url = "https://files.pythonhosted.org/packages/9a/c9/36c3e17d0a8efb2fab1ce0c748ddd1aa3234ead83aaa011fd5c6c780e043/tantivy-0.26.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:62cfb03e3ca4cb0f41213acf0a3fd04ffe7f359ca6e3e91cab37c06799e68d3c", size = 4567017, upload-time = "2026-04-29T11:50:44.943Z" },
{ url = "https://files.pythonhosted.org/packages/88/34/fbdcbcb6862cdea4f2ed7d62b99c4785b89560c1ae72558a6cec724cf64c/tantivy-0.26.0-cp312-cp312-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl", hash = "sha256:a4676c1989d8d7e61ebc4d6d4b46224fa9351eedb6e077a815a5f7233c4e20b6", size = 8301385, upload-time = "2026-04-29T11:50:48.769Z" },
{ url = "https://files.pythonhosted.org/packages/d0/40/09e6a400d2505024a412d4c2a67f58ece4703799a53ba432215b90de3835/tantivy-0.26.0-cp312-cp312-macosx_10_12_x86_64.whl", hash = "sha256:df1e2f21f69ffe9ea12ca7f3f3900eadea4670cf6f8c15aa4fec0bb3a1185577", size = 4284335, upload-time = "2026-04-29T11:50:51.201Z" },
{ url = "https://files.pythonhosted.org/packages/2f/9d/1e7db895aa5cfac5609bd03cc7bf44a6f8b9f2892baf9225f9189121316d/tantivy-0.26.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:d2d22d53c9fd718b6d74eea3b0e8a26ec8bd7ca8a594ba15c7a38ac5e9e3d33b", size = 4610238, upload-time = "2026-04-29T11:50:53.01Z" },
{ url = "https://files.pythonhosted.org/packages/f8/05/205758c98a5e450ec22c08904dbf3fe10dce4d39c3e1c6a05c9daaa436be/tantivy-0.26.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:ff0765cfbc9e10e96b71c78074e29f25fee388e5bc8a0a7c8aa166bf80304769", size = 4570688, upload-time = "2026-04-29T11:50:55.057Z" },
{ url = "https://files.pythonhosted.org/packages/fb/56/aef45e8ec7fddbca4885516bc1d8cc61950f666ba8b44ef9e50b8db51f91/tantivy-0.26.0-cp313-cp313-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl", hash = "sha256:7509b06fab07d4209bf37759e3c0c407f6c53fa3184d694ec82416fc8d189e7d", size = 8301236, upload-time = "2026-04-29T11:50:59.299Z" },
{ url = "https://files.pythonhosted.org/packages/ef/3b/8ce1a1662e6e6c303a65055a42206853adfeaa14596e62b6f218b5af5526/tantivy-0.26.0-cp313-cp313-macosx_10_12_x86_64.whl", hash = "sha256:0c77962afc03f8a7081991fee088d09891acaa3401cdf882b1cc40d9d839a683", size = 4284342, upload-time = "2026-04-29T11:51:01.626Z" },
{ url = "https://files.pythonhosted.org/packages/32/15/4f4e74669bdc26e7508d527f27be1a12d249e1d9ab6bba05f47ba1cdedfa/tantivy-0.26.0-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:9f017f6991752da8092c46c670e0eae349a1266d2315720ff5b252ed1ea4acd4", size = 4610161, upload-time = "2026-04-29T11:51:03.966Z" },
{ url = "https://files.pythonhosted.org/packages/50/3e/4648f7fc34834f3f4c99465111f50add73f0517f21ceae7c81771942c99b/tantivy-0.26.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:bb7df28cc98497c767b86ebc5f167e3f2552739f748a1c28c10e14e7db726cd1", size = 4571125, upload-time = "2026-04-29T11:51:06.162Z" },
{ url = "https://files.pythonhosted.org/packages/59/3d/a85ffd178a6b00813cd144e4af2db100023a5a7cd9cae3a5f998e74d5cbd/tantivy-0.26.0-cp313-cp313t-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl", hash = "sha256:7b9a350e59513e330fbcb0930ef2f574d86a8aa46047a1ad1c5b4c91838e3aec", size = 8296228, upload-time = "2026-04-29T11:51:10.844Z" },
{ url = "https://files.pythonhosted.org/packages/d6/81/c78c3ae37c52e1244da340139e11cca0d44ee742e227bba19ecefbbf54b5/tantivy-0.26.0-cp313-cp313t-macosx_10_12_x86_64.whl", hash = "sha256:e1bdf6dcbd25fdc3244e6b09f5114253a81ec533ad5346eff135ee98668f04c7", size = 4279507, upload-time = "2026-04-29T11:51:13.065Z" },
{ url = "https://files.pythonhosted.org/packages/39/df/a409dad06800793d7dd2cc831aeb46b808b4ce00df7d506e6760acee2cc2/tantivy-0.26.0-cp313-cp313t-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:a9450186861d6f350ec5969ff2e349377d2ed5617f084a8ad0f303ed7a91a2e3", size = 4605550, upload-time = "2026-04-29T11:51:15.289Z" },
{ url = "https://files.pythonhosted.org/packages/bf/24/68245511021df5b19879d262ef2b1adf214ec67cc69761e1a83fd28571b1/tantivy-0.26.0-cp313-cp313t-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:e69569ec68c0e8d1c3aa3cf52423ced6b9b9b5bb0156e8b32088390919d41964", size = 4566249, upload-time = "2026-04-29T11:51:17.398Z" },
{ url = "https://files.pythonhosted.org/packages/98/53/0c9ec136930dad07c8f24cd31210181e7ba2228aeea9fd75d751b9d716e5/tantivy-0.26.0-cp314-cp314-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl", hash = "sha256:7983206d75c7334fdcb1c49372adf6afe8aeb89ad2abfc7a5fd9701254735134", size = 8319819, upload-time = "2026-04-29T11:51:21.451Z" },
{ url = "https://files.pythonhosted.org/packages/10/5b/35b7f2af5101883dde8f0523a69944bd5b2471ce399339e97e2e3775a227/tantivy-0.26.0-cp314-cp314-macosx_10_12_x86_64.whl", hash = "sha256:4a970d35402612208b077de5feb9da978380c173e7835335c50d15d0f13cbaba", size = 4289015, upload-time = "2026-04-29T11:51:23.435Z" },
{ url = "https://files.pythonhosted.org/packages/ef/4b/fbc1293fc8108eaa2f11718d77070e09cefe4aac4055fd2036548a776946/tantivy-0.26.0-cp314-cp314-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:22f6acf0c4d9dff2aecb161071db921c9a0bd3e2485dbc8304fd4561fe50c6a7", size = 4622332, upload-time = "2026-04-29T11:51:25.483Z" },
{ url = "https://files.pythonhosted.org/packages/86/68/d1ae76c42b523b076434a357b675c37fd9684ac0fc6e0e63f703a1f1014a/tantivy-0.26.0-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:a92825f3215fc4b5ef55da8a577eb9ad30c7ee5c7371daea82329e67aa4c7d12", size = 4573494, upload-time = "2026-04-29T11:51:27.512Z" },
]
[[package]]
@@ -4919,12 +4946,12 @@ dependencies = [
{ name = "typing-extensions", marker = "sys_platform == 'darwin'" },
]
wheels = [
{ url = "https://download-r2.pytorch.org/whl/cpu/torch-2.11.0-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:d75eadcd97fe0dc7cd0eedc4d72152484c19cb2cfe46ce55766c8e129116425f" },
{ url = "https://download-r2.pytorch.org/whl/cpu/torch-2.11.0-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:43b35116802c85fb88d99f4a396b8bd4472bfca1dd82e69499e5a4f9b8b4e252" },
{ url = "https://download-r2.pytorch.org/whl/cpu/torch-2.11.0-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:442ec9dc78592564fdad69cf0beaa9da2f82ab810ccb4f13903869a90bf3f15d" },
{ url = "https://download-r2.pytorch.org/whl/cpu/torch-2.11.0-cp313-cp313t-macosx_11_0_arm64.whl", hash = "sha256:cc3a195701bba2239c313ee311487f80f8aaebe9e89b9073dddbcf2f93b5a0ba" },
{ url = "https://download-r2.pytorch.org/whl/cpu/torch-2.11.0-cp314-cp314-macosx_11_0_arm64.whl", hash = "sha256:072a0d6e4865e8b0dc0dbfe6ebed68fae235124222835ef03e5814d414d8c012" },
{ url = "https://download-r2.pytorch.org/whl/cpu/torch-2.11.0-cp314-cp314t-macosx_11_0_arm64.whl", hash = "sha256:23ec7789017da9d95b6d543d790814785e6f30905c5443efa8257d1490d73f79" },
{ url = "https://download-r2.pytorch.org/whl/cpu/torch-2.11.0-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:d75eadcd97fe0dc7cd0eedc4d72152484c19cb2cfe46ce55766c8e129116425f", upload-time = "2026-03-23T15:16:54Z" },
{ url = "https://download-r2.pytorch.org/whl/cpu/torch-2.11.0-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:43b35116802c85fb88d99f4a396b8bd4472bfca1dd82e69499e5a4f9b8b4e252", upload-time = "2026-03-23T15:16:58Z" },
{ url = "https://download-r2.pytorch.org/whl/cpu/torch-2.11.0-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:442ec9dc78592564fdad69cf0beaa9da2f82ab810ccb4f13903869a90bf3f15d", upload-time = "2026-03-23T15:17:02Z" },
{ url = "https://download-r2.pytorch.org/whl/cpu/torch-2.11.0-cp313-cp313t-macosx_11_0_arm64.whl", hash = "sha256:cc3a195701bba2239c313ee311487f80f8aaebe9e89b9073dddbcf2f93b5a0ba", upload-time = "2026-03-23T15:17:06Z" },
{ url = "https://download-r2.pytorch.org/whl/cpu/torch-2.11.0-cp314-cp314-macosx_11_0_arm64.whl", hash = "sha256:072a0d6e4865e8b0dc0dbfe6ebed68fae235124222835ef03e5814d414d8c012", upload-time = "2026-03-23T15:17:10Z" },
{ url = "https://download-r2.pytorch.org/whl/cpu/torch-2.11.0-cp314-cp314t-macosx_11_0_arm64.whl", hash = "sha256:23ec7789017da9d95b6d543d790814785e6f30905c5443efa8257d1490d73f79", upload-time = "2026-03-23T15:17:14Z" },
]
[[package]]
@@ -4947,24 +4974,24 @@ dependencies = [
{ name = "typing-extensions", marker = "sys_platform == 'linux'" },
]
wheels = [
{ url = "https://download-r2.pytorch.org/whl/cpu/torch-2.11.0%2Bcpu-cp311-cp311-linux_s390x.whl" },
{ url = "https://download-r2.pytorch.org/whl/cpu/torch-2.11.0%2Bcpu-cp311-cp311-manylinux_2_28_aarch64.whl" },
{ url = "https://download-r2.pytorch.org/whl/cpu/torch-2.11.0%2Bcpu-cp311-cp311-manylinux_2_28_x86_64.whl" },
{ url = "https://download-r2.pytorch.org/whl/cpu/torch-2.11.0%2Bcpu-cp312-cp312-linux_s390x.whl" },
{ url = "https://download-r2.pytorch.org/whl/cpu/torch-2.11.0%2Bcpu-cp312-cp312-manylinux_2_28_aarch64.whl" },
{ url = "https://download-r2.pytorch.org/whl/cpu/torch-2.11.0%2Bcpu-cp312-cp312-manylinux_2_28_x86_64.whl" },
{ url = "https://download-r2.pytorch.org/whl/cpu/torch-2.11.0%2Bcpu-cp313-cp313-linux_s390x.whl" },
{ url = "https://download-r2.pytorch.org/whl/cpu/torch-2.11.0%2Bcpu-cp313-cp313-manylinux_2_28_aarch64.whl" },
{ url = "https://download-r2.pytorch.org/whl/cpu/torch-2.11.0%2Bcpu-cp313-cp313-manylinux_2_28_x86_64.whl" },
{ url = "https://download-r2.pytorch.org/whl/cpu/torch-2.11.0%2Bcpu-cp313-cp313t-linux_s390x.whl" },
{ url = "https://download-r2.pytorch.org/whl/cpu/torch-2.11.0%2Bcpu-cp313-cp313t-manylinux_2_28_aarch64.whl" },
{ url = "https://download-r2.pytorch.org/whl/cpu/torch-2.11.0%2Bcpu-cp313-cp313t-manylinux_2_28_x86_64.whl" },
{ url = "https://download-r2.pytorch.org/whl/cpu/torch-2.11.0%2Bcpu-cp314-cp314-linux_s390x.whl" },
{ url = "https://download-r2.pytorch.org/whl/cpu/torch-2.11.0%2Bcpu-cp314-cp314-manylinux_2_28_aarch64.whl" },
{ url = "https://download-r2.pytorch.org/whl/cpu/torch-2.11.0%2Bcpu-cp314-cp314-manylinux_2_28_x86_64.whl" },
{ url = "https://download-r2.pytorch.org/whl/cpu/torch-2.11.0%2Bcpu-cp314-cp314t-linux_s390x.whl" },
{ url = "https://download-r2.pytorch.org/whl/cpu/torch-2.11.0%2Bcpu-cp314-cp314t-manylinux_2_28_aarch64.whl" },
{ url = "https://download-r2.pytorch.org/whl/cpu/torch-2.11.0%2Bcpu-cp314-cp314t-manylinux_2_28_x86_64.whl" },
{ url = "https://download-r2.pytorch.org/whl/cpu/torch-2.11.0%2Bcpu-cp311-cp311-linux_s390x.whl", hash = "sha256:5214b203ee187f8746c66f1378b72611b7c1e15c5cb325037541899e705ea24e", upload-time = "2026-04-27T21:55:40Z" },
{ url = "https://download-r2.pytorch.org/whl/cpu/torch-2.11.0%2Bcpu-cp311-cp311-manylinux_2_28_aarch64.whl", hash = "sha256:46fbb0aa257bb781efbfad648f5b045c0e232573b661f1461593db61342e9096", upload-time = "2026-04-28T00:05:38Z" },
{ url = "https://download-r2.pytorch.org/whl/cpu/torch-2.11.0%2Bcpu-cp311-cp311-manylinux_2_28_x86_64.whl", hash = "sha256:8a56a8c95531ef0e454510ba8bbd9d11dc7a9000337265210b10f6bfeacdd485", upload-time = "2026-04-28T00:05:47Z" },
{ url = "https://download-r2.pytorch.org/whl/cpu/torch-2.11.0%2Bcpu-cp312-cp312-linux_s390x.whl", hash = "sha256:2db3ae5404e32cb42b5fcbd94f13607761eaec0cf1687fde95095289d1e26cfb", upload-time = "2026-04-28T00:06:06Z" },
{ url = "https://download-r2.pytorch.org/whl/cpu/torch-2.11.0%2Bcpu-cp312-cp312-manylinux_2_28_aarch64.whl", hash = "sha256:70ecb2659af6373b7c5336e692e665605b0201ea21ff51aaea47e1d75ea6b5aa", upload-time = "2026-04-28T00:06:14Z" },
{ url = "https://download-r2.pytorch.org/whl/cpu/torch-2.11.0%2Bcpu-cp312-cp312-manylinux_2_28_x86_64.whl", hash = "sha256:f82e2ae20c1545bb03997d1cc3143d94e14b800038669ee1aca45808a9acc338", upload-time = "2026-04-28T00:06:24Z" },
{ url = "https://download-r2.pytorch.org/whl/cpu/torch-2.11.0%2Bcpu-cp313-cp313-linux_s390x.whl", hash = "sha256:d1eff25ccc454faf21c9666c81bfab8e405e87c12d300708d4559620bc191a36", upload-time = "2026-04-28T00:06:42Z" },
{ url = "https://download-r2.pytorch.org/whl/cpu/torch-2.11.0%2Bcpu-cp313-cp313-manylinux_2_28_aarch64.whl", hash = "sha256:48b3e21a311445acdd0b27f13830e21d93adef70d4721e051e9f059baeb9b8f9", upload-time = "2026-04-28T00:06:51Z" },
{ url = "https://download-r2.pytorch.org/whl/cpu/torch-2.11.0%2Bcpu-cp313-cp313-manylinux_2_28_x86_64.whl", hash = "sha256:45025d7752dbc6b4c784c03afaee9c5f19730ce084b2e43fc9a2fe1677d9ff86", upload-time = "2026-04-28T00:07:02Z" },
{ url = "https://download-r2.pytorch.org/whl/cpu/torch-2.11.0%2Bcpu-cp313-cp313t-linux_s390x.whl", hash = "sha256:65d427a196ab0abe359b93c5bffedd76ded02df2b1b1d2d9f11a2609b69f426a", upload-time = "2026-04-28T00:07:19Z" },
{ url = "https://download-r2.pytorch.org/whl/cpu/torch-2.11.0%2Bcpu-cp313-cp313t-manylinux_2_28_aarch64.whl", hash = "sha256:8f13dc7075ae04ca5f876a9f40b4e47522a04c23e30824b4409f42a3f3e57aa4", upload-time = "2026-04-28T00:07:27Z" },
{ url = "https://download-r2.pytorch.org/whl/cpu/torch-2.11.0%2Bcpu-cp313-cp313t-manylinux_2_28_x86_64.whl", hash = "sha256:8713bb8679376ea0ec25742100b6cfb8447e0904c48bddefb9eb0ac1abbfa60a", upload-time = "2026-04-28T00:07:37Z" },
{ url = "https://download-r2.pytorch.org/whl/cpu/torch-2.11.0%2Bcpu-cp314-cp314-linux_s390x.whl", hash = "sha256:c9a14c367f470623b978e273a4e1915995b4ba7a0ae999178b06c273eea3536f", upload-time = "2026-04-28T00:07:54Z" },
{ url = "https://download-r2.pytorch.org/whl/cpu/torch-2.11.0%2Bcpu-cp314-cp314-manylinux_2_28_aarch64.whl", hash = "sha256:71676f6a9a84bbd385e010198b51fa1c2324fb8f3c512a32d2c81af65f68f4c9", upload-time = "2026-04-28T00:08:02Z" },
{ url = "https://download-r2.pytorch.org/whl/cpu/torch-2.11.0%2Bcpu-cp314-cp314-manylinux_2_28_x86_64.whl", hash = "sha256:f8481ea9088e4e5b81178a75aabdbb658bde8639bc1a15fd5d8f930abc966735", upload-time = "2026-04-28T00:08:11Z" },
{ url = "https://download-r2.pytorch.org/whl/cpu/torch-2.11.0%2Bcpu-cp314-cp314t-linux_s390x.whl", hash = "sha256:825f1596878280a3a4c861441674888bc2d792e4ab7b045cb35feeab3f4f5dd7", upload-time = "2026-04-28T00:08:27Z" },
{ url = "https://download-r2.pytorch.org/whl/cpu/torch-2.11.0%2Bcpu-cp314-cp314t-manylinux_2_28_aarch64.whl", hash = "sha256:c8a0bdfb2fd915b6c2cd27c856f63f729c366a4917772eba6b2b02aa3bce70d5", upload-time = "2026-04-28T00:08:36Z" },
{ url = "https://download-r2.pytorch.org/whl/cpu/torch-2.11.0%2Bcpu-cp314-cp314t-manylinux_2_28_x86_64.whl", hash = "sha256:768f22924a25cad2adeb9c6cbac5159e71067c8d4019b1511960d7435a5ca652", upload-time = "2026-04-28T00:08:47Z" },
]
[[package]]
@@ -5208,6 +5235,15 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/55/82/7d25dce10aad92d2226b269bce2f85cfd843b4477cd50245d7d40ecf8f89/types_redis-4.6.0.20241004-py3-none-any.whl", hash = "sha256:ef5da68cb827e5f606c8f9c0b49eeee4c2669d6d97122f301d3a55dc6a63f6ed", size = 58737, upload-time = "2024-10-04T02:43:57.968Z" },
]
[[package]]
name = "types-regex"
version = "2026.4.4.20260408"
source = { registry = "https://pypi.org/simple" }
sdist = { url = "https://files.pythonhosted.org/packages/92/42/d7c691fc5a8a8ecfba3f23c1c4c087a089af0767610d88c29201193d8f60/types_regex-2026.4.4.20260408.tar.gz", hash = "sha256:86b2975ff11b06e7f538839821510daea2566d9cb18bb8acde47834315409cf9", size = 13182, upload-time = "2026-04-08T04:31:11.887Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/e1/92/e109654a804d11d9b60d67c7b29d64b2beac6b2e3209ea075e268e5a1021/types_regex-2026.4.4.20260408-py3-none-any.whl", hash = "sha256:d436bcc409abf9b06747b7e038014afc6d40ef7b72329655c353a1955534068f", size = 11116, upload-time = "2026-04-08T04:31:11.01Z" },
]
[[package]]
name = "types-setuptools"
version = "80.10.0.20260124"