diff --git a/docs/superpowers/plans/2026-06-02-lancedb-vector-store.md b/docs/superpowers/plans/2026-06-02-lancedb-vector-store.md
deleted file mode 100644
index 94503abdd..000000000
--- a/docs/superpowers/plans/2026-06-02-lancedb-vector-store.md
+++ /dev/null
@@ -1,1721 +0,0 @@
-# LanceDB Vector Store Implementation Plan
-
-> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
-
-**Goal:** Replace the AI feature's FAISS + `SimpleDocumentStore` + `SimpleIndexStore` llama-index storage with a single LanceDB table fronted by a custom `BasePydanticVectorStore` adapter, eliminating fake deletes, whole-file rewrites, the custom chat retriever, and pandas.
-
-**Architecture:** A new `paperless_ai/vector_store.py` defines `PaperlessLanceVectorStore`, a llama-index `BasePydanticVectorStore` talking to `lancedb` + `pyarrow` directly. `indexing.py` is rewired to build the index from that store alone (`VectorStoreIndex.from_vector_store`), add/update via atomic `merge_insert` upsert, remove via predicate delete, and query/similar/chat via stock retrievers with `MetadataFilters`. Disk bloat from MVCC is reclaimed with `optimize(cleanup_older_than=...)` folded into the scheduled `update_llm_index`.
-
-**Tech Stack:** Python 3.11+, Django, llama-index-core, lancedb 0.33.x, pyarrow, pytest + pytest-django + pytest-mock, factory-boy, uv.
-
-**Reference spec:** `docs/superpowers/specs/2026-06-02-lancedb-vector-store-design.md`
-
----
-
-## Conventions for this plan
-
-- Backend tests are **pytest-style, grouped in classes**, with `@pytest.mark.django_db` on the class when DB access is needed. Annotate fixture params, fixture return types, and test signatures. Use the `mocker` fixture (pytest-mock), not bare `patch`. Build models with `DocumentFactory` from `documents/tests/factories.py`. (See `CLAUDE.md`.)
-- Run a single test: `uv run pytest src/paperless_ai/tests/test_vector_store.py::TestClass::test_x -v`
-- Lint/format with the **global** `ruff`: `ruff check src/paperless_ai` and `ruff format src/paperless_ai` (not `uv run ruff`).
-- **Tests cannot be executed in the authoring session on this machine** — where a step says "Run … Expected: PASS", the implementer runs it and confirms before moving on.
-- Commit messages end with the trailer:
-  `Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>`
-
----
-
-## HARD CONSTRAINT: lazy imports of AI libraries
-
-**`llama_index`, `lancedb`, and `pyarrow` must never be imported at module-load time of any module reachable from a non-AI entry point.** A simple management command must not transitively drag in gigabytes of AI libraries. This is a known past regression and a hard requirement.
-
-The existing code already enforces this pattern, and this plan must preserve it:
-
-- `documents/tasks.py` imports `paperless_ai.indexing` **at module top**, and management commands import `documents.tasks` / `documents.models`. So anything `indexing.py` pulls at module-load time lands in the light path.
-- Today `indexing.py` and `embedding.py` keep all `llama_index` / `faiss` imports **function-local** (e.g. `build_document_node` imports `LlamaDocument` inside the function). Importing `indexing.py` does **not** import `llama_index`.
-
-Rules for this plan:
-
-1. `paperless_ai/vector_store.py` (the new adapter) **may** import `lancedb` / `pyarrow` / `llama_index` at its top level — it is pure AI code.
-2. **`indexing.py` must import `vector_store` only inside functions** (e.g. inside `get_vector_store()`), never at module top. Use `if TYPE_CHECKING:` for type hints.
-3. **Any `llama_index` symbol used in `indexing.py` / `chat.py` (including `MetadataMode`, retrievers, filters) must be imported inside the function that uses it**, never at module top.
-4. Test modules under `paperless_ai/tests/` **may** import these at the top — they are AI tests.
-5. A subprocess guard test (Task 6) asserts that importing `documents.tasks` leaves `lancedb` / `pyarrow` / `llama_index` absent from `sys.modules`.
-
----
-
-## File Structure
-
-- **Create** `src/paperless_ai/vector_store.py` — `PaperlessLanceVectorStore` adapter (schema, add, upsert_document, delete, get_nodes, query, `_build_where`, `maybe_create_ann_index`, `optimize`). Single responsibility: the LanceDB ↔ llama-index storage boundary.
-- **Create** `src/paperless_ai/tests/test_vector_store.py` — adapter unit/integration tests.
-- **Modify** `src/paperless_ai/indexing.py` — factory + load/build/add/update/remove/similar functions rewired to the adapter; delete `get_or_create_storage_context`, `remove_document_docstore_nodes`; change `build_document_node`, `vector_store_file_exists`, `update_llm_index`, `llm_index_add_or_update_document`, `llm_index_remove_document`, `query_similar_documents`, `load_or_build_index`.
-- **Modify** `src/paperless_ai/chat.py` — delete `_get_document_filtered_retriever`; use stock `VectorIndexRetriever` with filters; switch the no-content pre-check to `store.get_nodes`.
-- **Modify** `src/documents/tasks.py` — call adapter compaction at the end of `update_llm_index` (via indexing helper) — no new beat task.
-- **Modify** `pyproject.toml` — drop `faiss-cpu`, `llama-index-vector-stores-faiss`; add `lancedb`, `pyarrow`.
-- **Modify** `src/paperless_ai/embedding.py` — add a `current_embedding_dim()` helper used by the dimension-mismatch guard (logic already mostly present in `get_embedding_dim`).
-- **Modify** `src/paperless_ai/tests/test_ai_indexing.py`, `src/paperless_ai/tests/test_chat.py` — update tests that referenced FAISS/docstore internals.
-
----
-
-## Task 1: Swap dependencies
-
-**Files:**
-
-- Modify: `pyproject.toml:45` (remove `faiss-cpu`), `pyproject.toml:60` (remove `llama-index-vector-stores-faiss`), and add `lancedb` + `pyarrow` in alphabetical position.
-
-- [ ] **Step 1: Remove the FAISS dependencies**
-
-In `pyproject.toml`, delete these two lines from the `dependencies` array:
-
-```toml
-  "faiss-cpu>=1.10",
-```
-
-```toml
-  "llama-index-vector-stores-faiss>=0.5.2",
-```
-
-- [ ] **Step 2: Add lancedb and pyarrow**
-
-In the same `dependencies` array, add (keep the array alphabetized — `lancedb` goes just before `langdetect`, `pyarrow` just before `python-dateutil`):
-
-```toml
-  "lancedb~=0.33.0",
-```
-
-```toml
-  "pyarrow>=16",
-```
-
-- [ ] **Step 3: Resolve the lockfile**
-
-Run: `uv sync --group dev`
-Expected: resolves and installs; `faiss-cpu` and `llama-index-vector-stores-faiss` are removed, `lancedb`/`pyarrow` added. No `pandas` is added.
-
-- [ ] **Step 4: Verify pandas is absent and lancedb imports**
-
-Run: `uv run python -c "import importlib.util as u; import lancedb, pyarrow; print('lancedb', lancedb.__version__); print('pandas present:', u.find_spec('pandas') is not None)"`
-Expected: prints the lancedb version and `pandas present: False`.
-
-- [ ] **Step 5: Verify multi-arch wheels resolved**
-
-Run: `uv pip show lancedb pyarrow`
-Expected: both shown with versions. (Linux x86_64 + aarch64 wheels exist for lancedb 0.33.x — confirm CI Docker build later.)
-
-- [ ] **Step 6: Commit**
-
-```bash
-git add pyproject.toml uv.lock
-git commit -m "build: replace faiss-cpu with lancedb for the AI vector store
-
-Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>"
-```
-
----
-
-## Task 2: The adapter — schema, add, delete, get_nodes, query
-
-**Files:**
-
-- Create: `src/paperless_ai/vector_store.py`
-- Test: `src/paperless_ai/tests/test_vector_store.py`
-
-- [ ] **Step 1: Write the failing test for add → query round-trip**
-
-Create `src/paperless_ai/tests/test_vector_store.py`:
-
-```python
-from pathlib import Path
-
-import pytest
-from llama_index.core.schema import TextNode
-from llama_index.core.vector_stores.types import FilterOperator
-from llama_index.core.vector_stores.types import MetadataFilter
-from llama_index.core.vector_stores.types import MetadataFilters
-from llama_index.core.vector_stores.types import VectorStoreQuery
-
-from paperless_ai.vector_store import PaperlessLanceVectorStore
-
-DIM = 8
-
-
-def _node(node_id: str, document_id: str, text: str, vec: float) -> TextNode:
-    node = TextNode(id_=node_id, text=text, metadata={"document_id": document_id})
-    node.set_content(text)
-    node.embedding = [vec] * DIM
-    node.relationships = {}
-    node.ref_doc_id = document_id
-    return node
-
-
-class TestPaperlessLanceVectorStoreCrud:
-    @pytest.fixture
-    def store(self, tmp_path: Path) -> PaperlessLanceVectorStore:
-        return PaperlessLanceVectorStore(uri=str(tmp_path / "idx"))
-
-    def test_add_then_query_returns_node(
-        self,
-        store: PaperlessLanceVectorStore,
-    ) -> None:
-        store.add([_node("1-0", "1", "alpha", 0.1), _node("2-0", "2", "beta", 0.9)])
-
-        result = store.query(
-            VectorStoreQuery(query_embedding=[0.1] * DIM, similarity_top_k=1),
-        )
-
-        assert len(result.nodes) == 1
-        assert result.nodes[0].metadata["document_id"] == "1"
-
-    def test_query_empty_table_returns_empty_no_raise(
-        self,
-        store: PaperlessLanceVectorStore,
-    ) -> None:
-        result = store.query(
-            VectorStoreQuery(query_embedding=[0.1] * DIM, similarity_top_k=5),
-        )
-        assert result.nodes == []
-        assert result.ids == []
-```
-
-- [ ] **Step 2: Run the test to verify it fails**
-
-Run: `uv run pytest src/paperless_ai/tests/test_vector_store.py -v`
-Expected: FAIL with `ModuleNotFoundError: No module named 'paperless_ai.vector_store'`.
-
-- [ ] **Step 3: Write the adapter (add/delete/get_nodes/query/client)**
-
-Create `src/paperless_ai/vector_store.py`:
-
-```python
-import json
-import logging
-from typing import Any
-
-import lancedb
-import pyarrow as pa
-from llama_index.core.bridge.pydantic import PrivateAttr
-from llama_index.core.schema import BaseNode
-from llama_index.core.vector_stores.types import BasePydanticVectorStore
-from llama_index.core.vector_stores.types import FilterCondition
-from llama_index.core.vector_stores.types import FilterOperator
-from llama_index.core.vector_stores.types import MetadataFilters
-from llama_index.core.vector_stores.types import VectorStoreQuery
-from llama_index.core.vector_stores.types import VectorStoreQueryResult
-from llama_index.core.vector_stores.utils import metadata_dict_to_node
-from llama_index.core.vector_stores.utils import node_to_metadata_dict
-
-logger = logging.getLogger("paperless_ai.vector_store")
-
-DEFAULT_TABLE_NAME = "documents"
-
-
-def _escape(value: str) -> str:
-    return str(value).replace("'", "''")
-
-
-def _build_where(filters: MetadataFilters | None) -> str | None:
-    """Translate the EQ / IN filters we use into a Lance SQL predicate on the
-    top-level ``document_id`` column."""
-    if filters is None or not filters.filters:
-        return None
-    clauses: list[str] = []
-    for f in filters.filters:
-        if f.operator == FilterOperator.IN:
-            vals = ",".join(f"'{_escape(v)}'" for v in f.value)
-            clauses.append(f"{f.key} IN ({vals})")
-        elif f.operator == FilterOperator.EQ:
-            clauses.append(f"{f.key} = '{_escape(f.value)}'")
-        else:  # pragma: no cover - we only ever build EQ/IN filters
-            raise NotImplementedError(f"Unsupported filter operator: {f.operator}")
-    joiner = " OR " if filters.condition == FilterCondition.OR else " AND "
-    return joiner.join(clauses)
-
-
-class PaperlessLanceVectorStore(BasePydanticVectorStore):
-    """A llama-index vector store backed directly by a LanceDB table.
-
-    Stores one row per node with the node id, its document id (both as the
-    ``ref_doc_id`` delete key ``doc_id`` and a top-level filter column
-    ``document_id``), the embedding, and the serialised node (text + metadata)
-    as JSON. ``stores_text`` lets llama-index run off this store alone, with no
-    separate docstore or index store.
-    """
-
-    stores_text: bool = True
-    flat_metadata: bool = True
-
-    _uri: str = PrivateAttr()
-    _table_name: str = PrivateAttr()
-    _conn: Any = PrivateAttr()
-    _table: Any = PrivateAttr()
-
-    def __init__(self, uri: str, table_name: str = DEFAULT_TABLE_NAME) -> None:
-        super().__init__()
-        self._uri = uri
-        self._table_name = table_name
-        self._conn = lancedb.connect(uri)
-        existing = list(self._conn.table_names())
-        self._table = (
-            self._conn.open_table(table_name) if table_name in existing else None
-        )
-
-    @property
-    def client(self) -> Any:
-        return self._conn
-
-    def table_exists(self) -> bool:
-        return self._table_name in list(self._conn.table_names())
-
-    def vector_dim(self) -> int | None:
-        if self._table is None:
-            return None
-        return self._table.schema.field("vector").type.list_size
-
-    def drop_table(self) -> None:
-        if self.table_exists():
-            self._conn.drop_table(self._table_name)
-        self._table = None
-
-    @staticmethod
-    def _schema(dim: int) -> pa.Schema:
-        return pa.schema(
-            [
-                pa.field("id", pa.string()),
-                pa.field("doc_id", pa.string()),
-                pa.field("document_id", pa.string()),
-                pa.field("vector", pa.list_(pa.float32(), dim)),
-                pa.field("node_content", pa.string()),
-            ],
-        )
-
-    def _row(self, node: BaseNode) -> dict[str, Any]:
-        meta = node_to_metadata_dict(
-            node,
-            remove_text=False,
-            flat_metadata=self.flat_metadata,
-        )
-        return {
-            "id": node.node_id,
-            "doc_id": node.ref_doc_id,
-            "document_id": str(node.metadata.get("document_id")),
-            "vector": node.get_embedding(),
-            "node_content": json.dumps(meta),
-        }
-
-    def add(self, nodes: list[BaseNode], **add_kwargs: Any) -> list[str]:
-        if not nodes:
-            return []
-        rows = [self._row(node) for node in nodes]
-        if self._table is None:
-            dim = len(nodes[0].get_embedding())
-            self._table = self._conn.create_table(
-                self._table_name,
-                rows,
-                schema=self._schema(dim),
-            )
-        else:
-            self._table.add(rows)
-        return [node.node_id for node in nodes]
-
-    def delete(self, ref_doc_id: str, **delete_kwargs: Any) -> None:
-        if self._table is not None:
-            self._table.delete(f'doc_id = "{_escape(ref_doc_id)}"')
-
-    def delete_nodes(
-        self,
-        node_ids: list[str] | None = None,
-        filters: MetadataFilters | None = None,
-        **delete_kwargs: Any,
-    ) -> None:
-        if self._table is None:
-            return
-        if node_ids:
-            ids = ",".join(f'"{_escape(n)}"' for n in node_ids)
-            self._table.delete(f"id IN ({ids})")
-        elif filters is not None:
-            where = _build_where(filters)
-            if where:
-                self._table.delete(where)
-
-    def _rows_to_nodes(self, rows: list[dict[str, Any]]) -> list[BaseNode]:
-        nodes: list[BaseNode] = []
-        for row in rows:
-            node = metadata_dict_to_node(json.loads(row["node_content"]))
-            node.embedding = list(row["vector"])
-            nodes.append(node)
-        return nodes
-
-    def get_nodes(
-        self,
-        node_ids: list[str] | None = None,
-        filters: MetadataFilters | None = None,
-        **kwargs: Any,
-    ) -> list[BaseNode]:
-        if self._table is None:
-            return []
-        query = self._table.search()
-        where = _build_where(filters)
-        if node_ids:
-            ids = ",".join(f'"{_escape(n)}"' for n in node_ids)
-            query = query.where(f"id IN ({ids})")
-        elif where:
-            query = query.where(where)
-        return self._rows_to_nodes(query.to_list())
-
-    def query(
-        self,
-        query: VectorStoreQuery,
-        **kwargs: Any,
-    ) -> VectorStoreQueryResult:
-        if self._table is None:
-            return VectorStoreQueryResult(nodes=[], similarities=[], ids=[])
-        top_k = query.similarity_top_k or 10
-        search = self._table.search(query.query_embedding).limit(top_k)
-        where = _build_where(query.filters)
-        if where:
-            search = search.where(where)
-        rows = search.to_list()
-        nodes = self._rows_to_nodes(rows)
-        # LanceDB returns squared-L2 distance; map to a descending similarity.
-        sims = [1.0 / (1.0 + float(row["_distance"])) for row in rows]
-        ids = [row["id"] for row in rows]
-        return VectorStoreQueryResult(nodes=nodes, similarities=sims, ids=ids)
-```
-
-- [ ] **Step 4: Run the tests to verify they pass**
-
-Run: `uv run pytest src/paperless_ai/tests/test_vector_store.py -v`
-Expected: PASS (both tests).
-
-- [ ] **Step 5: Add delete / filter / get_nodes / fresh-process tests**
-
-Append to `src/paperless_ai/tests/test_vector_store.py` inside `TestPaperlessLanceVectorStoreCrud`:
-
-```python
-    def test_delete_removes_all_chunks_of_document(
-        self,
-        store: PaperlessLanceVectorStore,
-    ) -> None:
-        store.add([_node("1-0", "1", "a", 0.1), _node("1-1", "1", "b", 0.2)])
-        store.add([_node("2-0", "2", "c", 0.9)])
-
-        store.delete("1")
-
-        assert store.client.open_table("documents").count_rows() == 1
-
-    def test_query_with_in_filter_scopes_results(
-        self,
-        store: PaperlessLanceVectorStore,
-    ) -> None:
-        store.add([_node("1-0", "1", "a", 0.1), _node("2-0", "2", "b", 0.1)])
-
-        result = store.query(
-            VectorStoreQuery(
-                query_embedding=[0.1] * DIM,
-                similarity_top_k=5,
-                filters=MetadataFilters(
-                    filters=[
-                        MetadataFilter(
-                            key="document_id",
-                            operator=FilterOperator.IN,
-                            value=["2"],
-                        ),
-                    ],
-                ),
-            ),
-        )
-
-        assert [n.metadata["document_id"] for n in result.nodes] == ["2"]
-
-    def test_get_nodes_filter_returns_empty_cleanly(
-        self,
-        store: PaperlessLanceVectorStore,
-    ) -> None:
-        store.add([_node("1-0", "1", "a", 0.1)])
-        nodes = store.get_nodes(
-            filters=MetadataFilters(
-                filters=[
-                    MetadataFilter(
-                        key="document_id",
-                        operator=FilterOperator.IN,
-                        value=["999"],
-                    ),
-                ],
-            ),
-        )
-        assert nodes == []
-
-    def test_fresh_instance_filters_existing_table(
-        self,
-        tmp_path: Path,
-    ) -> None:
-        uri = str(tmp_path / "idx")
-        PaperlessLanceVectorStore(uri=uri).add(
-            [_node("1-0", "1", "a", 0.1), _node("2-0", "2", "b", 0.1)],
-        )
-
-        reopened = PaperlessLanceVectorStore(uri=uri)
-        result = reopened.query(
-            VectorStoreQuery(
-                query_embedding=[0.1] * DIM,
-                similarity_top_k=5,
-                filters=MetadataFilters(
-                    filters=[
-                        MetadataFilter(
-                            key="document_id",
-                            operator=FilterOperator.IN,
-                            value=["1"],
-                        ),
-                    ],
-                ),
-            ),
-        )
-        assert [n.metadata["document_id"] for n in result.nodes] == ["1"]
-
-    def test_table_exists_and_drop(
-        self,
-        store: PaperlessLanceVectorStore,
-    ) -> None:
-        assert store.table_exists() is False
-        store.add([_node("1-0", "1", "a", 0.1)])
-        assert store.table_exists() is True
-        assert store.vector_dim() == DIM
-        store.drop_table()
-        assert store.table_exists() is False
-```
-
-- [ ] **Step 6: Run all adapter tests**
-
-Run: `uv run pytest src/paperless_ai/tests/test_vector_store.py -v`
-Expected: PASS (all 7 tests).
-
-- [ ] **Step 7: Lint and commit**
-
-```bash
-ruff check src/paperless_ai/vector_store.py src/paperless_ai/tests/test_vector_store.py
-ruff format src/paperless_ai/vector_store.py src/paperless_ai/tests/test_vector_store.py
-git add src/paperless_ai/vector_store.py src/paperless_ai/tests/test_vector_store.py
-git commit -m "feat(ai): add LanceDB-backed vector store adapter
-
-Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>"
-```
-
----
-
-## Task 3: Atomic upsert (`upsert_document`)
-
-**Files:**
-
-- Modify: `src/paperless_ai/vector_store.py`
-- Test: `src/paperless_ai/tests/test_vector_store.py`
-
-- [ ] **Step 1: Write the failing test for shrink-on-update pruning + single commit**
-
-Append a new class to `src/paperless_ai/tests/test_vector_store.py`:
-
-```python
-class TestPaperlessLanceVectorStoreUpsert:
-    @pytest.fixture
-    def store(self, tmp_path: Path) -> PaperlessLanceVectorStore:
-        s = PaperlessLanceVectorStore(uri=str(tmp_path / "idx"))
-        s.add(
-            [
-                _node("1-0", "1", "old0", 0.1),
-                _node("1-1", "1", "old1", 0.2),
-                _node("1-2", "1", "old2", 0.3),
-                _node("2-0", "2", "keep", 0.9),
-            ],
-        )
-        return s
-
-    def test_upsert_prunes_stale_chunks_and_keeps_others(
-        self,
-        store: PaperlessLanceVectorStore,
-    ) -> None:
-        store.upsert_document(
-            "1",
-            [_node("1-0", "1", "new0", 0.1), _node("1-1", "1", "new1", 0.2)],
-        )
-
-        table = store.client.open_table("documents")
-        doc1 = sorted(
-            r["id"] for r in table.search().where("document_id = '1'").to_list()
-        )
-        assert doc1 == ["1-0", "1-1"]  # 1-2 pruned
-        assert table.count_rows() == 3  # 2 new doc1 + 1 doc2
-
-    def test_upsert_is_single_commit(
-        self,
-        store: PaperlessLanceVectorStore,
-    ) -> None:
-        table = store.client.open_table("documents")
-        before = table.version
-        store.upsert_document("1", [_node("1-0", "1", "new0", 0.1)])
-        assert store.client.open_table("documents").version == before + 1
-```
-
-- [ ] **Step 2: Run to verify it fails**
-
-Run: `uv run pytest src/paperless_ai/tests/test_vector_store.py::TestPaperlessLanceVectorStoreUpsert -v`
-Expected: FAIL with `AttributeError: 'PaperlessLanceVectorStore' object has no attribute 'upsert_document'`.
-
-- [ ] **Step 3: Implement `upsert_document`**
-
-Add to `PaperlessLanceVectorStore` in `src/paperless_ai/vector_store.py`, after `add`:
-
-```python
-    def upsert_document(self, document_id: str, nodes: list[BaseNode]) -> list[str]:
-        """Atomically replace all stored chunks of ``document_id`` with ``nodes``.
-
-        A single ``merge_insert`` commit: matching node ids are updated, new ids
-        inserted, and any existing rows for this document that are not in the new
-        set are deleted (``when_not_matched_by_source_delete``). This prunes stale
-        trailing chunks when an edit reduces a document's chunk count, with no
-        transient empty state for concurrent lock-free readers.
-        """
-        if not nodes:
-            # No indexable content: treat as a removal.
-            self.delete(document_id)
-            return []
-        rows = [self._row(node) for node in nodes]
-        if self._table is None:
-            dim = len(nodes[0].get_embedding())
-            self._table = self._conn.create_table(
-                self._table_name,
-                rows,
-                schema=self._schema(dim),
-            )
-            return [node.node_id for node in nodes]
-        (
-            self._table.merge_insert("id")
-            .when_matched_update_all()
-            .when_not_matched_insert_all()
-            .when_not_matched_by_source_delete(
-                f"document_id = '{_escape(document_id)}'",
-            )
-            .execute(rows)
-        )
-        return [node.node_id for node in nodes]
-```
-
-- [ ] **Step 4: Run to verify it passes**
-
-Run: `uv run pytest src/paperless_ai/tests/test_vector_store.py::TestPaperlessLanceVectorStoreUpsert -v`
-Expected: PASS (both tests).
-
-- [ ] **Step 5: Lint and commit**
-
-```bash
-ruff check src/paperless_ai/vector_store.py src/paperless_ai/tests/test_vector_store.py
-ruff format src/paperless_ai/vector_store.py src/paperless_ai/tests/test_vector_store.py
-git add src/paperless_ai/vector_store.py src/paperless_ai/tests/test_vector_store.py
-git commit -m "feat(ai): atomic upsert_document on the LanceDB store
-
-Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>"
-```
-
----
-
-## Task 4: ANN index threshold, scalar index, and compaction
-
-**Files:**
-
-- Modify: `src/paperless_ai/vector_store.py`
-- Test: `src/paperless_ai/tests/test_vector_store.py`
-
-- [ ] **Step 1: Write the failing tests**
-
-Append a new class to `src/paperless_ai/tests/test_vector_store.py`:
-
-```python
-class TestPaperlessLanceVectorStoreMaintenance:
-    @pytest.fixture
-    def store(self, tmp_path: Path) -> PaperlessLanceVectorStore:
-        return PaperlessLanceVectorStore(uri=str(tmp_path / "idx"))
-
-    def test_maybe_create_ann_index_noop_below_threshold(
-        self,
-        store: PaperlessLanceVectorStore,
-    ) -> None:
-        store.add([_node("1-0", "1", "a", 0.1)])
-        # Threshold far above row count -> no index attempted, no error.
-        store.maybe_create_ann_index(min_rows=1000)
-        # Still queryable.
-        result = store.query(
-            VectorStoreQuery(query_embedding=[0.1] * DIM, similarity_top_k=1),
-        )
-        assert len(result.nodes) == 1
-
-    def test_maybe_create_ann_index_non_divisible_dim_falls_back(
-        self,
-        store: PaperlessLanceVectorStore,
-    ) -> None:
-        # DIM=8 is not divisible by the PQ default sub-vectors; must not raise
-        # and must leave the table queryable (IVF_FLAT fallback or skipped).
-        for i in range(40):
-            store.add([_node(f"1-{i}", "1", f"t{i}", float(i))])
-        store.maybe_create_ann_index(min_rows=10)
-        result = store.query(
-            VectorStoreQuery(query_embedding=[1.0] * DIM, similarity_top_k=3),
-        )
-        assert len(result.nodes) == 3
-
-    def test_compact_reduces_to_single_version(
-        self,
-        store: PaperlessLanceVectorStore,
-    ) -> None:
-        for i in range(5):
-            store.add([_node(f"1-{i}", "1", f"t{i}", float(i))])
-        assert len(store.client.open_table("documents").list_versions()) > 1
-        store.compact(retention_seconds=0)
-        assert len(store.client.open_table("documents").list_versions()) == 1
-```
-
-- [ ] **Step 2: Run to verify they fail**
-
-Run: `uv run pytest src/paperless_ai/tests/test_vector_store.py::TestPaperlessLanceVectorStoreMaintenance -v`
-Expected: FAIL (`maybe_create_ann_index` / `compact` not defined).
-
-- [ ] **Step 3: Implement maintenance methods**
-
-Add to the top of `src/paperless_ai/vector_store.py` (module constants, after `DEFAULT_TABLE_NAME`):
-
-```python
-# Below this many chunks, LanceDB's exact (brute-force) search is sufficient and
-# faster than building an ANN index (per LanceDB guidance, ~100K vectors).
-ANN_INDEX_MIN_ROWS = 100_000
-# IVF_PQ default; num_sub_vectors must evenly divide the embedding dimension.
-ANN_PQ_SUB_VECTORS = 96
-```
-
-Add these methods to `PaperlessLanceVectorStore`:
-
-```python
-    def _has_vector_index(self) -> bool:
-        try:
-            return any(
-                "vector" in (getattr(idx, "columns", []) or [])
-                for idx in self._table.list_indices()
-            )
-        except Exception:  # pragma: no cover - older lancedb without list_indices
-            return False
-
-    def maybe_create_ann_index(self, min_rows: int = ANN_INDEX_MIN_ROWS) -> None:
-        """Best-effort: build an IVF index once the table is large enough.
-
-        IVF_PQ is used when ``num_sub_vectors`` divides the embedding dimension,
-        otherwise IVF_FLAT (no divisor constraint). Any failure is logged and
-        leaves the table on exact search, which is always correct.
-        """
-        if self._table is None:
-            return
-        rows = self._table.count_rows()
-        if rows < min_rows or self._has_vector_index():
-            return
-        num_partitions = max(1, rows // 4096)
-        # Embedding dim from the schema's fixed-size list column.
-        dim = self._table.schema.field("vector").type.list_size
-        try:
-            if dim % ANN_PQ_SUB_VECTORS == 0:
-                self._table.create_index(
-                    metric="l2",
-                    num_partitions=num_partitions,
-                    num_sub_vectors=ANN_PQ_SUB_VECTORS,
-                    index_type="IVF_PQ",
-                )
-            else:
-                self._table.create_index(
-                    metric="l2",
-                    num_partitions=num_partitions,
-                    index_type="IVF_FLAT",
-                )
-        except Exception as e:  # pragma: no cover - depends on data/dim
-            logger.warning("Skipping ANN index creation: %s", e)
-
-    def ensure_document_id_scalar_index(self) -> None:
-        """Create a scalar index on the filter column (never on the merge key
-        ``id`` — see LanceDB #3177)."""
-        if self._table is None:
-            return
-        try:
-            self._table.create_scalar_index("document_id", replace=True)
-        except Exception as e:  # pragma: no cover
-            logger.warning("Skipping document_id scalar index: %s", e)
-
-    def compact(self, retention_seconds: int) -> None:
-        """Compact fragments and prune old MVCC versions in one call."""
-        if self._table is None:
-            return
-        from datetime import timedelta
-
-        self._table.optimize(cleanup_older_than=timedelta(seconds=retention_seconds))
-```
-
-> **Note for the implementer:** verify `list_size` is the right attribute for a `pyarrow` fixed-size list on the installed pyarrow (`pa.list_(pa.float32(), 8).list_size == 8`). If the installed pyarrow exposes it differently, adjust the accessor accordingly (this same accessor is used by `vector_dim()` in Task 2).
-
-- [ ] **Step 4: Run to verify they pass**
-
-Run: `uv run pytest src/paperless_ai/tests/test_vector_store.py::TestPaperlessLanceVectorStoreMaintenance -v`
-Expected: PASS (all three).
-
-- [ ] **Step 5: Add the upsert-after-optimize regression test (#3177 guard)**
-
-Append to `TestPaperlessLanceVectorStoreMaintenance`:
-
-```python
-    def test_upsert_after_optimize_with_scalar_index(
-        self,
-        store: PaperlessLanceVectorStore,
-    ) -> None:
-        store.add(
-            [
-                _node("1-0", "1", "old0", 0.1),
-                _node("1-1", "1", "old1", 0.2),
-                _node("1-2", "1", "old2", 0.3),
-                _node("2-0", "2", "keep", 0.9),
-            ],
-        )
-        store.ensure_document_id_scalar_index()
-        store.compact(retention_seconds=0)
-
-        store.upsert_document("1", [_node("1-0", "1", "new0", 0.1)])
-
-        table = store.client.open_table("documents")
-        doc1 = sorted(
-            r["id"] for r in table.search().where("document_id = '1'").to_list()
-        )
-        assert doc1 == ["1-0"]
-        assert table.count_rows() == 2
-```
-
-- [ ] **Step 6: Run the full maintenance class**
-
-Run: `uv run pytest src/paperless_ai/tests/test_vector_store.py::TestPaperlessLanceVectorStoreMaintenance -v`
-Expected: PASS (all four).
-
-- [ ] **Step 7: Lint and commit**
-
-```bash
-ruff check src/paperless_ai/vector_store.py src/paperless_ai/tests/test_vector_store.py
-ruff format src/paperless_ai/vector_store.py src/paperless_ai/tests/test_vector_store.py
-git add src/paperless_ai/vector_store.py src/paperless_ai/tests/test_vector_store.py
-git commit -m "feat(ai): ANN index threshold, scalar index, and compaction
-
-Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>"
-```
-
----
-
-## Task 5: Node identity — set `LlamaDocument.id_` to the document id
-
-**Files:**
-
-- Modify: `src/paperless_ai/indexing.py` (the `build_document_node` function, around `indexing.py:132-149`)
-- Test: `src/paperless_ai/tests/test_ai_indexing.py`
-
-- [ ] **Step 1: Write the failing test**
-
-Add to `src/paperless_ai/tests/test_ai_indexing.py`:
-
-```python
-@pytest.mark.django_db
-def test_build_document_node_sets_ref_doc_id(real_document) -> None:
-    nodes = indexing.build_document_node(real_document)
-    assert nodes
-    for node in nodes:
-        assert node.ref_doc_id == str(real_document.id)
-```
-
-- [ ] **Step 2: Run to verify it fails**
-
-Run: `uv run pytest src/paperless_ai/tests/test_ai_indexing.py::test_build_document_node_sets_ref_doc_id -v`
-Expected: FAIL — `ref_doc_id` is a random uuid, not `str(real_document.id)`.
-
-- [ ] **Step 3: Set the LlamaDocument id**
-
-In `src/paperless_ai/indexing.py`, in `build_document_node`, change the `LlamaDocument(...)` construction to set `id_`:
-
-```python
-    doc = LlamaDocument(
-        id_=str(document.id),
-        text=text,
-        metadata=metadata,
-        excluded_embed_metadata_keys=list(metadata.keys()),
-    )
-```
-
-- [ ] **Step 4: Run to verify it passes**
-
-Run: `uv run pytest src/paperless_ai/tests/test_ai_indexing.py::test_build_document_node_sets_ref_doc_id -v`
-Expected: PASS.
-
-- [ ] **Step 5: Commit**
-
-```bash
-git add src/paperless_ai/indexing.py src/paperless_ai/tests/test_ai_indexing.py
-git commit -m "feat(ai): tie LlamaDocument id to the paperless document id
-
-Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>"
-```
-
----
-
-## Task 6: Vector-store factory + load/build in `indexing.py`
-
-**Files:**
-
-- Modify: `src/paperless_ai/indexing.py` — replace `get_or_create_storage_context` and `load_or_build_index`; add `get_vector_store` and `LLM_INDEX_TABLE`.
-- Test: `src/paperless_ai/tests/test_ai_indexing.py`
-
-- [ ] **Step 1: Write the failing test**
-
-Add to `src/paperless_ai/tests/test_ai_indexing.py`:
-
-```python
-@pytest.mark.django_db
-def test_get_vector_store_roundtrip(
-    temp_llm_index_dir,
-    mock_embed_model,
-) -> None:
-    from llama_index.core.vector_stores.types import VectorStoreQuery
-
-    from paperless_ai.vector_store import PaperlessLanceVectorStore
-
-    store = indexing.get_vector_store()
-    assert isinstance(store, PaperlessLanceVectorStore)
-```
-
-- [ ] **Step 2: Run to verify it fails**
-
-Run: `uv run pytest src/paperless_ai/tests/test_ai_indexing.py::test_get_vector_store_roundtrip -v`
-Expected: FAIL — `indexing.get_vector_store` not defined.
-
-- [ ] **Step 3: Add the factory and rewrite the index builder**
-
-In `src/paperless_ai/indexing.py`:
-
-1. Add near the top (after `logger = ...`). **Do not** add a top-level `import` of `vector_store` — only a constant and a `TYPE_CHECKING`-only hint (see the lazy-import constraint):
-
-```python
-LLM_INDEX_TABLE = "documents"
-```
-
-There is already a `from typing import TYPE_CHECKING` block at the top of `indexing.py`; add the adapter to it for type hints only:
-
-```python
-if TYPE_CHECKING:
-    from paperless_ai.vector_store import PaperlessLanceVectorStore
-```
-
-2. Replace the entire `get_or_create_storage_context(...)` function with (note the **function-local** import of `vector_store` and the string type hint):
-
-```python
-def get_vector_store() -> "PaperlessLanceVectorStore":
-    """Open (or lazily create) the LanceDB-backed vector store.
-
-    Imports ``vector_store`` lazily so that importing ``indexing`` (which
-    ``documents.tasks`` does at module top) never drags in lancedb/llama_index.
-    """
-    from paperless_ai.vector_store import PaperlessLanceVectorStore
-
-    settings.LLM_INDEX_DIR.mkdir(parents=True, exist_ok=True)
-    return PaperlessLanceVectorStore(
-        uri=str(settings.LLM_INDEX_DIR),
-        table_name=LLM_INDEX_TABLE,
-    )
-```
-
-3. Replace `load_or_build_index(...)` with:
-
-```python
-def load_or_build_index(nodes=None):
-    """Load the VectorStoreIndex backed by the LanceDB store.
-
-    With ``stores_text=True`` the index runs off the vector store alone — no
-    docstore or index store. ``nodes`` is accepted for signature compatibility
-    but unused; the store is the source of truth.
-    """
-    import llama_index.core.settings as llama_settings
-    from llama_index.core import VectorStoreIndex
-
-    embed_model = get_embedding_model()
-    llama_settings.Settings.embed_model = embed_model
-    vector_store = get_vector_store()
-    return VectorStoreIndex.from_vector_store(
-        vector_store=vector_store,
-        embed_model=embed_model,
-    )
-```
-
-4. Replace `vector_store_file_exists()` (it must use the store before Task 7 relies on it):
-
-```python
-def vector_store_file_exists() -> bool:
-    """True when the LanceDB table exists."""
-    return get_vector_store().table_exists()
-```
-
-5. Remove the now-unused imports for `StorageContext`, `SimpleDocumentStore`, `SimpleIndexStore`, and `faiss`. **Keep `shutil`** — it is used by Task 9's migration cleanup.
-
-- [ ] **Step 4: Run to verify it passes**
-
-Run: `uv run pytest src/paperless_ai/tests/test_ai_indexing.py::test_get_vector_store_roundtrip -v`
-Expected: PASS.
-
-- [ ] **Step 5: Write the lazy-import guard test**
-
-This guards the hard constraint: importing `documents.tasks` (the light path that management commands traverse) must not pull in any AI library. It runs in a **subprocess** because the pytest process has already imported these libs via other tests.
-
-Create `src/paperless_ai/tests/test_lazy_imports.py`:
-
-```python
-import subprocess
-import sys
-
-
-class TestLazyAiImports:
-    def test_importing_tasks_does_not_load_ai_libraries(self) -> None:
-        code = (
-            "import os, django, sys\n"
-            "os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'paperless.settings')\n"
-            "django.setup()\n"
-            "import documents.tasks  # noqa: F401\n"
-            "leaked = [m for m in ('lancedb', 'pyarrow', 'llama_index') "
-            "if m in sys.modules]\n"
-            "assert not leaked, f'AI libraries leaked into the light path: {leaked}'\n"
-        )
-        result = subprocess.run(
-            [sys.executable, "-c", code],
-            capture_output=True,
-            text=True,
-            cwd="src",
-        )
-        assert result.returncode == 0, result.stdout + result.stderr
-```
-
-- [ ] **Step 6: Run the guard test**
-
-Run: `uv run pytest src/paperless_ai/tests/test_lazy_imports.py -v`
-Expected: PASS. If it FAILS, find the offending top-level import (`git grep -n "^from llama_index\|^import lancedb\|^import pyarrow\|^from paperless_ai.vector_store" src/paperless_ai src/documents`) and make it function-local.
-
-- [ ] **Step 7: Commit**
-
-```bash
-git add src/paperless_ai/indexing.py src/paperless_ai/tests/test_ai_indexing.py src/paperless_ai/tests/test_lazy_imports.py
-git commit -m "refactor(ai): build the index from the LanceDB store alone (lazy import)
-
-Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>"
-```
-
----
-
-## Task 7: Rewire add / update / remove / rebuild
-
-**Files:**
-
-- Modify: `src/paperless_ai/indexing.py` — `llm_index_add_or_update_document`, `llm_index_remove_document`, `update_llm_index`; delete `remove_document_docstore_nodes`.
-- Test: `src/paperless_ai/tests/test_ai_indexing.py`
-
-- [ ] **Step 1: Write the failing tests (CRUD against the real store)**
-
-Add to `src/paperless_ai/tests/test_ai_indexing.py`:
-
-```python
-@pytest.mark.django_db
-def test_add_then_remove_document(
-    temp_llm_index_dir,
-    mock_embed_model,
-    real_document,
-) -> None:
-    indexing.llm_index_add_or_update_document(real_document)
-    store = indexing.get_vector_store()
-    table = store.client.open_table(indexing.LLM_INDEX_TABLE)
-    assert table.count_rows() >= 1
-
-    indexing.llm_index_remove_document(real_document)
-    assert store.client.open_table(indexing.LLM_INDEX_TABLE).count_rows() == 0
-
-
-@pytest.mark.django_db
-def test_update_shrinks_chunks_without_orphans(
-    temp_llm_index_dir,
-    mock_embed_model,
-    real_document,
-) -> None:
-    real_document.content = "word " * 4000  # many chunks
-    real_document.save()
-    indexing.llm_index_add_or_update_document(real_document)
-    store = indexing.get_vector_store()
-    big = store.client.open_table(indexing.LLM_INDEX_TABLE).count_rows()
-
-    real_document.content = "short"  # one chunk
-    real_document.save()
-    indexing.llm_index_add_or_update_document(real_document)
-
-    rows = store.client.open_table(indexing.LLM_INDEX_TABLE).count_rows()
-    assert rows < big
-    assert rows >= 1
-```
-
-- [ ] **Step 2: Run to verify they fail**
-
-Run: `uv run pytest src/paperless_ai/tests/test_ai_indexing.py -k "add_then_remove or shrinks" -v`
-Expected: FAIL (current implementation still uses the docstore path).
-
-- [ ] **Step 3: Rewrite the functions**
-
-In `src/paperless_ai/indexing.py` (per the lazy-import constraint, **do not** add a top-level `llama_index` import — `MetadataMode` is imported inside each function that uses it):
-
-1. **Delete** the `remove_document_docstore_nodes(...)` function entirely.
-
-2. Replace `llm_index_add_or_update_document`:
-
-```python
-def llm_index_add_or_update_document(document: Document):
-    """Add or atomically replace a document's chunks in the LLM index."""
-    from llama_index.core.schema import MetadataMode
-
-    new_nodes = build_document_node(document, chunk_size=get_rag_chunk_size())
-
-    embed_model = get_embedding_model()
-    for node in new_nodes:
-        node.embedding = embed_model.get_text_embedding(
-            node.get_content(metadata_mode=MetadataMode.EMBED),
-        )
-
-    with FileLock(_index_lock_path()):
-        store = get_vector_store()
-        store.upsert_document(str(document.id), new_nodes)
-        store.ensure_document_id_scalar_index()
-```
-
-> Note: `upsert_document` with an empty `new_nodes` list deletes the document (handles the "no indexable content" case the old code logged-and-skipped).
-
-3. Replace `llm_index_remove_document`:
-
-```python
-def llm_index_remove_document(document: Document):
-    """Remove a document's chunks from the LLM index."""
-    with FileLock(_index_lock_path()):
-        store = get_vector_store()
-        store.delete(str(document.id))
-```
-
-4. Rewrite `update_llm_index` — both the rebuild and incremental branches. The rebuild path drops/recreates the table and bulk-inserts; the incremental path upserts changed documents (compare `modified`). Replace the function body with:
-
-```python
-def update_llm_index(
-    *,
-    iter_wrapper: IterWrapper[Document] = identity,
-    rebuild=False,
-) -> str:
-    """Rebuild or incrementally update the LLM index."""
-    from llama_index.core.schema import MetadataMode
-
-    documents = Document.objects.all()
-    if not documents.exists():
-        logger.warning("No documents found to index.")
-        if not rebuild and not vector_store_file_exists():
-            return "No documents found to index."
-
-    chunk_size = AIConfig().llm_embedding_chunk_size
-    embed_model = get_embedding_model()
-
-    with FileLock(_index_lock_path()):
-        if rebuild or not vector_store_file_exists():
-            (settings.LLM_INDEX_DIR / "meta.json").unlink(missing_ok=True)
-            logger.info("Rebuilding LLM index.")
-            store = get_vector_store()
-            store.drop_table()  # defined in Task 2; bulk-insert into a fresh table
-            for document in iter_wrapper(documents):
-                nodes = build_document_node(document, chunk_size=chunk_size)
-                for node in nodes:
-                    node.embedding = embed_model.get_text_embedding(
-                        node.get_content(metadata_mode=MetadataMode.EMBED),
-                    )
-                store.add(nodes)
-            msg = "LLM index rebuilt successfully."
-        else:
-            store = get_vector_store()
-            existing = {
-                str(row["document_id"]): json.loads(row["node_content"])
-                for row in _iter_existing_modified(store)
-            }
-            changed = 0
-            for document in iter_wrapper(documents):
-                doc_id = str(document.id)
-                node_meta = existing.get(doc_id)
-                if node_meta is not None:
-                    stored_modified = node_meta.get("modified")
-                    if stored_modified == document.modified.isoformat():
-                        continue
-                nodes = build_document_node(document, chunk_size=chunk_size)
-                for node in nodes:
-                    node.embedding = embed_model.get_text_embedding(
-                        node.get_content(metadata_mode=MetadataMode.EMBED),
-                    )
-                store.upsert_document(doc_id, nodes)
-                changed += 1
-            msg = (
-                "LLM index updated successfully."
-                if changed
-                else "No changes detected in LLM index."
-            )
-
-        store.ensure_document_id_scalar_index()
-        store.maybe_create_ann_index()
-        store.compact(retention_seconds=get_llm_index_compaction_retention())
-    return msg
-```
-
-5. Add the helpers used above near the other small helpers in `indexing.py`:
-
-```python
-def _iter_existing_modified(store) -> list[dict]:
-    """One representative row per document_id, for modified-time comparison."""
-    table_name = LLM_INDEX_TABLE
-    if table_name not in store.client.table_names():
-        return []
-    seen: dict[str, dict] = {}
-    for row in store.client.open_table(table_name).search().to_list():
-        seen.setdefault(str(row["document_id"]), row)
-    return list(seen.values())
-
-
-def get_llm_index_compaction_retention() -> int:
-    """Seconds of MVCC version history to keep during compaction."""
-    return 60 * 60  # 1 hour: safe for in-flight readers, reclaims daily
-```
-
-6. Ensure `import json` is present at the top of `indexing.py` (it is used by `_iter_existing_modified`).
-
-- [ ] **Step 4: Run to verify they pass**
-
-Run: `uv run pytest src/paperless_ai/tests/test_ai_indexing.py -k "add_then_remove or shrinks" -v`
-Expected: PASS.
-
-- [ ] **Step 5: Run the full indexing test module**
-
-Run: `uv run pytest src/paperless_ai/tests/test_ai_indexing.py -v`
-Expected: PASS, except tests that asserted on the old docstore internals — fix or delete those in Task 11.
-
-- [ ] **Step 6: Commit**
-
-```bash
-git add src/paperless_ai/indexing.py src/paperless_ai/tests/test_ai_indexing.py
-git commit -m "refactor(ai): add/update/remove/rebuild via LanceDB upsert + delete
-
-Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>"
-```
-
----
-
-## Task 8: `query_similar_documents` via metadata filter
-
-**Files:**
-
-- Modify: `src/paperless_ai/indexing.py` — `query_similar_documents` (around `indexing.py:394-463`)
-- Test: `src/paperless_ai/tests/test_ai_indexing.py`
-
-- [ ] **Step 1: Write the failing test**
-
-Add to `src/paperless_ai/tests/test_ai_indexing.py`:
-
-```python
-@pytest.mark.django_db
-def test_query_similar_documents_respects_allowed_ids(
-    temp_llm_index_dir,
-    mock_embed_model,
-) -> None:
-    from documents.tests.factories import DocumentFactory
-
-    a = DocumentFactory.create(content="alpha shared content here")
-    b = DocumentFactory.create(content="beta shared content here")
-    c = DocumentFactory.create(content="gamma shared content here")
-    for doc in (a, b, c):
-        indexing.llm_index_add_or_update_document(doc)
-
-    results = indexing.query_similar_documents(a, document_ids=[b.id])
-
-    assert all(doc.id == b.id for doc in results)
-```
-
-- [ ] **Step 2: Run to verify it fails**
-
-Run: `uv run pytest src/paperless_ai/tests/test_ai_indexing.py::test_query_similar_documents_respects_allowed_ids -v`
-Expected: FAIL (current implementation scans the docstore via `index.docstore.docs`, which no longer exists).
-
-- [ ] **Step 3: Rewrite `query_similar_documents`**
-
-Replace the function body in `src/paperless_ai/indexing.py`:
-
-```python
-def query_similar_documents(
-    document: Document,
-    top_k: int = 5,
-    document_ids: Iterable[int | str] | None = None,
-) -> list[Document]:
-    """Return up to ``top_k`` Documents most similar to ``document``."""
-    allowed_document_ids = normalize_document_ids(document_ids)
-    if allowed_document_ids is not None and not allowed_document_ids:
-        return []
-
-    if not vector_store_file_exists():
-        queue_llm_index_update_if_needed(
-            rebuild=False,
-            reason="LLM index not found for similarity query.",
-        )
-        return []
-
-    from llama_index.core.retrievers import VectorIndexRetriever
-    from llama_index.core.vector_stores.types import FilterOperator
-    from llama_index.core.vector_stores.types import MetadataFilter
-    from llama_index.core.vector_stores.types import MetadataFilters
-
-    index = load_or_build_index()
-
-    filters = None
-    if allowed_document_ids is not None:
-        filters = MetadataFilters(
-            filters=[
-                MetadataFilter(
-                    key="document_id",
-                    operator=FilterOperator.IN,
-                    value=sorted(allowed_document_ids),
-                ),
-            ],
-        )
-
-    retriever = VectorIndexRetriever(
-        index=index,
-        similarity_top_k=top_k,
-        filters=filters,
-    )
-
-    config = AIConfig()
-    query_text = truncate_content(
-        (document.title or "") + "\n" + (document.content or ""),
-        chunk_size=config.llm_embedding_chunk_size,
-        context_size=config.llm_context_size,
-    )
-    results = retriever.retrieve(query_text)
-
-    retrieved_document_ids: list[int] = []
-    for node in results:
-        document_id = node.metadata.get("document_id")
-        if document_id is None:
-            continue
-        normalized = str(document_id)
-        if allowed_document_ids is not None and normalized not in allowed_document_ids:
-            continue
-        try:
-            retrieved_document_ids.append(int(normalized))
-        except ValueError:
-            logger.warning(
-                "Skipping LLM index result with invalid document_id %r.",
-                document_id,
-            )
-
-    return list(Document.objects.filter(pk__in=retrieved_document_ids))
-```
-
-- [ ] **Step 4: Run to verify it passes**
-
-Run: `uv run pytest src/paperless_ai/tests/test_ai_indexing.py::test_query_similar_documents_respects_allowed_ids -v`
-Expected: PASS.
-
-- [ ] **Step 5: Commit**
-
-```bash
-git add src/paperless_ai/indexing.py src/paperless_ai/tests/test_ai_indexing.py
-git commit -m "refactor(ai): query_similar_documents via metadata filter
-
-Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>"
-```
-
----
-
-## Task 9: Dimension guard, migration cleanup, and embedding helper
-
-> Note: the store primitives `table_exists` / `vector_dim` / `drop_table` were added in Task 2, and `vector_store_file_exists()` was rewritten in Task 6. This task adds only the indexing-level migration and dimension-mismatch guard plus the embedding helper they need.
-
-**Files:**
-
-- Modify: `src/paperless_ai/indexing.py` — add migration cleanup + dimension-mismatch guard; wire them into `update_llm_index`.
-- Modify: `src/paperless_ai/embedding.py` — add `current_embedding_dim`.
-- Test: `src/paperless_ai/tests/test_ai_indexing.py`
-
-- [ ] **Step 1: Write the failing test**
-
-Add to `test_ai_indexing.py`:
-
-```python
-@pytest.mark.django_db
-def test_migration_wipes_stale_faiss_files(temp_llm_index_dir) -> None:
-    stale = temp_llm_index_dir / "default__vector_store.json"
-    stale.write_text("{}")
-    indexing.migrate_stale_faiss_index()
-    assert not stale.exists()
-```
-
-- [ ] **Step 2: Run to verify it fails**
-
-Run: `uv run pytest src/paperless_ai/tests/test_ai_indexing.py::test_migration_wipes_stale_faiss_files -v`
-Expected: FAIL (`indexing.migrate_stale_faiss_index` not defined).
-
-- [ ] **Step 3: Add the embedding helper**
-
-In `src/paperless_ai/embedding.py`, add:
-
-```python
-def current_embedding_dim() -> int:
-    """Embedding dimension for the configured model (probes if not cached)."""
-    return get_embedding_dim()
-```
-
-- [ ] **Step 4: Add migration cleanup + dimension guard**
-
-In `src/paperless_ai/indexing.py` (note: `vector_store_file_exists` was already rewritten in Task 6 — do not redefine it):
-
-```python
-def migrate_stale_faiss_index() -> None:
-    """Remove a pre-LanceDB FAISS index directory so it is rebuilt fresh."""
-    stale_marker = settings.LLM_INDEX_DIR / "default__vector_store.json"
-    if stale_marker.exists():
-        logger.info("Removing stale FAISS LLM index; it will be rebuilt.")
-        shutil.rmtree(settings.LLM_INDEX_DIR, ignore_errors=True)
-        settings.LLM_INDEX_DIR.mkdir(parents=True, exist_ok=True)
-
-
-def embedding_dim_mismatch() -> bool:
-    """True when the stored table's vector dim differs from the current model."""
-    store = get_vector_store()
-    stored = store.vector_dim()
-    if stored is None:
-        return False
-    from paperless_ai.embedding import current_embedding_dim
-
-    return stored != current_embedding_dim()
-```
-
-Then wire them into `update_llm_index` — add this at the very top of the function body, **before** the `with FileLock(...)` block (the `migrate_stale_faiss_index` call from Task 7's `update_llm_index` rewrite, if already present, should match this; otherwise add it now):
-
-```python
-    migrate_stale_faiss_index()
-    if not rebuild and vector_store_file_exists() and embedding_dim_mismatch():
-        logger.warning("Embedding dimension changed; forcing LLM index rebuild.")
-        rebuild = True
-```
-
-- [ ] **Step 5: Run to verify it passes**
-
-Run: `uv run pytest src/paperless_ai/tests/test_ai_indexing.py::test_migration_wipes_stale_faiss_files -v`
-Expected: PASS.
-
-- [ ] **Step 6: Commit**
-
-```bash
-git add src/paperless_ai/indexing.py src/paperless_ai/embedding.py src/paperless_ai/tests/test_ai_indexing.py
-git commit -m "feat(ai): dimension guard and FAISS index migration
-
-Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>"
-```
-
----
-
-## Task 10: Chat — stock retriever with filters
-
-**Files:**
-
-- Modify: `src/paperless_ai/chat.py` — delete `_get_document_filtered_retriever`; rewrite `_stream_chat_with_documents`.
-- Test: `src/paperless_ai/tests/test_chat.py`
-
-- [ ] **Step 1: Read the existing chat tests**
-
-Run: `uv run pytest src/paperless_ai/tests/test_chat.py -v`
-Expected: baseline of current passes (note which tests reference `_get_document_filtered_retriever` or FAISS internals; those will be updated).
-
-- [ ] **Step 2: Write/adjust the failing test**
-
-Add to `src/paperless_ai/tests/test_chat.py` (a class-grouped, mocker-based test):
-
-```python
-import pytest
-
-from paperless_ai import chat
-
-
-@pytest.mark.django_db
-class TestStreamChatRetrieval:
-    def test_no_nodes_yields_no_content_message(
-        self,
-        temp_llm_index_dir,
-        mock_embed_model,
-        mocker,
-    ) -> None:
-        from documents.tests.factories import DocumentFactory
-
-        doc = DocumentFactory.create(content="hello world")
-        # Nothing indexed for this document yet.
-        out = list(chat.stream_chat_with_documents("question?", [doc]))
-        assert chat.CHAT_NO_CONTENT_MESSAGE in out
-```
-
-(`mock_embed_model` is the fixture in `test_ai_indexing.py`; move it into `conftest.py` in Step 4 so both modules can use it.)
-
-- [ ] **Step 3: Run to verify it fails or errors on the docstore reach-in**
-
-Run: `uv run pytest src/paperless_ai/tests/test_chat.py::TestStreamChatRetrieval -v`
-Expected: FAIL/ERROR — current `_stream_chat_with_documents` reads `index.docstore.docs`, which no longer exists.
-
-- [ ] **Step 4: Move `mock_embed_model` + `FakeEmbedding` to conftest**
-
-Cut the `FakeEmbedding` class and `mock_embed_model` fixture from `test_ai_indexing.py` and paste them into `src/paperless_ai/tests/conftest.py` (so both test modules share them). Leave `temp_llm_index_dir` as-is.
-
-- [ ] **Step 5: Rewrite chat**
-
-In `src/paperless_ai/chat.py`:
-
-1. **Delete** `_get_document_filtered_retriever(...)` entirely.
-
-2. Rewrite `_stream_chat_with_documents`:
-
-```python
-def _stream_chat_with_documents(query_str: str, documents: list[Document]):
-    from llama_index.core.prompts import PromptTemplate
-    from llama_index.core.query_engine import RetrieverQueryEngine
-    from llama_index.core.response_synthesizers import get_response_synthesizer
-    from llama_index.core.retrievers import VectorIndexRetriever
-    from llama_index.core.vector_stores.types import FilterOperator
-    from llama_index.core.vector_stores.types import MetadataFilter
-    from llama_index.core.vector_stores.types import MetadataFilters
-
-    client = AIClient()
-    index = load_or_build_index()
-
-    doc_ids = [str(doc.pk) for doc in documents]
-    filters = MetadataFilters(
-        filters=[
-            MetadataFilter(
-                key="document_id",
-                operator=FilterOperator.IN,
-                value=doc_ids,
-            ),
-        ],
-    )
-
-    # No indexed content for these documents -> bail early.
-    if not index.vector_store.get_nodes(filters=filters):
-        logger.warning("No nodes found for the given documents.")
-        yield CHAT_NO_CONTENT_MESSAGE
-        return
-
-    retriever = VectorIndexRetriever(
-        index=index,
-        similarity_top_k=CHAT_RETRIEVER_TOP_K,
-        filters=filters,
-    )
-
-    top_nodes = retriever.retrieve(query_str)
-    if len(top_nodes) == 0:
-        logger.warning("Retriever returned no nodes for the given documents.")
-        yield CHAT_NO_CONTENT_MESSAGE
-        return
-
-    references = _get_document_references(documents, top_nodes)
-
-    prompt_template = PromptTemplate(template=CHAT_PROMPT_TMPL)
-    response_synthesizer = get_response_synthesizer(
-        llm=client.llm,
-        prompt_helper=get_rag_prompt_helper(),
-        text_qa_template=prompt_template,
-        streaming=True,
-    )
-    query_engine = RetrieverQueryEngine.from_args(
-        retriever=retriever,
-        llm=client.llm,
-        response_synthesizer=response_synthesizer,
-        streaming=True,
-    )
-
-    logger.debug("Document chat query: %s", query_str)
-    response_stream = query_engine.query(query_str)
-    for chunk in response_stream.response_gen:
-        yield chunk
-        sys.stdout.flush()
-
-    if references:
-        yield _format_chat_metadata_trailer(references)
-```
-
-- [ ] **Step 6: Run to verify it passes**
-
-Run: `uv run pytest src/paperless_ai/tests/test_chat.py -v`
-Expected: PASS (update or remove any remaining test that asserted on `DocumentFilteredFaissRetriever` / FAISS internals).
-
-- [ ] **Step 7: Lint and commit**
-
-```bash
-ruff check src/paperless_ai/chat.py
-ruff format src/paperless_ai/chat.py
-git add src/paperless_ai/chat.py src/paperless_ai/tests/test_chat.py src/paperless_ai/tests/conftest.py src/paperless_ai/tests/test_ai_indexing.py
-git commit -m "refactor(ai): chat uses a stock filtered retriever
-
-Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>"
-```
-
----
-
-## Task 11: Sweep remaining FAISS references and fix stale tests
-
-**Files:**
-
-- Modify: `src/paperless_ai/tests/test_ai_indexing.py`, `src/paperless_ai/tests/test_chat.py` (delete/adjust FAISS-internal assertions)
-- Modify: any remaining references in `src/paperless_ai/` to `FaissVectorStore`, `SimpleDocumentStore`, `SimpleIndexStore`, `get_or_create_storage_context`, `remove_document_docstore_nodes`.
-
-- [ ] **Step 1: Find remaining references**
-
-Run: `git grep -n "Faiss\|FaissVectorStore\|SimpleDocumentStore\|SimpleIndexStore\|get_or_create_storage_context\|remove_document_docstore_nodes\|_faiss_index\|index_struct.nodes_dict\|docstore.docs" src/`
-Expected: matches only in tests to be updated, or none.
-
-- [ ] **Step 2: Update or delete each stale reference**
-
-For each match in a test, replace the docstore/FAISS-internal assertion with the equivalent store-level assertion (`store.client.open_table(...).count_rows()`, `store.query(...)`, `store.get_nodes(...)`). Delete tests that only validated old internals (e.g. a test asserting `remove_document_docstore_nodes` left FAISS vectors behind).
-
-- [ ] **Step 3: Run the whole AI suite**
-
-Run: `uv run pytest src/paperless_ai/ -v`
-Expected: PASS, no references to removed symbols.
-
-- [ ] **Step 4: Run the documents task tests that touch the LLM index**
-
-Run: `uv run pytest src/documents/tests/test_tasks.py -k "llm or index" -v`
-Expected: PASS.
-
-- [ ] **Step 5: Commit**
-
-```bash
-git add src/paperless_ai/tests/
-git commit -m "test(ai): drop FAISS-internal assertions
-
-Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>"
-```
-
----
-
-## Task 12: Full-suite verification and config docs
-
-**Files:**
-
-- Modify: `paperless.conf.example` (document the unchanged `PAPERLESS_LLM_INDEX_TASK_CRON`; no new vars required — ANN threshold and compaction retention are internal constants).
-
-- [ ] **Step 1: Confirm no new env vars needed**
-
-The ANN threshold (`ANN_INDEX_MIN_ROWS`) and compaction retention (`get_llm_index_compaction_retention`) are internal constants per the spec. No `paperless.conf.example` change is required unless a maintainer wants them tunable. Skip unless requested.
-
-- [ ] **Step 2: Run the full backend AI + tasks suite**
-
-Run: `uv run pytest src/paperless_ai/ src/documents/tests/test_tasks.py -v`
-Expected: PASS.
-
-- [ ] **Step 3: Run the app-config API test (it referenced the index status)**
-
-Run: `uv run pytest src/documents/tests/test_api_app_config.py -v`
-Expected: PASS.
-
-- [ ] **Step 4: Lint the whole AI package**
-
-Run: `ruff check src/paperless_ai && ruff format --check src/paperless_ai`
-Expected: clean.
-
-- [ ] **Step 5: Verify the consume → index path manually (smoke)**
-
-Run a quick smoke per the spec: with `PAPERLESS_LLM_INDEX_ENABLED` on, consume a document and confirm `llm_index_add_or_update_document` writes a row (the test in Task 7 covers this in CI; this is an optional manual smoke).
-
-- [ ] **Step 6: Final commit / branch is ready for PR**
-
-```bash
-git add -A
-git commit -m "chore(ai): finalize LanceDB vector store migration
-
-Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>"
-```
-
----
-
-## Task 13: Type-check the new/changed AI code against pyrefly
-
-**Goal:** the code this branch adds/changes passes `pyrefly` cleanly **without growing
-`.pyrefly-baseline.json`**. The baseline (~600 KB) suppresses pre-existing repo errors; our
-new code must not add to it. Run this **last**, once all implementation tasks are done, so
-every new file/symbol exists.
-
-**Files (likely to need annotations/fixes):**
-
-- `src/paperless_ai/vector_store.py`, `src/paperless_ai/indexing.py`,
-  `src/paperless_ai/chat.py`, `src/paperless_ai/embedding.py`, and the new test modules.
-
-**Environment:** pyrefly needs the dependencies installed to resolve third-party types, so
-run it **on the Linux VM** (where the venv has `lancedb`/`pyarrow`/`llama_index`). The
-`[tool.pyrefly]` config already sets `search-path = ["src"]`, `python-platform = "linux"`,
-and `baseline = ".pyrefly-baseline.json"`, so `pyrefly check` from the repo root applies
-the baseline automatically and reports only non-baselined (i.e. new) errors.
-
-- [ ] **Step 1: Run pyrefly on the VM and capture NEW errors**
-
-```bash
-tar czf - src pyproject.toml uv.lock .pyrefly-baseline.json \
-  | ssh -o BatchMode=yes -p 2244 trenton@localhost 'tar xzf - -C ~/projects/paperless-ngx'
-ssh -o BatchMode=yes -p 2244 trenton@localhost \
-  'bash -lc "cd ~/projects/paperless-ngx && uv run pyrefly check"'
-```
-
-Expected at first: a list of errors located in the changed `paperless_ai` files (anything
-already in the baseline is suppressed). Note each `file:line` + error code.
-
-- [ ] **Step 2: Fix the type errors at the source**
-
-Prefer real fixes over suppressions:
-
-- Add/repair annotations on our functions, fixtures, and the adapter methods so signatures
-  match `BasePydanticVectorStore` (e.g. `Sequence[BaseNode]`, `list[str]`, the
-  `MetadataFilters | None` params, `VectorStoreQueryResult` return).
-- Annotate `PrivateAttr` fields and the lazy `get_vector_store() -> "PaperlessLanceVectorStore"`
-  (string annotation under `TYPE_CHECKING`).
-- For genuine third-party stub gaps (`lancedb`/`pyarrow` ship little/no type info; some
-  `llama_index` returns are dynamic), use a **targeted, commented** suppression on that exact
-  line — `# type: ignore[<code>]  # lancedb has no type stubs` — not a blanket file-level
-  ignore.
-
-- [ ] **Step 3: Do NOT grow the baseline**
-
-Do not regenerate or append `.pyrefly-baseline.json`. The goal is zero new baseline entries.
-If — and only if — an error is genuinely impossible to fix or suppress inline (rare), stop
-and report it as DONE_WITH_CONCERNS describing the specific error, rather than silently
-baselining it.
-
-- [ ] **Step 4: Re-run until clean**
-
-Re-run the Step 1 command. Expected: no errors in the `paperless_ai` files we touched (the
-overall run still passes via the unchanged baseline for the rest of the repo).
-
-- [ ] **Step 5: Lint and commit**
-
-```bash
-ruff check src/paperless_ai
-ruff format src/paperless_ai
-git add src/paperless_ai
-git commit -m "types(ai): pass pyrefly for the LanceDB vector store code
-
-Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>"
-```
-
----
-
-## Self-Review notes (for the implementer)
-
-- **Lazy imports are a hard requirement** (see the constraint section). After Tasks 6, 7, and 10, the guard test (`test_lazy_imports.py`) must stay green: importing `documents.tasks` must not load `lancedb` / `pyarrow` / `llama_index`. Every `llama_index` symbol in `indexing.py`/`chat.py` (retrievers, filters, `MetadataMode`) and the `vector_store` import itself must be function-local; only `vector_store.py` and test modules import these at top level.
-
-- **`MetadataMode.EMBED`** is passed to `get_content` when embedding in the add/incremental/rebuild paths. Because `build_document_node` sets `excluded_embed_metadata_keys` to every metadata key, `EMBED` yields just the chunk text — exactly what llama-index's own embedding pipeline would feed the model, preserving current behavior. The import `from llama_index.core.schema import MetadataMode` is added in Task 7.
-- **`list_size`** is the pyarrow attribute for a fixed-size list's length, used by `vector_dim()` (Task 2) and `maybe_create_ann_index()` (Task 4). Confirm on the installed pyarrow (`pa.list_(pa.float32(), 8).list_size`); adjust the accessor in both places if the version differs.
-- **`merge_insert` match key `id` must never get a scalar index** (LanceDB #3177). The only scalar index is on `document_id` (`ensure_document_id_scalar_index`). Task 4's `test_upsert_after_optimize_with_scalar_index` guards this.
-- **`embed_model.get_text_embedding`** is called per node in the rebuild/incremental/add paths because we bypass `index.insert_nodes` and write to the store directly. This matches the proven probe. For large rebuilds, consider batching with `get_text_embedding_batch` as a later optimization (YAGNI for now).
-- **Compaction retention** defaults to 1 hour (`get_llm_index_compaction_retention`); tests call `compact(retention_seconds=0)` directly to force a single version.
diff --git a/docs/superpowers/specs/2026-06-02-lancedb-vector-store-design.md b/docs/superpowers/specs/2026-06-02-lancedb-vector-store-design.md
deleted file mode 100644
index 722709a59..000000000
--- a/docs/superpowers/specs/2026-06-02-lancedb-vector-store-design.md
+++ /dev/null
@@ -1,448 +0,0 @@
-# Replace the FAISS vector store with LanceDB
-
-**Date:** 2026-06-02
-**Status:** Design — pending implementation plan
-**Area:** `src/paperless_ai/` (AI / LLM index feature)
-
-## Problem
-
-The optional AI feature stores document embeddings in a llama-index `StorageContext`
-made of three file-backed components persisted under `DATA_DIR/llm_index/`:
-
-| Component                                | Role                                                 | Backing           |
-| ---------------------------------------- | ---------------------------------------------------- | ----------------- |
-| `FaissVectorStore` (`faiss.IndexFlatL2`) | the vectors                                          | binary faiss file |
-| `SimpleDocumentStore`                    | node text + metadata (source of truth for retrieval) | one large JSON    |
-| `SimpleIndexStore`                       | `vector_id → node_id` map                            | JSON              |
-
-`faiss.IndexFlatL2` is append-only and has no metadata filtering, and all three
-components are whole-file, load-everything-into-RAM structures. That combination —
-not FAISS alone — drives the bulk of the surrounding complexity and is what fails
-on large installs:
-
-1. **Deletes are fake.** On update/remove, `remove_document_docstore_nodes`
-   (`indexing.py:182`) deletes nodes from the _docstore_ only; the FAISS vectors
-   physically remain forever. The only way to truly reclaim them is a full
-   `rebuild=True` (re-embed every document).
-2. **No metadata filtering** forces the entire custom `DocumentFilteredFaissRetriever`
-   (`chat.py:78-151`) with its expanding `top_k *= 2` loop to emulate a
-   `document_id IN (...)` filter.
-3. **Whole-docstore Python scans.** `query_similar_documents` (`indexing.py:419`)
-   iterates the full docstore in Python to translate `document_id → node_id`.
-4. **Write amplification.** Every single-document add/update/remove takes a global
-   `FileLock` and calls `storage_context.persist()`, which rewrites the entire
-   multi-GB JSON docstore — O(N) memory and O(N) disk per document operation.
-5. **Brute-force query.** `IndexFlatL2` is O(N·d) per search with no ANN.
-
-We cannot predict or bound a user's install size, so the replacement must scale from
-a handful of documents to very large corpora on a single node, with no extra service.
-
-## Constraints (decided during brainstorming)
-
-- **Engine-agnostic, on-disk store.** Paperless supports SQLite, PostgreSQL _and_
-  MariaDB, so DB-integrated vectors (e.g. pgvector) are out — the vector store stays
-  a self-contained on-disk artifact like today's `llm_index` dir, identical across DB
-  backends.
-- **Swap the storage layer only.** Keep llama-index as the framework. `VectorStoreIndex`,
-  the retrievers, the chat query engine + response synthesizer, `SimpleNodeParser`, and
-  the embedding-model abstraction are all unchanged. Only the `StorageContext` trio is
-  replaced.
-- **Store: LanceDB**, integrated via a **custom `BasePydanticVectorStore` adapter** we
-  own (`PaperlessLanceVectorStore`) talking to `lancedb` + `pyarrow` directly — _not_ the
-  official `llama-index-vector-stores-lancedb` wrapper. The wrapper was evaluated and
-  rejected: it hard-requires `pandas`, hides `index_type` behind `**kwargs`, and _raises_
-  on empty query results. A ~150-180 line adapter against llama-index's stable public
-  interfaces avoids all three and lets us own the table schema. (See "Why a custom
-  adapter".)
-- **ANN: auto threshold.** Small installs use LanceDB's exact (brute-force) kNN, which
-  LanceDB's own docs call sufficient for datasets up to ~100K vectors. Past a threshold
-  we build an IVF index automatically, best-effort, with exact search as the
-  always-valid fallback.
-- **pandas is eliminated.** `llama-index-core` does not depend on pandas, and the custom
-  adapter materializes LanceDB results via `pyarrow` (`.to_list()`), so pandas never
-  enters the dependency tree. `pyarrow` is a direct dep but arrives transitively through
-  `lancedb` regardless.
-
-## Why LanceDB
-
-LanceDB is the only embedded, serverless candidate architected for **disk-resident,
-memory-mapped** operation — RAM does not scale with the corpus, which is the single
-most important property for "tiny or very large, equally." It provides real CRUD
-(predicate `delete`, `add`), filtered search, and IVF ANN, all writing to a directory on
-disk. Because our adapter declares `stores_text = True`, llama-index runs off the vector
-store alone — so both `SimpleDocumentStore` and `SimpleIndexStore` are deleted outright.
-
-Verified against `lancedb 0.33.0` with functional probes:
-
-- A `lancedb` table on disk is memory-mapped; writes are durable on call (connect = a
-  directory, table = a Lance dataset). **No `persist()` and no whole-file rewrite.**
-- `table.delete('doc_id = "..."')` is a real predicate delete that physically removes
-  rows (probe: a 2-chunk doc dropped to 0 rows).
-- `table.add(rows)` appends; `merge_insert(...).when_not_matched_by_source_delete(...)`
-  provides an atomic upsert that also prunes stale chunks — the incremental update path
-  (see §3). Verified: a doc going 5→3 chunks ends with exactly the 3 new chunks.
-- `table.search(embedding).where('document_id IN (...)').limit(k).to_list()` returns
-  plain dicts via `pyarrow` (**no pandas**), and returns `[]` cleanly on no match — **no
-  raise**.
-
-## Why a custom adapter (not `llama-index-vector-stores-lancedb`)
-
-The custom adapter was proven end-to-end through llama-index's real
-`VectorStoreIndex` → `VectorIndexRetriever` path with a `MockEmbedding`: build, update
-(delete+insert with **zero orphan rows**), `MetadataFilters(IN)` forwarded through the
-retriever, empty-filter → `[]`, and remove — all with **`pandas` never imported**. The
-adapter is ~120 lines in the probe (≈150-180 production-ready) and uses only llama-index's
-**stable public** primitives: `BasePydanticVectorStore`, `node_to_metadata_dict` /
-`metadata_dict_to_node`, `VectorStoreQuery` / `VectorStoreQueryResult`.
-
-Choosing the adapter over the wrapper converts several wrapper-specific liabilities into
-non-issues:
-
-| Wrapper liability                                                                       | With the custom adapter                                                                            |
-| --------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------- |
-| Hard-imports `pandas` (`base.py:33`), uses `.to_pandas()`                               | Eliminated — `pyarrow.to_list()`                                                                   |
-| `create_index` hides `index_type`, hard-defaults `num_sub_vectors=96` (base.py:333-368) | We call `table.create_index(...)` with explicit `index_type` / partitions / sub-vectors            |
-| `query()` _raises_ `Warning` on empty results (base.py:560-563)                         | Our `query()` returns an empty `VectorStoreQueryResult` natively                                   |
-| `_to_lance_filter` prefixes `metadata.<key>`; fragile when `_metadata_keys is None`     | Dedicated top-level `document_id` column; filter is plain `document_id IN (...)`, scalar-indexable |
-| Third-party package to pin and track for API drift                                      | No integration package; depend only on stable llama-index core interfaces                          |
-
-The cost is ~150-180 lines we own and test (vs. a ~10-line subclass) — but we were
-already subclassing to swallow the empty-result `Warning` and add the ANN threshold, so
-the net additional code is modest and removes a dependency.
-
-## Design
-
-### 1. Storage layer
-
-Replace `get_or_create_storage_context()` with a vector-store factory that returns a
-`PaperlessLanceVectorStore` pointed at `settings.LLM_INDEX_DIR` with an **explicit,
-pinned `table_name`** (e.g. `LLM_INDEX_TABLE = "documents"`) used consistently by the
-factory, the existence check (§7), and the migration detection (§8). The index is built
-with `VectorStoreIndex.from_vector_store(vector_store, embed_model=...)` for the
-load/query path, and `VectorStoreIndex(nodes=..., storage_context=...)` (storage context
-holding only the vector store) for the rebuild path. No docstore, no index store.
-
-`meta.json` (embedding model name + dimension) is **kept** for embedding-model-change
-detection that forces a rebuild — unchanged from today (`embedding.py:get_embedding_dim`).
-
-### 2. `PaperlessLanceVectorStore(BasePydanticVectorStore)`
-
-A custom adapter (~150-180 lines) implementing llama-index's vector-store contract
-directly against `lancedb` + `pyarrow`. Class flags: `stores_text = True`,
-`flat_metadata = True`.
-
-**Table schema** (explicit `pyarrow` schema, created lazily on first `add`):
-
-| Column         | Type                            | Purpose                                                           |
-| -------------- | ------------------------------- | ----------------------------------------------------------------- |
-| `id`           | `string`                        | node id (`node.node_id`)                                          |
-| `doc_id`       | `string`                        | `node.ref_doc_id` (= `str(document.id)`, see §3) — the delete key |
-| `document_id`  | `string`                        | top-level filter column (mirrors `metadata["document_id"]`)       |
-| `vector`       | `fixed_size_list<float32>[dim]` | embedding                                                         |
-| `node_content` | `string`                        | `json.dumps(node_to_metadata_dict(node, remove_text=False))`      |
-
-A dedicated top-level `document_id` column (rather than the wrapper's nested
-`metadata.<key>` struct) makes filtering a plain `document_id IN (...)` predicate and
-allows an optional LanceDB **scalar index** on it for fast filtered scans.
-
-**Methods:**
-
-- `add(nodes)` — serialize each node via `node_to_metadata_dict(node, remove_text=False,
-flat_metadata=True)` into the schema above; lazily `create_table` (with the explicit
-  schema sized to the embedding dim) or `table.add(rows)` (plain append). Used by the
-  **rebuild** path (bulk insert into a fresh table) and as llama-index's `add` hook.
-  Returns node ids.
-- `upsert_document(document_id, nodes)` — the **incremental** add/update path. A single
-  `merge_insert("id").when_matched_update_all().when_not_matched_insert_all()
-.when_not_matched_by_source_delete("document_id = '<id>'").execute(rows)` — atomic
-  replace-with-prune for one document (see §3). All nodes passed must belong to the one
-  `document_id`. Nodes are embedded before the call (the incremental path embeds with the
-  configured `embed_model` rather than going through `index.insert_nodes`).
-- `delete(ref_doc_id)` — `table.delete(f'doc_id = "{ref_doc_id}"')` (parameter-escaped).
-  Used for document removal.
-- `delete_nodes(node_ids)` — `table.delete('id IN (...)')` (for completeness).
-- `get_nodes(node_ids=None, filters=None)` — `table.search().where(...).to_list()`,
-  rebuild nodes via `metadata_dict_to_node(json.loads(row["node_content"]))`. Returns `[]`
-  cleanly when empty — the correct primitive for the chat no-content pre-check.
-- `query(VectorStoreQuery)` — `table.search(query.query_embedding).where(_build_where(
-query.filters)).limit(query.similarity_top_k).to_list()`; rebuild nodes, map LanceDB L2
-  `_distance` → a similarity score, and **return an empty `VectorStoreQueryResult` on no
-  match (no raise)**.
-- `client` property → the `lancedb` connection.
-
-**Filter translation** — `_build_where(MetadataFilters)` handles exactly the operators we
-use (`EQ`, `IN`) on the top-level `document_id` column, string-escaping values. This is
-small, fully owned, and free of the wrapper's `metadata.`-prefix / `_metadata_keys`
-behavior.
-
-**Auto ANN index** — `maybe_create_ann_index()`, called after build/update writes **while
-holding the global write lock** (it is itself a write path): if the table row count
-exceeds `ANN_INDEX_MIN_ROWS` (~100K chunks, per LanceDB guidance) and no vector index
-exists yet, best-effort `table.create_index(...)`:
-
-- **Index type by divisibility.** IVF*PQ requires `num_sub_vectors` to \_evenly divide*
-  the embedding dimension — LanceDB raises a hard `RuntimeError` otherwise (verified). The
-  dimension is detected at runtime from a user-configurable model and many common dims
-  (e.g. 1024) are **not** divisible by 96. So: pick a `num_sub_vectors` that divides the
-  dim and build **IVF_PQ**; if none exists, build **IVF_FLAT** (`index_type="IVF_FLAT"`),
-  which has no divisor constraint and still gives IVF/ANN speedup — strictly better than
-  reverting to full brute-force. (Talking to `lancedb` directly, `index_type` is just a
-  named argument — none of the wrapper's kwargs-smuggling.)
-- `num_partitions`: LanceDB guidance is ≈ `num_rows / 4096`; clamp to a sane minimum.
-- Wrapped in `try/except` — a failure logs and leaves the table on exact search, which is
-  always correct.
-
-### 3. Node identity
-
-In `build_document_node` (`indexing.py:109`), set the `LlamaDocument` `id_` to
-`str(document.id)`. `SimpleNodeParser` propagates that as each chunk node's
-`ref_doc_id`, and the adapter stores it in the `doc_id` column. Result: every chunk of a
-paperless document shares `ref_doc_id == str(document.id)`, so one `delete(str(doc.id))`
-clears exactly that document's chunks (verified end-to-end). `document_id` also remains in
-node metadata (and is mirrored to the top-level filter column) for filtering and result
-mapping.
-
-**Update = native upsert via `merge_insert` (one atomic commit).** The incremental
-add/update path uses a single `merge_insert`, not delete-then-add:
-
-```
-table.merge_insert("id")
-     .when_matched_update_all()
-     .when_not_matched_insert_all()
-     .when_not_matched_by_source_delete(f"document_id = '{document_id}'")
-     .execute(new_rows)
-```
-
-The `when_not_matched_by_source_delete` clause — scoped to the document's `document_id`
-— prunes stale trailing chunks (the case where an edit reduces a document's chunk count)
-**atomically in the same commit**. Verified on 0.33.0: a doc going 5→3 chunks ends with
-exactly the 3 new chunks, other documents untouched, and it works whether or not chunk
-ids are deterministic (non-matching ids become a full replace).
-
-This is strictly better than delete-then-add on three axes:
-
-- **Atomicity / no transient empty state.** Queries take no lock (§6), so delete-then-add
-  exposes a window between the delete commit and the add commit in which a concurrent
-  reader sees the document with **zero chunks**. A single `merge_insert` commit eliminates
-  that window — a reader sees either the old or the new chunk set.
-- **Half the version growth.** One commit per update instead of two, directly halving the
-  MVCC version accumulation that compaction (§10) must reclaim.
-- **Correctness preserved** without a separate delete call.
-
-> **Important:** `optimize()` prunes old _versions_, **not** dead _rows_ in the live
-> version. A plain upsert (update+insert without the delete clause) would leave stale
-> chunks as live rows that `optimize` can never remove — so the
-> `when_not_matched_by_source_delete` clause is mandatory, not optional.
-
-> **Index caveat (LanceDB #3177):** `merge_insert` can fail _silently_ after `optimize()`
-> when a scalar index exists on the **match column**. We match on `id`, so a scalar index
-> must **never** be created on `id`. The optional scalar index for filtering goes on
-> `document_id` only (§2), which is not the match column.
-
-### 4. The four operations collapse
-
-| Operation    | Before                                                                                                      | After                                                                                   |
-| ------------ | ----------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------- |
-| add / update | load whole index → `remove_document_docstore_nodes` (fake delete) → `insert_nodes` → `persist` (rewrite GB) | `store.upsert_document(str(doc.id), embedded_nodes)` (one atomic `merge_insert` commit) |
-| remove       | load → docstore delete → persist                                                                            | `store.delete(str(doc.id))`                                                             |
-| similar      | load whole docstore, Python scan for node ids, custom retriever                                             | `VectorIndexRetriever(index, similarity_top_k=k, filters=document_id IN allowed)`       |
-| chat         | custom `DocumentFilteredFaissRetriever` (74 lines)                                                          | stock `VectorIndexRetriever(filters=document_id IN doc_ids)`                            |
-
-Deleted entirely: `remove_document_docstore_nodes`, the whole-docstore scan in
-`query_similar_documents`, and `_get_document_filtered_retriever` /
-`DocumentFilteredFaissRetriever` in `chat.py`. `chat.py`'s direct reaches into
-`index.docstore.docs`, `index.vector_store._faiss_index`, and
-`index.index_struct.nodes_dict` all disappear. The "no content" pre-check in
-`_stream_chat_with_documents` becomes a `store.get_nodes(filters=...)` existence check —
-the adapter's `get_nodes` returns `[]` cleanly on no match, so it is the correct
-primitive for an existence test. References are still derived from returned nodes'
-`metadata["document_id"]` / `metadata["title"]`, so `_get_document_references` is
-unchanged.
-
-Because the adapter's `query()` returns an empty result on no match (it never raises),
-both similar-docs and chat — which retrieve through `VectorIndexRetriever` /
-`RetrieverQueryEngine` calling `vector_store.query()` internally — get a clean empty
-result set instead of an exception. This was the wrapper's most disruptive wart and is
-designed out, not worked around.
-
-### 5. Filtering
-
-Both similar-document and chat retrieval pass a `MetadataFilters` with a single
-`MetadataFilter(key="document_id", operator=FilterOperator.IN, value=[...])` (omitted
-when unconstrained). The adapter's `_build_where` translates this to the plain predicate
-`document_id IN ("...","...")` against the top-level `document_id` column — no struct
-path, no `_metadata_keys` dependence, so filtering is unconditionally correct on a freshly
-opened table and across process restarts (proven by the fresh-process probe).
-
-This replaces today's `query_similar_documents` mechanism, which pre-scans the docstore
-for node ids and passes `doc_ids=` to `VectorIndexRetriever` (`indexing.py:416-434`) — a
-_different_ retriever mechanism. The new path relies on
-`VectorIndexRetriever(filters=MetadataFilters(...))` forwarding `query.filters` into
-`vector_store.query()` — **verified end-to-end in the probe** (a retriever with an `IN`
-filter returned only the matching document). Still covered by a regression test (see
-Testing) since it is load-bearing for both similar-docs and chat.
-
-### 6. Concurrency
-
-Keep the existing `FileLock(_index_lock_path())` around **writes** only. Each write is now
-a small delta append/upsert instead of a multi-GB rewrite, so the lock is held briefly.
-Queries take no lock (LanceDB reads are MVCC snapshot-consistent). The lock-free read path
-is safe for updates specifically because the incremental update is a single atomic
-`merge_insert` commit (§3) — a reader never observes a document mid-update. The lock still
-serializes _writers_ across Celery processes to avoid `CommitConflictError`.
-
-**Why the global lock is load-bearing.** LanceDB's MVCC tolerates concurrent _appends_,
-but concurrent _delete/update_ operations frequently conflict and fail with
-`CommitConflictError` after exhausting retries (LanceDB issues #1597, #3086). Paperless's
-add/update path is exactly delete-then-insert and runs from **separate Celery worker
-processes**. The design is safe only because `_index_lock_path()` is a single shared lock
-file under `LLM_INDEX_DIR` that serializes _all_ writers. This lock must:
-
-- remain a single global lock (do **not** relax to per-document granularity), and
-- cover every write path — add, update, remove, **and** `maybe_create_ann_index()`.
-
-### 7. Index existence / rebuild trigger
-
-Replace `vector_store_file_exists()` with a check for the LanceDB table's existence
-(`LLM_INDEX_DIR` present and the pinned `LLM_INDEX_TABLE` in `connection.table_names()`).
-The existing `queue_llm_index_update_if_needed` / `load_or_build_index` rebuild-on-missing
-logic is otherwise unchanged.
-
-**Dimension-mismatch guard.** The Lance table's vector column dimension is fixed at the
-first `add()`. Beyond the `meta.json` model-change detection (which forces a rebuild when
-the _model name_ changes), guard against a dimension mismatch directly: if the current
-embedding dim differs from the existing table's vector dim, force a rebuild rather than
-letting `add()` fail with a hard dimension error. This covers the gaps `meta.json` can't —
-a missing/corrupt `meta.json`, or two models sharing a name but differing in dim.
-
-### 8. Migration
-
-The index is fully derived data, rebuildable from `Document` rows. On first run of the
-new code, detect the stale FAISS format (presence of `default__vector_store.json` /
-faiss files with no LanceDB table), wipe `LLM_INDEX_DIR`, and trigger a rebuild through
-the existing `queue_llm_index_update_if_needed(rebuild=...)` path. No data migration and
-no user action beyond the automatic background rebuild.
-
-### 9. Dependencies (`pyproject.toml`)
-
-- **Remove:** `faiss-cpu`, `llama-index-vector-stores-faiss`.
-- **Add:** `lancedb` (pulls in `pyarrow`, `numpy`, `pydantic`, `tqdm`) and `pyarrow`
-  (declared directly since the adapter imports it, even though `lancedb` pulls it
-  transitively). **No `llama-index-vector-stores-lancedb`, no `pandas`** — `llama-index-core`
-  does not require pandas (verified) and the adapter uses `pyarrow.to_list()`.
-- Confirm multi-arch wheels (linux x86_64 + aarch64, the paperless Docker targets) for
-  `lancedb`/`pyarrow` resolve in the lockfile. (`lancedb 0.33.0` ships manylinux x86_64 +
-  aarch64 wheels, matching the paperless Docker build matrix.)
-
-### 10. Maintenance / compaction — **required, not optional**
-
-MVCC has a real disk cost that this design must actively manage. LanceDB writes a **new
-fragment + version on every `add`/`delete`** and retains the superseded files until
-cleanup. Paperless adds/updates documents **one at a time**, so the store bloats
-continuously without maintenance. Measured on 2000 × 768-dim vectors (raw float32 =
-6000 KiB):
-
-| Scenario                                                | On disk                | Versions |
-| ------------------------------------------------------- | ---------------------- | -------- |
-| One bulk insert (= a rebuild)                           | 6016 KiB               | 1        |
-| 2000 single-row adds (= per-document writes)            | **172,848 KiB (~28×)** | 2001     |
-| After `table.optimize(cleanup_older_than=timedelta(0))` | **6344 KiB**           | 1        |
-
-Implications:
-
-- **Full rebuilds are naturally compact** (bulk insert ≈ raw vector bytes), so a rebuild
-  resets accumulated bloat.
-- **The atomic upsert (§3) halves _update_ version growth** (one commit instead of
-  delete-then-add's two), but every new-document insert is still its own version, so
-  versions accumulate over time regardless — compaction remains required.
-- **Per-document writes must be compacted periodically.** Run
-  `table.optimize(cleanup_older_than=<retention>)` — a **single call** that compacts
-  fragments _and_ drops old versions — folded into the existing scheduled LLM-index
-  maintenance task, under the global write lock. Use a small but non-zero retention in
-  production (e.g. minutes–hours) so an in-flight reader on an old version isn't pulled
-  out from under; `timedelta(0)` is for tests/rebuild-time only.
-- **Do not use the older `cleanup_old_versions()`** API: it requires the separate
-  `pylance` package (not pulled by `lancedb` core) and is superseded by
-  `optimize(cleanup_older_than=...)`.
-
-**On the "larger on disk than FAISS" observation:** at small scale LanceDB stores vectors
-as **raw `float32`** (identical per-vector bytes to FAISS `IndexFlatL2`); vector
-_compression_ only comes from the IVF*PQ index, which only exists past the ANN threshold
-(§2). So a small dataset is expected to be \_comparable, not smaller*, than FAISS — and any
-large discrepancy is version accumulation, fixed by the compaction above.
-
-> **Windows note:** the probe hit an `Access is denied` error writing a version-hint file
-> during cleanup on Windows (temp-dir file locking). Paperless production is Linux
-> containers, so this does not affect the deployment target, but bare-metal Windows dev
-> installs may need attention.
-
-## Testing
-
-Per project conventions (pytest-style, classes with `@pytest.mark.django_db`,
-pytest-mock, factory-boy, type-annotated fixtures/tests, default config). LanceDB writes
-to a real directory, so tests point `settings.LLM_INDEX_DIR` at `tmp_path` and exercise a
-**real** (tiny) LanceDB table with a stub embedding model returning deterministic vectors
-— no mocking of store internals.
-
-- **add → query** returns the document.
-- **update** via `upsert_document` leaves no orphan rows — re-index a document whose chunk
-  count _shrinks_ (e.g. 5→3) and assert exactly the new chunks remain and other documents
-  are untouched (this is the regression the old fake delete could not provide, and proves
-  `when_not_matched_by_source_delete` prunes stale chunks).
-- **update is one commit** — assert the table version advances by exactly 1 per
-  `upsert_document` (guards the atomicity / version-growth property).
-- **remove** drops all of a document's chunks.
-- **filtered query** scopes results to the given `document_id`s and excludes others.
-- **empty query** returns `[]` (the adapter's `query()` never raises).
-- **node round-trip**: a node serialized via `node_to_metadata_dict` and reconstructed via
-  `metadata_dict_to_node` preserves text + metadata (`document_id`, `title`).
-- **embedding-model change** → `meta.json` mismatch forces rebuild (existing behavior).
-- **dimension-mismatch guard** → a current embedding dim differing from the stored table
-  dim forces a rebuild rather than a hard `add()` failure.
-- **ANN threshold** trigger logic with a low test threshold: `maybe_create_ann_index`
-  attempts an index past the threshold and is a no-op below it; a `create_index` failure
-  is non-fatal and leaves exact search working.
-- **ANN fallback on a non-divisible dim**: with an embedding dim not divisible by the PQ
-  `num_sub_vectors` (e.g. 1024), `maybe_create_ann_index` builds IVF_FLAT (or the
-  try/except fallback fires) and leaves the table queryable, not broken/unindexed.
-- **Fresh-process filtering**: construct a brand-new `PaperlessLanceVectorStore` against an
-  existing on-disk table and assert an `IN` filter still returns the right rows — the
-  cross-restart path.
-- **Retriever forwards filters**: assert `VectorIndexRetriever(filters=MetadataFilters(...))`
-  built on `VectorStoreIndex.from_vector_store(...)` actually scopes results — the
-  load-bearing integration seam for similar-docs and chat.
-- **Compaction reclaims versions**: after several single-document writes, the maintenance
-  `optimize(cleanup_older_than=...)` call reduces the table to a single version and
-  results stay queryable afterward.
-- **Upsert after optimize** (LanceDB #3177 guard): with a scalar index on `document_id`
-  (and none on `id`), an `upsert_document` performed _after_ `optimize()` still prunes and
-  replaces correctly — verified, but pinned with a test so a future index-placement change
-  or LanceDB regression is caught.
-- Parametrize the add/update/remove variations rather than duplicating bodies.
-
-## Out of scope
-
-- Replacing llama-index for chunking, embeddings, or the chat query engine.
-- Any DB-integrated (pgvector-style) path.
-- Hybrid / full-text / reranked search modes offered by LanceDB (vector search only,
-  matching current behavior).
-- Tuning embedding models or chunking parameters.
-
-## Open risks
-
-- **pyarrow/lancedb footprint.** `lancedb` + `pyarrow` (native wheels) enlarge the optional
-  AI feature's dependency tree; verify image-size impact when updating the lockfile. (Still
-  lighter than the wrapper path, which added `pandas` on top of these.)
-- **ANN index parameters.** The IVF_PQ-vs-IVF_FLAT-by-divisibility logic (§2) plus the
-  best-effort/exact fallback contains the correctness risk, but the row threshold and
-  `num_partitions` heuristic should be validated on a large fixture for actual query
-  latency.
-- **We own the adapter.** We depend on llama-index's `BasePydanticVectorStore` interface
-  and the `node_to_metadata_dict` / `metadata_dict_to_node` helpers. These are stable core
-  APIs (far more stable than the integration package), but a major llama-index bump should
-  re-run the end-to-end retriever test. Pin a known-good `lancedb` and `llama-index-core`.
-- **`merge_insert` + scalar index on the match column (LanceDB #3177).** `merge_insert` can
-  fail _silently_ after `optimize()` if a scalar index exists on the match column. We match
-  on `id` and only index `document_id`, so we are clear — but this is an invariant to
-  enforce (never index `id`) and to cover with a test that exercises
-  upsert-after-optimize.