import { CodeTabs } from '../views/docs/CodeTabs'; import { CodeBlock } from '../views/docs/CodeBlock'; import { Callout, NextLink } from '../views/docs/prose'; import { DOCUMENT_QA_SIG_PY, DOCUMENT_QA_SIG_TS, DOCUMENT_QA_EX_PY, DOCUMENT_QA_EX_TS, DOCUMENT_QA_MGMT_EX_PY, } from './helpers-constants'; ## DocumentQA `DocumentQA` answers caller questions against documents using retrieval-augmented generation (RAG). It operates in one of two modes: - **Server mode (default):** Documents are uploaded to the Guava server and questions are answered server-side. Intended for simple use cases with few documents. - **Local mode:** Bring your own vector store and generation model for full control over the RAG pipeline. Guava provides ready-made backends for ChromaDB, LanceDB, pgvector, and Pinecone. ### Constructor | Parameter | Type | Required | Default | Description | |-----------|------|----------|---------|-------------| | `store` | `VectorStore \| None` | No | `None` | Vector store for local mode. When omitted, server mode is used automatically. | | `documents` | `list[str] \| str \| None` | No | `None` | Documents to index at construction time. Accepts a single string or a list. | | `ids` | `list[str] \| None` | No | `None` | Caller-provided IDs for each document, enabling later `upsert_document` / `delete_document`. Length must match `documents` if provided. | | `chunk_size` | `int` | No | `5000` | Maximum characters per chunk (local mode only). | | `chunk_overlap` | `int` | No | `200` | Overlap between consecutive chunks in characters (local mode only). | | `instructions` | `str \| None` | No | `None` | System instruction for the generation model. Overrides the built-in default. | | `generation_model` | `GenerationModel \| None` | Local mode | `None` | Generation model for producing answers. Required when `store` is provided. | | `namespace` | `str \| None` | Server mode | `None` | Stable string to scope this instance's documents on the server. | namespace requirement: In server mode, `namespace` is required when running multiple `DocumentQA` instances concurrently — even across different files. Without a namespace, concurrent instances may interfere with each other's document stores. ### Methods **`ask(question: str, k: int = 5) -> str`** — Retrieve relevant chunks and generate an answer. In server mode, `k` is ignored (the server uses full document context). **`upsert_document(key: str, text: str) -> None`** — Add or replace a document by key. Stale chunks from a previously longer document are deleted automatically. **`add_document(text: str) -> None`** — Add a document without specifying a key. In server mode, uses a content-derived key (SHA-256 hash). **`delete_document(key: str) -> None`** — Delete a previously upserted document by key. **`clear() -> None`** — Remove all documents from the store. ### Available VectorStore Backends (Local Mode) | Class | Import | Install | Default Embedding | |-------|--------|---------|-------------------| | `ChromaVectorStore` | `guava.helpers.chromadb` | `pip install 'gridspace-guava[chromadb]'` | Built-in `all-MiniLM-L6-v2` (no API needed) | | `LanceDBStore` | `guava.helpers.lancedb` | `pip install 'gridspace-guava[lancedb]'` | Required — pass an `EmbeddingModel` | | `PgVectorStore` | `guava.helpers.pgvector` | `pip install 'gridspace-guava[pgvector]'` | Required — pass an `EmbeddingModel` | | `PineconeVectorStore` | `guava.helpers.pinecone` | `pip install 'gridspace-guava[pinecone]'` | `multilingual-e5-large` via Pinecone Inference | See the Vector Stores reference for full constructor details and backend-specific options. ### Examples ### Incremental Document Management Use `ids` to assign stable keys to documents at construction time, then use `upsert_document`, `delete_document`, and `clear` to manage documents without re-creating the `DocumentQA` instance.