Vector Stores
Guava provides four ready-made VectorStore implementations that can be passed directly to DocumentQA as the store argument. Each wraps a popular vector database and handles embedding, indexing, and similarity search.
Python only: Vector store backends are currently available in Python only. TypeScript equivalents are not yet available.
Installation
Install only the backend(s) you need:
pip install 'guava-sdk[chromadb]'pip install 'guava-sdk[lancedb]'pip install 'guava-sdk[pgvector]'pip install 'guava-sdk[pinecone]'
Embedding and generation provider extras:
pip install 'guava-sdk[genai]'— Google Gemini (backsGenAIEmbedding/GenAIGeneration)pip install 'guava-sdk[openai]'— OpenAI (backsOpenAIEmbedding/OpenAIGeneration)
Importing a backend class without the corresponding extra installed raises ImportError with an install hint.
ChromaVectorStore
from guava.helpers.chromadb import ChromaVectorStore
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
path | str | None | No | "./chroma_data" | Directory for persistent storage. Pass None for an in-memory ephemeral store. |
collection_name | str | No | "chunks" | ChromaDB collection name. |
embedding_model | EmbeddingModel | None | No | None | External embedding model. When omitted, ChromaDB's built-in all-MiniLM-L6-v2 model is used — no external API needed. |
LanceDBStore
from guava.helpers.lancedb import LanceDBStore
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
path | str | No | "./lancedb_data" | Local path or GCS URI (e.g. "gs://bucket/lancedb") for storage. |
table_name | str | No | "chunks" | LanceDB table name. |
embedding_model | EmbeddingModel | Yes | — | Embedding model to use. Pass a configured instance such as GenAIEmbedding or OpenAIEmbedding. |
Note: LanceDB silently drops tables that predate the current schema version. This triggers a full re-index the next time DocumentQA ingests documents.
PgVectorStore
from guava.helpers.pgvector import PgVectorStore
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
db_url | str | Yes | — | PostgreSQL connection string (e.g. "postgresql://user:pass@host/db"). |
table_name | str | No | "guava_chunks" | Table name for stored chunks. |
embedding_model | EmbeddingModel | Yes | — | Embedding model to use. Pass a configured instance such as GenAIEmbedding or OpenAIEmbedding. |
PgVectorStore creates the vector extension, chunks table, and HNSW cosine index automatically on first connect. If the connecting user lacks CREATE EXTENSION privileges, initialization will fail.
Managed Postgres: Managed services (Cloud SQL, AlloyDB, RDS) are untested but expected to work since the implementation uses standard psycopg.
PineconeVectorStore
from guava.helpers.pinecone import PineconeVectorStore
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
api_key | str | None | No | env PINECONE_API_KEY | Pinecone API key. If omitted, reads from the environment. |
index_name | str | No | "guava-chunks" | Pinecone index name. Created automatically if it does not exist. |
cloud | str | No | "aws" | Serverless cloud provider for index creation. Ignored if the index already exists. |
region | str | No | "us-east-1" | Serverless region for index creation. Ignored if the index already exists. |
embedding_model | EmbeddingModel | None | No | PineconeInferenceEmbedding | Defaults to multilingual-e5-large (1024-dim) via Pinecone's hosted Inference API. |
Cold start: Pinecone index creation can take 30–60 seconds on first use. Subsequent instantiations with the same index_name skip creation and connect immediately.
PineconeInferenceEmbedding
from guava.helpers.pinecone import PineconeInferenceEmbedding
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
pc | Pinecone | Yes | — | A configured Pinecone client instance. |
model | str | No | "multilingual-e5-large" | Pinecone inference model name. |
dimensionality | int | No | 1024 | Output vector size. |
GenAIEmbedding / GenAIGeneration
from guava.helpers.genai import GenAIEmbedding, GenAIGeneration
Install: pip install 'guava-sdk[genai]'
GenAIEmbedding — EmbeddingModel backed by Google Gemini. Works with either a Vertex AI client (genai.Client(vertexai=True, project=..., location=...)) or an AI Studio client (genai.Client(api_key=...)).
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
client | google.genai.Client | Yes | — | Configured Gemini client. |
model | str | No | "gemini-embedding-001" | Gemini embedding model name. |
dimensionality | int | No | 768 | Output vector size. |
Uses different task types under the hood: RETRIEVAL_DOCUMENT for embed_documents and QUESTION_ANSWERING for embed_query, which improves retrieval quality versus a single generic embedding.
GenAIGeneration — GenerationModel backed by Google Gemini.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
client | google.genai.Client | Yes | — | Configured Gemini client. |
model | str | No | "gemini-2.5-flash" | Gemini chat model name. |
thinking_budget | int | None | No | 0 | Token budget for the model's internal thinking step. Default 0 disables thinking on gemini-2.5-flash for faster responses. Pass None for non-thinking models (e.g. gemini-1.5-flash). Pass a positive integer (e.g. 8192) to enable extended thinking. |
thinking_budget compatibility: The default thinking_budget=0 works with gemini-2.5-flash. If you switch to a non-thinking model like gemini-1.5-flash, pass thinking_budget=None — that model raises an error when it receives a thinking_config.
OpenAIEmbedding / OpenAIGeneration
from guava.helpers.openai import OpenAIEmbedding, OpenAIGeneration
Install: pip install 'guava-sdk[openai]'
The caller supplies a configured openai.OpenAI instance. The same wrappers work against Azure OpenAI or any OpenAI-compatible base URL — the client object decides where requests go.
OpenAIEmbedding — EmbeddingModel backed by the OpenAI Embeddings API.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
client | openai.OpenAI | Yes | — | Configured OpenAI client. |
model | str | No | "text-embedding-3-small" | OpenAI embedding model name. |
dimensionality | int | No | 1536 | Output vector size. |
OpenAIGeneration — GenerationModel backed by OpenAI chat.completions.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
client | openai.OpenAI | Yes | — | Configured OpenAI client. |
model | str | No | "gpt-5-mini" | OpenAI chat model name. |
The system_instruction argument to generate() is mapped to a {"role": "system", ...} message prepended to the user prompt.
GenerationModel
Any implementation of the guava.helpers.rag.GenerationModel interface works with DocumentQA in local mode. The examples on this page use GenAIGeneration, but OpenAIGeneration (above) or any custom GenerationModel subclass works equally well.
Examples
from guava.helpers.rag import DocumentQA
from guava.helpers.genai import GenAIEmbedding, GenAIGeneration
from google import genai
client = genai.Client(vertexai=True, project="my-project", location="us-central1")
embedding = GenAIEmbedding(client=client) # gemini-embedding-001, 768-dim
generation = GenAIGeneration(client=client) # gemini-2.5-flash
# ChromaDB — no external embedding API required; persists to disk by default
from guava.helpers.chromadb import ChromaVectorStore
store = ChromaVectorStore() # path="./chroma_data" by default
store = ChromaVectorStore(path=None) # in-memory/ephemeral
qa = DocumentQA(store=store, generation_model=generation, documents=[doc1, doc2])
answer = qa.ask("What is the deductible?")
# LanceDB — local path or GCS URI; requires an embedding model
from guava.helpers.lancedb import LanceDBStore
store = LanceDBStore("./lancedb_data", embedding_model=embedding)
store = LanceDBStore("gs://my-bucket/lancedb", embedding_model=embedding) # GCS
qa = DocumentQA(store=store, generation_model=generation, documents=[doc1, doc2])
answer = qa.ask("What is the deductible?")
# pgvector — Postgres connection string; table and indexes created automatically
from guava.helpers.pgvector import PgVectorStore
store = PgVectorStore(
db_url="postgresql://user:password@localhost:5432/mydb",
embedding_model=embedding,
)
qa = DocumentQA(store=store, generation_model=generation, documents=[doc1, doc2])
answer = qa.ask("What is the deductible?")
# Pinecone — set PINECONE_API_KEY; index and embeddings are fully managed
from guava.helpers.pinecone import PineconeVectorStore
store = PineconeVectorStore() # index_name="guava-chunks" by default
qa = DocumentQA(store=store, generation_model=generation, documents=[doc1, doc2])
answer = qa.ask("What is the deductible?")Questions? hi@goguava.ai
from guava.helpers.rag import DocumentQA
from guava.helpers.genai import GenAIEmbedding, GenAIGeneration
from google import genai
client = genai.Client(vertexai=True, project="my-project", location="us-central1")
embedding = GenAIEmbedding(client=client) # gemini-embedding-001, 768-dim
generation = GenAIGeneration(client=client) # gemini-2.5-flash
# ChromaDB — no external embedding API required; persists to disk by default
from guava.helpers.chromadb import ChromaVectorStore
store = ChromaVectorStore() # path="./chroma_data" by default
store = ChromaVectorStore(path=None) # in-memory/ephemeral
qa = DocumentQA(store=store, generation_model=generation, documents=[doc1, doc2])
answer = qa.ask("What is the deductible?")
# LanceDB — local path or GCS URI; requires an embedding model
from guava.helpers.lancedb import LanceDBStore
store = LanceDBStore("./lancedb_data", embedding_model=embedding)
store = LanceDBStore("gs://my-bucket/lancedb", embedding_model=embedding) # GCS
qa = DocumentQA(store=store, generation_model=generation, documents=[doc1, doc2])
answer = qa.ask("What is the deductible?")
# pgvector — Postgres connection string; table and indexes created automatically
from guava.helpers.pgvector import PgVectorStore
store = PgVectorStore(
db_url="postgresql://user:password@localhost:5432/mydb",
embedding_model=embedding,
)
qa = DocumentQA(store=store, generation_model=generation, documents=[doc1, doc2])
answer = qa.ask("What is the deductible?")
# Pinecone — set PINECONE_API_KEY; index and embeddings are fully managed
from guava.helpers.pinecone import PineconeVectorStore
store = PineconeVectorStore() # index_name="guava-chunks" by default
qa = DocumentQA(store=store, generation_model=generation, documents=[doc1, doc2])
answer = qa.ask("What is the deductible?")