Skip to content

Vector Stores

Neurosurfer’s VectorDB layer provides a unified interface for storing and retrieving embeddings across different backends (e.g., Chroma, in-memory). Implementations share the same contract so you can swap backends without changing app code.

  • Base Concepts


    Core contracts and data structures. Start here to understand the abstraction used by all backends.

    BaseVectorDB Doc

  • Chroma


    Persistent, production-ready store using chromadb with PersistentClient, metadata filtering, and similarity search.

    Documentation

  • In-Memory


    Lightweight baseline store ideal for tests, demos, and small prototypes. Implements cosine similarity over Python lists.

    Documentation


⚡ Quick Example

from neurosurfer.vectorstores import ChromaVectorStore
from neurosurfer.vectorstores.base import Doc
from neurosurfer.models.embedders.sentence_transformer import SentenceTransformerEmbedder

# Create vector store and embedder
vectorstore = ChromaVectorStore(
    collection_name="my_docs",
    persist_directory="./chroma_db"
)
embedder = SentenceTransformerEmbedder(model_name="intfloat/e5-large-v2")

# Create and add documents
docs = [
    Doc(id="1", text="Document 1", 
        embedding=embedder.embed("Document 1"),
        metadata={"source": "doc1.txt"}),
    Doc(id="2", text="Document 2", 
        embedding=embedder.embed("Document 2"),
        metadata={"source": "doc2.txt"})
]
vectorstore.add_documents(docs)

# Search (requires embedding the query)
query_embedding = embedder.embed("query")
results = vectorstore.similarity_search(query_embedding, top_k=5)

for doc, score in results:
    print(f"[{score:.3f}] {doc.text}")

Tip: When using the ingestion pipeline, see RAG Ingestor for batching, deduplication, and automatic ID strategy. The vector store API is intentionally minimal to keep backends interchangeable.