BaseVectorDB¶
Module: neurosurfer.vectorstores.base
Type: Abstract Base Class
Overview¶
The BaseVectorDB is the abstract base class for all vector database implementations in Neurosurfer.
Required Abstract Methods:
add_documents(docs: List[Doc])- Add documents with embeddingssimilarity_search(query_embedding, top_k, metadata_filter, similarity_threshold)- Search for similar documentscount()- Get document countlist_all_documents(metadata_filter)- Retrieve all documentsdelete_documents(ids)- Delete specific documentsdelete_collection()- Delete the collectionclear_collection()- Clear all documents
Doc Dataclass¶
Documents are represented using the Doc dataclass:
@dataclass
class Doc:
id: str
text: str
embedding: Optional[List[float]] = None
metadata: Dict[str, Any] = field(default_factory=dict)
Creating Custom Vector Stores¶
from neurosurfer.vectorstores.base import BaseVectorDB, Doc
from typing import List, Dict, Optional, Tuple, Any
class CustomVectorStore(BaseVectorDB):
def __init__(self, collection_name: str):
self.collection_name = collection_name
# Your initialization
def add_documents(self, docs: List[Doc]):
"""Add documents with embeddings to the store."""
# Your implementation
pass
def similarity_search(
self,
query_embedding: List[float],
top_k: int = 5,
metadata_filter: Optional[Dict[str, Any]] = None,
similarity_threshold: Optional[float] = None
) -> List[Tuple[Doc, float]]:
"""Search for similar documents.
Returns:
List of (Doc, similarity_score) tuples, sorted by score descending.
"""
# Your implementation
pass
def count(self) -> int:
"""Return total number of documents."""
# Your implementation
pass
def list_all_documents(
self,
metadata_filter: Optional[Dict[str, Any]] = None
) -> List[Doc]:
"""Retrieve all documents, optionally filtered by metadata."""
# Your implementation
pass
def delete_documents(self, ids: List[str]):
"""Delete specific documents by ID."""
# Your implementation
pass
def clear_collection(self):
"""Clear all documents but keep collection."""
# Your implementation
pass
def delete_collection(self):
"""Delete the entire collection."""
# Your implementation
pass