Getting Started¶
Welcome to Neurosurfer — a production‑grade AI framework with multi‑LLM, RAG, agents, tools, and a FastAPI server. This page walks you through installation, the CLI, and basic Python usage. It’s written in a hybrid style: short explanatory paragraphs with bullets where they help.
🚀 Quick Navigation¶
-
Installation
Install via pip or from source, and learn about CPU/GPU (CUDA/MPS) options.
-
CLI Usage
Serve the backend and (optionally) the NeurowebUI with one command.
-
Basic Usage
Import models & agents, call an LLM, and plug in RAG.
-
Configuration
Configure API keys, models, server options, and environment variables.
đź“‹ Prerequisites¶
Neurosurfer installs as a lightweight core so you can get started quickly. A typical setup uses Python 3.9+ and a virtual environment. If you plan to run the NeurowebUI locally, you’ll also need Node.js and npm. Hardware acceleration is optional but recommended when working with larger models.
- Python
>= 3.9and pip (or uv/pipx/poetry) - Node.js + npm for the NeurowebUI dev server (optional)
- GPU (optional): NVIDIA CUDA on Linux x86_64, or Apple Silicon (MPS) on macOS arm64. CPU‑only works fine for smaller models or demos.
Keep installs lightweight
Neurosurfer’s core installs without heavy model deps. Add the full LLM stack only when you need it via the torch extra.
đź§° Installation¶
Installation is flexible: minimal core for APIs, or the full LLM stack for local inference/finetuning. If you’re unsure, start minimal; you can always add the extra later.
Minimal core¶
This keeps dependencies light — ideal for API servers, docs, or CI.
Full LLM stack (recommended for model work)¶
This extra adds torch, transformers, sentence-transformers, accelerate, bitsandbytes, and unsloth.
OS Platform
Neurosurfer has only been tested on Ubuntu Linux x86_64. It is expected to work on Windows and macOS, but you may need to verify CUDA related dependencies.
GPU / CUDA notes (PyTorch)¶
If you want GPU acceleration, the pip wheel already bundles CUDA; you do not need to install the system CUDA toolkit. You do need a compatible NVIDIA driver. Use these commands to verify and install appropriately:
-
Check your GPU and driver
Confirm the driver is active and note the CUDA version the driver supports.
-
Install a CUDA‑enabled torch wheel (Linux x86_64):
-
Install bitsandbytes (optional, Linux x86_64):
On macOS/Windows, bitsandbytes wheels are limited; prefer CPU or other quantization strategies.
-
Apple Silicon (MPS):
PyTorch includes MPS support on macOS arm64. No extra toolkit required.
-
CPU‑only (portable):
Driver/toolkit alignment
If torch.cuda.is_available() is False despite a GPU, your driver may be outdated. Update the NVIDIA driver to a version compatible with the wheel you installed (e.g., CU124) and restart. You generally do not need the full CUDA toolkit for pip wheels.
Quick GPU sanity check
Build from source¶
git clone https://github.com/NaumanHSA/neurosurfer.git
cd neurosurfer
pip install -U pip wheel build
pip install -e . # editable
# or
pip install -e '.[torch]' # editable + full LLM stack
Quick import check¶
You’ll see a banner; if optional deps are missing, you’ll get a single consolidated warning with install hints.
Useful environment flags
NEUROSURF_SILENCE=1— hide banner & warningsNEUROSURF_EAGER_RUNTIME_ASSERT=1— fail fast at import if LLM deps missing (opt‑in)
🖥️ CLI Usage¶
The CLI runs the backend API and (optionally) the NeurowebUI dev server. Start simple with neurosurfer serve, then refine with flags as you scale.
Help¶
Start backend + UI (dev)¶
The backend binds to NEUROSURF_BACKEND_HOST / NEUROSURF_BACKEND_PORT (from config). The UI root is auto‑detected; pass --ui-root if needed. On first run, an npm install may occur — you can control this with --npm-install.
Common options¶
- Backend app (
--backend-app): module path likepkg.module:app_or_factory()or a Python file with aNeurosurferAppinstance. The default isneurosurfer.examples.quickstart_app:ns. - Backend:
--backend-host,--backend-port,--backend-log-level,--backend-reload,--backend-workers,--backend-worker-timeout. - UI:
--ui-root,--ui-host,--ui-port,--ui-strict-port,--ui-open,--npm-install {auto|always|never}. - Split:
--only-backendor--only-ui.
Examples¶
Backend only
Serve your own file
Serve module
Run UI only
Public backend URL for UI (when binding 0.0.0.0)
🤖 Basic Usage¶
A minimal model call shows the shape of the API; agents and RAG come next. If LLM dependencies are missing, you’ll get clear, actionable errors instructing you to install the extra.
Minimal model call¶
from neurosurfer.models.chat_models.transformers import TransformersModel
llm = TransformersModel(
model_name="unsloth/Llama-3.2-1B-Instruct-bnb-4bit",
max_seq_length=4096,
load_in_4bit=True,
enable_thinking=False,
stop_words=[],
)
resp = llm.ask(user_prompt="Say hi!", system_prompt="You are helpful.", stream=False)
print(resp.choices[0].message.content)
Agents¶
from neurosurfer.agents import ReActAgent
from neurosurfer.tools import Toolkit
from neurosurfer.models.chat_models.transformers import TransformersModel
llm = TransformersModel(model_name="meta-llama/Llama-3.2-3B-Instruct")
agent = ReActAgent(toolkit=Toolkit(), llm=llm)
print(agent.run("What is the capital of France?"))
RAG Basics¶
from neurosurfer.models.embedders.sentence_transformer import SentenceTransformerEmbedder
from neurosurfer.rag.chunker import Chunker
from neurosurfer.rag.filereader import FileReader
from neurosurfer.server.services.rag_orchestrator import RAGOrchestrator
embedder = SentenceTransformerEmbedder(model_name="intfloat/e5-large-v2")
rag = RAGOrchestrator(
embedder=embedder,
chunker=Chunker(),
file_reader=FileReader(),
persist_dir="./rag_store",
max_context_tokens=2000,
top_k=15,
min_top_sim_default=0.35,
min_top_sim_when_explicit=0.15,
min_sim_to_keep=0.20,
)
aug = rag.apply(
actor_id=1,
thread_id="demo",
user_query="Summarize the attached report",
files=[{"name":"report.pdf","content":"...base64...","type":"application/pdf"}],
)
print("Augmented query:", aug.augmented_query)
⚙️ Configuration¶
The full configuration guide covers API keys, model providers, server settings, logging, and environment variables. Read it here → Configuration.
➡️ Next Steps & Help¶
Explore the broader documentation and examples to deepen your setup:
- API Reference — classes, methods, schemas :octicons-arrow-up-right-24
- Examples — runnable code samples :octicons-arrow-up-right-24
- CLI — command line interface :octicons-arrow-up-right-24
Installation issues? Try the full stack extra:
GPU problems? Verify with nvidia-smi, confirm driver versions, and test torch.cuda.is_available(). For quantization on non‑Linux platforms, consider CPU or alternative strategies.
Ready? Jump to Installation or try the CLI.