Getting Started¶

Welcome to Neurosurfer — a production‑grade AI framework with multi‑LLM, RAG, agents, tools, and a FastAPI server. This page walks you through installation, the CLI, and basic Python usage. It’s written in a hybrid style: short explanatory paragraphs with bullets where they help.

Installation

Install via pip or from source, and learn about CPU/GPU (CUDA/MPS) options.

Go to Installation
CLI Usage

Serve the backend and (optionally) the NeurowebUI with one command.

Go to CLI
Basic Usage

Import models & agents, call an LLM, and plug in RAG.

Go to Basic Usage
Configuration

Configure API keys, models, server options, and environment variables.

Open Configuration

📋 Prerequisites¶

Neurosurfer installs as a lightweight core so you can get started quickly. A typical setup uses Python 3.9+ and a virtual environment. If you plan to run the NeurowebUI locally, you’ll also need Node.js and npm. Hardware acceleration is optional but recommended when working with larger models.

Python >= 3.9 and pip (or uv/pipx/poetry)
Node.js + npm for the NeurowebUI dev server (optional)
GPU (optional): NVIDIA CUDA on Linux x86_64, or Apple Silicon (MPS) on macOS arm64. CPU‑only works fine for smaller models or demos.

Keep installs lightweight

Neurosurfer’s core installs without heavy model deps. Add the full LLM stack only when you need it via the torch extra.

🧰 Installation¶

Installation is flexible: minimal core for APIs, or the full LLM stack for local inference/finetuning. If you’re unsure, start minimal; you can always add the extra later.

Minimal core¶

pip install -U neurosurfer

This keeps dependencies light — ideal for API servers, docs, or CI.

Full LLM stack (recommended for model work)¶

pip install -U 'neurosurfer[torch]'

This extra adds torch, transformers, sentence-transformers, accelerate, bitsandbytes, and unsloth.

OS Platform

Neurosurfer has only been tested on Ubuntu Linux x86_64. It is expected to work on Windows and macOS, but you may need to verify CUDA related dependencies.

GPU / CUDA notes (PyTorch)¶

If you want GPU acceleration, the pip wheel already bundles CUDA; you do not need to install the system CUDA toolkit. You do need a compatible NVIDIA driver. Use these commands to verify and install appropriately:

Check your GPU and driver
```
nvidia-smi
```
Confirm the driver is active and note the CUDA version the driver supports.

Install a CUDA‑enabled torch wheel (Linux x86_64):

# Example for CUDA 12.4 wheels:
pip install -U torch --index-url https://download.pytorch.org/whl/cu124

Install bitsandbytes (optional, Linux x86_64):
```
pip install -U bitsandbytes
```
On macOS/Windows, bitsandbytes wheels are limited; prefer CPU or other quantization strategies.
Apple Silicon (MPS):
```
pip install -U torch
```
PyTorch includes MPS support on macOS arm64. No extra toolkit required.

CPU‑only (portable):

pip install -U torch --index-url https://download.pytorch.org/whl/cpu

Driver/toolkit alignment

If torch.cuda.is_available() is False despite a GPU, your driver may be outdated. Update the NVIDIA driver to a version compatible with the wheel you installed (e.g., CU124) and restart. You generally do not need the full CUDA toolkit for pip wheels.

Quick GPU sanity check

import torch; print('CUDA available:', torch.cuda.is_available())
if torch.cuda.is_available():
    print('GPU:', torch.cuda.get_device_name(0))

Build from source¶

git clone https://github.com/NaumanHSA/neurosurfer.git
cd neurosurfer
pip install -U pip wheel build
pip install -e .            # editable
# or
pip install -e '.[torch]'   # editable + full LLM stack

Quick import check¶

python -c "import neurosurfer; print('ok')"

You’ll see a banner; if optional deps are missing, you’ll get a single consolidated warning with install hints.

Useful environment flags

NEUROSURF_SILENCE=1 — hide banner & warnings
NEUROSURF_EAGER_RUNTIME_ASSERT=1 — fail fast at import if LLM deps missing (opt‑in)

🖥️ CLI Usage¶

The CLI runs the backend API and (optionally) the NeurowebUI dev server. Start simple with neurosurfer serve, then refine with flags as you scale.

Help¶

neurosurfer --help
neurosurfer serve --help

Start backend + UI (dev)¶

neurosurfer serve

The backend binds to NEUROSURF_BACKEND_HOST / NEUROSURF_BACKEND_PORT (from config). The UI root is auto‑detected; pass --ui-root if needed. On first run, an npm install may occur — you can control this with --npm-install.

Common options¶

Backend app (--backend-app): module path like pkg.module:app_or_factory() or a Python file with a NeurosurferApp instance. The default is neurosurfer.examples.quickstart_app:ns.
Backend: --backend-host, --backend-port, --backend-log-level, --backend-reload, --backend-workers, --backend-worker-timeout.
UI: --ui-root, --ui-host, --ui-port, --ui-strict-port, --ui-open, --npm-install {auto|always|never}.
Split: --only-backend or --only-ui.

Examples¶

Backend only

neurosurfer serve --only-backend --backend-host 0.0.0.0 --backend-port 8000

Serve your own file

neurosurfer serve --backend-app ./app.py --backend-reload

Serve module

neurosurfer serve --backend-app mypkg.myapp:ns

Run UI only

neurosurfer serve --only-ui --ui-root /path/to/neurowebui

Public backend URL for UI (when binding 0.0.0.0)

export NEUROSURF_PUBLIC_HOST=your.ip.addr
neurosurfer serve

🤖 Basic Usage¶

A minimal model call shows the shape of the API; agents and RAG come next. If LLM dependencies are missing, you’ll get clear, actionable errors instructing you to install the extra.

Minimal model call¶

from neurosurfer.models.chat_models.transformers import TransformersModel

llm = TransformersModel(
    model_name="unsloth/Llama-3.2-1B-Instruct-bnb-4bit",
    max_seq_length=4096,
    load_in_4bit=True,
    enable_thinking=False,
    stop_words=[],
)

resp = llm.ask(user_prompt="Say hi!", system_prompt="You are helpful.", stream=False)
print(resp.choices[0].message.content)

Agents¶

from neurosurfer.agents import ReActAgent
from neurosurfer.tools import Toolkit
from neurosurfer.models.chat_models.transformers import TransformersModel

llm = TransformersModel(model_name="meta-llama/Llama-3.2-3B-Instruct")
agent = ReActAgent(toolkit=Toolkit(), llm=llm)

print(agent.run("What is the capital of France?"))

RAG Basics¶

from neurosurfer.models.embedders.sentence_transformer import SentenceTransformerEmbedder
from neurosurfer.rag.chunker import Chunker
from neurosurfer.rag.filereader import FileReader
from neurosurfer.server.services.rag_orchestrator import RAGOrchestrator

embedder = SentenceTransformerEmbedder(model_name="intfloat/e5-large-v2")
rag = RAGOrchestrator(
    embedder=embedder,
    chunker=Chunker(),
    file_reader=FileReader(),
    persist_dir="./rag_store",
    max_context_tokens=2000,
    top_k=15,
    min_top_sim_default=0.35,
    min_top_sim_when_explicit=0.15,
    min_sim_to_keep=0.20,
)

aug = rag.apply(
    actor_id=1,
    thread_id="demo",
    user_query="Summarize the attached report",
    files=[{"name":"report.pdf","content":"...base64...","type":"application/pdf"}],
)
print("Augmented query:", aug.augmented_query)

⚙️ Configuration¶

The full configuration guide covers API keys, model providers, server settings, logging, and environment variables. Read it here → Configuration.

➡️ Next Steps & Help¶

Explore the broader documentation and examples to deepen your setup:

API Reference — classes, methods, schemas :octicons-arrow-up-right-24
Examples — runnable code samples :octicons-arrow-up-right-24
CLI — command line interface :octicons-arrow-up-right-24

Installation issues? Try the full stack extra:

pip install -U 'neurosurfer[torch]'

GPU problems? Verify with nvidia-smi, confirm driver versions, and test torch.cuda.is_available(). For quantization on non‑Linux platforms, consider CPU or alternative strategies.

Ready? Jump to Installation or try the CLI.