Core Concepts

Understand the moving pieces—projects, sessions, runtimes, and the service architecture—before you customize or extend LlamaFarm.

Architecture Overview

┌────────────┐       ┌───────────────┐       ┌──────────────┐
│   lf CLI   │──────▶│  LlamaFarm    │──────▶│ Runtime Host │
│            │ HTTP  │  Server (API) │       │ (Ollama/vLLM │
└─────┬──────┘       │               │       │  OpenAI,...) │
      │              │  ┌─────────┐  │       └──────────────┘
      │ Websocket    │  │ Celery  │◀┐
      │ (streaming)  │  │ Workers │ │ ingest jobs
      │              │  └─────────┘ │
      │              │      ▲       │
      ▼              │      │       │
┌────────────┐       │  ┌─────────┐ │    ┌─────────────┐
│ Config     │◀──────┘  │ RAG     │◀┼────│ Vector Store │
│ Watcher    │ updates   │ Worker  │ │    │ (Chroma,...)│
└────────────┘           └─────────┘ │    └─────────────┘
                                     │
                                     ▼
                               ┌────────────┐
                               │ Dataset    │
                               │ Storage    │
                               └────────────┘

CLI (lf) orchestrates everything: talking to the API, streaming responses, uploading datasets, and watching config changes.
Server exposes REST endpoints under /v1/projects/{namespace}/{project}/... for chat completions, datasets, and RAG queries.
Celery workers handle ingestion tasks asynchronously; the CLI polls and surfaces progress.
Runtime hosts can be local (Ollama) or remote OpenAI-compatible endpoints (vLLM, Together). Configuration controls provider, base URL, API key, and instructor mode.
RAG worker processes documents via configured pipelines and writes to vector databases (default Chroma, configurable).

Projects & Namespaces

A project is a configuration bundle stored in llamafarm.yaml plus server-side metadata.
Projects live within a namespace (defaults to default). Namespaces isolate resources, dataset names, and sessions.
lf init creates a project using the server’s template; you can list existing projects with lf projects list --namespace my-team.

Sessions

lf chat creates or resumes a session when you pass a --session-id or use the environment variable LLAMAFARM_SESSION_ID.
lf start opens a stateful dev session whose history persists under .llamafarm/projects/<namespace>/<project>/dev/context.
lf chat --no-rag is stateless by default unless you provide a session identifier.
API consumers pass session_id directly to /chat/completions to control continuity.

Configuration-Driven Behaviour

llamafarm.yaml defines runtime, prompts, and RAG strategies (see Configuration Guide).
Changes to the file trigger the config watcher; the CLI reloads live during dev sessions.
Missing runtime fields (provider/base_url/api_key) are treated as errors; there are no hidden defaults.

RAG Strategies

RAG configuration is composed of databases and data processing strategies.
Each dataset references a strategy and database; CLI enforces this relationship when creating datasets.
Strategies describe parsers, extractors, metadata processors, and embedding choices.

Extensibility Mindset

Everything in LlamaFarm is intended to be swapped or extended:

Point runtime.base_url to a vLLM or custom OpenAI-compatible gateway.
Register a new vector store backend, update rag/schema.yaml, and regenerate types.
Add parsers/extractors to support new file formats.
Create new CLI subcommands under cli/cmd to automate workflows.

See Extending LlamaFarm for detailed instructions.

Component Health

When commands run, you might see a summary like:

⚠️ Server is degraded
Summary: server=healthy, storage=healthy, ollama=healthy, celery=degraded, rag-service=healthy, project=healthy
  ⚠️ celery  degraded   No workers replied to ping (latency: 533ms)

Degraded does not always mean failure; ingestion may continue in the background.
lf rag health reports live status of embedder, store, and processing pipeline.
Address warnings before production deployment (ensure Celery workers are running, Ollama/vLLM accessible, etc.).

Next Steps

Quickstart – run through the onboarding flow if you haven’t already.
CLI Reference – learn each command in detail.
RAG Guide – configure databases, strategies, and retrieval.

Architecture Overview​

Projects & Namespaces​

Sessions​

Configuration-Driven Behaviour​

RAG Strategies​

Extensibility Mindset​

Component Health​

Next Steps​