Quickstart
Get the CLI installed, ingest a dataset, and run your first RAG-powered chat in minutes.
1. Prerequisites
Docker is used to run the API and RAG worker automatically when you invoke
lf start
.
2. Install the CLI
# macOS / Linux
curl -fsSL https://raw.githubusercontent.com/llama-farm/llamafarm/main/install.sh | bash
- Windows users: download the latest
lf.exe
from the releases page and add it to your PATH.
Confirm everything is wired up:
lf --help
3. Tune Your Runtime (Ollama)
For best RAG results with longer documents, increase the Ollama context window to match production expectations (e.g., 100K tokens):
- Open the Ollama app.
- Navigate to Settings → Advanced.
- Adjust the context window to your desired size.
4. Create a Project
lf init my-project
This reaches the server (auto-started if needed) and writes llamafarm.yaml
with default runtime, prompts, and RAG configuration.
5. Start the Local Stack
lf start
- Spins up the FastAPI server and RAG worker via Docker.
- Starts a config watcher and opens the interactive dev chat TUI.
- Shows health diagnostics for Ollama, Celery, and the rag-service.
Hit Ctrl+C
to exit the chat UI when you’re done.
Running Services Manually (no Docker auto-start)
If you want to control each service yourself (useful when hacking on code), launch them with Nx from the repository root:
git clone https://github.com/llama-farm/llamafarm.git
cd llamafarm
npm install -g nx
nx init --useDotNxInstallation --interactive=false
# Option A: start both services together
nx dev
# Option B: start in separate terminals
nx start rag # Terminal 1
nx start server # Terminal 2
Open another terminal to run lf
commands against the locally running stack.
6. Chat with Your Project
# Interactive chat (opens TUI using project from llamafarm.yaml)
lf chat
# One-off message
lf chat "What can you do?"
Options you'll likely use:
--no-rag
– bypass retrieval and hit the runtime directly.--database
,--retrieval-strategy
– override RAG behaviour.--curl
– print the sanitizedcurl
command instead of executing.
7. Create and Populate a Dataset
# Create dataset with configured strategy/database
lf datasets create -s pdf_ingest -b main_db research-notes
# Upload documents (supports globs/directories)
lf datasets upload research-notes ./examples/fda_rag/files/*.pdf
The CLI validates strategy and database names against your rag
configuration and reports upload successes/failures.
8. Process Documents
lf datasets process research-notes
- Sends an ingestion job to Celery.
- Shows heartbeat dots (TTY only) so long-running jobs feel alive.
- For large PDFs, the worker may need extra time—rerun the command if you see a timeout message.
9. Query with RAG
lf rag query --database main_db "Which FDA letters mention clinical trial data?"
Useful flags: --top-k 10
, --filter "file_type:pdf"
, --include-metadata
, --include-score
.
10. Reset Sessions (Optional)
For stateless testing, clear dev history by removing .llamafarm/projects/<namespace>/<project>/dev/context
, or start a new namespace/project.
11. Next Steps
- Configuration Guide — deep dive into
llamafarm.yaml
. - RAG Guide — strategies, parsers, and retrieval.
- Extending LlamaFarm — add new providers, stores, or parsers.
- Examples — run the FDA and Raleigh demos end-to-end.
Need help? Chat with us on Discord or open a discussion.