Quickstart
Get the CLI installed, ingest a dataset, and run your first RAG-powered chat in minutes.
1. Prerequisites
Docker is used to run the API and RAG worker automatically when you invoke
lf start.
2. Install the CLI
# macOS / Linux
curl -fsSL https://raw.githubusercontent.com/llama-farm/llamafarm/main/install.sh | bash
- Windows users: download the latest
lf.exefrom the releases page and add it to your PATH.
Confirm everything is wired up:
lf --help
3. Tune Your Runtime (Ollama)
For best RAG results with longer documents, increase the Ollama context window to match production expectations (e.g., 100K tokens):
- Open the Ollama app.
- Navigate to Settings → Advanced.
- Adjust the context window to your desired size.
4. Create a Project
lf init my-project
This reaches the server (auto-started if needed) and writes llamafarm.yaml with default runtime, prompts, and RAG configuration.
5. Start the Local Stack
lf start
- Spins up the FastAPI server and RAG worker via Docker.
- Starts a config watcher and opens the interactive dev chat TUI.
- Shows health diagnostics for Ollama, Celery, and the rag-service.
- Launches the Designer web UI at
http://localhost:8000for visual project management.
Hit Ctrl+C to exit the chat UI when you're done.
Prefer a visual interface? Open http://localhost:8000 in your browser to access the Designer, where you can manage projects, upload datasets, configure models, and test prompts—all without touching the command line.
See the Designer documentation for details.
Running Services Manually (no Docker auto-start)
If you want to control each service yourself (useful when hacking on code), launch them with Nx from the repository root:
git clone https://github.com/llama-farm/llamafarm.git
cd llamafarm
npm install -g nx
nx init --useDotNxInstallation --interactive=false
# Option A: start both services together
nx dev
# Option B: start in separate terminals
nx start rag # Terminal 1
nx start server # Terminal 2
Open another terminal to run lf commands against the locally running stack.
6. Chat with Your Project
# Interactive chat (opens TUI using project from llamafarm.yaml)
lf chat
# One-off message
lf chat "What can you do?"
Options you'll likely use:
--no-rag– bypass retrieval and hit the runtime directly.--database,--retrieval-strategy– override RAG behaviour.--curl– print the sanitizedcurlcommand instead of executing.
7. Create and Populate a Dataset
# Create dataset with configured strategy/database
lf datasets create -s pdf_ingest -b main_db research-notes
# Upload documents (supports globs/directories)
lf datasets upload research-notes ./examples/fda_rag/files/*.pdf
The CLI validates strategy and database names against your rag configuration and reports upload successes/failures.
8. Process Documents
lf datasets process research-notes
- Sends an ingestion job to Celery.
- Shows heartbeat dots (TTY only) so long-running jobs feel alive.
- For large PDFs, the worker may need extra time—rerun the command if you see a timeout message.
9. Query with RAG
lf rag query --database main_db "Which FDA letters mention clinical trial data?"
Useful flags: --top-k 10, --filter "file_type:pdf", --include-metadata, --include-score.
10. Reset Sessions (Optional)
For stateless testing, clear dev history by removing .llamafarm/projects/<namespace>/<project>/dev/context, or start a new namespace/project.
11. Next Steps
- Designer Web UI — visual interface for managing projects.
- Configuration Guide — deep dive into
llamafarm.yaml. - RAG Guide — strategies, parsers, and retrieval.
- Extending LlamaFarm — add new providers, stores, or parsers.
- Examples — run the FDA and Raleigh demos end-to-end.
Need help? Chat with us on Discord or open a discussion.