Quickstart

Get the CLI installed, ingest a dataset, and run your first RAG-powered chat in minutes.

1. Prerequisites

Docker
Ollama (local runtime today; additional providers coming soon)

Docker is used to run the API and RAG worker automatically when you invoke lf start.

2. Install the CLI

# macOS / Linux
curl -fsSL https://raw.githubusercontent.com/llama-farm/llamafarm/main/install.sh | bash

Windows users: download the latest lf.exe from the releases page and add it to your PATH.

Confirm everything is wired up:

lf --help

3. Tune Your Runtime (Ollama)

For best RAG results with longer documents, increase the Ollama context window to match production expectations (e.g., 100K tokens):

Open the Ollama app.
Navigate to Settings → Advanced.
Adjust the context window to your desired size.

4. Create a Project

lf init my-project

This reaches the server (auto-started if needed) and writes llamafarm.yaml with default runtime, prompts, and RAG configuration.

5. Start the Local Stack

lf start

Spins up the FastAPI server and RAG worker via Docker.
Starts a config watcher and opens the interactive dev chat TUI.
Shows health diagnostics for Ollama, Celery, and the rag-service.

Hit Ctrl+C to exit the chat UI when you’re done.

Running Services Manually (no Docker auto-start)

If you want to control each service yourself (useful when hacking on code), launch them with Nx from the repository root:

git clone https://github.com/llama-farm/llamafarm.git
cd llamafarm

npm install -g nx
nx init --useDotNxInstallation --interactive=false

# Option A: start both services together
nx dev

# Option B: start in separate terminals
nx start rag    # Terminal 1
nx start server # Terminal 2

Open another terminal to run lf commands against the locally running stack.

6. Chat with Your Project

# Interactive chat (opens TUI using project from llamafarm.yaml)
lf chat

# One-off message
lf chat "What can you do?"

Options you'll likely use:

--no-rag – bypass retrieval and hit the runtime directly.
--database, --retrieval-strategy – override RAG behaviour.
--curl – print the sanitized curl command instead of executing.

7. Create and Populate a Dataset

# Create dataset with configured strategy/database
lf datasets create -s pdf_ingest -b main_db research-notes

# Upload documents (supports globs/directories)
lf datasets upload research-notes ./examples/fda_rag/files/*.pdf

The CLI validates strategy and database names against your rag configuration and reports upload successes/failures.

8. Process Documents

lf datasets process research-notes

Sends an ingestion job to Celery.
Shows heartbeat dots (TTY only) so long-running jobs feel alive.
For large PDFs, the worker may need extra time—rerun the command if you see a timeout message.

9. Query with RAG

lf rag query --database main_db "Which FDA letters mention clinical trial data?"

Useful flags: --top-k 10, --filter "file_type:pdf", --include-metadata, --include-score.

10. Reset Sessions (Optional)

For stateless testing, clear dev history by removing .llamafarm/projects/<namespace>/<project>/dev/context, or start a new namespace/project.

11. Next Steps

Configuration Guide — deep dive into llamafarm.yaml.
RAG Guide — strategies, parsers, and retrieval.
Extending LlamaFarm — add new providers, stores, or parsers.
Examples — run the FDA and Raleigh demos end-to-end.

Need help? Chat with us on Discord or open a discussion.

1. Prerequisites​

2. Install the CLI​

3. Tune Your Runtime (Ollama)​

4. Create a Project​

5. Start the Local Stack​

Running Services Manually (no Docker auto-start)​

6. Chat with Your Project​

7. Create and Populate a Dataset​

8. Process Documents​

9. Query with RAG​

10. Reset Sessions (Optional)​

11. Next Steps​