Skip to main content

Quickstart

Get the CLI installed, ingest a dataset, and run your first RAG-powered chat in minutes.

1. Prerequisites

  • Docker
  • Ollama (local runtime today; additional providers coming soon)

Docker is used to run the API and RAG worker automatically when you invoke lf start.

2. Install the CLI

# macOS / Linux
curl -fsSL https://raw.githubusercontent.com/llama-farm/llamafarm/main/install.sh | bash
  • Windows users: download the latest lf.exe from the releases page and add it to your PATH.

Confirm everything is wired up:

lf --help

3. Tune Your Runtime (Ollama)

For best RAG results with longer documents, increase the Ollama context window to match production expectations (e.g., 100K tokens):

  1. Open the Ollama app.
  2. Navigate to Settings → Advanced.
  3. Adjust the context window to your desired size.

4. Create a Project

lf init my-project

This reaches the server (auto-started if needed) and writes llamafarm.yaml with default runtime, prompts, and RAG configuration.

5. Start the Local Stack

lf start
  • Spins up the FastAPI server and RAG worker via Docker.
  • Starts a config watcher and opens the interactive dev chat TUI.
  • Shows health diagnostics for Ollama, Celery, and the rag-service.

Hit Ctrl+C to exit the chat UI when you’re done.

Running Services Manually (no Docker auto-start)

If you want to control each service yourself (useful when hacking on code), launch them with Nx from the repository root:

git clone https://github.com/llama-farm/llamafarm.git
cd llamafarm

npm install -g nx
nx init --useDotNxInstallation --interactive=false

# Option A: start both services together
nx dev

# Option B: start in separate terminals
nx start rag # Terminal 1
nx start server # Terminal 2

Open another terminal to run lf commands against the locally running stack.

6. Chat with Your Project

# Interactive chat (opens TUI using project from llamafarm.yaml)
lf chat

# One-off message
lf chat "What can you do?"

Options you'll likely use:

  • --no-rag – bypass retrieval and hit the runtime directly.
  • --database, --retrieval-strategy – override RAG behaviour.
  • --curl – print the sanitized curl command instead of executing.

7. Create and Populate a Dataset

# Create dataset with configured strategy/database
lf datasets create -s pdf_ingest -b main_db research-notes

# Upload documents (supports globs/directories)
lf datasets upload research-notes ./examples/fda_rag/files/*.pdf

The CLI validates strategy and database names against your rag configuration and reports upload successes/failures.

8. Process Documents

lf datasets process research-notes
  • Sends an ingestion job to Celery.
  • Shows heartbeat dots (TTY only) so long-running jobs feel alive.
  • For large PDFs, the worker may need extra time—rerun the command if you see a timeout message.

9. Query with RAG

lf rag query --database main_db "Which FDA letters mention clinical trial data?"

Useful flags: --top-k 10, --filter "file_type:pdf", --include-metadata, --include-score.

10. Reset Sessions (Optional)

For stateless testing, clear dev history by removing .llamafarm/projects/<namespace>/<project>/dev/context, or start a new namespace/project.

11. Next Steps

Need help? Chat with us on Discord or open a discussion.