API Reference

LlamaFarm provides a comprehensive REST API for managing projects, datasets, chat interactions, and RAG (Retrieval-Augmented Generation) operations. The API follows RESTful conventions and is compatible with OpenAI's chat completion format.

Base URL

The API is served at: http://localhost:8000

All versioned endpoints use the /v1 prefix:

http://localhost:8000/v1

Finding Your Namespace and Project Name

Understanding Namespaces and Projects

LlamaFarm organizes your work into namespaces (organizational containers) and projects (individual configurations):

Namespace: A top-level organizational unit (e.g., your username, team name, or organization)
Project Name: The unique identifier for a specific LlamaFarm project within a namespace

From Your llamafarm.yaml

The easiest way to find your namespace and project name is to check your llamafarm.yaml configuration file:

version: v1
name: my-project        # This is your project name
namespace: my-org       # This is your namespace

From the File System

Projects are stored in:

~/.llamafarm/projects/{namespace}/{project_name}/

For example, if you see:

~/.llamafarm/projects/acme-corp/chatbot/

Then:

Namespace: acme-corp
Project name: chatbot

Using the API

You can also list projects programmatically:

# List all projects in a namespace
curl http://localhost:8000/v1/projects/my-org

Custom Data Directory

If you've set a custom data directory using the LF_DATA_DIR environment variable, check:

$LF_DATA_DIR/projects/{namespace}/{project_name}/

Authentication

Currently, the API does not require authentication. This is designed for local development environments. For production deployments, implement authentication at the reverse proxy or load balancer level.

Response Format

Success Responses

Successful requests return JSON with appropriate HTTP status codes (200, 201, etc.):

{
  "field": "value",
  ...
}

Error Responses

Error responses follow a consistent format with appropriate HTTP status codes (400, 404, 500, etc.):

{
  "detail": "Error message description"
}

Common HTTP status codes:

200 OK - Request succeeded
201 Created - Resource created successfully
400 Bad Request - Invalid request parameters
404 Not Found - Resource not found
422 Unprocessable Entity - Validation error
500 Internal Server Error - Server error

API Endpoints Overview

Projects

GET /v1/projects/{namespace} - List projects
POST /v1/projects/{namespace} - Create project
GET /v1/projects/{namespace}/{project} - Get project details
PUT /v1/projects/{namespace}/{project} - Update project configuration
DELETE /v1/projects/{namespace}/{project} - Delete project

Chat

POST /v1/projects/{namespace}/{project}/chat/completions - Send chat message (OpenAI-compatible)
GET /v1/projects/{namespace}/{project}/chat/sessions/{session_id}/history - Get chat history
DELETE /v1/projects/{namespace}/{project}/chat/sessions/{session_id} - Delete chat session
DELETE /v1/projects/{namespace}/{project}/chat/sessions - Delete all sessions
GET /v1/projects/{namespace}/{project}/models - List available models

Datasets

GET /v1/projects/{namespace}/{project}/datasets - List datasets
POST /v1/projects/{namespace}/{project}/datasets - Create dataset
DELETE /v1/projects/{namespace}/{project}/datasets/{dataset} - Delete dataset
POST /v1/projects/{namespace}/{project}/datasets/{dataset}/data - Upload file to dataset
POST /v1/projects/{namespace}/{project}/datasets/{dataset}/process - Process dataset into vector database
DELETE /v1/projects/{namespace}/{project}/datasets/{dataset}/data/{file_hash} - Remove file from dataset

RAG (Retrieval-Augmented Generation)

POST /v1/projects/{namespace}/{project}/rag/query - Query RAG system
GET /v1/projects/{namespace}/{project}/rag/health - Check RAG health

Tasks

GET /v1/projects/{namespace}/{project}/tasks/{task_id} - Get async task status

Examples

GET /v1/examples - List available examples
POST /v1/examples/{example_id}/import-project - Import example as new project
POST /v1/examples/{example_id}/import-data - Import example data into existing project

Health

GET /health - Overall health check
GET /health/liveness - Liveness probe

System Info

GET / - Basic hello endpoint
GET /info - System information

Projects API

List Projects

List all projects in a namespace.

Endpoint: GET /v1/projects/{namespace}

Parameters:

namespace (path, required): The namespace to list projects from

Response:

{
  "total": 2,
  "projects": [
    {
      "namespace": "my-org",
      "name": "chatbot",
      "config": {
        "version": "v1",
        "name": "chatbot",
        "namespace": "my-org",
        "runtime": { ... },
        ...
      }
    }
  ]
}

Example:

curl http://localhost:8000/v1/projects/my-org

Create Project

Create a new project in a namespace.

Endpoint: POST /v1/projects/{namespace}

Parameters:

namespace (path, required): The namespace to create the project in

Request Body:

{
  "name": "my-new-project",
  "config_template": "server"  // Optional: server, rag, or custom template name
}

Response:

{
  "project": {
    "namespace": "my-org",
    "name": "my-new-project",
    "config": { ... }
  }
}

Example:

curl -X POST http://localhost:8000/v1/projects/my-org \
  -H "Content-Type: application/json" \
  -d '{"name": "chatbot", "config_template": "server"}'

Get Project

Get details of a specific project.

Endpoint: GET /v1/projects/{namespace}/{project}

Parameters:

namespace (path, required): Project namespace
project (path, required): Project name

Response:

{
  "project": {
    "namespace": "my-org",
    "name": "chatbot",
    "config": {
      "version": "v1",
      "name": "chatbot",
      "namespace": "my-org",
      "runtime": {
        "provider": "ollama",
        "model": "llama3.2:3b"
      },
      ...
    }
  }
}

Example:

curl http://localhost:8000/v1/projects/my-org/chatbot

Update Project

Update a project's configuration.

Endpoint: PUT /v1/projects/{namespace}/{project}

Parameters:

namespace (path, required): Project namespace
project (path, required): Project name

Request Body:

{
  "config": {
    "version": "v1",
    "name": "chatbot",
    "namespace": "my-org",
    "runtime": {
      "provider": "ollama",
      "model": "llama3.2:3b"
    },
    ...
  }
}

Response:

{
  "project": {
    "namespace": "my-org",
    "name": "chatbot",
    "config": { ... }
  }
}

Example:

curl -X PUT http://localhost:8000/v1/projects/my-org/chatbot \
  -H "Content-Type: application/json" \
  -d @updated-config.json

Delete Project

Delete a project (currently returns project info; actual deletion not implemented).

Endpoint: DELETE /v1/projects/{namespace}/{project}

Parameters:

namespace (path, required): Project namespace
project (path, required): Project name

Response:

{
  "project": {
    "namespace": "my-org",
    "name": "chatbot",
    "config": { ... }
  }
}

Chat API

Send Chat Message (OpenAI-Compatible)

Send a chat message to the LLM. This endpoint is compatible with OpenAI's chat completion API.

Endpoint: POST /v1/projects/{namespace}/{project}/chat/completions

Headers:

X-Session-ID (optional): Session ID for stateful conversations. If not provided, a new session is created.
X-No-Session (optional): Set to any value for stateless mode (no conversation history)

Request Body:

{
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant."
    },
    {
      "role": "user",
      "content": "What is LlamaFarm?"
    }
  ],
  "stream": false,
  "temperature": 0.7,
  "max_tokens": 1000,
  "rag_enabled": true,
  "database": "main_db",
  "rag_top_k": 5,
  "rag_score_threshold": 0.7
}

Request Fields:

messages (required): Array of chat messages with role and content
model (optional): Select which model to use (OpenAI-compatible, added in PR #263 multi-model support)
stream (optional): Enable streaming responses (Server-Sent Events)
temperature (optional): Sampling temperature (0.0-2.0)
max_tokens (optional): Maximum tokens to generate
top_p (optional): Nucleus sampling parameter
top_k (optional): Top-k sampling parameter
rag_enabled (optional): Enable/disable RAG (uses config default if not specified)
database (optional): Database to use for RAG queries
rag_top_k (optional): Number of RAG results to retrieve
rag_score_threshold (optional): Minimum similarity score for RAG results

Response (Non-Streaming):

{
  "id": "chatcmpl-123",
  "object": "chat.completion",
  "created": 1677652288,
  "model": "llama3.2:3b",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "LlamaFarm is a framework for building AI applications..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 20,
    "completion_tokens": 100,
    "total_tokens": 120
  }
}

Response Headers:

X-Session-ID: The session ID (only in stateful mode)

Streaming Response:

When stream: true, the response is sent as Server-Sent Events:

data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1677652288,"model":"llama3.2:3b","choices":[{"index":0,"delta":{"role":"assistant","content":"Llama"},"finish_reason":null}]}

data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1677652288,"model":"llama3.2:3b","choices":[{"index":0,"delta":{"content":"Farm"},"finish_reason":null}]}

data: [DONE]

Example (Non-Streaming):

curl -X POST http://localhost:8000/v1/projects/my-org/chatbot/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {"role": "user", "content": "Hello!"}
    ]
  }'

Example (Streaming):

curl -X POST http://localhost:8000/v1/projects/my-org/chatbot/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {"role": "user", "content": "Hello!"}
    ],
    "stream": true
  }'

Example (Stateless):

curl -X POST http://localhost:8000/v1/projects/my-org/chatbot/chat/completions \
  -H "Content-Type: application/json" \
  -H "X-No-Session: true" \
  -d '{
    "messages": [
      {"role": "user", "content": "Hello!"}
    ]
  }'

Example (With RAG):

curl -X POST http://localhost:8000/v1/projects/my-org/chatbot/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {"role": "user", "content": "What are the FDA regulations?"}
    ],
    "rag_enabled": true,
    "database": "fda_db",
    "rag_top_k": 10
  }'

Get Chat History

Retrieve conversation history for a session.

Endpoint: GET /v1/projects/{namespace}/{project}/chat/sessions/{session_id}/history

Parameters:

namespace (path, required): Project namespace
project (path, required): Project name
session_id (path, required): Session ID

Response:

{
  "messages": [
    {
      "role": "user",
      "content": "Hello!"
    },
    {
      "role": "assistant",
      "content": "Hi! How can I help you today?"
    }
  ]
}

Example:

curl http://localhost:8000/v1/projects/my-org/chatbot/chat/sessions/abc-123/history

Delete Chat Session

Delete a specific chat session and its history.

Endpoint: DELETE /v1/projects/{namespace}/{project}/chat/sessions/{session_id}

Parameters:

namespace (path, required): Project namespace
project (path, required): Project name
session_id (path, required): Session ID

Response:

{
  "message": "Session abc-123 deleted"
}

Example:

curl -X DELETE http://localhost:8000/v1/projects/my-org/chatbot/chat/sessions/abc-123

Delete All Chat Sessions

Delete all chat sessions for a project.

Endpoint: DELETE /v1/projects/{namespace}/{project}/chat/sessions

Parameters:

namespace (path, required): Project namespace
project (path, required): Project name

Response:

{
  "message": "Deleted 5 session(s)",
  "count": 5
}

Example:

curl -X DELETE http://localhost:8000/v1/projects/my-org/chatbot/chat/sessions

List Available Models

List all configured models for a project.

Endpoint: GET /v1/projects/{namespace}/{project}/models

Parameters:

namespace (path, required): Project namespace
project (path, required): Project name

Response:

{
  "total": 2,
  "models": [
    {
      "name": "fast-model",
      "provider": "ollama",
      "model": "llama3.2:3b",
      "base_url": "http://localhost:11434/v1",
      "default": true,
      "description": "Fast model for quick responses"
    },
    {
      "name": "smart-model",
      "provider": "ollama",
      "model": "llama3.2:70b",
      "base_url": "http://localhost:11434/v1",
      "default": false,
      "description": "Larger model for complex tasks"
    }
  ]
}

Example:

curl http://localhost:8000/v1/projects/my-org/chatbot/models

Datasets API

List Datasets

List all datasets in a project.

Endpoint: GET /v1/projects/{namespace}/{project}/datasets

Parameters:

namespace (path, required): Project namespace
project (path, required): Project name
include_extra_details (query, optional): Include detailed file information (default: true)

Response:

{
  "total": 2,
  "datasets": [
    {
      "name": "research_papers",
      "database": "main_db",
      "data_processing_strategy": "universal_processor",
      "files": ["abc123", "def456"],
      "details": {
        "total_files": 2,
        "total_size_bytes": 1048576,
        "file_details": [
          {
            "hash": "abc123",
            "original_filename": "paper1.pdf",
            "size": 524288,
            "timestamp": 1677652288.0
          }
        ]
      }
    }
  ]
}

Example:

curl http://localhost:8000/v1/projects/my-org/chatbot/datasets

Get Available Strategies

Get available data processing strategies and databases for a project.

Endpoint: GET /v1/projects/{namespace}/{project}/datasets/strategies

Parameters:

namespace (path, required): Project namespace
project (path, required): Project name

Response:

{
  "data_processing_strategies": ["universal_processor", "custom_strategy"],
  "databases": ["main_db", "research_db"]
}

Example:

curl http://localhost:8000/v1/projects/my-org/chatbot/datasets/strategies

Create Dataset

Create a new dataset.

Endpoint: POST /v1/projects/{namespace}/{project}/datasets

Parameters:

namespace (path, required): Project namespace
project (path, required): Project name

Request Body:

{
  "name": "research_papers",
  "data_processing_strategy": "universal_processor",
  "database": "main_db"
}

Response:

{
  "dataset": {
    "name": "research_papers",
    "database": "main_db",
    "data_processing_strategy": "universal_processor",
    "files": []
  }
}

Example:

curl -X POST http://localhost:8000/v1/projects/my-org/chatbot/datasets \
  -H "Content-Type: application/json" \
  -d '{
    "name": "research_papers",
    "data_processing_strategy": "universal_processor",
    "database": "main_db"
  }'

Delete Dataset

Delete a dataset.

Endpoint: DELETE /v1/projects/{namespace}/{project}/datasets/{dataset}

Parameters:

namespace (path, required): Project namespace
project (path, required): Project name
dataset (path, required): Dataset name

Response:

{
  "dataset": {
    "name": "research_papers",
    "database": "main_db",
    "data_processing_strategy": "universal_processor",
    "files": []
  }
}

Example:

curl -X DELETE http://localhost:8000/v1/projects/my-org/chatbot/datasets/research_papers

Upload File to Dataset

Upload a file to a dataset (stores the file but does not process it).

Endpoint: POST /v1/projects/{namespace}/{project}/datasets/{dataset}/data

Parameters:

namespace (path, required): Project namespace
project (path, required): Project name
dataset (path, required): Dataset name

Request:

Content-Type: multipart/form-data
Body: File upload with field name file

Response:

{
  "filename": "paper1.pdf",
  "hash": "abc123def456",
  "processed": false
}

Example:

curl -X POST http://localhost:8000/v1/projects/my-org/chatbot/datasets/research_papers/data \
  -F "file=@paper1.pdf"

Process Dataset

Process all files in a dataset into the vector database.

Endpoint: POST /v1/projects/{namespace}/{project}/datasets/{dataset}/process

Parameters:

namespace (path, required): Project namespace
project (path, required): Project name
dataset (path, required): Dataset name
async_processing (query, optional): Process asynchronously (default: false)

Response (Synchronous):

{
  "message": "Dataset processing completed",
  "processed_files": 2,
  "skipped_files": 0,
  "failed_files": 0,
  "strategy": "universal_processor",
  "database": "main_db",
  "details": [
    {
      "hash": "abc123",
      "filename": "paper1.pdf",
      "status": "processed",
      "parser": "pdf",
      "extractors": ["text"],
      "chunks": 42,
      "chunk_size": 500,
      "embedder": "sentence-transformers"
    }
  ]
}

Response (Asynchronous):

{
  "message": "Dataset processing started asynchronously",
  "processed_files": 0,
  "skipped_files": 0,
  "failed_files": 0,
  "strategy": "universal_processor",
  "database": "main_db",
  "details": [
    {
      "hash": "abc123",
      "filename": null,
      "status": "pending"
    }
  ],
  "task_id": "task-123-abc"
}

Example (Synchronous):

curl -X POST http://localhost:8000/v1/projects/my-org/chatbot/datasets/research_papers/process

Example (Asynchronous):

curl -X POST "http://localhost:8000/v1/projects/my-org/chatbot/datasets/research_papers/process?async_processing=true"

Remove File from Dataset

Remove a file from a dataset.

Endpoint: DELETE /v1/projects/{namespace}/{project}/datasets/{dataset}/data/{file_hash}

Parameters:

namespace (path, required): Project namespace
project (path, required): Project name
dataset (path, required): Dataset name
file_hash (path, required): Hash of the file to remove
remove_from_disk (query, optional): Also delete the file from disk (default: false)

Response:

{
  "file_hash": "abc123"
}

Example:

curl -X DELETE http://localhost:8000/v1/projects/my-org/chatbot/datasets/research_papers/data/abc123

Example (Remove from disk):

curl -X DELETE "http://localhost:8000/v1/projects/my-org/chatbot/datasets/research_papers/data/abc123?remove_from_disk=true"

RAG API

Query RAG System

Perform a semantic search query against a RAG database.

Endpoint: POST /v1/projects/{namespace}/{project}/rag/query

Parameters:

namespace (path, required): Project namespace
project (path, required): Project name

Request Body:

{
  "query": "What are the clinical trial requirements?",
  "database": "fda_db",
  "top_k": 5,
  "score_threshold": 0.7,
  "retrieval_strategy": "hybrid",
  "metadata_filters": {
    "document_type": "regulation"
  }
}

Request Fields:

query (required): The search query text
database (optional): Database name (uses default if not specified)
top_k (optional): Number of results to return (default: 5)
score_threshold (optional): Minimum similarity score
retrieval_strategy (optional): Strategy to use for retrieval
metadata_filters (optional): Filter results by metadata
distance_metric (optional): Distance metric to use
hybrid_alpha (optional): Alpha parameter for hybrid search
rerank_model (optional): Model to use for reranking
query_expansion (optional): Enable query expansion
max_tokens (optional): Maximum tokens in results

Response:

{
  "query": "What are the clinical trial requirements?",
  "results": [
    {
      "content": "Clinical trials must follow FDA regulations...",
      "score": 0.92,
      "metadata": {
        "document_id": "fda_21cfr312",
        "page": 5,
        "document_type": "regulation"
      },
      "chunk_id": "chunk_123",
      "document_id": "fda_21cfr312"
    }
  ],
  "total_results": 5,
  "processing_time_ms": 45.2,
  "retrieval_strategy_used": "hybrid",
  "database_used": "fda_db"
}

Example:

curl -X POST http://localhost:8000/v1/projects/my-org/chatbot/rag/query \
  -H "Content-Type: application/json" \
  -d '{
    "query": "What are the clinical trial requirements?",
    "database": "fda_db",
    "top_k": 5
  }'

Check RAG Health

Get health status of the RAG system and databases.

Endpoint: GET /v1/projects/{namespace}/{project}/rag/health

Parameters:

namespace (path, required): Project namespace
project (path, required): Project name
database (query, optional): Specific database to check (uses default if not specified)

Response:

{
  "status": "healthy",
  "database": "main_db",
  "components": {
    "vector_store": {
      "name": "vector_store",
      "status": "healthy",
      "latency": 15.2,
      "message": "Vector store operational"
    },
    "embeddings": {
      "name": "embeddings",
      "status": "healthy",
      "latency": 8.5,
      "message": "Embedding service operational"
    }
  },
  "last_check": "2024-01-15T10:30:00Z",
  "issues": null
}

Response Fields:

status: Overall health status (healthy, degraded, unhealthy)
database: Database that was checked
components: Individual component health checks
last_check: Timestamp of the health check
issues: Array of issues if any problems detected

Component Health:

name: Component identifier
status: Component status (healthy, degraded, unhealthy)
latency: Response time in milliseconds
message: Optional status message

Example:

curl http://localhost:8000/v1/projects/my-org/chatbot/rag/health

Example (Specific database):

curl "http://localhost:8000/v1/projects/my-org/chatbot/rag/health?database=main_db"

Tasks API

Get Task Status

Get the status of an asynchronous task.

Endpoint: GET /v1/projects/{namespace}/{project}/tasks/{task_id}

Parameters:

namespace (path, required): Project namespace
project (path, required): Project name
task_id (path, required): Task ID returned from async operations

Response:

{
  "task_id": "task-123-abc",
  "state": "SUCCESS",
  "meta": {
    "current": 2,
    "total": 2
  },
  "result": {
    "processed_files": 2,
    "failed_files": 0
  },
  "error": null,
  "traceback": null
}

Task States:

PENDING - Task is queued
STARTED - Task is running
SUCCESS - Task completed successfully
FAILURE - Task failed with error
RETRY - Task is being retried

Example:

curl http://localhost:8000/v1/projects/my-org/chatbot/tasks/task-123-abc

Examples API

List Examples

List all available example projects.

Endpoint: GET /v1/examples

Response:

{
  "examples": [
    {
      "id": "fda_rag",
      "slug": "fda_rag",
      "title": "FDA RAG Example",
      "description": "Example using FDA warning letters",
      "primaryModel": "llama3.2:3b",
      "tags": ["rag", "healthcare"],
      "dataset_count": 1,
      "data_size_bytes": 2097152,
      "data_size_human": "2.0MB",
      "project_size_bytes": 4096,
      "project_size_human": "4.0KB",
      "updated_at": "2024-01-15T10:30:00"
    }
  ]
}

Example:

curl http://localhost:8000/v1/examples

Import Example as Project

Import an example as a new project.

Endpoint: POST /v1/examples/{example_id}/import-project

Parameters:

example_id (path, required): Example ID to import

Request Body:

{
  "namespace": "my-org",
  "name": "my-fda-project",
  "process": true
}

Response:

{
  "project": "my-fda-project",
  "namespace": "my-org",
  "datasets": ["fda_letters"],
  "task_ids": ["task-123-abc"]
}

Example:

curl -X POST http://localhost:8000/v1/examples/fda_rag/import-project \
  -H "Content-Type: application/json" \
  -d '{
    "namespace": "my-org",
    "name": "my-fda-project",
    "process": true
  }'

Import Example Data

Import example data into an existing project.

Endpoint: POST /v1/examples/{example_id}/import-data

Parameters:

example_id (path, required): Example ID to import data from

Request Body:

{
  "namespace": "my-org",
  "project": "my-project",
  "include_strategies": true,
  "process": true
}

Response:

{
  "project": "my-project",
  "namespace": "my-org",
  "datasets": ["fda_letters"],
  "task_ids": ["task-456-def"]
}

Example:

curl -X POST http://localhost:8000/v1/examples/fda_rag/import-data \
  -H "Content-Type: application/json" \
  -d '{
    "namespace": "my-org",
    "project": "my-project",
    "include_strategies": true,
    "process": true
  }'

Health API

Overall Health Check

Check overall system health.

Endpoint: GET /health

Response:

{
  "status": "healthy",
  "services": {
    "api": "healthy",
    "celery": "healthy",
    "redis": "healthy"
  }
}

Example:

curl http://localhost:8000/health

Liveness Probe

Simple liveness check for container orchestration.

Endpoint: GET /health/liveness

Response:

{
  "status": "alive"
}

Example:

curl http://localhost:8000/health/liveness

System Info

Root Endpoint

Basic hello endpoint.

Endpoint: GET /

Response:

{
  "message": "Hello, World!"
}

System Information

Get system version and configuration info.

Endpoint: GET /info

Response:

{
  "version": "0.1.0",
  "data_directory": "/Users/username/.llamafarm"
}

Example:

curl http://localhost:8000/info

Multi-Model Support

As of PR #263, LlamaFarm supports multiple models per project with OpenAI-compatible model field in chat requests.

Configuration

In your llamafarm.yaml:

runtime:
  models:
    - name: fast-model
      provider: ollama
      model: llama3.2:3b
    - name: smart-model
      provider: ollama
      model: llama3.2:70b
  default_model: fast-model

Using Different Models

The model field in chat completions requests is OpenAI-compatible and allows you to select which configured model to use:

curl -X POST http://localhost:8000/v1/projects/my-org/chatbot/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "smart-model",
    "messages": [
      {"role": "user", "content": "Explain quantum computing"}
    ]
  }'

If model is not specified, the default_model from your configuration is used. This makes LlamaFarm a drop-in replacement for OpenAI's API with the added benefit of model selection.

Get Available Models

curl http://localhost:8000/v1/projects/my-org/chatbot/models

Rate Limiting and Performance

Concurrent Requests

The API handles concurrent requests efficiently:

Chat sessions are thread-safe with internal locking
Dataset processing can run asynchronously with Celery
Multiple chat sessions can be active simultaneously

Session Management

Sessions expire after 30 minutes of inactivity
Use X-No-Session header for stateless requests
Session cleanup happens automatically

Async Processing

For long-running operations (dataset processing):

Use async_processing=true to get immediate response
Poll the task endpoint to check status
Retrieve final results when state is SUCCESS

Best Practices

Error Handling

Always check HTTP status codes and handle errors:

response=$(curl -s -w "\n%{http_code}" http://localhost:8000/v1/projects/my-org/chatbot)
http_code=$(echo "$response" | tail -n 1)
body=$(echo "$response" | head -n -1)

if [ "$http_code" -eq 200 ]; then
  echo "Success: $body"
else
  echo "Error ($http_code): $body"
fi

Using RAG Effectively

Create datasets first: Upload and process files before querying
Use appropriate top_k: Start with 5-10 results
Set score thresholds: Filter low-quality results (e.g., 0.7)
Test queries: Use the RAG query endpoint to test retrieval before chat

Chat Sessions

Stateful conversations: Reuse session IDs for context
Stateless queries: Use X-No-Session for one-off questions
Clean up: Delete old sessions to free resources
History access: Use history endpoint to debug conversations

Dataset Processing

Upload first, process later: Separate ingestion from processing
Use async for large datasets: Enable async processing for >10 files
Monitor task status: Poll task endpoint for progress
Handle duplicates: System automatically skips duplicate files

API Clients

Python Example

import requests

class LlamaFarmClient:
    def __init__(self, base_url="http://localhost:8000"):
        self.base_url = base_url
        self.session = requests.Session()

    def chat(self, namespace, project, message, session_id=None):
        url = f"{self.base_url}/v1/projects/{namespace}/{project}/chat/completions"
        headers = {}
        if session_id:
            headers["X-Session-ID"] = session_id

        response = self.session.post(
            url,
            headers=headers,
            json={
                "messages": [{"role": "user", "content": message}]
            }
        )
        response.raise_for_status()
        return response.json()

    def query_rag(self, namespace, project, query, database=None, top_k=5):
        url = f"{self.base_url}/v1/projects/{namespace}/{project}/rag/query"
        response = self.session.post(
            url,
            json={
                "query": query,
                "database": database,
                "top_k": top_k
            }
        )
        response.raise_for_status()
        return response.json()

# Usage
client = LlamaFarmClient()
result = client.chat("my-org", "chatbot", "Hello!")
print(result["choices"][0]["message"]["content"])

JavaScript/TypeScript Example

interface ChatMessage {
  role: "system" | "user" | "assistant";
  content: string;
}

interface ChatRequest {
  messages: ChatMessage[];
  stream?: boolean;
  rag_enabled?: boolean;
  database?: string;
}

class LlamaFarmClient {
  constructor(private baseUrl: string = "http://localhost:8000") {}

  async chat(
    namespace: string,
    project: string,
    messages: ChatMessage[],
    sessionId?: string
  ): Promise<any> {
    const headers: Record<string, string> = {
      "Content-Type": "application/json",
    };

    if (sessionId) {
      headers["X-Session-ID"] = sessionId;
    }

    const response = await fetch(
      `${this.baseUrl}/v1/projects/${namespace}/${project}/chat/completions`,
      {
        method: "POST",
        headers,
        body: JSON.stringify({ messages }),
      }
    );

    if (!response.ok) {
      throw new Error(`API error: ${response.status}`);
    }

    return response.json();
  }

  async queryRAG(
    namespace: string,
    project: string,
    query: string,
    database?: string,
    topK: number = 5
  ): Promise<any> {
    const response = await fetch(
      `${this.baseUrl}/v1/projects/${namespace}/${project}/rag/query`,
      {
        method: "POST",
        headers: { "Content-Type": "application/json" },
        body: JSON.stringify({
          query,
          database,
          top_k: topK,
        }),
      }
    );

    if (!response.ok) {
      throw new Error(`API error: ${response.status}`);
    }

    return response.json();
  }
}

// Usage
const client = new LlamaFarmClient();
const result = await client.chat("my-org", "chatbot", [
  { role: "user", content: "Hello!" }
]);
console.log(result.choices[0].message.content);

Troubleshooting

Common Issues

Problem: 404 Not Found when accessing project

Solution: Verify namespace and project name are correct. List projects to confirm.

Problem: Chat returns empty or error

Solution: Check that the model is configured correctly and Ollama is running.

Problem: RAG query returns no results

Solution: Ensure dataset is processed and database exists. Check RAG health endpoint.

Problem: Dataset processing stuck

Solution: Check Celery worker status. Use async processing and poll task endpoint.

Problem: Session not persisting

Solution: Ensure you're passing X-Session-ID header and not using X-No-Session.

Debugging Tips

Check logs: Server logs are written to stdout
Verify configuration: Use GET /v1/projects/{namespace}/{project} to inspect config
Test health: Use /health endpoint to verify services are running
Inspect tasks: For async operations, poll task endpoint for detailed error info
Use curl verbose: Add -v flag to curl for detailed request/response info

Next Steps

Learn about Configuration
Explore RAG concepts
Review Examples
Check Troubleshooting

Base URL​

Finding Your Namespace and Project Name​

Understanding Namespaces and Projects​

From Your llamafarm.yaml​

From the File System​

Using the API​

Custom Data Directory​

Authentication​

Response Format​

Success Responses​

Error Responses​

API Endpoints Overview​

Projects​

Chat​

Datasets​

RAG (Retrieval-Augmented Generation)​

Tasks​

Examples​

Health​

System Info​

Projects API​

List Projects​

Create Project​

Get Project​

Update Project​

Delete Project​

Chat API​

Send Chat Message (OpenAI-Compatible)​

Get Chat History​

Delete Chat Session​

Delete All Chat Sessions​

List Available Models​

Datasets API​

List Datasets​

Get Available Strategies​

Create Dataset​

Delete Dataset​

Upload File to Dataset​

Process Dataset​

Remove File from Dataset​

RAG API​

Query RAG System​

Check RAG Health​

Tasks API​

Get Task Status​

Examples API​

List Examples​

Import Example as Project​

Import Example Data​

Health API​

Overall Health Check​

Liveness Probe​

System Info​

Root Endpoint​

System Information​

Multi-Model Support​

Configuration​

Using Different Models​

Get Available Models​

Rate Limiting and Performance​

Concurrent Requests​

Session Management​

Async Processing​

Best Practices​

Error Handling​

Using RAG Effectively​

Chat Sessions​

Dataset Processing​

API Clients​

Python Example​

JavaScript/TypeScript Example​

Troubleshooting​

Common Issues​

Debugging Tips​

Next Steps​

Base URL

Finding Your Namespace and Project Name

Understanding Namespaces and Projects

From Your llamafarm.yaml

From the File System

Using the API

Custom Data Directory

Authentication

Response Format

Success Responses

Error Responses

API Endpoints Overview

Projects

Chat

Datasets

RAG (Retrieval-Augmented Generation)

Tasks

Examples

Health

System Info

Projects API

List Projects

Create Project

Get Project

Update Project

Delete Project

Chat API

Send Chat Message (OpenAI-Compatible)

Get Chat History

Delete Chat Session

Delete All Chat Sessions

List Available Models

Datasets API

List Datasets

Get Available Strategies

Create Dataset

Delete Dataset

Upload File to Dataset

Process Dataset

Remove File from Dataset

RAG API

Query RAG System

Check RAG Health

Tasks API

Get Task Status

Examples API

List Examples

Import Example as Project

Import Example Data

Health API

Overall Health Check

Liveness Probe

System Info

Root Endpoint

System Information

Multi-Model Support

Configuration

Using Different Models

Get Available Models

Rate Limiting and Performance

Concurrent Requests

Session Management

Async Processing

Best Practices

Error Handling

Using RAG Effectively

Chat Sessions

Dataset Processing

API Clients

Python Example

JavaScript/TypeScript Example

Troubleshooting

Common Issues

Debugging Tips

Next Steps