Changelog
Stay up to date with the latest features, improvements, and fixes in LlamaFarm.
Latest Release
v0.0.30 — 2026-04-09
Release Highlights: 0.0.30
This release focuses on improving the user experience by adding new command-line tools and streamlining the release process, making it easier than ever to work with LlamaFarm.
New Features
Native CLI Commands for LlamaFarm Models
Now, you can interact with LlamaFarm models directly from the command line without needing to start a server. This makes it faster and more convenient to manage and run models, especially for users who prefer the command line interface. You can now launch, stop, and manage models with simple commands, giving you greater control and flexibility.
Improvements
Streamlined Release Process
The release process has been made more efficient, ensuring that each version of LlamaFarm is ready to use with minimal setup. This means users can get the latest features and improvements faster, with fewer steps to follow.
We’re excited to share this update and look forward to your feedback as you explore the new capabilities. Let us know how you're using LlamaFarm, and we'll be here to help!
v0.0.29 — 2026-04-08
Release Highlights: 0.0.29
This release focuses on improving performance and usability for edge deployments, while also enhancing reliability and clarity in model management.
New Features
Edge Optimization
We've introduced model preloading and cache pinning for edge environments, which helps reduce latency and improves responsiveness when running AI models on devices with limited resources like Raspberry Pi or Jetson boards. This means your models will load faster and stay ready for use, even in low-power settings.
Offline Mode Support
Now you can deploy LlamaFarm in offline mode by specifying the model path directly, either through the lf models path command or the environment variable LLAMAFARM_MODEL_DIR. This is perfect for environments without internet access or where you want to keep models localized for privacy and reliability.
Standalone Edge Runtime
For users deploying on devices like Raspberry Pi or Jetson, we’ve added a standalone edge runtime. This allows you to run LlamaFarm without relying on a full server setup, making it easier to deploy AI models on resource-constrained hardware.
Improvements
Enhanced Logging
We’ve forced UTF-8 encoding on log file handlers to prevent encoding issues and added the edge runtime to our CI pipeline. This ensures logs are consistent and reliable, especially in environments where character encoding can be tricky.
CI Reliability
A common issue with our CI workflow was causing duplicate SHA256 hashes that led to failed releases. This has been fixed, ensuring that pyapp release uploads now work smoothly and without errors.
Bug Fixes
Stable Releases
The release process has been refined to ensure that version 0.0.29 is properly and reliably published, with all necessary components correctly built and uploaded.
With 0.0.29, we're making it easier than ever to run LlamaFarm on edge devices and in offline environments. Whether you're working with low-powered hardware or need to keep your models private, this release has something for you. Let us know how you're using LlamaFarm — we're excited to hear your stories! 🚀 A new LlamaFarm CLI release (v0.0.28) is available. Run 'lf version upgrade' for details.
v0.0.28 — 2026-03-05
LlamaFarm 0.0.28: Building Blocks for Smarter AI Workflows
This release focuses on expanding the capabilities of LlamaFarm, making it easier to build, deploy, and manage AI models with greater flexibility and reliability. We’ve introduced new tools, improved performance, and fixed issues that were impacting the user experience.
New Features & Enhancements
Deploy Models with Ease
We've added a new deploy command to the CLI and a bundled packaging system, making it simple to package and deploy models as self-contained units. This is especially useful for sharing models with teams or deploying in production environments.
Visual Designer for Deployment
A new Bundle UI has been added to the Designer, allowing users to create and manage deployment workflows visually. This makes it easier to build complex model pipelines without needing to write code.
ML Addons for Specialized Tasks
We've introduced several new ML addons, including support for time series analysis, drift detection, and CatBoost. These can be easily enabled or disabled, giving users the flexibility to tailor their AI workflows to specific needs.
Enhanced Log Support for GGUF Models
Users can now access log probabilities for GGUF chat completions, which is helpful for understanding model behavior and improving the quality of generated responses.
Improved Server-Side KV Cache
We’ve implemented a more efficient server-side key-value cache that supports multi-turn chaining and pre-warming, which helps maintain performance during long conversations with large models.
Built-in Tools System
A new tools system has been added to the server, including a tasks tool. This allows users to run custom functions directly within the AI platform, opening up new possibilities for automation and integration.
Structured Output Support
Models can now return structured outputs, which is especially useful for applications that require precise data formatting, such as data pipelines or API integrations.
Vision API Improvements
The vision API has seen significant improvements, including better evaluation pipelines and object tracking. This makes it easier to build and test computer vision models within LlamaFarm.
Vision UI for Designer
A new Vision UI has been added to the Designer, allowing users to build and manage vision workflows visually, including detection, classification, and training.
Vision MVP with Basic Functionality
We've launched a vision MVP that includes core capabilities like detection, classification, training, and feedback loops. This provides a solid foundation for building more advanced vision models.
Bug Fixes & Stability Improvements
CI Process Optimization
We fixed an issue where the prose changelog was causing an ARG_MAX overflow in the CI pipeline, making the build process more reliable.
Addon Registry Integration
The addon registry is now embedded into the binary for released builds, ensuring that models and tools are available without needing to download external packages.
Content Budget Calculations
We've improved the math behind content budget calculations, ensuring that model usage is tracked accurately and efficiently.
Remote Access for Designer
Users can now access the Designer remotely, which is especially useful for collaborative workflows or when working with headless environments.
Improved Addon Bundling
We’ve fixed an issue where base-install dependencies were being included in addon wheel bundles, ensuring that addons are self-contained and easier to manage.
Better Error Handling for Timeseries
If a timeseries backend is unavailable, the system now returns a 422 error instead of a 500, which helps users understand and resolve issues more quickly.
Other Updates
We've also completed the release process for version 0.0.28, ensuring that everything is ready for users to try out and provide feedback.
LlamaFarm 0.0.28 is a major step forward in making AI development more intuitive, efficient, and powerful. Whether you're building models, managing workflows, or exploring new capabilities like vision, there's something here to help you get more done. Let us know what you think!
v0.0.27 — 2026-02-16
LlamaFarm 0.0.27: Addons, Smarter RAG, and Runtime Resilience
This release introduces the addons system, smarter RAG defaults, and significant runtime stability improvements.
New Features
Addons System
LlamaFarm now supports addons — modular extensions you can install and enable to expand your platform's capabilities. The Designer includes a polished UX for browsing, installing, and managing addons, with sequential installation and auto-enable on install for a smooth experience.
Per-Model RAG Defaults
You can now configure default RAG retrieval strategies on a per-model basis. This means different models can automatically use the retrieval settings that work best for them — no manual configuration needed each time.
RAG Source Chunks in Test Outputs
The Designer now shows RAG source chunks directly in test outputs, so you can see exactly which documents your model is referencing. Great for debugging retrieval quality and understanding model responses.
Cascading Data Processing Strategies
The server now supports cascading default data processing strategies, making it easier to set up sensible defaults that flow through your entire pipeline.
Anomaly Detection Documentation
Comprehensive docs, use-cases, and a full demo for the anomaly detection feature introduced in v0.0.24 — making it much easier to get started with outlier detection.
Infrastructure
- Binary component builds for faster CI and distribution
- Server port change — default port moved from 8000 to 14345 to avoid conflicts
Bug Fixes
- Smart GPU allocation — prevents multi-model OOM crashes by intelligently managing GPU memory across loaded models
- Event loop protection — model loading in the Universal Runtime no longer blocks the event loop, improving responsiveness during heavy loads
- API system prompts — fixed a bug where API-provided system prompts were being overridden by config-level system prompts
- Designer improvements — better delete UX, ghost project handling, fixed 404 on train button, improved onboarding checklist updates after demo project conversion
- Audio error handling — improved error handling in the Designer for audio features
v0.0.26 — 2026-01-27
LlamaFarm 0.0.26: Smarter, Faster, and More Accessible
This release brings a range of improvements to make LlamaFarm more intuitive, efficient, and accessible across different platforms and use cases.
New Features and Enhancements
Reusability and Configuration Improvements
We've introduced reusable components in the configuration system, allowing you to define and reuse common settings across different parts of your application. This makes managing complex configurations much simpler and reduces duplication.
Enhanced RAG Capabilities
Universal RAG - We've added zero-config default strategies that work out of the box for most use cases. No more complex setup required to get started with retrieval-augmented generation.
Document Preview - You can now preview documents with strategy selection directly in the Designer, making it easier to understand how your RAG pipeline processes different file types.
Dataset Management
New sample datasets for gardening and home repair scenarios help you get started quickly with realistic data. Plus, datasets now auto-process on upload, eliminating manual processing steps.
Developer Experience
Dynamic Value Substitution - Prompts and tools now support dynamic variable substitution, making your configurations more flexible and powerful.
Service Status Panel - A new status panel in the Designer header gives you real-time visibility into your LlamaFarm services, so you know exactly what's running.
Audio and Speech
This release introduces a full-duplex speech reasoning pipeline with audio processing capabilities in the Universal Runtime. Build voice-enabled AI applications with ease.
Cross-Platform Support
- Desktop App Improvements - Better splash screen UX and enhanced cross-platform support
- Intel Mac Support - Added support for Intel Macs (x86_64) with PyTorch 2.2.2
- Jetson/Tegra Optimization - Improved CUDA optimization and unified memory GPU support
Bug Fixes
- Fixed dev builds stopping running services
- Resolved sample project creation failures
- Fixed chat input clearing during streaming
- Improved error display and Service Status panel reliability
Recent Releases
v0.0.25 — 2026-01-14
LlamaFarm 0.0.25: Native Tool Calling and Developer Productivity
This release focuses on improving the developer experience with better tooling, native tool calling support, and automatic file processing capabilities.
New Features
Native Tool Calling
The Universal Runtime now supports native tool calling, enabling your AI models to interact with external tools and APIs more efficiently. This is a major step forward for building agentic AI applications that can take actions in the real world.
Automatic File Processing
Files uploaded to datasets now process automatically, eliminating the manual processing step and streamlining your workflow. Just upload and go.
Enhanced Designer Development Tools
The Designer now includes comprehensive API call logging in the dev tools panel, making it easier to debug and understand how your application communicates with the backend. See every request and response in real-time.
Streaming Model Downloads
Embedding model downloads now use SSE streaming, providing real-time progress updates so you always know exactly what's happening during long downloads.
Extended Testing Capabilities
The test space now includes support for anomaly detection and classifier tests, giving you more ways to validate your AI models before deployment.
Bug Fixes
- Fixed config validation error output for clearer debugging
- Resolved install and run failures on Windows with NVIDIA GPUs
- Removed parser fallback to prevent unexpected behavior
- Enabled offline GGUF model loading for air-gapped environments
v0.0.24 — 2026-01-06
LlamaFarm 0.0.24: Anomaly Detection
This release introduces anomaly detection capabilities to help identify outliers and unusual patterns in your data.
New Features
Anomaly Detection
The Universal Runtime now supports anomaly detection with configurable normalization methods for scoring. Whether you're monitoring for fraud, equipment failures, or data quality issues, LlamaFarm can now help identify when something doesn't look right.
Designer UX for Anomaly Detection
The Designer includes a new interface for configuring and testing anomaly detection models, making it easy to set up detection pipelines and visualize results.
Bug Fixes
- Fixed anomaly and classifier UX issues in the Designer for smoother workflows
v0.0.23 — 2025-12-20
LlamaFarm 0.0.23: Stability Improvements
A focused stability release addressing a critical logging issue in the Universal Runtime.
Bug Fixes
- Fixed broken pipe errors caused by problematic logging in the Universal Runtime, improving reliability for long-running inference tasks
v0.0.22 — 2025-12-19
LlamaFarm 0.0.22: Inference Fix
A quick bug fix release addressing an issue with logits processor handling.
Bug Fixes
- Fixed logits_processor to be passed as callable instead of list, resolving inference issues with certain model configurations
v0.0.21 — 2025-12-19
LlamaFarm 0.0.21: Specialized ML Models and Vision API
A feature-packed holiday release bringing specialized ML models, vision capabilities, and major Designer enhancements.
New Features
Specialized ML Models
Added support for OCR, document extraction, and anomaly detection models in the Universal Runtime. These specialized models expand what you can build with LlamaFarm beyond text generation - now you can extract text from images, parse documents, and detect anomalies.
Vision API
New vision router and model versioning for ML endpoints, enabling image understanding capabilities in your applications. Build apps that can see and understand visual content.
Designer Improvements
- Santa's Holiday Helper Demo - A festive demo project to help new users get started
- Enhanced RAG UX - Improved retrieval strategy settings in test chat
- Data Enhancements - Better tools for managing your datasets
- Global Project Listing - Easily see all your projects in one place
Cross-Platform Support
Native llama-cpp bindings now included for all platforms, and Windows builds correctly include the .exe extension for seamless installation.
Bug Fixes
- Fixed upgrade failures on Linux
- Ensured multi-arch Linux builds work correctly
- Fixed model unload cleanup and OpenAI message validation
- Removed console log spam in Designer
Older Releases
View all releases
v0.0.20 — 2025-12-10
Auto-Start Services, RAG Stats, and Reliability Improvements
New Features
- Auto-Start Service Flag - Services can now start automatically when you run LlamaFarm
- More GGUF Download Options - More quantization options for model downloads in Designer
- RAG Database Listing - List all documents in your RAG databases
- RAG Statistics - View detailed stats about your RAG setup
- Chunk Cleanup - Automatically remove database chunks when files are deleted
- Data Processing Control - Start and stop data processing from the API
Bug Fixes
- Fixed first-run startup failures for new users
- Improved path resolution with
~expansion - Better process manager locking to prevent conflicts
- Fixed upgrade hang caused by process stop deadlock
- Prevented storage of failed vectors in RAG
v0.0.19 — 2025-12-03
Automatic Model Downloads, Custom RAG Queries, and Reasoning Models
New Features
- Automatic Model Download Management - Models download automatically when needed
- Custom RAG Queries - Send custom RAG queries through the chat/completions endpoint
- Thinking/Reasoning Model Support - Support for models that show their reasoning process
- Database CRUD API - Full create, read, update, delete operations for databases
- Better Day-2 UX - Improved experience for returning users
- Disk Space Checking - Check available disk space before downloading models
- GGUF Model Listing - Browse available GGUF models for download
Bug Fixes
- Fixed datasets endpoint trailing slash requirement
- Improved cross-filesystem data moves
- Fixed PDF parsing issues in RAG
- Addressed demo timeout issues
v0.0.18 — 2025-11-25
v0.0.17 — 2025-11-24
Bug Fixes and Documentation
Bug Fixes
- Fixed empty prompts array for new projects
- Added troubleshooting documentation
- Fixed HuggingFace progress bar crashes
v0.0.16 — 2025-11-23
v0.0.15 — 2025-11-22
Desktop App Launch and GGUF Model Support
New Features
- Desktop App - Full Electron desktop app with auto-updates and polished UI
- GGUF Model Support - Run quantized GGUF models in the Universal Runtime
- Demo Project System - Interactive demo projects to help new users get started
- Universal Event Logging - Comprehensive observability across the platform
- Enhanced Tool Calling - Improved tool calling capabilities
- Project Cloning - Create new projects from existing ones
Bug Fixes
- Fixed upgrade failures on Unix-like systems
- Improved RAG integration and chat context management
- Fixed database tab switching in Designer
- Better dataset validation and status display
v0.0.14 — 2025-11-13
v0.0.13 — 2025-11-11
v0.0.12 — 2025-11-11
Project Management and Config Editor
New Features
- Delete Projects - Remove projects from CLI and API
- Config Editor Enhancements - Copy button, search, anchor points, unsaved changes prompts
- Embedding Strategies API - Configure embedding strategies via API
- MCP Server Config - Add MCP server configuration to runtime
- Project Context Provider - Better project context management
v0.0.11 and earlier
For releases v0.0.11 and earlier, please see the full changelog on GitHub.
About These Release Notes
These release notes are generated from our conventional commit history. For the complete structured changelog with commit links and PR references, see the CHANGELOG.md on GitHub.
Stay Updated
- GitHub Releases: github.com/llama-farm/llamafarm/releases
- Reddit: r/LlamaFarm
- Discord: Join our community