Personal Medical Assistant ๐ฅ
A 100% local, privacy-first medical assistant that helps you understand your medical records using AI and evidence-based medical knowledge. Built with Next.js and LlamaFarm, all PDF processing happens entirely in your browser โ your health data never leaves your device.
Video Demoโ
Watch the full demonstration of the Medical Records Helper in action, showing how it processes medical documents locally and provides intelligent, evidence-based responses to your health questions.
โจ Key Featuresโ
๐ Complete Privacyโ
- PDFs parsed client-side โ All document processing happens in your browser
- No server uploads โ Your files never leave your device
- PHI protection โ Protected Health Information stays completely private
- Local-first architecture โ Full HIPAA-aligned approach to sensitive data
๐ค Multi-Hop Agentic RAGโ
- AI orchestration โ Intelligent query generation and refinement
- Knowledge retrieval โ Semantic search across medical literature
- Response synthesis โ Combines multiple sources for comprehensive answers
- Chain-of-thought reasoning โ Transparent decision-making process
๐ Medical Knowledge Baseโ
- 125,830 knowledge chunks from authoritative medical textbooks
- 18 medical textbooks from the MedRAG dataset
- Evidence-based information โ Vetted medical knowledge sources
- Semantic indexing โ Fast, accurate retrieval of relevant information
โก Two-Tier AI Architectureโ
- Fast model for query generation and routing (e.g., Llama 3.2 3B)
- Capable model for comprehensive medical responses (e.g., Qwen 2.5 14B)
- Optimized inference โ Balance between speed and quality
- Streaming responses โ Real-time output for better UX
๐ฌ Streaming Chat Interfaceโ
- Real-time streaming โ See responses as they're generated
- Collapsible agent reasoning โ Inspect the AI's decision-making process
- Conversation history โ Maintain context across multiple queries
- Citation support โ Link responses to source documents
๐ Smart Document Analysisโ
- Semantic chunking โ Intelligent splitting of medical documents
- Medical context awareness โ Understands clinical terminology and structure
- Cross-document synthesis โ Correlate information across multiple records
- Metadata extraction โ Automatically identify key information
โ๏ธ Configurable Retrievalโ
- Adjustable top-k โ Control how many documents to retrieve
- Score thresholds โ Filter low-relevance results
- Local document toggle โ Choose between uploaded docs or knowledge base
- Hybrid search โ Combine keyword and semantic search
๐จ Modern UIโ
- Built with shadcn/ui โ Beautiful, accessible components
- Tailwind CSS โ Responsive, mobile-friendly design
- Dark mode support โ Easy on the eyes for extended use
- Intuitive workflow โ Upload, chat, understand
Architecture Overviewโ
graph TB
A[User's Browser] -->|Upload PDF| B[Client-Side PDF Parser]
B -->|Extracted Text| C[LlamaFarm RAG Pipeline]
C -->|Query| D[Fast Model - Query Generation]
D -->|Refined Query| E[Vector Database]
E -->|Relevant Chunks| F[Medical Knowledge Base<br/>125,830 chunks]
E -->|User Documents| G[Uploaded Medical Records]
F -->|Context| H[Capable Model - Response]
G -->|Context| H
H -->|Streaming Response| I[Chat Interface]
I -->|User Question| D
style A fill:#e1f5ff
style C fill:#fff3cd
style F fill:#d4edda
style G fill:#f8d7da
style H fill:#d1ecf1
How It Worksโ
- Document Upload โ Drop your medical PDF into the browser interface
- Client-Side Parsing โ PDF is parsed using PDF.js entirely in JavaScript
- Semantic Chunking โ Text is split into meaningful chunks with medical context
- Query Processing โ Your question is analyzed by a fast model to generate optimal search queries
- RAG Retrieval โ Relevant information is retrieved from both your documents and the medical knowledge base
- Response Generation โ A more capable model synthesizes the information into a comprehensive answer
- Stream & Display โ Response is streamed in real-time with citations and reasoning
Use Casesโ
๐ฉบ Understanding Test Resultsโ
Ask questions like:
- "What does my hemoglobin A1c level of 6.8% mean?"
- "Should I be concerned about elevated liver enzymes?"
- "Explain my cholesterol panel results"
๐ Medication Informationโ
Get context about prescriptions:
- "What are the side effects of metformin?"
- "Why was I prescribed this medication?"
- "Are there any drug interactions I should know about?"
๐ฅ Procedure Preparationโ
Prepare for medical procedures:
- "What should I expect before my colonoscopy?"
- "How do I prepare for an MRI with contrast?"
- "What are the risks of this surgery?"
๐ Medical History Synthesisโ
Consolidate your records:
- "Summarize my visits from the last 6 months"
- "What were my blood pressure trends over time?"
- "List all medications I've been prescribed"
๐ Second Opinion Researchโ
Research conditions and treatments:
- "What are alternative treatments for hypertension?"
- "What does current research say about this diagnosis?"
- "Are there any clinical trials for my condition?"
Getting Startedโ
Prerequisitesโ
- LlamaFarm installed and running โ Follow the Quickstart Guide
- Two models configured:
- Fast model (e.g.,
llama3.2:3b
) for query generation - Capable model (e.g.,
qwen2.5:14b
) for responses
- Fast model (e.g.,
- Medical knowledge base โ Download and process the MedRAG dataset
Installationโ
# Clone the local-ai-apps repository
git clone https://github.com/llama-farm/local-ai-apps.git
cd local-ai-apps/Medical-Records-Helper
# Install dependencies
npm install
# Configure environment variables
cp .env.example .env.local
# Edit .env.local with your LlamaFarm API endpoint
# Start the development server
npm run dev
Configurationโ
Edit your llamafarm.yaml
to configure the two-tier model setup:
version: v1
name: medical-assistant
namespace: personal
runtime:
models:
fast-model:
description: "Fast model for query generation"
provider: ollama
model: llama3.2:3b
base_url: http://localhost:11434/v1
capable-model:
description: "Capable model for medical responses"
provider: ollama
model: qwen2.5:14b
base_url: http://localhost:11434/v1
default_model: capable-model
rag:
databases:
- name: medical_knowledge
type: chroma
path: ./data/medical_knowledge
data_processing_strategies:
- name: medical_processor
parser: pdf
extractors:
- type: text
chunker:
type: semantic
chunk_size: 1000
chunk_overlap: 200
embedder:
model: all-MiniLM-L6-v2
Setting Up the Medical Knowledge Baseโ
# Download the MedRAG dataset
wget https://github.com/Teddy-XiongGZ/MedRAG/releases/download/v1.0/textbooks.zip
unzip textbooks.zip
# Create dataset
lf datasets create medical_texts -s medical_processor -b medical_knowledge
# Ingest the textbooks
lf datasets ingest medical_texts textbooks/*.pdf
# Process into vector database (this may take a while)
lf datasets process medical_texts
Privacy & Securityโ
HIPAA Compliance Considerationsโ
While this application is designed with privacy in mind, consider these factors:
โ What's Private:
- All PDF parsing happens client-side
- Documents are not stored on any server
- Queries are processed locally via LlamaFarm
- No data is sent to external APIs
โ ๏ธ Important Notes:
- LlamaFarm must be running locally (not exposed to internet)
- Ensure your machine is secured (encrypted disk, screensaver lock)
- Consider using this on a dedicated, air-gapped machine for maximum security
- Review your local network security if accessing from multiple devices
Data Flowโ
Your Device Only:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Browser (PDF Parsing) โโ> LlamaFarm (Local) โโ> Models โ
โ No Internet Required No Server Upload Local RAM โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Performance Tipsโ
Hardware Recommendationsโ
Minimum Configuration:
- CPU: 4 cores (8 recommended)
- RAM: 16GB (32GB recommended)
- Storage: 50GB free space for models and knowledge base
- GPU: Optional, but significantly speeds up inference
Optimal Configuration:
- CPU: 8+ cores
- RAM: 32GB+
- GPU: NVIDIA GPU with 8GB+ VRAM (for faster inference)
- Storage: SSD recommended
Model Selectionโ
Choose models based on your hardware:
Hardware | Fast Model | Capable Model |
---|---|---|
CPU Only | llama3.2:1b | llama3.2:3b |
8GB GPU | llama3.2:3b | qwen2.5:7b |
16GB GPU | llama3.2:3b | qwen2.5:14b |
24GB+ GPU | llama3.2:3b | qwen2.5:32b |
Technical Detailsโ
PDF Processing Pipelineโ
The client-side PDF processing uses PDF.js to:
- Extract text content page by page
- Preserve formatting and structure
- Identify tables and lists
- Extract metadata (dates, patient info)
RAG Strategyโ
The multi-hop agentic RAG approach:
- Query Analysis โ Fast model analyzes the user's question
- Query Generation โ Generate multiple search queries to capture different aspects
- Retrieval โ Semantic search across knowledge base + user documents
- Re-ranking โ Score and filter results by relevance
- Synthesis โ Capable model generates comprehensive response with citations
Streaming Implementationโ
Responses stream using Server-Sent Events (SSE):
// Client-side streaming handler
const eventSource = new EventSource('/api/chat');
eventSource.onmessage = (event) => {
const chunk = JSON.parse(event.data);
appendToChat(chunk.content);
};
Limitationsโ
What This Tool Is NOTโ
โ Not a replacement for medical professionals โ Always consult qualified healthcare providers โ Not for emergencies โ Call 911 or go to the ER for urgent medical issues โ Not diagnostic โ Cannot diagnose conditions or prescribe treatments โ Not medical advice โ For informational and educational purposes only
Known Limitationsโ
- Language support โ Currently optimized for English medical documents
- Handwritten notes โ Cannot process handwritten records (OCR not included)
- Image analysis โ Cannot interpret medical images (X-rays, CT scans, etc.)
- Complex tables โ May have difficulty with intricate tabular data
- Real-time data โ Cannot access current labs or vitals from healthcare systems
Troubleshootingโ
PDFs Not Parsingโ
Problem: Uploaded PDF shows no content Solutions:
- Check if PDF is encrypted (password-protected)
- Ensure PDF contains text (not scanned images without OCR)
- Try a different PDF viewer to verify file integrity
- Check browser console for JavaScript errors
Slow Response Timesโ
Problem: AI responses take too long Solutions:
- Switch to smaller/faster models
- Reduce RAG top-k setting (retrieve fewer chunks)
- Increase score threshold (more selective retrieval)
- Check system resources (CPU/RAM usage)
- Consider GPU acceleration
Poor Answer Qualityโ
Problem: Responses are vague or incorrect Solutions:
- Use a more capable model for response generation
- Increase top-k to retrieve more context
- Ensure medical knowledge base is properly processed
- Refine your questions to be more specific
- Check that relevant documents are uploaded
LlamaFarm Connection Issuesโ
Problem: Cannot connect to LlamaFarm API Solutions:
- Verify LlamaFarm is running:
lf start
- Check API endpoint in
.env.local
- Ensure correct port (default: 8000)
- Check firewall settings
- Review LlamaFarm logs for errors
Contributingโ
This example is part of the local-ai-apps repository. Contributions welcome!
Ideas for Enhancementโ
- ๐ Visualization โ Chart trends from lab results over time
- ๐ Multi-language โ Support for non-English medical documents
- ๐ Voice interface โ Ask questions using speech-to-text
- ๐ฑ Mobile app โ Native iOS/Android versions
- ๐ EHR integration โ Connect to healthcare system APIs (with proper authorization)
- ๐งช Lab result interpretation โ Automated flagging of abnormal values
- ๐ Appointment preparation โ Generate questions for your next doctor visit
Resourcesโ
Medical Knowledge Sourcesโ
- MedRAG Dataset โ 18 authoritative medical textbooks
- PubMed Central โ Free full-text archive of biomedical literature
- UpToDate โ Evidence-based clinical decision support (subscription)
Related Projectsโ
- LlamaFarm Documentation โ Full platform documentation
- LlamaFarm GitHub โ Main repository
- Local AI Apps โ Collection of privacy-first applications
Further Readingโ
Legal Disclaimerโ
IMPORTANT: This application is for educational and informational purposes only. It is not intended to be a substitute for professional medical advice, diagnosis, or treatment. Always seek the advice of your physician or other qualified health provider with any questions you may have regarding a medical condition. Never disregard professional medical advice or delay in seeking it because of something you have read through this application.
The medical knowledge base includes information from publicly available medical textbooks and is provided "as is" without warranty of any kind. The developers of this application are not liable for any damages or health issues arising from the use of this tool.
If you think you may have a medical emergency, call your doctor or 911 immediately.
Next Stepsโ
- Learn about RAG configuration
- Explore multi-model setups
- Review privacy best practices
- Check out other examples