Skip to main content

Personal Medical Assistant ๐Ÿฅ

A 100% local, privacy-first medical assistant that helps you understand your medical records using AI and evidence-based medical knowledge. Built with Next.js and LlamaFarm, all PDF processing happens entirely in your browser โ€“ your health data never leaves your device.

Video Demoโ€‹

Watch the full demonstration of the Medical Records Helper in action, showing how it processes medical documents locally and provides intelligent, evidence-based responses to your health questions.

โœจ Key Featuresโ€‹

๐Ÿ”’ Complete Privacyโ€‹

  • PDFs parsed client-side โ€“ All document processing happens in your browser
  • No server uploads โ€“ Your files never leave your device
  • PHI protection โ€“ Protected Health Information stays completely private
  • Local-first architecture โ€“ Full HIPAA-aligned approach to sensitive data

๐Ÿค– Multi-Hop Agentic RAGโ€‹

  • AI orchestration โ€“ Intelligent query generation and refinement
  • Knowledge retrieval โ€“ Semantic search across medical literature
  • Response synthesis โ€“ Combines multiple sources for comprehensive answers
  • Chain-of-thought reasoning โ€“ Transparent decision-making process

๐Ÿ“š Medical Knowledge Baseโ€‹

  • 125,830 knowledge chunks from authoritative medical textbooks
  • 18 medical textbooks from the MedRAG dataset
  • Evidence-based information โ€“ Vetted medical knowledge sources
  • Semantic indexing โ€“ Fast, accurate retrieval of relevant information

โšก Two-Tier AI Architectureโ€‹

  • Fast model for query generation and routing (e.g., Llama 3.2 3B)
  • Capable model for comprehensive medical responses (e.g., Qwen 2.5 14B)
  • Optimized inference โ€“ Balance between speed and quality
  • Streaming responses โ€“ Real-time output for better UX

๐Ÿ’ฌ Streaming Chat Interfaceโ€‹

  • Real-time streaming โ€“ See responses as they're generated
  • Collapsible agent reasoning โ€“ Inspect the AI's decision-making process
  • Conversation history โ€“ Maintain context across multiple queries
  • Citation support โ€“ Link responses to source documents

๐Ÿ“„ Smart Document Analysisโ€‹

  • Semantic chunking โ€“ Intelligent splitting of medical documents
  • Medical context awareness โ€“ Understands clinical terminology and structure
  • Cross-document synthesis โ€“ Correlate information across multiple records
  • Metadata extraction โ€“ Automatically identify key information

โš™๏ธ Configurable Retrievalโ€‹

  • Adjustable top-k โ€“ Control how many documents to retrieve
  • Score thresholds โ€“ Filter low-relevance results
  • Local document toggle โ€“ Choose between uploaded docs or knowledge base
  • Hybrid search โ€“ Combine keyword and semantic search

๐ŸŽจ Modern UIโ€‹

  • Built with shadcn/ui โ€“ Beautiful, accessible components
  • Tailwind CSS โ€“ Responsive, mobile-friendly design
  • Dark mode support โ€“ Easy on the eyes for extended use
  • Intuitive workflow โ€“ Upload, chat, understand

Architecture Overviewโ€‹

graph TB
A[User's Browser] -->|Upload PDF| B[Client-Side PDF Parser]
B -->|Extracted Text| C[LlamaFarm RAG Pipeline]
C -->|Query| D[Fast Model - Query Generation]
D -->|Refined Query| E[Vector Database]
E -->|Relevant Chunks| F[Medical Knowledge Base<br/>125,830 chunks]
E -->|User Documents| G[Uploaded Medical Records]
F -->|Context| H[Capable Model - Response]
G -->|Context| H
H -->|Streaming Response| I[Chat Interface]
I -->|User Question| D

style A fill:#e1f5ff
style C fill:#fff3cd
style F fill:#d4edda
style G fill:#f8d7da
style H fill:#d1ecf1

How It Worksโ€‹

  1. Document Upload โ€“ Drop your medical PDF into the browser interface
  2. Client-Side Parsing โ€“ PDF is parsed using PDF.js entirely in JavaScript
  3. Semantic Chunking โ€“ Text is split into meaningful chunks with medical context
  4. Query Processing โ€“ Your question is analyzed by a fast model to generate optimal search queries
  5. RAG Retrieval โ€“ Relevant information is retrieved from both your documents and the medical knowledge base
  6. Response Generation โ€“ A more capable model synthesizes the information into a comprehensive answer
  7. Stream & Display โ€“ Response is streamed in real-time with citations and reasoning

Use Casesโ€‹

๐Ÿฉบ Understanding Test Resultsโ€‹

Ask questions like:

  • "What does my hemoglobin A1c level of 6.8% mean?"
  • "Should I be concerned about elevated liver enzymes?"
  • "Explain my cholesterol panel results"

๐Ÿ’Š Medication Informationโ€‹

Get context about prescriptions:

  • "What are the side effects of metformin?"
  • "Why was I prescribed this medication?"
  • "Are there any drug interactions I should know about?"

๐Ÿฅ Procedure Preparationโ€‹

Prepare for medical procedures:

  • "What should I expect before my colonoscopy?"
  • "How do I prepare for an MRI with contrast?"
  • "What are the risks of this surgery?"

๐Ÿ“‹ Medical History Synthesisโ€‹

Consolidate your records:

  • "Summarize my visits from the last 6 months"
  • "What were my blood pressure trends over time?"
  • "List all medications I've been prescribed"

๐Ÿ” Second Opinion Researchโ€‹

Research conditions and treatments:

  • "What are alternative treatments for hypertension?"
  • "What does current research say about this diagnosis?"
  • "Are there any clinical trials for my condition?"

Getting Startedโ€‹

Prerequisitesโ€‹

  1. LlamaFarm installed and running โ€“ Follow the Quickstart Guide
  2. Two models configured:
    • Fast model (e.g., llama3.2:3b) for query generation
    • Capable model (e.g., qwen2.5:14b) for responses
  3. Medical knowledge base โ€“ Download and process the MedRAG dataset

Installationโ€‹

# Clone the local-ai-apps repository
git clone https://github.com/llama-farm/local-ai-apps.git
cd local-ai-apps/Medical-Records-Helper

# Install dependencies
npm install

# Configure environment variables
cp .env.example .env.local
# Edit .env.local with your LlamaFarm API endpoint

# Start the development server
npm run dev

Configurationโ€‹

Edit your llamafarm.yaml to configure the two-tier model setup:

version: v1
name: medical-assistant
namespace: personal

runtime:
models:
fast-model:
description: "Fast model for query generation"
provider: ollama
model: llama3.2:3b
base_url: http://localhost:11434/v1

capable-model:
description: "Capable model for medical responses"
provider: ollama
model: qwen2.5:14b
base_url: http://localhost:11434/v1

default_model: capable-model

rag:
databases:
- name: medical_knowledge
type: chroma
path: ./data/medical_knowledge

data_processing_strategies:
- name: medical_processor
parser: pdf
extractors:
- type: text
chunker:
type: semantic
chunk_size: 1000
chunk_overlap: 200
embedder:
model: all-MiniLM-L6-v2

Setting Up the Medical Knowledge Baseโ€‹

# Download the MedRAG dataset
wget https://github.com/Teddy-XiongGZ/MedRAG/releases/download/v1.0/textbooks.zip
unzip textbooks.zip

# Create dataset
lf datasets create medical_texts -s medical_processor -b medical_knowledge

# Ingest the textbooks
lf datasets ingest medical_texts textbooks/*.pdf

# Process into vector database (this may take a while)
lf datasets process medical_texts

Privacy & Securityโ€‹

HIPAA Compliance Considerationsโ€‹

While this application is designed with privacy in mind, consider these factors:

โœ… What's Private:

  • All PDF parsing happens client-side
  • Documents are not stored on any server
  • Queries are processed locally via LlamaFarm
  • No data is sent to external APIs

โš ๏ธ Important Notes:

  • LlamaFarm must be running locally (not exposed to internet)
  • Ensure your machine is secured (encrypted disk, screensaver lock)
  • Consider using this on a dedicated, air-gapped machine for maximum security
  • Review your local network security if accessing from multiple devices

Data Flowโ€‹

Your Device Only:
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Browser (PDF Parsing) โ”€โ”€> LlamaFarm (Local) โ”€โ”€> Models โ”‚
โ”‚ No Internet Required No Server Upload Local RAM โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Performance Tipsโ€‹

Hardware Recommendationsโ€‹

Minimum Configuration:

  • CPU: 4 cores (8 recommended)
  • RAM: 16GB (32GB recommended)
  • Storage: 50GB free space for models and knowledge base
  • GPU: Optional, but significantly speeds up inference

Optimal Configuration:

  • CPU: 8+ cores
  • RAM: 32GB+
  • GPU: NVIDIA GPU with 8GB+ VRAM (for faster inference)
  • Storage: SSD recommended

Model Selectionโ€‹

Choose models based on your hardware:

HardwareFast ModelCapable Model
CPU Onlyllama3.2:1bllama3.2:3b
8GB GPUllama3.2:3bqwen2.5:7b
16GB GPUllama3.2:3bqwen2.5:14b
24GB+ GPUllama3.2:3bqwen2.5:32b

Technical Detailsโ€‹

PDF Processing Pipelineโ€‹

The client-side PDF processing uses PDF.js to:

  1. Extract text content page by page
  2. Preserve formatting and structure
  3. Identify tables and lists
  4. Extract metadata (dates, patient info)

RAG Strategyโ€‹

The multi-hop agentic RAG approach:

  1. Query Analysis โ€“ Fast model analyzes the user's question
  2. Query Generation โ€“ Generate multiple search queries to capture different aspects
  3. Retrieval โ€“ Semantic search across knowledge base + user documents
  4. Re-ranking โ€“ Score and filter results by relevance
  5. Synthesis โ€“ Capable model generates comprehensive response with citations

Streaming Implementationโ€‹

Responses stream using Server-Sent Events (SSE):

// Client-side streaming handler
const eventSource = new EventSource('/api/chat');
eventSource.onmessage = (event) => {
const chunk = JSON.parse(event.data);
appendToChat(chunk.content);
};

Limitationsโ€‹

What This Tool Is NOTโ€‹

โŒ Not a replacement for medical professionals โ€“ Always consult qualified healthcare providers โŒ Not for emergencies โ€“ Call 911 or go to the ER for urgent medical issues โŒ Not diagnostic โ€“ Cannot diagnose conditions or prescribe treatments โŒ Not medical advice โ€“ For informational and educational purposes only

Known Limitationsโ€‹

  • Language support โ€“ Currently optimized for English medical documents
  • Handwritten notes โ€“ Cannot process handwritten records (OCR not included)
  • Image analysis โ€“ Cannot interpret medical images (X-rays, CT scans, etc.)
  • Complex tables โ€“ May have difficulty with intricate tabular data
  • Real-time data โ€“ Cannot access current labs or vitals from healthcare systems

Troubleshootingโ€‹

PDFs Not Parsingโ€‹

Problem: Uploaded PDF shows no content Solutions:

  • Check if PDF is encrypted (password-protected)
  • Ensure PDF contains text (not scanned images without OCR)
  • Try a different PDF viewer to verify file integrity
  • Check browser console for JavaScript errors

Slow Response Timesโ€‹

Problem: AI responses take too long Solutions:

  • Switch to smaller/faster models
  • Reduce RAG top-k setting (retrieve fewer chunks)
  • Increase score threshold (more selective retrieval)
  • Check system resources (CPU/RAM usage)
  • Consider GPU acceleration

Poor Answer Qualityโ€‹

Problem: Responses are vague or incorrect Solutions:

  • Use a more capable model for response generation
  • Increase top-k to retrieve more context
  • Ensure medical knowledge base is properly processed
  • Refine your questions to be more specific
  • Check that relevant documents are uploaded

LlamaFarm Connection Issuesโ€‹

Problem: Cannot connect to LlamaFarm API Solutions:

  • Verify LlamaFarm is running: lf start
  • Check API endpoint in .env.local
  • Ensure correct port (default: 8000)
  • Check firewall settings
  • Review LlamaFarm logs for errors

Contributingโ€‹

This example is part of the local-ai-apps repository. Contributions welcome!

Ideas for Enhancementโ€‹

  • ๐Ÿ“Š Visualization โ€“ Chart trends from lab results over time
  • ๐ŸŒ Multi-language โ€“ Support for non-English medical documents
  • ๐Ÿ”Š Voice interface โ€“ Ask questions using speech-to-text
  • ๐Ÿ“ฑ Mobile app โ€“ Native iOS/Android versions
  • ๐Ÿ”— EHR integration โ€“ Connect to healthcare system APIs (with proper authorization)
  • ๐Ÿงช Lab result interpretation โ€“ Automated flagging of abnormal values
  • ๐Ÿ“… Appointment preparation โ€“ Generate questions for your next doctor visit

Resourcesโ€‹

Medical Knowledge Sourcesโ€‹

  • MedRAG Dataset โ€“ 18 authoritative medical textbooks
  • PubMed Central โ€“ Free full-text archive of biomedical literature
  • UpToDate โ€“ Evidence-based clinical decision support (subscription)

Further Readingโ€‹


IMPORTANT: This application is for educational and informational purposes only. It is not intended to be a substitute for professional medical advice, diagnosis, or treatment. Always seek the advice of your physician or other qualified health provider with any questions you may have regarding a medical condition. Never disregard professional medical advice or delay in seeking it because of something you have read through this application.

The medical knowledge base includes information from publicly available medical textbooks and is provided "as is" without warranty of any kind. The developers of this application are not liable for any damages or health issues arising from the use of this tool.

If you think you may have a medical emergency, call your doctor or 911 immediately.


Next Stepsโ€‹