Skip to main content

Kubernetes

Deploy to Kubernetes clusters with manifests or Helm.

Example manifest

k8s-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: llamafarm
namespace: ai
spec:
replicas: 3
selector:
matchLabels:
app: llamafarm
template:
metadata:
labels:
app: llamafarm
spec:
containers:
- name: llamafarm
image: llamafarm/llamafarm:latest
ports:
- containerPort: 8080
resources:
requests:
memory: '8Gi'
cpu: '4'
nvidia.com/gpu: 1
limits:
memory: '16Gi'
cpu: '8'
nvidia.com/gpu: 1
env:
- name: MODEL_PATH
value: '/models'
volumeMounts:
- name: models
mountPath: /models
volumes:
- name: models
persistentVolumeClaim:
claimName: models-pvc

Helm

helm install llamafarm ./charts/llamafarm \
--set model.type=llama2-13b \
--set replicas=3 \
--set gpu.enabled=true

Tips

  • Use node pools for CPU/GPU separation
  • Configure HPA/VPA for scaling
  • Mount secrets via Secret and CSI providers