Skip to content

Scale Agent Deployments

This guide covers scaling strategies for Omnia agent deployments.

Scale by adjusting the runtime.replicas field:

apiVersion: omnia.altairalabs.ai/v1alpha1
kind: AgentRuntime
metadata:
name: my-agent
spec:
runtime:
replicas: 3
# ...

Or use kubectl:

Terminal window
kubectl patch agentruntime my-agent --type=merge \
-p '{"spec":{"runtime":{"replicas":5}}}'

Enable built-in HPA autoscaling:

spec:
runtime:
autoscaling:
enabled: true
type: hpa
minReplicas: 2
maxReplicas: 10
targetMemoryUtilizationPercentage: 70
targetCPUUtilizationPercentage: 90

The HPA automatically adjusts replicas based on resource utilization.

Terminal window
kubectl get hpa
kubectl describe hpa my-agent

For custom metrics and scale-to-zero capabilities, use KEDA:

spec:
runtime:
autoscaling:
enabled: true
type: keda
minReplicas: 1
maxReplicas: 20
keda:
pollingInterval: 30
cooldownPeriod: 300
triggers:
- type: prometheus
metadata:
serverAddress: "http://prometheus:9090"
query: 'sum(omnia_agent_connections_active{agent="my-agent"})'
threshold: "10"

See Autoscaling Explained for detailed KEDA configuration.

Configure CPU and memory for predictable performance:

spec:
runtime:
resources:
requests:
cpu: "500m"
memory: "256Mi"
limits:
cpu: "1000m"
memory: "512Mi"
WorkloadCPU RequestMemory Request
Light250m128Mi
Medium500m256Mi
Heavy1000m512Mi

When using multiple replicas, ensure session affinity:

Redis-backed sessions work seamlessly with any replica:

spec:
session:
type: redis
storeRef:
name: redis-credentials

If using memory sessions (not recommended for production), configure service affinity:

apiVersion: v1
kind: Service
metadata:
name: my-agent
spec:
sessionAffinity: ClientIP
sessionAffinityConfig:
clientIP:
timeoutSeconds: 3600

Check replica status:

Terminal window
kubectl get agentruntime my-agent -o wide

View status conditions:

Terminal window
kubectl describe agentruntime my-agent

View autoscaling metrics:

Terminal window
kubectl get hpa my-agent
kubectl get scaledobject my-agent
kubectl get hpa keda-hpa-my-agent