Skip to content

Production Setup

This tutorial walks through deploying Omnia for production use with:

  • Istio service mesh for traffic management and mTLS
  • Gateway API for external agent access
  • Observability stack (Prometheus, Grafana, Loki, Tempo)
  • JWT authentication for secure agent access
  • Kubernetes cluster (1.28+)
  • kubectl, helm, and istioctl installed
  • Domain name with DNS configured (for external access)

Install Istio with the Gateway API integration:

Terminal window
curl -L https://istio.io/downloadIstio | sh -
cd istio-*
export PATH=$PWD/bin:$PATH
istioctl install --set profile=default -y
kubectl label namespace default istio-injection=enabled

Verify Istio is running:

Terminal window
kubectl get pods -n istio-system

Create a values file for production:

replicaCount: 2
prometheus:
enabled: true
server:
persistentVolume:
enabled: true
size: 50Gi
grafana:
enabled: true
adminPassword: "your-secure-password" # Change this!
persistence:
enabled: true
size: 10Gi
loki:
enabled: true
singleBinary:
persistence:
enabled: true
size: 50Gi
alloy:
enabled: true
tempo:
enabled: true
persistence:
enabled: true
size: 50Gi
istio:
enabled: true
gateway:
enabled: true
className: istio
listeners:
https:
enabled: true
tlsSecretName: omnia-tls # Create this secret with your TLS cert
internalGateway:
enabled: true
className: istio
keda:
enabled: true

Install Omnia:

Terminal window
kubectl create namespace omnia-system
helm install omnia omnia/omnia \
-n omnia-system \
-f production-values.yaml

Create a TLS secret for HTTPS:

Terminal window
kubectl apply -f - <<EOF
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: omnia-tls
namespace: omnia-system
spec:
secretName: omnia-tls
issuerRef:
name: letsencrypt-prod
kind: ClusterIssuer
dnsNames:
- agents.yourdomain.com
EOF
kubectl create secret tls omnia-tls \
-n omnia-system \
--cert=path/to/tls.crt \
--key=path/to/tls.key

Set up JWT authentication for agent access:

authentication:
enabled: true
jwt:
issuer: "https://your-identity-provider.com"
audiences:
- "omnia-agents"

See Configure Authentication for detailed provider configuration.

First, create a Provider for your LLM:

apiVersion: v1
kind: Secret
metadata:
name: llm-credentials
namespace: default
stringData:
ANTHROPIC_API_KEY: "sk-ant-..."
---
apiVersion: omnia.altairalabs.ai/v1alpha1
kind: Provider
metadata:
name: claude-production
namespace: default
spec:
type: claude
model: claude-sonnet-4-20250514
secretRef:
name: llm-credentials
defaults:
temperature: "0.7"
maxTokens: 4096

Then create an AgentRuntime with production settings:

apiVersion: omnia.altairalabs.ai/v1alpha1
kind: AgentRuntime
metadata:
name: production-agent
namespace: default
spec:
promptPackRef:
name: my-promptpack
providerRef:
name: claude-production
facade:
type: websocket
port: 8080
handler: runtime # Production mode
session:
type: redis
storeRef:
name: redis-credentials
ttl: "24h"
runtime:
replicas: 2
resources:
requests:
cpu: "500m"
memory: "256Mi"
limits:
cpu: "1000m"
memory: "512Mi"
autoscaling:
enabled: true
type: keda
minReplicas: 1
maxReplicas: 10
keda:
pollingInterval: 15
cooldownPeriod: 300
triggers:
- type: prometheus
metadata:
serverAddress: "http://omnia-prometheus-server.omnia-system.svc"
query: 'sum(omnia_facade_connections_active{agent="production-agent"})'
threshold: "50"

Create an HTTPRoute to expose the agent:

apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
name: production-agent-route
namespace: default
spec:
parentRefs:
- name: omnia-agents
namespace: omnia-system
hostnames:
- "agents.yourdomain.com"
rules:
- matches:
- path:
type: PathPrefix
value: /production-agent
backendRefs:
- name: production-agent
port: 8080

The internal gateway exposes Grafana and Prometheus. Port-forward to access:

Terminal window
# Get the internal gateway IP
kubectl get gateway omnia-internal -n omnia-system
# Or port-forward directly
kubectl port-forward svc/omnia-grafana 3000:80 -n omnia-system

Access Grafana at http://localhost:3000 with:

  • Username: admin
  • Password: (from your values file)

Pre-configured dashboards include:

  • Omnia Overview: Request rates, latency, and errors
  • Agent Metrics: Per-agent performance and scaling
  • Cost Tracking: Token usage and estimated costs

With the external gateway configured, connect via WebSocket:

Terminal window
websocat "wss://agents.yourdomain.com/production-agent/ws?agent=production-agent" \
-H "Authorization: Bearer YOUR_JWT_TOKEN"

Or using the internal service (from within the cluster):

Terminal window
websocat "ws://production-agent.default.svc:8080/ws?agent=production-agent"
  • Istio pods running in istio-system
  • Omnia operator running in omnia-system
  • Prometheus scraping metrics
  • Grafana dashboards loading
  • Gateway receiving external traffic
  • Agent pods running with Istio sidecar
  • WebSocket connections working through Gateway