Skip to content

Multi-Tenancy Architecture

This document explains how Omnia provides multi-tenancy through Workspaces, the design decisions behind the architecture, and how it scales from small teams to enterprise deployments.

Workspaces provide logical isolation for teams sharing an Omnia cluster. Each workspace has:

  • Dedicated namespace - Kubernetes namespace for resource isolation
  • Role-based access - Three roles (owner, editor, viewer) with scoped permissions
  • Resource quotas - Limits on compute, objects, and Omnia-specific resources
  • Network isolation - Automatic NetworkPolicy generation to restrict cross-tenant traffic
  • Cost attribution - Tags for tracking spend by team
graph TB
subgraph cluster["Kubernetes Cluster"]
subgraph ws1["Workspace: Team A"]
ns1[Namespace: omnia-team-a]
sa1[ServiceAccounts]
rb1[RoleBindings]
q1[ResourceQuota]
np1[NetworkPolicy]
a1[Agents]
end
subgraph ws2["Workspace: Team B"]
ns2[Namespace: omnia-team-b]
sa2[ServiceAccounts]
rb2[RoleBindings]
q2[ResourceQuota]
np2[NetworkPolicy]
a2[Agents]
end
subgraph shared["Shared Resources"]
providers[Providers]
tools[ToolRegistries]
end
a1 --> providers
a1 --> tools
a2 --> providers
a2 --> tools
end
users1((Team A Users)) --> ws1
users2((Team B Users)) --> ws2

Each workspace maps to exactly one Kubernetes namespace. This provides:

  • Resource scoping - Agents, PromptPacks, and Arena jobs are namespace-scoped
  • Network isolation - NetworkPolicies can restrict cross-namespace traffic
  • RBAC boundaries - Permissions are scoped to the namespace

The controller creates the namespace when spec.namespace.create: true:

spec:
namespace:
name: omnia-customer-support
create: true
labels:
environment: production

Workspaces define three roles with increasing permissions:

RoleView ResourcesCreate/EditDeleteManage Members
viewerYesNoNoNo
editorYesYesYesNo
ownerYesYesYesYes

The controller creates a ServiceAccount and RoleBinding for each role:

graph LR
subgraph workspace["Workspace Controller"]
ws[Workspace CRD]
end
subgraph namespace["Workspace Namespace"]
sa1[SA: ws-owner]
sa2[SA: ws-editor]
sa3[SA: ws-viewer]
rb1[RoleBinding → owner]
rb2[RoleBinding → editor]
rb3[RoleBinding → viewer]
end
subgraph clusterRoles["Cluster Roles"]
cr1[omnia-workspace-owner]
cr2[omnia-workspace-editor]
cr3[omnia-workspace-viewer]
end
ws --> sa1
ws --> sa2
ws --> sa3
sa1 --> rb1
sa2 --> rb2
sa3 --> rb3
rb1 --> cr1
rb2 --> cr2
rb3 --> cr3

When a user accesses the dashboard, authorization happens at the application layer:

sequenceDiagram
participant U as User
participant D as Dashboard
participant K as Kubernetes API
U->>D: Request workspace resources
D->>D: Extract user identity (OIDC claims)
D->>K: Get Workspace CRD
K-->>D: Workspace spec
D->>D: Match user groups to roleBindings
D->>D: Determine role (owner/editor/viewer)
D->>K: Get SA token for role
K-->>D: Short-lived token
D->>K: API call with SA token
K-->>D: Resources
D-->>U: Scoped response
  1. Identity extraction - User’s email and groups from OIDC token
  2. Role determination - Match groups against spec.roleBindings
  3. Token acquisition - Fetch ServiceAccount token for the determined role
  4. Scoped API calls - Use token to make workspace-scoped K8s API calls

Kubernetes RBAC has scaling limitations:

ScaleUsersRoleBindingsProblem
Small<50<100Works well
Medium50-500500-5000etcd pressure, slow reconciliation
Large500+5000+Unmanageable, audit nightmare

Application-layer authorization keeps the Workspace CRD small (10-20 group entries) while supporting thousands of users. User management happens in your identity provider, not Kubernetes.

Groups are the primary access mechanism:

spec:
roleBindings:
- groups:
- "omnia-admins@acme.com" # Azure AD group
- "engineering-team" # Okta group
role: editor

When a user’s JWT contains matching groups, they get the associated role. Multiple groups can map to the same role, and users get the highest privilege role from any matching group.

For CI/CD and automation, ServiceAccounts use native Kubernetes RBAC:

spec:
roleBindings:
- serviceAccounts:
- name: argocd-application-controller
namespace: argocd
role: editor

The controller creates actual RoleBindings for ServiceAccounts, allowing direct K8s API access without going through the dashboard.

The dashboard uses the Kubernetes TokenRequest API to get short-lived ServiceAccount tokens:

sequenceDiagram
participant D as Dashboard
participant K as Kubernetes API
D->>K: TokenRequest (SA, 1h expiry)
K-->>D: JWT token
D->>D: Cache token (50 min)
D->>K: API calls with token
Note over D,K: Token auto-refreshes before expiry

Benefits:

  • Security - Tokens expire automatically
  • Least privilege - Token matches user’s role
  • Audit - Token identity appears in K8s audit logs

Tokens are cached to avoid excessive TokenRequest calls:

  • TTL: Tokens have 1-hour expiry
  • Cache duration: 50 minutes (refresh before expiry)
  • Cache key: workspace + role combination

Some resources are shared across workspaces:

LLM providers are typically cluster-wide:

apiVersion: omnia.altairalabs.ai/v1alpha1
kind: Provider
metadata:
name: claude-sonnet
namespace: omnia-shared
labels:
omnia.altairalabs.ai/shared: "true"

Agents reference shared providers:

spec:
providerRef:
name: claude-sonnet
namespace: omnia-shared

Common tools can be shared:

apiVersion: omnia.altairalabs.ai/v1alpha1
kind: ToolRegistry
metadata:
name: common-tools
namespace: omnia-shared
labels:
omnia.altairalabs.ai/shared: "true"

Access to shared resources requires explicit reference. Workspaces can’t enumerate or modify resources in other namespaces.

The controller can create ResourceQuotas in workspace namespaces:

spec:
quotas:
compute:
requests.cpu: "50"
requests.memory: "100Gi"

This maps directly to Kubernetes ResourceQuota objects.

Additional quotas control Omnia resources:

spec:
quotas:
agents:
maxAgentRuntimes: 20
maxReplicasPerAgent: 10
arena:
maxConcurrentJobs: 10
maxJobsPerDay: 100

These are enforced by admission webhooks (planned) or controller validation.

Simplicity and Kubernetes alignment:

  • Native RBAC - Permissions naturally scope to namespaces
  • Resource isolation - Standard Kubernetes isolation model
  • Familiar model - Teams understand namespace boundaries
  • Tool compatibility - Works with existing K8s tooling

Future versions may support multiple namespaces per workspace for complex organizational structures.

Kubeflow Profiles require Istio and additional infrastructure. Workspaces are:

  • Self-contained - No external dependencies
  • Lightweight - Minimal resource overhead
  • Focused - Built for Omnia’s specific needs

Third-party multi-tenancy solutions add operational complexity:

  • Capsule - Powerful but requires learning new abstractions
  • HNC - Archived in April 2025
  • Workspaces - Integrated into Omnia, purpose-built

Kubernetes RBAC doesn’t scale for human users:

  • RoleBinding explosion - N users × M workspaces = N×M bindings
  • No dynamic groups - Changes require RoleBinding updates
  • No directory sync - Manual user provisioning

Application-layer auth with IdP groups provides:

  • Scalability - Thousands of users, minimal CRD entries
  • Dynamic membership - Add users in IdP, instant access
  • Centralized management - Single source of truth in IdP
  • Users get the minimum role needed
  • Anonymous access defaults to viewer
  • ServiceAccount tokens are scoped to workspace
  • Short-lived tokens (1 hour default)
  • Tokens cached server-side, not exposed to browser
  • Automatic refresh before expiry
  • Kubernetes audit logs capture all API calls
  • Token identity shows which workspace SA was used
  • Dashboard can log user actions separately

Workspaces can automatically generate NetworkPolicies to enforce network boundaries between tenants. When enabled, the controller creates a NetworkPolicy that restricts traffic while allowing essential communication.

spec:
networkPolicy:
isolate: true

The controller creates a NetworkPolicy named workspace-{name}-isolation with these default rules:

graph TB
subgraph workspace["Workspace Namespace"]
pods[All Pods]
end
subgraph defaults["Default Allowed Traffic"]
dns[DNS to kube-system]
same[Same Namespace]
shared[Shared Namespaces]
external[External APIs]
end
pods --> dns
pods <--> same
pods <--> shared
pods --> external
style dns fill:#90EE90
style same fill:#90EE90
style shared fill:#90EE90
style external fill:#90EE90
Default RuleDirectionPurpose
DNS (port 53)EgressAllow pods to resolve DNS
Same namespaceBothAllow intra-workspace communication
Shared namespacesBothAllow access to shared Providers/Tools
External IPs (0.0.0.0/0)EgressAllow LLM API calls
Private IPs (10.x, 172.16.x, 192.168.x)BlockedPrevent cross-tenant access

For fine-grained control, add custom ingress and egress rules:

spec:
networkPolicy:
isolate: true
allowExternalAPIs: true # Default: true
allowSharedNamespaces: true # Default: true
allowFrom:
- peers:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: ingress-nginx
allowTo:
- peers:
- ipBlock:
cidr: 10.0.0.0/8 # Internal database network
ports:
- protocol: TCP
port: 5432

For maximum isolation, disable external API access and explicitly allow only required endpoints:

spec:
networkPolicy:
isolate: true
allowExternalAPIs: false
allowTo:
# Only allow specific LLM provider
- peers:
- ipBlock:
cidr: 104.18.0.0/16 # Anthropic API
ports:
- protocol: TCP
port: 443

This approach provides defense-in-depth: even if application-layer authorization is compromised, network-layer controls prevent unauthorized data exfiltration or cross-tenant access.

For local development where agents need to access services on private networks (e.g., local Ollama):

spec:
networkPolicy:
isolate: true
allowPrivateNetworks: true # Removes RFC 1918 exclusions

This is useful when running LLM providers locally or accessing internal development services. Do not use in production.

Use the default setup:

  • Anonymous mode for development
  • Basic OIDC for production
  • Single workspace per team

Enable full workspace features:

  • OIDC with group claims
  • Multiple workspaces by team
  • Resource quotas per workspace

Leverage the scalability design:

  • IdP groups for all access (no direct grants)
  • Automated workspace provisioning via GitOps
  • Monitoring and alerting on quota usage

Consider additional patterns:

  • External policy engine (OPA, Cedar) for complex rules
  • Hierarchical workspaces (parent/child)
  • Cross-workspace sharing policies