Skip to content

ArenaConfig CRD

The ArenaConfig custom resource defines a test configuration that combines an ArenaSource with providers and evaluation settings. It bridges PromptKit bundles with Omnia’s existing Provider and ToolRegistry CRDs.

apiVersion: omnia.altairalabs.ai/v1alpha1
kind: ArenaConfig

ArenaConfig provides:

  • Source binding: Reference an ArenaSource for the PromptKit bundle
  • Provider selection: Test scenarios against multiple LLM providers
  • Tool access: Make ToolRegistry tools available during evaluation
  • Scenario filtering: Include/exclude patterns for scenario selection
  • Self-play support: Configure agent vs agent evaluation
  • Evaluation tuning: Configure timeouts, retries, and concurrency

Reference to the ArenaSource containing the PromptKit bundle.

FieldTypeRequiredDescription
namestringYesName of the ArenaSource
spec:
sourceRef:
name: customer-support-prompts

Filter which scenarios to run from the bundle.

FieldTypeDescription
include[]stringGlob patterns for scenarios to include
exclude[]stringGlob patterns for scenarios to exclude
spec:
scenarios:
include:
- "scenarios/billing-*.yaml"
- "scenarios/support-*.yaml"
exclude:
- "*-wip.yaml"
- "scenarios/experimental/*"

Pattern matching:

  • Uses glob syntax (* matches any characters, ** matches paths)
  • Exclusions are applied after inclusions
  • If include is empty, all scenarios are included by default

List of Provider CRDs to use for LLM credentials. Each provider is tested against all selected scenarios.

FieldTypeRequiredDescription
namestringYesName of the Provider
namespacestringNoNamespace (defaults to config namespace)
spec:
providers:
- name: claude-sonnet
- name: gpt-4o
- name: gemini-pro
namespace: shared-providers

List of ToolRegistry CRDs to make available during evaluation. Tools from all registries are merged.

FieldTypeRequiredDescription
namestringYesName of the ToolRegistry
namespacestringNoNamespace (defaults to config namespace)
spec:
toolRegistries:
- name: customer-tools
- name: billing-tools

Configure self-play evaluation where agents compete against each other.

FieldTypeDefaultDescription
enabledbooleanfalseEnable self-play mode
roundsinteger1Number of rounds per scenario
swapRolesbooleanfalseAlternate roles between rounds
spec:
selfPlay:
enabled: true
rounds: 3
swapRoles: true

Configure evaluation criteria and execution settings.

FieldTypeDefaultDescription
metrics[]string-Metrics to collect (latency, tokens, cost, quality)
timeoutstring”5m”Max duration per evaluation
maxRetriesinteger3Max retries for failures (0-10)
concurrencyinteger1Parallel evaluations per worker (1-100)
spec:
evaluation:
metrics:
- latency
- tokens
- cost
- quality
timeout: 10m
maxRetries: 3
concurrency: 5

When true, prevents new jobs from being created. Existing jobs continue running.

spec:
suspend: true
ValueDescription
PendingConfig is being validated
ReadyConfig is valid and ready for jobs
InvalidConfig has validation errors
ErrorError occurred during validation

Information about the resolved ArenaSource.

FieldDescription
revisionArtifact revision from the source
urlArtifact download URL
scenarioCountNumber of scenarios matching filter

List of validated provider names.

TypeDescription
ReadyOverall readiness of the config
SourceResolvedArenaSource successfully resolved
ProvidersValidAll provider references are valid
ToolRegistriesValidAll tool registry references are valid

Timestamp of the last successful validation.

apiVersion: omnia.altairalabs.ai/v1alpha1
kind: ArenaConfig
metadata:
name: basic-eval
namespace: arena
spec:
sourceRef:
name: my-prompts
providers:
- name: claude-provider
evaluation:
timeout: 5m
apiVersion: omnia.altairalabs.ai/v1alpha1
kind: ArenaConfig
metadata:
name: provider-comparison
namespace: arena
spec:
sourceRef:
name: customer-support-prompts
scenarios:
include:
- "scenarios/*.yaml"
exclude:
- "*-experimental.yaml"
providers:
- name: claude-sonnet
- name: gpt-4o
- name: gemini-pro
evaluation:
metrics:
- latency
- tokens
- cost
- quality
timeout: 10m
concurrency: 10
apiVersion: omnia.altairalabs.ai/v1alpha1
kind: ArenaConfig
metadata:
name: debate-eval
namespace: arena
spec:
sourceRef:
name: debate-prompts
providers:
- name: claude-sonnet
selfPlay:
enabled: true
rounds: 5
swapRoles: true
evaluation:
timeout: 15m
maxRetries: 2
apiVersion: omnia.altairalabs.ai/v1alpha1
kind: ArenaConfig
metadata:
name: tool-eval
namespace: arena
spec:
sourceRef:
name: agent-prompts
providers:
- name: claude-sonnet
toolRegistries:
- name: search-tools
- name: calculator-tools
evaluation:
timeout: 5m
concurrency: 5

ArenaConfig is referenced by ArenaJob to execute test runs:

apiVersion: omnia.altairalabs.ai/v1alpha1
kind: ArenaJob
metadata:
name: evaluation-001
namespace: arena
spec:
sourceRef:
name: provider-comparison
  1. Create ArenaSource - Define where to fetch PromptKit bundles
  2. Create Providers - Configure LLM credentials
  3. Create ArenaConfig - Combine source with providers and settings
  4. Create ArenaJob - Execute the evaluation
ArenaSource ──┐
├──▶ ArenaConfig ──▶ ArenaJob ──▶ Results
Provider(s) ──┘