Skip to content

Session Management

This document explains how Omnia manages conversation sessions for AI agents.

A session represents a single conversation between a client and an agent. It maintains:

  • Conversation history - All messages exchanged
  • Agent state - Internal state maintained by the agent
  • Metadata - Custom data attached to the session

Sessions enable multi-turn conversations where the agent remembers previous context.

stateDiagram-v2
[*] --> New: Client connects
New --> Active: Session created
Active --> Active: Message exchanged (TTL reset)
Active --> Expired: TTL exceeded
Expired --> [*]: Data deleted
Active --> Resumed: Client reconnects with session_id
Resumed --> Active: History loaded

A new session is created when:

  1. A client connects without providing a session ID
  2. A client provides a session ID that doesn’t exist or has expired

The server assigns a unique session ID and returns it in the connected message.

While active, a session:

  • Stores new messages as they’re exchanged
  • Maintains the agent’s internal state
  • Tracks the last activity timestamp

Sessions expire after a period of inactivity defined by session.ttl. When expired:

  • The session data is deleted
  • Attempting to resume creates a new session
  • The conversation history is lost

The simplest option, suitable for development:

session:
type: memory
ttl: 1h

Characteristics:

  • Fast access
  • No external dependencies
  • Lost on pod restart
  • Not suitable for multiple replicas

Production-ready distributed storage:

session:
type: redis
ttl: 24h
storeRef:
name: redis-credentials
key: url

Characteristics:

  • Persistent across restarts
  • Works with multiple replicas
  • Supports large session counts
  • Requires Redis infrastructure

Clients can resume sessions by including the session ID:

{
"type": "message",
"session_id": "sess-abc123",
"content": "What did we discuss earlier?"
}
  1. Client sends message with session ID
  2. Server looks up session in store
  3. If found and not expired:
    • Load conversation history
    • Process new message with context
  4. If not found:
    • Create new session
    • Process message without history

With Redis sessions, clients can resume on any replica:

graph LR
C((Client)) --> LB[Load Balancer]
LB --> P1[Agent Pod 1]
LB --> P2[Agent Pod 2]
LB --> P3[Agent Pod 3]
P1 --> R[(Redis)]
P2 --> R
P3 --> R

Each session stores:

type Session struct {
ID string
AgentName string
Messages []Message
State map[string]interface{}
Metadata map[string]interface{}
CreatedAt time.Time
UpdatedAt time.Time
}

The conversation history:

type Message struct {
Role string // "user", "assistant", "tool"
Content string
Timestamp time.Time
}

Agent-specific state that persists across messages. Useful for:

  • Tracking conversation progress
  • Storing extracted entities
  • Managing multi-step workflows

Client-provided metadata:

{
"type": "message",
"content": "...",
"metadata": {
"user_id": "user-123",
"source": "mobile-app"
}
}
Use CaseRecommended TTL
Quick queries15m - 1h
Support conversations1h - 4h
Ongoing projects24h - 168h
Persistent contextUse external state

Longer TTLs require more storage:

  • Each message adds to session size
  • Tool calls include full arguments/results
  • Consider message pruning for long sessions

TTL is refreshed on every activity:

  1. Client sends message
  2. Server updates UpdatedAt timestamp
  3. TTL countdown restarts

With one replica, in-memory sessions work fine. But:

  • Sessions are lost on pod restart
  • No horizontal scaling

With multiple replicas, you must use Redis:

  1. Install Redis in your cluster
  2. Configure session.type: redis
  3. All replicas share session state

If you can’t use Redis, configure service affinity:

apiVersion: v1
kind: Service
spec:
sessionAffinity: ClientIP

This routes the same client to the same pod, but:

  • Sessions still lost on pod restart
  • Uneven load distribution
  • Not recommended for production
  1. Use Redis for production - Always use Redis with multiple replicas
  2. Set appropriate TTLs - Balance memory usage with user experience
  3. Handle expiration gracefully - Clients should expect session loss
  4. Don’t store sensitive data - Sessions may be logged or cached
  5. Monitor session counts - Alert on unusual growth