Session Management
This document explains how Omnia manages conversation sessions for AI agents.
What is a Session?
Section titled “What is a Session?”A session represents a single conversation between a client and an agent. It maintains:
- Conversation history - All messages exchanged
- Agent state - Internal state maintained by the agent
- Metadata - Custom data attached to the session
Sessions enable multi-turn conversations where the agent remembers previous context.
Session Lifecycle
Section titled “Session Lifecycle”stateDiagram-v2 [*] --> New: Client connects New --> Active: Session created Active --> Active: Message exchanged (TTL reset) Active --> Expired: TTL exceeded Expired --> [*]: Data deleted Active --> Resumed: Client reconnects with session_id Resumed --> Active: History loadedCreation
Section titled “Creation”A new session is created when:
- A client connects without providing a session ID
- A client provides a session ID that doesn’t exist or has expired
The server assigns a unique session ID and returns it in the connected message.
Active
Section titled “Active”While active, a session:
- Stores new messages as they’re exchanged
- Maintains the agent’s internal state
- Tracks the last activity timestamp
Expiration
Section titled “Expiration”Sessions expire after a period of inactivity defined by session.ttl. When expired:
- The session data is deleted
- Attempting to resume creates a new session
- The conversation history is lost
Session Stores
Section titled “Session Stores”In-Memory Store
Section titled “In-Memory Store”The simplest option, suitable for development:
session: type: memory ttl: 1hCharacteristics:
- Fast access
- No external dependencies
- Lost on pod restart
- Not suitable for multiple replicas
Redis Store
Section titled “Redis Store”Production-ready distributed storage:
session: type: redis ttl: 24h storeRef: name: redis-credentials key: urlCharacteristics:
- Persistent across restarts
- Works with multiple replicas
- Supports large session counts
- Requires Redis infrastructure
Session Resumption
Section titled “Session Resumption”Clients can resume sessions by including the session ID:
{ "type": "message", "session_id": "sess-abc123", "content": "What did we discuss earlier?"}Resumption Flow
Section titled “Resumption Flow”- Client sends message with session ID
- Server looks up session in store
- If found and not expired:
- Load conversation history
- Process new message with context
- If not found:
- Create new session
- Process message without history
Cross-Replica Resumption
Section titled “Cross-Replica Resumption”With Redis sessions, clients can resume on any replica:
graph LR C((Client)) --> LB[Load Balancer] LB --> P1[Agent Pod 1] LB --> P2[Agent Pod 2] LB --> P3[Agent Pod 3] P1 --> R[(Redis)] P2 --> R P3 --> RSession Data Structure
Section titled “Session Data Structure”Each session stores:
type Session struct { ID string AgentName string Messages []Message State map[string]interface{} Metadata map[string]interface{} CreatedAt time.Time UpdatedAt time.Time}Messages
Section titled “Messages”The conversation history:
type Message struct { Role string // "user", "assistant", "tool" Content string Timestamp time.Time}Agent-specific state that persists across messages. Useful for:
- Tracking conversation progress
- Storing extracted entities
- Managing multi-step workflows
Metadata
Section titled “Metadata”Client-provided metadata:
{ "type": "message", "content": "...", "metadata": { "user_id": "user-123", "source": "mobile-app" }}TTL Considerations
Section titled “TTL Considerations”Choosing a TTL
Section titled “Choosing a TTL”| Use Case | Recommended TTL |
|---|---|
| Quick queries | 15m - 1h |
| Support conversations | 1h - 4h |
| Ongoing projects | 24h - 168h |
| Persistent context | Use external state |
TTL and Memory
Section titled “TTL and Memory”Longer TTLs require more storage:
- Each message adds to session size
- Tool calls include full arguments/results
- Consider message pruning for long sessions
TTL Refresh
Section titled “TTL Refresh”TTL is refreshed on every activity:
- Client sends message
- Server updates
UpdatedAttimestamp - TTL countdown restarts
Scaling Considerations
Section titled “Scaling Considerations”Single Replica
Section titled “Single Replica”With one replica, in-memory sessions work fine. But:
- Sessions are lost on pod restart
- No horizontal scaling
Multiple Replicas
Section titled “Multiple Replicas”With multiple replicas, you must use Redis:
- Install Redis in your cluster
- Configure
session.type: redis - All replicas share session state
Session Affinity Alternative
Section titled “Session Affinity Alternative”If you can’t use Redis, configure service affinity:
apiVersion: v1kind: Servicespec: sessionAffinity: ClientIPThis routes the same client to the same pod, but:
- Sessions still lost on pod restart
- Uneven load distribution
- Not recommended for production
Best Practices
Section titled “Best Practices”- Use Redis for production - Always use Redis with multiple replicas
- Set appropriate TTLs - Balance memory usage with user experience
- Handle expiration gracefully - Clients should expect session loss
- Don’t store sensitive data - Sessions may be logged or cached
- Monitor session counts - Alert on unusual growth