DuckDB Backend¶
Overview¶
DuckDB is an embedded analytical database (OLAP) optimized for complex queries and aggregations. While not designed for high-concurrency transactional workloads, DuckDB excels at session analytics, reporting, and embedded use cases with zero-configuration setup.
Key Features:
Embedded Database: No server setup required, single-file or in-memory
Native JSON Support: Efficient JSON type for state and metadata storage
Columnar Storage: Optimized for analytical queries on session data
ACID Transactions: Reliable transaction support
SQL Analytics: Advanced SQL features for session analysis
Zero Configuration: Perfect for development and testing
Ideal Use Cases:
Development and testing environments
Session analytics and reporting dashboards
Embedded applications requiring local data storage
Offline analysis of exported session logs
Prototyping AI agent applications
Warning
DuckDB is optimized for OLAP workloads, not high-frequency transactional operations. For production AI agents with concurrent user sessions and frequent writes, use PostgreSQL or MySQL. DuckDB is best suited for analytics, development, and embedded scenarios.
Installation¶
Install SQLSpec with DuckDB support:
pip install sqlspec[duckdb] google-genai
# or
uv pip install sqlspec[duckdb] google-genai
Quick Start¶
Basic File-Based Database¶
from sqlspec.adapters.duckdb import DuckDBConfig
from sqlspec.adapters.duckdb.adk import DuckdbADKStore
from sqlspec.extensions.adk import SQLSpecSessionService
# Create file-based database
config = DuckDBConfig(database="./agent_sessions.duckdb")
store = DuckdbADKStore(config)
store.create_tables()
service = SQLSpecSessionService(store)
# Create session
session = service.create_session(
app_name="analytics_agent",
user_id="analyst_1",
state={"dashboard": "active"}
)
In-Memory Database (Testing)¶
from sqlspec.adapters.duckdb import DuckDBConfig
from sqlspec.adapters.duckdb.adk import DuckdbADKStore
# Create in-memory database
config = DuckDBConfig(database=":memory:")
store = DuckdbADKStore(config)
store.create_tables()
Tip
In-memory databases are perfect for unit tests and ephemeral workloads. All data is lost when the process exits.
Configuration¶
Basic Configuration¶
from sqlspec.adapters.duckdb import DuckDBConfig
config = DuckDBConfig(
database="/path/to/database.duckdb", # or ":memory:"
read_only=False,
config={
"threads": 4,
"max_memory": "1GB",
}
)
Custom Table Names¶
store = DuckdbADKStore(
config,
session_table="agent_sessions",
events_table="agent_events"
)
Schema¶
Sessions Table¶
CREATE TABLE IF NOT EXISTS adk_sessions (
id VARCHAR PRIMARY KEY,
app_name VARCHAR NOT NULL,
user_id VARCHAR NOT NULL,
state JSON NOT NULL, -- Native JSON type
create_time TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
update_time TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP
);
CREATE INDEX idx_adk_sessions_app_user
ON adk_sessions(app_name, user_id);
CREATE INDEX idx_adk_sessions_update_time
ON adk_sessions(update_time DESC);
Events Table¶
CREATE TABLE IF NOT EXISTS adk_events (
id VARCHAR PRIMARY KEY,
session_id VARCHAR NOT NULL,
app_name VARCHAR NOT NULL,
user_id VARCHAR NOT NULL,
invocation_id VARCHAR,
author VARCHAR,
actions BLOB,
long_running_tool_ids_json VARCHAR,
branch VARCHAR,
timestamp TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
content JSON,
grounding_metadata JSON,
custom_metadata JSON,
partial BOOLEAN,
turn_complete BOOLEAN,
interrupted BOOLEAN,
error_code VARCHAR,
error_message VARCHAR,
FOREIGN KEY (session_id) REFERENCES adk_sessions(id)
);
CREATE INDEX idx_adk_events_session
ON adk_events(session_id, timestamp ASC);
Note
DuckDB supports foreign keys but does not support CASCADE deletes. The store manually deletes events when a session is deleted.
Analytical Queries¶
DuckDB’s strength is analytical SQL. Here are examples for session analysis:
Session Activity by User¶
SELECT
user_id,
COUNT(*) as total_sessions,
AVG(julianday(update_time) - julianday(create_time)) as avg_duration_days
FROM adk_sessions
WHERE app_name = 'my_agent'
GROUP BY user_id
ORDER BY total_sessions DESC
LIMIT 10;
Event Distribution¶
SELECT
author,
COUNT(*) as event_count,
COUNT(DISTINCT session_id) as sessions_with_events
FROM adk_events
WHERE app_name = 'my_agent'
GROUP BY author;
Most Active Sessions¶
SELECT
s.id,
s.user_id,
COUNT(e.id) as event_count,
MIN(e.timestamp) as first_event,
MAX(e.timestamp) as last_event
FROM adk_sessions s
LEFT JOIN adk_events e ON s.id = e.session_id
WHERE s.app_name = 'my_agent'
GROUP BY s.id, s.user_id
ORDER BY event_count DESC
LIMIT 20;
JSON Extraction¶
-- Extract values from session state
SELECT
id,
user_id,
json_extract(state, '$.dashboard') as dashboard_type,
json_extract(state, '$.filters.date_range') as date_range
FROM adk_sessions
WHERE app_name = 'analytics_bot';
Time-Series Analysis¶
-- Events per hour
SELECT
DATE_TRUNC('hour', timestamp) as hour,
COUNT(*) as event_count,
COUNT(DISTINCT session_id) as active_sessions
FROM adk_events
WHERE app_name = 'my_agent'
AND timestamp >= CURRENT_TIMESTAMP - INTERVAL 24 HOUR
GROUP BY hour
ORDER BY hour;
Use Cases¶
Development & Testing¶
DuckDB’s zero-configuration setup makes it ideal for development:
# Quick setup for development
config = DuckDBConfig(database=":memory:")
store = DuckdbADKStore(config)
store.create_tables()
# No database server needed!
service = SQLSpecSessionService(store)
session = service.create_session("dev_app", "dev_user", {})
Session Analytics Dashboard¶
Build analytics on top of session data:
import duckdb
# Connect to existing DuckDB database
conn = duckdb.connect("agent_sessions.duckdb")
# Run analytical query
result = conn.execute("""
SELECT
DATE_TRUNC('day', create_time) as day,
COUNT(*) as sessions_created,
COUNT(DISTINCT user_id) as unique_users
FROM adk_sessions
WHERE app_name = 'my_agent'
GROUP BY day
ORDER BY day DESC
LIMIT 30
""").fetchall()
for day, sessions, users in result:
print(f"{day}: {sessions} sessions, {users} unique users")
Embedded Applications¶
Embed DuckDB in desktop applications:
from pathlib import Path
# Store database in application data directory
app_data = Path.home() / ".my_agent" / "sessions.duckdb"
app_data.parent.mkdir(parents=True, exist_ok=True)
config = DuckDBConfig(database=str(app_data))
store = DuckdbADKStore(config)
store.create_tables()
Performance Characteristics¶
Strengths¶
Analytical Queries: Excellent for aggregations, joins, and complex analytics
Columnar Storage: Efficient for scanning large datasets
Single-File Portability: Easy to backup, copy, and deploy
Memory Efficiency: Can handle datasets larger than RAM
SQL Features: Advanced SQL analytics functions available
Limitations¶
Concurrent Writes: Limited support for concurrent INSERT/UPDATE/DELETE
No CASCADE Deletes: Must manually handle cascading deletes
Transaction Model: Optimized for read-heavy workloads
Single Writer: Only one write transaction at a time
When to Use DuckDB¶
Ideal For:
✅ Development and testing environments ✅ Session analytics and reporting ✅ Embedded applications (desktop, mobile) ✅ Offline analysis of session logs ✅ Prototyping and demos ✅ Data science workflows on session data
Consider PostgreSQL Instead When:
❌ High-concurrency production AI agent (many simultaneous users) ❌ Frequent transactional updates required ❌ Need server-based deployment with connection pooling ❌ Require JSONB indexing for performance ❌ Need CASCADE deletes and full referential integrity
Comparison: DuckDB vs PostgreSQL¶
Feature |
DuckDB |
PostgreSQL |
|---|---|---|
Setup Complexity |
Zero config, embedded |
Requires server setup |
Concurrent Writes |
Limited |
Excellent |
Analytical Queries |
Excellent |
Good |
JSON Support |
Native JSON type |
Native JSONB with indexes |
Deployment |
Single file |
Client-server |
Best Use Case |
Analytics, development |
Production AI agents |
Example: Full Application¶
Follow the same script as ADK Session Service (AioSQLite) and swap in
DuckDBConfig plus the DuckDB ADK store. Because DuckDB runs in-process, you can drop the
connection pooling configuration entirely while keeping the session service logic identical.
Troubleshooting¶
Foreign Key Constraint Errors¶
If you see foreign key errors, ensure the session exists before creating events:
# Always create session first
session = service.create_session("app", "user", {})
# Then create events
event = service.append_event(session, user_event)
Database File Locked¶
DuckDB uses file locking. If you see “database is locked” errors:
# Close connection properly
store.close() # If available
# Or use in-memory for testing
config = DuckDBConfig(database=":memory:")
Migration from DuckDB to PostgreSQL¶
When your prototype becomes production, migrate to PostgreSQL:
# Export from DuckDB
import duckdb
duck_conn = duckdb.connect("agent_sessions.duckdb")
sessions = duck_conn.execute("SELECT * FROM adk_sessions").fetchall()
events = duck_conn.execute("SELECT * FROM adk_events").fetchall()
# Import to PostgreSQL
from sqlspec.adapters.asyncpg import AsyncpgConfig
from sqlspec.adapters.asyncpg.adk import AsyncpgADKStore
pg_config = AsyncpgConfig(pool_config={"dsn": "postgresql://..."})
pg_store = AsyncpgADKStore(pg_config)
await pg_store.create_tables()
# Insert data (handle async properly)
for session in sessions:
await pg_store.create_session(
session_id=session[0],
app_name=session[1],
user_id=session[2],
state=session[3]
)
API Reference¶
- class sqlspec.adapters.duckdb.adk.DuckdbADKStore[source]
Bases:
BaseSyncADKStore[DuckDBConfig]DuckDB ADK store for Google Agent Development Kit.
Implements session and event storage for Google Agent Development Kit using DuckDB’s synchronous driver. Provides: - Session state management with native JSON type - Event history tracking with BLOB-serialized actions - Native TIMESTAMP type support - Foreign key constraints (manual cascade in delete_session) - Columnar storage for analytical queries
- Parameters:
config¶ (
DuckDBConfig) – DuckDBConfig with extension_config[“adk”] settings.
Example
from sqlspec.adapters.duckdb import DuckDBConfig from sqlspec.adapters.duckdb.adk import DuckdbADKStore
- config = DuckDBConfig(
database=”sessions.ddb”, extension_config={
- “adk”: {
“session_table”: “my_sessions”, “events_table”: “my_events”, “owner_id_column”: “tenant_id INTEGER REFERENCES tenants(id)”
}
}
) store = DuckdbADKStore(config) store.create_tables()
- session = store.create_session(
session_id=”session-123”, app_name=”my-app”, user_id=”user-456”, state={“context”: “conversation”}
)
Notes
Uses DuckDB native JSON type (not JSONB)
TIMESTAMP for date/time storage with microsecond precision
BLOB for binary actions data
BOOLEAN native type support
Columnar storage provides excellent analytical query performance
DuckDB doesn’t support CASCADE in foreign keys (manual cascade required)
Optimized for OLAP workloads; for high-concurrency writes use PostgreSQL
Configuration is read from config.extension_config[“adk”]
- __init__(config)[source]
Initialize DuckDB ADK store.
- Parameters:
config¶ (
DuckDBConfig) – DuckDBConfig instance.
Notes
Configuration is read from config.extension_config[“adk”]: - session_table: Sessions table name (default: “adk_sessions”) - events_table: Events table name (default: “adk_events”) - owner_id_column: Optional owner FK column DDL (default: None)
- create_tables()[source]
Create both sessions and events tables if they don’t exist.
- Return type:
- create_session(session_id, app_name, user_id, state, owner_id=None)[source]
Create a new session.
- Parameters:
- Return type:
- Returns:
Created session record.
Notes
Uses current UTC timestamp for create_time and update_time. State is JSON-serialized using SQLSpec serializers.
- get_session(session_id)[source]
Get session by ID.
- Parameters:
- Return type:
- Returns:
Session record or None if not found.
Notes
DuckDB returns datetime objects for TIMESTAMP columns. JSON is parsed from database storage.
- update_session_state(session_id, state)[source]
Update session state.
- Parameters:
- Return type:
Notes
This replaces the entire state dictionary. Update time is automatically set to current UTC timestamp.
- delete_session(session_id)[source]
Delete session and all associated events.
Notes
DuckDB doesn’t support CASCADE in foreign keys, so we manually delete events first.
- list_sessions(app_name, user_id=None)[source]
List sessions for an app, optionally filtered by user.
- Parameters:
- Return type:
- Returns:
List of session records ordered by update_time DESC.
Notes
Uses composite index on (app_name, user_id) when user_id is provided.
- create_event(event_id, session_id, app_name, user_id, author=None, actions=None, content=None, **kwargs)[source]
Create a new event.
- Parameters:
- Return type:
- Returns:
Created event record.
Notes
Uses current UTC timestamp if not provided in kwargs. JSON fields are serialized using SQLSpec serializers.
- get_event(event_id)[source]
Get event by ID.
- Parameters:
- Return type:
- Returns:
Event record or None if not found.
- list_events(session_id)[source]
List events for a session ordered by timestamp.
- Parameters:
- Return type:
- Returns:
List of event records ordered by timestamp ASC.
- property config: ConfigT
Return the database configuration.
- property events_table: str
Return the events table name.
- property owner_id_column_ddl: str | None
Return the full owner ID column DDL (or None if not configured).
- property owner_id_column_name: str | None
Return the owner ID column name only (or None if not configured).
- property session_table: str
Return the sessions table name.
See Also¶
Quick Start - Quick start guide
Database Adapters - Adapter comparison
Schema Reference - Database schema details
ADK Session Service (AioSQLite) - Reference implementation (swap in DuckDB config)
DuckDB Documentation - Official DuckDB documentation
DuckDB SQL Reference - SQL syntax and functions