DuckDB¶

Sync DuckDB adapter with full Arrow integration, extension management, and secret configuration. DuckDB excels at analytical workloads and can query Parquet, CSV, and JSON files directly.

Configuration¶

class sqlspec.adapters.duckdb.DuckDBConfig[source]¶

Bases: SyncDatabaseConfig[DuckDBPyConnection, DuckDBConnectionPool, DuckDBDriver]

DuckDB configuration with connection pooling.

This configuration supports DuckDB's features including:

Connection pooling
Extension management and installation
Secret management for API integrations
Auto configuration settings
Arrow integration
Direct file querying capabilities
Configurable type handlers for JSON serialization and UUID conversion

DuckDB Connection Pool Configuration:

Default pool size is 1-4 connections (DuckDB uses single connection by default)
Connection recycling is set to 24 hours by default (set to 0 to disable)
Shared memory databases use :memory:shared_db for proper concurrency

Type Handler Configuration via driver_features:

json_serializer: Custom JSON serializer for dict/list parameters.
Defaults to sqlspec.utils.serializers.to_json if not provided. Accepts serializer callables that return text.

enable_uuid_conversion: Enable automatic UUID string conversion (default: True).

When True, UUID strings in query results are automatically converted to UUID objects. When False, UUID strings are treated as regular strings.

driver_type¶: alias of DuckDBDriver

__init__(*, connection_config=None, connection_instance=None, migration_config=None, statement_config=None, driver_features=None, bind_key=None, extension_config=None, observability_config=None, **kwargs)[source]¶

Initialize DuckDB configuration.

Parameters:

connection_config¶ (DuckDBPoolParams | dict[str, typing.Any] | None) -- Connection and pool configuration parameters
connection_instance¶ (DuckDBConnectionPool | None) -- Pre-created pool instance
migration_config¶ (dict[str, typing.Any] | None) -- Migration configuration
statement_config¶ (StatementConfig | None) -- Statement configuration override
driver_features¶ (DuckDBDriverFeatures | dict[str, typing.Any] | None) -- DuckDB-specific driver features including json_serializer and enable_uuid_conversion options
bind_key¶ (str | None) -- Optional unique identifier for this configuration
extension_config¶ (dict[str, dict[str, Any] | LitestarConfig | FastAPIConfig | StarletteConfig | SanicConfig | FlaskConfig | ADKConfig | EventsConfig | OpenTelemetryConfig | PrometheusConfig] | None) -- Extension-specific configuration
observability_config¶ (ObservabilityConfig | None) -- Adapter-level observability overrides for lifecycle hooks and observers
**kwargs¶ (Any) -- Additional keyword arguments passed to the base configuration.

create_connection()[source]¶

Get a DuckDB connection from the pool.

This method ensures the pool is created and returns a connection from the pool. The connection is checked out from the pool and must be properly managed by the caller.

Returns:: A connection from the pool
Return type:: DuckDBPyConnection

get_signature_namespace()[source]¶

Get the signature namespace for DuckDB types.

This provides all DuckDB-specific types that Litestar needs to recognize to avoid serialization attempts.

Return type:: dict[str, typing.Any]
Returns:: Dictionary mapping type names to types.

get_event_runtime_hints()[source]¶

Return polling defaults optimized for DuckDB.

Return type:: EventRuntimeHints

class sqlspec.adapters.duckdb.config.DuckDBExtensionConfig[source]¶

Bases: TypedDict

DuckDB extension configuration for auto-management.

name: str¶: Name of the extension to install/load.

version: NotRequired[str]¶: Specific version of the extension.

repository: NotRequired[str]¶: Repository for the extension (core, community, or custom URL).

repository_url: NotRequired[str]¶: Custom repository URL for the extension.

force_install: NotRequired[bool]¶: Force reinstallation of the extension.

class sqlspec.adapters.duckdb.config.DuckDBSecretConfig[source]¶

Bases: TypedDict

DuckDB secret configuration for AI/API integrations.

secret_type: str¶: Type of secret.

name: str¶: Name of the secret.

value: NotRequired[dict[str, Any]]¶: Secret configuration values.

provider: NotRequired[str]¶: Secret provider, such as config or credential_chain.

scope: NotRequired[str]¶: Optional path or storage-prefix scope for the secret.

persistent: NotRequired[bool]¶: Persist the secret to DuckDB's configured secret directory.

Driver¶

class sqlspec.adapters.duckdb.DuckDBDriver[source]¶

Bases: SyncDriverAdapterBase

Synchronous DuckDB database driver.

Provides SQL statement execution, transaction management, and result handling for DuckDB databases. Supports multiple parameter styles including QMARK, NUMERIC, and NAMED_DOLLAR formats.

The driver handles script execution, batch operations, and integrates with the sqlspec.core modules for statement processing and caching.

__init__(connection, statement_config=None, driver_features=None)[source]¶

Initialize driver adapter with connection and configuration.

Parameters:

connection¶ (DuckDBPyConnection) -- Database connection instance
statement_config¶ (StatementConfig | None) -- Statement configuration for the driver
driver_features¶ (dict[str, typing.Any] | None) -- Driver-specific features like extensions, secrets, and connection callbacks
observability¶ -- Optional runtime handling lifecycle hooks, observers, and spans

dispatch_execute(cursor, statement)[source]¶

Execute single SQL statement with data handling.

Executes a SQL statement with parameter binding and processes the results. Handles both data-returning queries and data modification operations.

Parameters:

cursor¶ (DuckDBPyConnection) -- DuckDB cursor object
statement¶ (SQL) -- SQL statement to execute

Return type:

ExecutionResult

Returns:

ExecutionResult with execution metadata

dispatch_execute_many(cursor, statement)[source]¶

Execute SQL with multiple parameter sets using batch processing.

Uses DuckDB's executemany method for batch operations and calculates row counts for both data modification and query operations.

Parameters:

cursor¶ (DuckDBPyConnection) -- DuckDB cursor object
statement¶ (SQL) -- SQL statement with multiple parameter sets

Return type:

ExecutionResult

Returns:

ExecutionResult with batch execution metadata

dispatch_execute_script(cursor, statement)[source]¶

Execute SQL script with statement splitting and parameter handling.

Parses multi-statement scripts and executes each statement sequentially with the provided parameters.

Parameters:

cursor¶ (DuckDBPyConnection) -- DuckDB cursor object
statement¶ (SQL) -- SQL statement with script content

Return type:

ExecutionResult

Returns:

ExecutionResult with script execution metadata

begin()[source]¶

Begin a database transaction.

Return type:: None

commit()[source]¶

Commit the current transaction.

Return type:: None

rollback()[source]¶

Rollback the current transaction.

Return type:: None

set_migration_session_schema(schema)[source]¶

Set DuckDB search_path for migration SQL.

Return type:: None

has_schema(schema)[source]¶

Return whether a DuckDB schema exists.

Return type:: bool

with_cursor(connection)[source]¶

Create context manager for DuckDB cursor.

Parameters:: connection¶ (DuckDBPyConnection) -- DuckDB connection instance
Return type:: DuckDBCursor
Returns:: DuckDBCursor context manager instance

handle_database_exceptions()[source]¶

Handle database-specific exceptions and wrap them appropriately.

Return type:: DuckDBExceptionHandler
Returns:: Exception handler with deferred exception pattern for mypyc compatibility.

select_to_arrow(statement, /, *parameters, statement_config=None, return_format='table', native_only=False, batch_size=None, arrow_schema=None, **kwargs)[source]¶

Execute query and return results as Apache Arrow (DuckDB native path).

DuckDB provides native Arrow support via cursor.arrow(). This is the fastest path due to DuckDB's columnar architecture.

Parameters:

statement¶ -- SQL statement, string, or QueryBuilder
*parameters¶ -- Query parameters or filters
statement_config¶ -- Optional statement configuration override
return_format¶ -- "table" for pyarrow.Table (default), "batch" for RecordBatch, "batches" for list of RecordBatch, "reader" for RecordBatchReader
native_only¶ -- Ignored for DuckDB (always uses native path)
batch_size¶ -- Batch size hint (for future streaming implementation)
arrow_schema¶ -- Optional pyarrow.Schema for type casting
**kwargs¶ -- Additional keyword arguments

Returns:

ArrowResult with native Arrow data

select_to_storage(statement, destination, /, *parameters, statement_config=None, partitioner=None, format_hint=None, telemetry=None, **kwargs)[source]¶: Persist DuckDB query output to a storage backend using Arrow fast paths.

load_from_arrow(table, source, *, partitioner=None, overwrite=False, telemetry=None)[source]¶

Load Arrow data into DuckDB using temporary table registration.

Return type:: StorageBridgeJob

load_from_storage(table, source, *, file_format, partitioner=None, overwrite=False)[source]¶

Read an artifact from storage and load it into DuckDB.

Return type:: StorageBridgeJob

property data_dictionary: DuckDBDataDictionary¶

Get the data dictionary for this driver.

Returns:: Data dictionary instance for metadata queries

collect_rows(cursor, fetched)[source]¶

Collect DuckDB rows for the direct execution path.

Return type:: tuple[list[typing.Any], list[str], int]

resolve_rowcount(cursor)[source]¶

Resolve rowcount from DuckDB cursor for the direct execution path.

Return type:: int

Connection Pool¶

final class sqlspec.adapters.duckdb.DuckDBConnectionPool[source]¶

Bases: object

Thread-local connection manager for DuckDB.

Uses thread-local storage to ensure each thread gets its own DuckDB connection, preventing the thread-safety issues that cause segmentation faults when multiple cursors share the same connection concurrently.

This design trades traditional pooling for thread safety, which is essential for DuckDB since connections and cursors are not thread-safe.

__init__(connection_config, pool_recycle_seconds=86400, health_check_interval=30.0, extensions=None, extension_flags=None, secrets=None, on_connection_create=None)[source]¶

Initialize the thread-local connection manager.

Parameters:

connection_config¶ -- DuckDB connection configuration
pool_recycle_seconds¶ -- Connection recycle time in seconds
health_check_interval¶ -- Seconds of idle time before running health check
extensions¶ -- List of extensions to install/load
extension_flags¶ -- Connection-level SET statements applied after creation
secrets¶ -- List of secrets to create
on_connection_create¶ -- Callback executed when connection is created

get_connection()[source]¶

Get a thread-local connection.

Each thread gets its own dedicated DuckDB connection to prevent thread-safety issues with concurrent cursor operations.

For file-based databases, the connection is closed when the context manager exits to release DuckDB's file lock, allowing subsequent connections with different configurations.

For in-memory databases, connections are kept alive to preserve data, as in-memory data is lost when the last connection closes.

Yields:: DuckDBConnection -- A thread-local connection.

close()[source]¶

Close the thread-local connection if it exists.

Return type:: None

size()[source]¶

Get current pool size (always 1 for thread-local).

Return type:: int

checked_out()[source]¶

Get number of checked out connections (always 0 for thread-local).

Return type:: int

acquire()[source]¶

Acquire a thread-local connection.

Each thread gets its own dedicated DuckDB connection to prevent thread-safety issues with concurrent cursor operations.

Returns:: A thread-local connection
Return type:: DuckDBPyConnection

release(connection)[source]¶

Release a connection (no-op for thread-local connections).

Parameters:: connection¶ (DuckDBPyConnection) -- The connection to release (ignored)
Return type:: None

Data Dictionary¶

class sqlspec.adapters.duckdb.data_dictionary.DuckDBDataDictionary[source]¶

Bases: SyncDataDictionaryBase

DuckDB-specific sync data dictionary.

dialect: ClassVar[str] = 'duckdb'¶: Dialect identifier. Must be defined by subclasses as a class attribute.

__init__()[source]¶

get_version(driver)[source]¶

Get DuckDB database version information.

Parameters:: driver¶ (DuckDBDriver) -- DuckDB driver instance.
Return type:: VersionInfo | None
Returns:: DuckDB version information or None if detection fails.

get_feature_flag(driver, feature)[source]¶

Check if DuckDB database supports a specific feature.

Parameters:

driver¶ (DuckDBDriver) -- DuckDB driver instance.
feature¶ (str) -- Feature name to check.

Return type:

bool

Returns:

True if feature is supported, False otherwise.

get_optimal_type(driver, type_category)[source]¶

Get optimal DuckDB type for a category.

Parameters:

driver¶ (DuckDBDriver) -- DuckDB driver instance.
type_category¶ (str) -- Type category.

Return type:

str

Returns:

DuckDB-specific type name.

get_tables(driver, schema=None)[source]¶

Get tables sorted by topological dependency order using DuckDB catalog.

Return type:: list[TableMetadata]

get_columns(driver, table=None, schema=None)[source]¶

Get column information for a table or schema.

Return type:: list[ColumnMetadata]

get_indexes(driver, table=None, schema=None)[source]¶

Get index metadata for a table or schema.

Return type:: list[IndexMetadata]

get_foreign_keys(driver, table=None, schema=None)[source]¶

Get foreign key metadata.

Return type:: list[ForeignKeyMetadata]