arrow-odbc#

Sync Arrow-over-ODBC adapter built on arrow-odbc. Streams pyarrow.RecordBatchReader results from any ODBC-compliant driver, making it a good fit for read-heavy analytical transfer between SQL Server, PostgreSQL, MySQL, or other ODBC sources and the Arrow ecosystem.

SQL Server coverage is exercised in CI against SQL Server 2022 through pytest-databases and Microsoft ODBC Driver 18. The shared contract matrix verifies native Arrow reads, Arrow reader/batch output, and Arrow bulk ingest for this adapter. Row-oriented execute_many() is intentionally unsupported; use load_from_arrow() for bulk writes.

Extension support is SQL Server-backed. The adapter exports a table-backed events queue store, a Litestar session store, and Google ADK session/event and memory stores for SQL Server connections through Microsoft ODBC Driver 18.

Configuration#

class sqlspec.adapters.arrow_odbc.ArrowOdbcConfig[source]#

Bases: NoPoolSyncConfig[Connection, ArrowOdbcDriver]

Configuration for synchronous arrow-odbc connections.

driver_type#: alias of ArrowOdbcDriver

connection_type#: alias of Connection

__init__(*, connection_config=None, connection_instance=None, migration_config=None, statement_config=None, driver_features=None, bind_key=None, extension_config=None, observability_config=None, **kwargs)[source]#: Initialize arrow-odbc configuration.

create_connection()[source]#

Create and return a new arrow-odbc connection.

Return type:: Connection

provide_connection(*args, **kwargs)[source]#

Provide a connection context manager.

Return type:: ArrowOdbcConnectionContext

provide_session(*_args, statement_config=None, **_kwargs)[source]#

Provide a driver session context manager.

Return type:: ArrowOdbcSessionContext

get_signature_namespace()[source]#

Get the signature namespace for ArrowOdbcConfig types.

Return type:: dict[str, typing.Any]

get_event_runtime_hints()[source]#

Return polling defaults suitable for generic ODBC sources.

Return type:: EventRuntimeHints

class sqlspec.adapters.arrow_odbc.ArrowOdbcConnectionParams[source]#

Bases: TypedDict

arrow-odbc connection parameters.

class sqlspec.adapters.arrow_odbc.ArrowOdbcDriverFeatures[source]#

Bases: TypedDict

arrow-odbc driver feature flags.

Driver#

class sqlspec.adapters.arrow_odbc.ArrowOdbcDriver[source]#

Bases: SyncDriverAdapterBase

Sync driver for generic ODBC connections with Arrow-native transfer.

__init__(connection, statement_config=None, driver_features=None)[source]#

Initialize driver adapter with connection and configuration.

Parameters:

connection¶ (Connection) -- Database connection instance
statement_config¶ (StatementConfig | None) -- Statement configuration for the driver
driver_features¶ (dict[str, typing.Any] | None) -- Driver-specific features like extensions, secrets, and connection callbacks
observability¶ -- Optional runtime handling lifecycle hooks, observers, and spans

property data_dictionary: ArrowOdbcDataDictionary#

Get the data dictionary for this driver.

Returns:: Data dictionary instance for metadata queries

dispatch_execute(cursor, statement)[source]#

Execute a single SQL statement.

Must be implemented by each driver for database-specific execution logic.

Parameters:

cursor¶ (Connection) -- Database cursor/connection object
statement¶ (SQL) -- SQL statement object with all necessary data and configuration

Return type:

ExecutionResult

Returns:

ExecutionResult with execution data

dispatch_execute_many(cursor, statement)[source]#

Execute SQL with multiple parameter sets (executemany).

Must be implemented by each driver for database-specific executemany logic.

Parameters:

cursor¶ (Connection) -- Database cursor/connection object
statement¶ (SQL) -- SQL statement object with all necessary data and configuration

Return type:

ExecutionResult

Returns:

ExecutionResult with execution data for the many operation

dispatch_execute_script(cursor, statement)[source]#

Execute a SQL script containing multiple statements.

Default implementation splits the script and executes statements individually. Drivers can override for database-specific script execution methods.

Parameters:

cursor¶ (Connection) -- Database cursor/connection object
statement¶ (SQL) -- SQL statement object with all necessary data and configuration

Return type:

ExecutionResult

Returns:

ExecutionResult with script execution data including statement counts

dispatch_select_stream(statement, chunk_size)[source]#

Return a native Arrow ODBC row stream backed by record batches.

Return type:: Optional[SyncRowStream[dict[str, typing.Any]]]

collect_rows(cursor, fetched)[source]#

Collect rows from cursor after fetchall for the direct execution path.

Adapters should override this method to provide optimized row collection that bypasses full dispatch_execute overhead.

Parameters:

cursor¶ (Connection) -- Database cursor with description metadata.
fetched¶ (list[typing.Any]) -- Rows returned from cursor.fetchall().

Return type:

tuple[list[typing.Any], list[str], int]

Returns:

Tuple of (data, column_names, row_count).

Raises:

NotImplementedError -- If the adapter does not implement this method.

resolve_rowcount(cursor)[source]#

Resolve the number of affected rows from cursor for the direct execution path.

Adapters should override this method to provide optimized rowcount resolution that bypasses full dispatch_execute overhead.

Parameters:: cursor¶ (Connection) -- Database cursor with rowcount metadata.
Return type:: int
Returns:: Number of affected rows, or 0 when unknown.
Raises:: NotImplementedError -- If the adapter does not implement this method.

begin()[source]#

Begin a database transaction on the current connection.

Return type:: None

commit()[source]#

Commit the current transaction on the current connection.

Return type:: None

rollback()[source]#

Rollback the current transaction on the current connection.

Return type:: None

with_cursor(connection)[source]#

Create and return a context manager for cursor acquisition and cleanup.

Returns a context manager that yields a cursor for database operations. Concrete implementations handle database-specific cursor creation and cleanup.

Return type:: ArrowOdbcCursor

handle_database_exceptions()[source]#

Handle database-specific exceptions and wrap them appropriately.

Return type:: ArrowOdbcExceptionHandler
Returns:: Exception handler with deferred exception pattern for mypyc compatibility. The handler stores mapped exceptions in pending_exception rather than raising from __exit__ to avoid ABI boundary violations.

create_savepoint(name)[source]#

Create a savepoint within the current transaction.

Return type:: None

release_savepoint(name)[source]#

Release a previously created savepoint.

Return type:: None

rollback_to_savepoint(name)[source]#

Roll back the current transaction to a previously created savepoint.

Return type:: None

select_to_arrow(statement, /, *parameters, statement_config=None, return_format='table', native_only=False, batch_size=None, arrow_schema=None, **kwargs)[source]#: Execute a query and return native Arrow results.

bulk_insert_arrow(target_table, source, *, chunk_size=None)[source]#

Insert an Arrow table or reader into a database table.

Return type:: None

load_from_arrow(table, source, *, partitioner=None, overwrite=False, telemetry=None)[source]#

Load Arrow data into a table via arrow-odbc bulk insert.

Return type:: StorageBridgeJob

load_from_storage(table, source, *, file_format, partitioner=None, overwrite=False)[source]#

Load staged artifacts from storage into a table via arrow-odbc bulk insert.

Return type:: StorageBridgeJob

Data Dictionary#

class sqlspec.adapters.arrow_odbc.data_dictionary.ArrowOdbcDataDictionary[source]#

Bases: SyncDataDictionaryBase

Runtime-dialect data dictionary for generic ODBC connections.

dialect: ClassVar[str] = 'sqlite'#: Dialect identifier. Must be defined by subclasses as a class attribute.

__init__(dialect='sqlite')[source]#

get_dialect_config()[source]#

Return the runtime dialect configuration for this data dictionary.

Return type:: DialectConfig

get_query(name)[source]#

Return a named SQL query for the runtime dialect.

Return type:: SQL

get_query_text(name)[source]#

Return raw SQL text for a named runtime dialect query.

Return type:: str

get_metadata_capabilities(driver, domains=None)[source]#

Report Arrow ODBC metadata capabilities without claiming raw catalog support.

Return type:: MetadataCapabilityProfile

resolve_schema(schema)[source]#

Return a schema name using runtime dialect defaults when missing.

Return type:: str | None

resolve_identifier(identifier)[source]#

Return a runtime-dialect-normalized identifier.

Return type:: str

get_version(driver)[source]#

Get database version information when the runtime dialect provides a query.

Return type:: VersionInfo | None

get_feature_flag(driver, feature)[source]#

Check whether the runtime dialect supports a feature.

Return type:: bool

get_optimal_type(driver, type_category)[source]#

Get the optimal runtime dialect type for a category.

Return type:: str

get_tables(driver, schema=None)[source]#

Get table metadata for dialects with bundled catalog queries.

Return type:: list[TableMetadata]

get_columns(driver, table=None, schema=None)[source]#

Get column metadata for dialects with bundled catalog queries.

Return type:: list[ColumnMetadata]

get_indexes(driver, table=None, schema=None)[source]#

Get index metadata for dialects with bundled catalog queries.

Return type:: list[IndexMetadata]

get_foreign_keys(driver, table=None, schema=None)[source]#

Get foreign-key metadata for dialects with bundled catalog queries.

Return type:: list[ForeignKeyMetadata]

get_ddl(driver, object_name, schema=None, *, object_type='table', include_dependencies=True, prefer_native=True, redact=True)[source]#

Fail closed because arrow-odbc does not expose lossless DDL metadata.

Return type:: DDLResult

Extensions#

class sqlspec.adapters.arrow_odbc.events.ArrowOdbcEventQueueStore[source]#

Bases: BaseEventQueueStore[ArrowOdbcConfig]

Event queue DDL for arrow-odbc SQL Server configs.

class sqlspec.adapters.arrow_odbc.litestar.ArrowOdbcStore[source]#

Bases: BaseSQLSpecStore[ArrowOdbcConfig]

SQL Server-backed session store using arrow-odbc sessions.

__init__(config)[source]#

Initialize the session store.

Parameters:: config¶ (ArrowOdbcConfig) -- SQLSpec database configuration.

async create_table()[source]#

Create the session table if it doesn't exist.

Return type:: None

async get(key, renew_for=None)[source]#

Get a session value by key.

Return type:: bytes | None

async set(key, value, expires_in=None)[source]#

Store a session value.

Return type:: None

async delete(key)[source]#

Delete a session by key.

Return type:: None

async delete_all()[source]#

Delete all sessions from the store.

Return type:: None

async exists(key)[source]#

Check if a session key exists and is not expired.

Return type:: bool

async expires_in(key)[source]#

Get the time in seconds until the session expires.

Return type:: int | None

async delete_expired()[source]#

Delete all expired sessions.

Return type:: int

class sqlspec.adapters.arrow_odbc.adk.ArrowOdbcADKStore[source]#

Bases: BaseSyncADKStore[ArrowOdbcConfig]

Synchronous SQL Server ADK session/event store using arrow-odbc.

create_tables()[source]#

Create all ADK session tables if they do not exist.

Return type:: None

create_session(session_id, app_name, user_id, state, owner_id=None)[source]#

Create a new ADK session.

Return type:: SessionRecord

get_session(app_name, user_id, session_id, *, renew_for=None)[source]#

Return a scoped session or None if absent.

Return type:: SessionRecord | None

update_session_state(app_name, user_id, session_id, state)[source]#

Replace a session's durable state.

Return type:: None

list_sessions(app_name, user_id=None)[source]#

List ADK sessions for an application, optionally scoped to a user.

Return type:: list[SessionRecord]

delete_session(app_name, user_id, session_id)[source]#

Delete a session. Event rows cascade through the FK.

Return type:: None

append_event(event_record)[source]#

Append an event to a session.

Return type:: None

append_event_and_update_state(event_record, app_name, user_id, session_id, state, *, app_state=None, user_state=None)[source]#

Atomically append an event and update durable session/scoped state.

Return type:: SessionRecord

get_events(app_name, user_id, session_id, after_timestamp=None, limit=None)[source]#

Return events for a scoped session ordered by event timestamp.

Return type:: list[EventRecord]

delete_expired_events(before)[source]#

Delete events older than before.

Return type:: int

delete_idle_sessions(updated_before)[source]#

Delete sessions whose update_time is older than updated_before.

Return type:: int

get_app_state(app_name)[source]#

Return app-scoped state.

Return type:: dict[str, typing.Any] | None

get_user_state(app_name, user_id)[source]#

Return user-scoped state.

Return type:: dict[str, typing.Any] | None

upsert_app_state(app_name, state)[source]#

Insert or replace app-scoped state.

Return type:: None

upsert_user_state(app_name, user_id, state)[source]#

Insert or replace user-scoped state.

Return type:: None

get_metadata(key)[source]#

Return an ADK metadata value.

Return type:: str | None

set_metadata(key, value)[source]#

Set an ADK metadata value.

Return type:: None

class sqlspec.adapters.arrow_odbc.adk.ArrowOdbcADKMemoryStore[source]#

Bases: BaseSyncADKMemoryStore[ArrowOdbcConfig]

SQL Server ADK memory store using arrow-odbc.

create_tables()[source]#

Create the memory table if memory storage is enabled.

Return type:: None

insert_memory_entries(entries, owner_id=None)[source]#

Insert memory entries, skipping duplicates by event_id.

Return type:: int

search_entries(query, app_name, user_id, limit=None)[source]#

Search memory entries with SQL Server LIKE matching.

Return type:: list[MemoryRecord]

delete_entries_by_session(session_id)[source]#

Delete all memory entries for a specific session.

Return type:: int

delete_entries_older_than(days)[source]#

Delete memory entries older than days days.

Return type:: int

Schema Discovery#

ArrowOdbcDataDictionary.get_columns first uses bundled dialect catalog queries. When no query exists for the detected dialect (or it returns no rows) and a table name is given, the driver issues a zero-row probe (SELECT * FROM "schema"."table" WHERE 1=0) and derives column names, ordering, nullability, and SQL type names from the Arrow reader schema. Arrow-derived type names are approximations (for example VARCHAR for any string column); mssql_python and other ODBC adapters without native metadata APIs remain SQL-only.