Execution Flow¶

Understanding how SQLSpec processes a query from input to result is essential for using the library effectively and debugging issues. SQLSpec employs a sophisticated “parse once, transform once, validate once” pipeline that ensures both performance and security.

This guide provides a comprehensive overview of the execution flow, from the moment you create a SQL statement to receiving typed Python objects as results.

Execution Pipeline Overview¶

SQLSpec’s execution flow can be visualized as a series of well-defined stages that transform user input into database results. The architecture is designed around single-pass processing with multi-tier caching for optimal performance.

High-Level Flow Diagram¶

        sequenceDiagram
    autonumber
    actor User
    participant SQL as SQL Object
    participant Core as SQLSpec Core
    participant Driver as Database Driver
    participant DB as Database
    participant Result as SQLResult

    Note over User,SQL: Stage 1: SQL Creation
    User->>SQL: Create SQL statement<br/>(string/builder/file)
    SQL->>SQL: Store parameters<br/>Initialize lazy flags

    Note over SQL,Core: Stage 2: Core Processing Pipeline
    SQL->>Core: Trigger compilation
    Core->>Core: Extract parameters
    Core->>Core: Parse to AST (SQLGlot)
    Core->>Core: Validate SQL
    Core->>Core: Transform AST
    Core->>Core: Compile to dialect

    Note over Core,DB: Stage 3: Database Execution
    Core->>Driver: Pass compiled SQL + params
    Driver->>Driver: Convert parameter style
    Driver->>DB: Execute query
    DB-->>Driver: Return raw results

    Note over Driver,Result: Stage 4: Result Processing
    Driver->>Result: Create SQLResult object
    Result->>Result: Map to schema types
    Result-->>User: Return typed Python objects

    Note right of Result: Supports: Pydantic,<br/>msgspec, attrs,<br/>dataclasses, TypedDict

Detailed Execution Stages¶

Stage 1: SQL Object Creation¶

The execution flow begins when you create a SQL object. SQLSpec accepts multiple input formats:

Direct SQL Creation

from sqlspec import SQL

# Raw SQL string with positional parameters
sql = SQL("SELECT * FROM users WHERE id = ?", 1)

# Named parameters
sql = SQL("SELECT * FROM users WHERE email = :email", email="[email protected]")

Using the Query Builder

from sqlspec import sql

# Build SQL programmatically
query = sql.select("id", "name", "email").from_("users").where("status = ?", "active")

From SQL Files

from sqlspec.loader import SQLFileLoader

loader = SQLFileLoader()
queries_path = Path(__file__).resolve().parents[1] / "queries" / "users.sql"
loader.load_sql(queries_path)
sql = loader.get_sql("get_user_by_id")

During initialization, the SQL object:

Stores the statement (string, QueryBuilder, or sqlglot expression)
Captures positional and named parameters with type information
Initializes lazy processing flags for deferred compilation
Prepares for pipeline execution

Stage 2: The Core Processing Pipeline¶

When the SQL object is compiled for execution, it passes through a sophisticated processing pipeline. This is where SQLSpec’s “parse once, transform once, validate once” philosophy is implemented.

Step 1: Parameter Extraction

The first step extracts and preserves parameter information before any SQL modifications:

positional = SQL("SELECT * FROM users WHERE id = ? AND status = ?", 1, "active")
positional_map = dict(enumerate(positional.positional_parameters))

named = SQL("SELECT * FROM users WHERE email = :email", email="[email protected]")
named_map = named.named_parameters

This step uses ParameterValidator to ensure parameters are properly formatted and positions are tracked.

Step 2: AST Generation with SQLGlot

The SQL string is parsed into an Abstract Syntax Tree (AST) using SQLGlot:

import sqlglot

# Parse SQL into structured AST
expression = sqlglot.parse_one("SELECT * FROM users WHERE id = ?", dialect="sqlite")

The AST represents your query as a tree structure that can be analyzed and modified programmatically. This is crucial for the validation and transformation steps.

Why AST Processing?

Instead of treating SQL as plain text, SQLSpec uses the AST to:

Understand the query structure (SELECT, WHERE, JOIN clauses)
Identify security risks (SQL injection patterns)
Detect performance issues (missing JOINs, unbounded queries)
Transform queries safely (add filters, parameterize literals)

Step 3: Compilation

The AST is compiled into the target SQL dialect:

import sqlglot

# Compile AST to target dialect
compiled_sql = sqlglot.parse_one("SELECT * FROM users WHERE id = ?", dialect="sqlite").sql(dialect="postgres")
# Result: "SELECT * FROM users WHERE id = $1"

Step 4: Parameter Processing

Parameters are converted to the appropriate style for the target database:

statement_config = StatementConfig(
    dialect="sqlite",
    parameter_config=ParameterStyleConfig(
        default_parameter_style=ParameterStyle.NUMERIC, has_native_list_expansion=False
    ),
)

sql = SQL("SELECT * FROM users WHERE id = ? AND status = ?", 1, "active", statement_config=statement_config)

compiled_sql, execution_params = sql.compile()

This ensures compatibility across different database drivers.

Step 5: Statement Execution

The compiled SQL and processed parameters are sent to the database:

sql_statement = SQL("SELECT ? AS message", "pipeline-complete")

with db_manager.provide_session(db) as session:
    result = session.execute(sql_statement)
    print(result.rows_affected)
    print(result.parameters)
    message = result.scalar()

The driver handles database-specific execution patterns and result retrieval.

Stage 3: Driver Execution¶

Once the SQL is compiled, it’s sent to the database-specific driver for execution:

# Driver receives compiled SQL and parameters
db_manager = SQLSpec()
db = db_manager.add_config(SqliteConfig(connection_config={"database": ":memory:"}))
with db_manager.provide_session(db) as session:
    message = session.select_value("SELECT 'test' as message")
    print(message)

Template Method Pattern

SQLSpec drivers use the Template Method pattern for consistent execution:

Special Handling Check: Try database-specific optimizations (PostgreSQL COPY, bulk operations)
Execution Routing: Route to appropriate method based on query type:
- _execute_statement: Single statement execution
- _execute_many: Batch execution (executemany)
- _execute_script: Multi-statement scripts
Database Interaction: Execute via DBAPI connection
Result Building: Package raw results into SQLResult

Example: SQLite Driver Execution

from typing import Any

from sqlspec.driver import ExecutionResult, SyncDriverAdapterBase

class SqliteDriver(SyncDriverAdapterBase):
    def _execute_statement(self, cursor: Any, statement: Any) -> ExecutionResult:
        sql, params = self._get_compiled_sql(statement)
        cursor.execute(sql, params or ())
        return self.create_execution_result(cursor)

    def _execute_many(self, cursor: Any, statement: Any) -> ExecutionResult:
        sql, params = self._get_compiled_sql(statement)
        cursor.executemany(sql, params)
        return self.create_execution_result(cursor)

Stage 4: Result Handling¶

After database execution, raw results are transformed into typed Python objects.

SQLResult Object

All query results are wrapped in a SQLResult object:

with db_manager.provide_session(db) as session:
    result = session.select("SELECT 'test' as col1, 'value' as col2")
    print(len(result))  # length of result set

    # Access result data (SQLResult is iterable over rows)
    print(result)

Convenience Methods

    result = session.execute("SELECT * FROM test WHERE id = 1")

    # Get exactly one row (raises if not exactly one)
    user = result.one()

    # Get one or None
    user = result.one_or_none()

    # Get scalar value (first column of first row)
    result2 = session.execute("SELECT COUNT(*) FROM test")
    count = result2.scalar()

Schema Mapping

SQLSpec can automatically map results to typed objects:

from pydantic import BaseModel
from rich import print

from sqlspec import SQLSpec
from sqlspec.adapters.sqlite import SqliteConfig

class User(BaseModel):
    id: int
    name: str
    email: str
    is_active: bool | None = True

db_manager = SQLSpec()
db = db_manager.add_config(SqliteConfig(connection_config={"database": ":memory:"}))

with db_manager.provide_session(db) as session:
    # Create test table
    session.execute("CREATE TABLE users (id INTEGER, name TEXT, email TEXT, is_active INTEGER)")
    session.execute("INSERT INTO users VALUES (1, 'Alice', '[email protected]', 1)")

    # Execute query
    result = session.select("SELECT id, name, email, is_active FROM users", schema_type=User)
    print(len(result))

    # Or get single typed user
    single_result = session.select_one(
        "SELECT id, name, email, is_active FROM users WHERE id = ?", 1, schema_type=User
    )
    print(single_result)

Supported Schema Types

Pydantic models (v1 and v2)
msgspec Structs
attrs classes
dataclasses
TypedDict

Performance Optimizations¶

SQLSpec’s pipeline includes several performance optimizations:

Multi-Tier Caching¶

SQLSpec implements caching at multiple levels:

  from sqlglot import Expression

  # Cache types and their purposes:
  sql_cache: dict[str, str] = {}  # Compiled SQL strings
  optimized_cache: dict[str, Expression] = {}  # Post-optimization AST

Cache Benefits

Avoids recompiling identical SQL statements
Skips redundant AST processing for repeated queries
Caches validation and transformation results
Reuses parameter conversion for identical patterns

Single-Pass Processing¶

Each SQL statement is processed exactly once through the pipeline:

Parse once → AST generation happens once
Transform once → Modifications applied once to AST
Validate once → Security and performance checks run once
Compile once → SQL generation happens once per dialect

This eliminates redundant work and ensures consistent results.

Configuration-Driven Processing¶

StatementConfig controls pipeline behavior:

  from sqlspec import ParameterStyle, ParameterStyleConfig, StatementConfig

  config = StatementConfig(
      dialect="postgres",
      enable_parsing=True,  # AST generation
      enable_validation=True,  # Security/performance checks
      enable_transformations=True,  # AST transformations
      enable_caching=True,  # Multi-tier caching
      parameter_config=ParameterStyleConfig(
          default_parameter_style=ParameterStyle.NUMERIC, has_native_list_expansion=False
      ),
  )

Disable features you don’t need for maximum performance.

Understanding the Flow Benefits¶

By understanding this execution flow, you can:

Debug Issues Effectively

Know where to look when queries fail
Understand validation errors
Trace parameter binding issues

Optimize Performance

Leverage caching appropriately
Understand when AST processing occurs
Choose the right statement configuration

Extend SQLSpec

Write custom transformers
Create new validators
Implement custom drivers

Write Better Queries

Understand how parameterization works
Know what triggers validation errors
Use the right query patterns for your database

Next Steps¶

Now that you understand the execution flow, learn how to:

Configuration - Configure database connections and statement processing
Drivers and Querying - Execute queries with different database drivers
Query Builder - Build queries programmatically with the fluent API