Builder¶
The SQL builder provides a fluent, type-safe API for constructing SQL queries programmatically. It supports SELECT, INSERT, UPDATE, DELETE, and complex operations like JOINs, CTEs, and window functions.
Warning
The builder API is experimental and subject to breaking changes in future releases. Use with caution in production code.
Overview¶
The builder uses method chaining to construct SQL queries:
from sqlspec import sql
query = (
sql.select("name", "email", "department")
.from_("users")
.where("active = TRUE")
.where(("salary", ">", 50000))
.order_by("name")
.limit(10)
)
# Convert to SQL statement
stmt = query.to_statement()
# Execute with driver
result = await driver.execute(stmt)
Builder Factory¶
- class sqlspec.builder.SQLFactory[source]¶
Bases:
objectFactory for creating SQL builders and column expressions.
The main entry point for building SQL queries.
Available as:
from sqlspec import sqlMethods:
select(*columns)- Create SELECT queryinsert(table)- Create INSERT queryupdate(table)- Create UPDATE querydelete(table)- Create DELETE querymerge(dialect=None)- Create MERGE query (PostgreSQL 15+, Oracle, BigQuery)
- __call__(statement, dialect=None)[source]¶
Create a SelectBuilder from a SQL string, or SQL object for DML with RETURNING.
- Parameters:
- Return type:
typing.Any
- Returns:
SelectBuilder instance for SELECT/WITH statements, SQL object for DML statements with RETURNING clause.
- Raises:
SQLBuilderError – If the SQL is not a SELECT/CTE/DML+RETURNING statement.
- explain(statement, *, analyze=False, verbose=False, format=None, dialect=None)[source]¶
Create an EXPLAIN builder for a SQL statement.
Wraps any SQL statement in an EXPLAIN clause with dialect-aware syntax generation.
- Parameters:
statement¶ (
str|Expression|SQL|SQLBuilderProtocol) – SQL statement to explain (string, expression, SQL object, or builder)analyze¶ (
bool) – Execute the statement and show actual runtime statisticsformat¶ (
ExplainFormat|str|None) – Output format (TEXT, JSON, XML, YAML, TREE, TRADITIONAL)dialect¶ (
Union[str,Dialect,Type[Dialect],None]) – Optional SQL dialect override
- Return type:
Explain- Returns:
Explain builder for further configuration
Examples
- Basic EXPLAIN:
plan = sql.explain(“SELECT * FROM users”).build()
- With options:
- plan = (
sql.explain(“SELECT * FROM users”, analyze=True, format=”json”) .buffers() .timing() .build()
)
- From QueryBuilder:
query = sql.select(“*”).from_(“users”).where(“id = :id”, id=1) plan = sql.explain(query, analyze=True).build()
- Chained configuration:
- plan = (
sql.explain(sql.select(“*”).from_(“large_table”)) .analyze() .format(“json”) .buffers() .timing() .build()
)
- property merge_: Merge¶
Create a new MERGE builder (property shorthand).
Property that returns a new Merge builder instance using the factory’s default dialect. Cleaner syntax alternative to merge() method.
Examples
query = sql.merge_.into(“products”).using(data, alias=”src”) query = sql.merge_.into(“products”, alias=”t”).on(“t.id = src.id”)
- Returns:
New Merge builder instance
- upsert(table, dialect=None)[source]¶
Create an upsert builder (MERGE or INSERT ON CONFLICT).
Automatically selects the appropriate builder based on database dialect: - PostgreSQL 15+, Oracle, BigQuery: Returns MERGE builder - SQLite, DuckDB, MySQL: Returns INSERT builder with ON CONFLICT support
- Parameters:
- Return type:
- Returns:
MERGE builder for supported databases, INSERT builder otherwise
Examples
- PostgreSQL/Oracle/BigQuery (uses MERGE):
- upsert_query = (
sql.upsert(“products”, dialect=”postgres”) .using([{“id”: 1, “name”: “Product 1”}], alias=”src”) .on(“t.id = src.id”) .when_matched_then_update(name=”src.name”) .when_not_matched_then_insert(id=”src.id”, name=”src.name”)
)
- SQLite/DuckDB/MySQL (uses INSERT ON CONFLICT):
- upsert_query = (
sql.upsert(“products”, dialect=”sqlite”) .values(id=1, name=”Product 1”) .on_conflict(“id”) .do_update(name=”EXCLUDED.name”)
)
- create_materialized_view(view_name, dialect=None)[source]¶
Create a CREATE MATERIALIZED VIEW builder.
- copy_from(table, source, *, columns=None, options=None, dialect=None)[source]¶
Build a COPY … FROM statement.
- Return type:
- copy_to(table, target, *, columns=None, options=None, dialect=None)[source]¶
Build a COPY … TO statement.
- Return type:
- copy(table, *, source=None, target=None, columns=None, options=None, dialect=None)[source]¶
Build a COPY statement, inferring direction from provided arguments.
- Return type:
- property case_: Case¶
Create a CASE expression builder.
- Returns:
Case builder instance for CASE expression building.
- property row_number_: WindowFunctionBuilder¶
Create a ROW_NUMBER() window function builder.
- property rank_: WindowFunctionBuilder¶
Create a RANK() window function builder.
- property dense_rank_: WindowFunctionBuilder¶
Create a DENSE_RANK() window function builder.
- property lag_: WindowFunctionBuilder¶
Create a LAG() window function builder.
- property lead_: WindowFunctionBuilder¶
Create a LEAD() window function builder.
- property exists_: SubqueryBuilder¶
Create an EXISTS subquery builder.
- property in_: SubqueryBuilder¶
Create an IN subquery builder.
- property any_: SubqueryBuilder¶
Create an ANY subquery builder.
- property all_: SubqueryBuilder¶
Create an ALL subquery builder.
- property inner_join_: JoinBuilder¶
Create an INNER JOIN builder.
- property left_join_: JoinBuilder¶
Create a LEFT JOIN builder.
- property right_join_: JoinBuilder¶
Create a RIGHT JOIN builder.
- property full_join_: JoinBuilder¶
Create a FULL OUTER JOIN builder.
- property cross_join_: JoinBuilder¶
Create a CROSS JOIN builder.
- property lateral_join_: JoinBuilder¶
Create a LATERAL JOIN builder.
- Returns:
JoinBuilder configured for LATERAL JOIN
- property left_lateral_join_: JoinBuilder¶
Create a LEFT LATERAL JOIN builder.
- Returns:
JoinBuilder configured for LEFT LATERAL JOIN
- property cross_lateral_join_: JoinBuilder¶
Create a CROSS LATERAL JOIN builder.
- Returns:
JoinBuilder configured for CROSS LATERAL JOIN
- __getattr__(name)[source]¶
Dynamically create column references.
- Parameters:
- Return type:
Column- Returns:
Column object for the given name.
Note
Special SQL constructs like case_, row_number_, etc. are handled as properties for type safety.
- static raw(sql_fragment, **parameters)[source]¶
Create a raw SQL expression from a string fragment with optional parameters.
- Parameters:
- Return type:
Expression|SQL- Returns:
SQLGlot expression from the parsed SQL fragment (if no parameters). SQL statement object (if parameters provided).
- Raises:
SQLBuilderError – If the SQL fragment cannot be parsed.
Example
```python expr = sql.raw(“COALESCE(name, ‘Unknown’)”)
- stmt = sql.raw(
“LOWER(name) LIKE LOWER(:pattern)”, pattern=f”%{query}%”
)
- expr = sql.raw(
“price BETWEEN :min_price AND :max_price”, min_price=100, max_price=500,
)
- query = sql.select(
“name”, sql.raw(
“ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC)”
),
- static to_literal(value)[source]¶
Convert a Python value to a SQLGlot literal expression.
Uses SQLGlot’s built-in exp.convert() function for literal creation. Handles all Python primitive types: - None -> exp.Null (renders as NULL) - bool -> exp.Boolean (renders as TRUE/FALSE or 1/0 based on dialect) - int/float -> exp.Literal with is_number=True - str -> exp.Literal with is_string=True - exp.Expression -> returned as-is (passthrough)
- static decode(column, *args)[source]¶
Create a DECODE expression (Oracle-style conditional logic).
DECODE compares column to each search value and returns the corresponding result. If no match is found, returns the default value (if provided) or NULL.
- Parameters:
- Raises:
ValueError – If fewer than two search/result pairs are provided.
- Return type:
FunctionExpression- Returns:
CASE expression equivalent to DECODE.
- static nvl(column, substitute_value)[source]¶
Create an NVL (Oracle-style) expression using COALESCE.
- static nvl2(column, value_if_not_null, value_if_null)[source]¶
Create an NVL2 (Oracle-style) expression using CASE.
NVL2 returns value_if_not_null if column is not NULL, otherwise returns value_if_null.
- Parameters:
- Return type:
ConversionExpression- Returns:
CASE expression equivalent to NVL2.
Example
`python sql.nvl2("salary", "Has Salary", "No Salary") `
- static bulk_insert(table_name, column_count, placeholder_style='?')[source]¶
Create bulk INSERT expression for executemany operations.
For bulk loading operations like CSV ingestion where an INSERT expression with placeholders for executemany() is needed.
- Parameters:
- Return type:
FunctionExpression- Returns:
INSERT expression with placeholders for bulk operations
- static case()[source]¶
Create a CASE expression builder.
- Return type:
Case- Returns:
CaseExpressionBuilder for building CASE expressions.
SELECT Queries¶
- class sqlspec.builder.Select[source]¶
Bases:
QueryBuilder,WhereClauseMixin,OrderByClauseMixin,LimitOffsetClauseMixin,SelectClauseMixin,JoinClauseMixin,HavingClauseMixin,SetOperationMixin,CommonTableExpressionMixin,PivotClauseMixin,UnpivotClauseMixin,ExplainMixinBuilder for SELECT queries.
Provides a fluent interface for constructing SQL SELECT statements with parameter binding and validation.
Example
>>> class User(BaseModel): ... id: int ... name: str >>> builder = Select("id", "name").from_("users") >>> result = driver.execute(builder)
Builder for SELECT statements with support for:
Column selection
FROM clauses
JOINs (INNER, LEFT, RIGHT, FULL)
WHERE conditions
GROUP BY
HAVING
ORDER BY
LIMIT/OFFSET
CTEs (WITH)
Window functions
Subqueries
- __init__(*columns, **kwargs)[source]¶
Initialize SELECT with optional columns.
- Parameters:
Examples
Select(“id”, “name”) # Shorthand for Select().select(“id”, “name”) Select() # Same as Select() - start empty
- with_hint(hint, *, location='statement', table=None, dialect=None)[source]¶
Attach an optimizer or dialect-specific hint to the query.
- Parameters:
- Return type:
Self- Returns:
The current builder instance for method chaining.
- for_update(*, skip_locked=False, nowait=False, of=None)[source]¶
Add FOR UPDATE clause to SELECT statement for row-level locking.
Add FOR SHARE clause for shared row-level locking.
Add FOR KEY SHARE clause (PostgreSQL-specific).
FOR KEY SHARE is like FOR SHARE, but the lock is weaker: SELECT FOR UPDATE is blocked, but not SELECT FOR NO KEY UPDATE.
- Return type:
Self- Returns:
Self for method chaining
Basic SELECT¶
# Simple select
query = sql.select("*").from_("users")
# Specific columns
query = sql.select("id", "name", "email").from_("users")
# Column aliases
query = sql.select(
"id",
"name AS full_name",
"COUNT(*) OVER () AS total_count"
).from_("users")
WHERE Conditions¶
Multiple ways to specify conditions:
# Raw SQL string
query = sql.select("*").from_("users").where("active = TRUE")
# Tuple format (column, operator, value)
query = sql.select("*").from_("users").where(("age", ">=", 18))
# Multiple conditions (AND)
query = (
sql.select("*")
.from_("users")
.where("active = TRUE")
.where(("age", ">=", 18))
.where(("department", "=", "Engineering"))
)
# OR conditions
query = sql.select("*").from_("users").where_or(
("role", "=", "admin"),
("role", "=", "moderator")
)
JOINs¶
# INNER JOIN
query = (
sql.select("u.name", "o.total", "p.name as product")
.from_("users u")
.inner_join("orders o", "u.id = o.user_id")
.inner_join("products p", "o.product_id = p.id")
)
# LEFT JOIN
query = (
sql.select("u.name", "COUNT(o.id) as order_count")
.from_("users u")
.left_join("orders o", "u.id = o.user_id")
.group_by("u.id", "u.name")
)
# Multiple JOIN types
query = (
sql.select("*")
.from_("users u")
.inner_join("orders o", "u.id = o.user_id")
.left_join("reviews r", "o.id = r.order_id")
.right_join("products p", "o.product_id = p.id")
)
Aggregation¶
# GROUP BY with aggregates
query = (
sql.select(
"department",
"COUNT(*) as employee_count",
"AVG(salary) as avg_salary",
"MAX(salary) as max_salary"
)
.from_("users")
.group_by("department")
.having("COUNT(*) > 5")
.order_by("avg_salary DESC")
)
CTEs (WITH Clauses)¶
# Simple CTE
high_earners = (
sql.select("id", "name", "salary")
.from_("users")
.where(("salary", ">", 100000))
)
query = (
sql.select("*")
.with_("high_earners", high_earners)
.from_("high_earners")
.order_by("salary DESC")
)
# Multiple CTEs
dept_stats = sql.select(
"department",
"AVG(salary) as avg_salary"
).from_("users").group_by("department")
user_ranks = sql.select(
"id", "name", "salary",
"ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) as rank"
).from_("users")
query = (
sql.select("u.name", "u.salary", "d.avg_salary")
.with_("dept_stats", dept_stats)
.with_("user_ranks", user_ranks)
.from_("user_ranks u")
.inner_join("dept_stats d", "u.department = d.department")
.where("u.rank <= 3")
)
Subqueries¶
# Subquery in WHERE
high_value_customers = (
sql.select("user_id")
.from_("orders")
.group_by("user_id")
.having("SUM(total_amount) > 10000")
)
query = (
sql.select("*")
.from_("users")
.where_in("id", high_value_customers)
)
# EXISTS subquery
has_orders = (
sql.select("1")
.from_("orders")
.where("orders.user_id = users.id")
)
query = (
sql.select("*")
.from_("users")
.where_exists(has_orders)
)
INSERT Queries¶
- class sqlspec.builder.Insert[source]¶
Bases:
QueryBuilder,ReturningClauseMixin,InsertValuesMixin,InsertFromSelectMixin,InsertIntoClauseMixin,ExplainMixinBuilder for INSERT statements.
Constructs SQL INSERT queries with parameter binding and validation.
Builder for INSERT statements.
- values_from_dict(data)[source]¶
Adds a row of values from a dictionary.
This is a convenience method that automatically sets columns based on the dictionary keys and values based on the dictionary values.
- values_from_dicts(data)[source]¶
Adds multiple rows of values from a sequence of dictionaries.
This is a convenience method for bulk inserts from structured data.
- Parameters:
data¶ (
Sequence[Mapping[str, typing.Any]]) – A sequence of mappings, each representing a row of data.- Return type:
Self- Returns:
The current builder instance for method chaining.
- Raises:
SQLBuilderError – If into() has not been called to set the table, or if dictionaries have inconsistent keys.
- on_conflict_do_nothing(*columns)[source]¶
Adds an ON CONFLICT DO NOTHING clause (convenience method).
- Parameters:
*columns¶ (
str) – Column names that define the conflict. If no columns provided, creates an ON CONFLICT without specific columns.- Return type:
- Returns:
The current builder instance for method chaining.
Note
This is a convenience method. For more control, use on_conflict().do_nothing().
- on_duplicate_key_update(**kwargs)[source]¶
Adds MySQL-style ON DUPLICATE KEY UPDATE clause.
- Parameters:
**kwargs¶ (
Any) – Column-value pairs to update on duplicate key.- Return type:
- Returns:
The current builder instance for method chaining.
Note
This method creates MySQL-specific ON DUPLICATE KEY UPDATE syntax. For PostgreSQL, use on_conflict() instead.
# Insert single row
query = (
sql.insert("users")
.values(name="Alice", email="[email protected]", age=30)
)
# Insert multiple rows
query = (
sql.insert("users")
.columns("name", "email", "age")
.values(
("Alice", "[email protected]", 30),
("Bob", "[email protected]", 25),
("Charlie", "[email protected]", 35)
)
)
# INSERT with RETURNING
query = (
sql.insert("users")
.values(name="Alice", email="[email protected]")
.returning("id", "created_at")
)
# INSERT from SELECT
query = (
sql.insert("archived_users")
.columns("id", "name", "email")
.from_select(
sql.select("id", "name", "email")
.from_("users")
.where("last_login < NOW() - INTERVAL '1 year'")
)
)
UPDATE Queries¶
- class sqlspec.builder.Update[source]¶
Bases:
QueryBuilder,WhereClauseMixin,ReturningClauseMixin,UpdateSetClauseMixin,UpdateFromClauseMixin,UpdateTableClauseMixin,ExplainMixinBuilder for UPDATE statements.
Constructs SQL UPDATE statements with parameter binding and validation.
Example
Update() .table(“users”) .set_(name=”John Doe”) .set_(email=”john@example.com”) .where(“id = 1”)
)
- update_query = (
Update(“users”).set_(name=”John Doe”).where(“id = 1”)
)
- update_query = (
Update() .table(“users”) .set_(status=”active”) .where_eq(“id”, 123)
)
- update_query = (
Update() .table(“users”, “u”) .set_(name=”Updated Name”) .from_(“profiles”, “p”) .where(“u.id = p.user_id AND p.is_verified = true”)
Builder for UPDATE statements.
- join(table, on, alias=None, join_type='INNER')[source]¶
Add JOIN clause to the UPDATE statement.
- Parameters:
- Return type:
Self- Returns:
The current builder instance for method chaining.
- Raises:
SQLBuilderError – If the current expression is not an UPDATE statement.
# Simple update
query = (
sql.update("users")
.set(active=False)
.where(("last_login", "<", "2024-01-01"))
)
# Multiple columns
query = (
sql.update("users")
.set(
active=True,
last_login="NOW()",
login_count="login_count + 1"
)
.where(("id", "=", 123))
)
# UPDATE with JOIN
query = (
sql.update("users u")
.set(premium=True)
.from_("orders o")
.where("u.id = o.user_id")
.where(("o.total_amount", ">", 10000))
)
# UPDATE with RETURNING
query = (
sql.update("users")
.set(salary="salary * 1.1")
.where(("department", "=", "Engineering"))
.returning("id", "name", "salary")
)
DELETE Queries¶
- class sqlspec.builder.Delete[source]¶
Bases:
QueryBuilder,WhereClauseMixin,ReturningClauseMixin,DeleteFromClauseMixin,ExplainMixinBuilder for DELETE statements.
Constructs SQL DELETE statements with parameter binding and validation. Does not support JOIN operations to maintain cross-dialect compatibility.
Builder for DELETE statements.
# Simple delete
query = sql.delete("users").where(("active", "=", False))
# Delete with multiple conditions
query = (
sql.delete("users")
.where("last_login < NOW() - INTERVAL '2 years'")
.where(("email_verified", "=", False))
)
# DELETE with RETURNING
query = (
sql.delete("users")
.where(("id", "=", 123))
.returning("id", "name", "email")
)
MERGE Queries (UPSERT)¶
- class sqlspec.builder.Merge[source]¶
Bases:
QueryBuilder,MergeUsingClauseMixin,MergeOnClauseMixin,MergeMatchedClauseMixin,MergeNotMatchedClauseMixin,MergeIntoClauseMixin,MergeNotMatchedBySourceClauseMixin,ExplainMixinBuilder for MERGE statements.
Constructs SQL MERGE statements (also known as UPSERT in some databases) with parameter binding and validation.
Builder for MERGE statements (INSERT or UPDATE based on condition).
Database Support:
✅ PostgreSQL 15+
✅ Oracle 9i+
✅ BigQuery
❌ MySQL (use INSERT … ON DUPLICATE KEY UPDATE)
❌ SQLite (use INSERT … ON CONFLICT)
Basic MERGE¶
# Simple upsert from dict
query = (
sql.merge(dialect="postgres")
.into("products", alias="t")
.using({"id": 1, "name": "Product A", "price": 19.99}, alias="src")
.on("t.id = src.id")
.when_matched_then_update(name="src.name", price="src.price")
.when_not_matched_then_insert(id="src.id", name="src.name", price="src.price")
)
# MERGE from table source
query = (
sql.merge(dialect="postgres")
.into("products", alias="t")
.using("staging_products", alias="s")
.on("t.id = s.id")
.when_matched_then_update(name="s.name", price="s.price")
.when_not_matched_then_insert(columns=["id", "name", "price"])
)
Conditional MERGE¶
# Update only if condition met
query = (
sql.merge(dialect="postgres")
.into("products", alias="t")
.using(staging_data, alias="src")
.on("t.id = src.id")
.when_matched_then_update(
condition="t.price < src.price",
price="src.price"
)
.when_not_matched_then_insert(id="src.id", name="src.name", price="src.price")
)
# Delete matched rows conditionally
query = (
sql.merge(dialect="postgres")
.into("products", alias="t")
.using({"id": 1, "discontinued": True}, alias="src")
.on("t.id = src.id")
.when_matched_then_delete(condition="src.discontinued = TRUE")
)
SQL Server Extensions¶
SQL Server supports additional WHEN NOT MATCHED BY SOURCE clauses:
# SQL Server: Handle rows in target not in source
query = (
sql.merge(dialect="tsql")
.into("products", alias="t")
.using(current_products, alias="src")
.on("t.id = src.id")
.when_matched_then_update(name="src.name", price="src.price")
.when_not_matched_then_insert(id="src.id", name="src.name", price="src.price")
.when_not_matched_by_source_then_delete() # Delete obsolete products
)
NULL Value Handling¶
MERGE automatically handles NULL values in source data:
# NULL values are properly typed
query = (
sql.merge(dialect="postgres")
.into("products", alias="t")
.using({"id": 1, "name": "Updated", "price": None}, alias="src")
.on("t.id = src.id")
.when_matched_then_update(name="src.name", price="src.price")
)
# Sets price to NULL if matched
Note
When all values for a column are NULL, PostgreSQL defaults to NUMERIC type. For other column types, provide at least one non-NULL value for accurate type inference.
Query Mixins¶
Base Classes¶
Vector Distance Functions¶
The query builder provides portable vector similarity search functions that generate dialect-specific SQL across PostgreSQL (pgvector), MySQL 9+, Oracle 23ai+, BigQuery, DuckDB, and other databases.
Note
Vector functions are designed for AI/ML similarity search with embedding vectors. The SQL is generated at build(dialect=X) time, enabling portable query definitions that execute against multiple database types.
Column Methods¶
- Column.vector_distance(other_vector, metric='euclidean')[source]¶
Calculate vector distance using the specified metric.
Generates dialect-specific SQL for vector distance operations.
- Parameters:
- Returns:
FunctionColumn expression for use in SELECT, WHERE, ORDER BY
- Return type:
FunctionColumn
Supported Metrics:
euclidean- L2 distance (default)cosine- Cosine distanceinner_product- Negative inner product (for similarity ranking)euclidean_squared- L2² distance (Oracle only)
Examples:
from sqlspec import sql from sqlspec.builder import Column query_vector = [0.1, 0.2, 0.3] # Basic distance query query = ( sql.select("id", "title", Column("embedding").vector_distance(query_vector).alias("distance")) .from_("documents") .where(Column("embedding").vector_distance(query_vector) < 0.5) .order_by("distance") .limit(10) ) # Using dynamic attribute access query = ( sql.select("*") .from_("docs") .order_by(sql.embedding.vector_distance(query_vector, metric="cosine")) .limit(10) ) # Compare two vector columns query = ( sql.select("*") .from_("pairs") .where(Column("vec1").vector_distance(Column("vec2"), metric="euclidean") < 0.3) )
- Column.cosine_similarity(other_vector)[source]¶
Calculate cosine similarity (1 - cosine_distance).
Convenience method that computes similarity instead of distance. Returns values in range [-1, 1] where 1 = identical vectors.
- Parameters:
other_vector (list[float] | Column | exp.Expression) – Vector to compare against
- Returns:
FunctionColumn expression computing
1 - cosine_distance(self, other_vector)- Return type:
FunctionColumn
Example:
from sqlspec import sql query_vector = [0.5, 0.5, 0.5] # Find most similar documents query = ( sql.select("id", "title", sql.embedding.cosine_similarity(query_vector).alias("similarity")) .from_("documents") .order_by(sql.column("similarity").desc()) .limit(10) )
Database Compatibility¶
Vector functions generate dialect-specific SQL:
Database |
Euclidean |
Cosine |
Inner Product |
|---|---|---|---|
PostgreSQL (pgvector) |
|
|
|
MySQL 9+ |
|
|
|
Oracle 23ai+ |
|
|
|
BigQuery |
|
|
|
DuckDB (VSS extension) |
|
|
|
Generic |
|
|
|
Usage Examples¶
Basic Similarity Search
from sqlspec import sql
# Find documents similar to query vector
query_vector = [0.1, 0.2, 0.3]
query = (
sql.select("id", "title", sql.embedding.vector_distance(query_vector).alias("distance"))
.from_("documents")
.order_by("distance")
.limit(10)
)
# PostgreSQL generates: SELECT id, title, embedding <-> '[0.1,0.2,0.3]' AS distance ...
# MySQL generates: SELECT id, title, DISTANCE(embedding, STRING_TO_VECTOR('[0.1,0.2,0.3]'), 'EUCLIDEAN') AS distance ...
# Oracle generates: SELECT id, title, VECTOR_DISTANCE(embedding, TO_VECTOR('[0.1,0.2,0.3]'), EUCLIDEAN) AS distance ...
Threshold Filtering
# Find documents within distance threshold
query = (
sql.select("*")
.from_("documents")
.where(sql.embedding.vector_distance(query_vector, metric="euclidean") < 0.5)
.order_by(sql.embedding.vector_distance(query_vector))
)
Similarity Ranking
# Rank by cosine similarity (higher = more similar)
query = (
sql.select("id", "content", sql.embedding.cosine_similarity(query_vector).alias("score"))
.from_("articles")
.order_by(sql.column("score").desc())
.limit(5)
)
Multiple Metrics
# Compare different distance metrics in single query
query = (
sql.select(
"id",
sql.embedding.vector_distance(query_vector, metric="euclidean").alias("l2_dist"),
sql.embedding.vector_distance(query_vector, metric="cosine").alias("cos_dist"),
sql.embedding.cosine_similarity(query_vector).alias("similarity")
)
.from_("documents")
.limit(10)
)
Combined Filters
# Vector search with additional filters
query = (
sql.select("*")
.from_("products")
.where("category = ?")
.where("in_stock = TRUE")
.where(sql.embedding.vector_distance(query_vector) < 0.3)
.order_by(sql.embedding.vector_distance(query_vector))
.limit(20)
)
Dialect-Agnostic Construction¶
Queries are constructed once and executed against multiple databases:
from sqlspec import sql
# Define query once
query = (
sql.select("id", "title", sql.embedding.vector_distance([0.1, 0.2, 0.3]).alias("distance"))
.from_("documents")
.order_by("distance")
.limit(10)
)
# Execute with different adapters
pg_result = await pg_session.execute(query) # → PostgreSQL SQL with <-> operator
mysql_result = await mysql_session.execute(query) # → MySQL SQL with DISTANCE()
oracle_result = await oracle_session.execute(query) # → Oracle SQL with VECTOR_DISTANCE()
The dialect is selected at build(dialect=X) time based on the driver, not at query construction time.
Filter Integration¶
The builder integrates with SQLSpec’s filter system:
from sqlspec.core import LimitOffsetFilter, SearchFilter, OrderByFilter
# Base query
query = sql.select("*").from_("users")
# Apply filters
filtered = query.append_filter(
SearchFilter("name", "John"),
LimitOffsetFilter(10, 0),
OrderByFilter("created_at", "desc")
)
# Execute
stmt = filtered.to_statement()
Statement Conversion¶
Convert builder to executable SQL:
# Build query
query = sql.select("*").from_("users").limit(10)
# Convert to SQL statement
stmt = query.to_statement()
# SQL object has .sql and .parameters
print(stmt.sql) # "SELECT * FROM users LIMIT 10"
print(stmt.parameters) # None or parameter dict
# Execute with driver
async with db.provide_session(config) as session:
result = await session.execute(stmt)
See Also¶
Query Builder - Builder usage guide
Core - SQL statement and filter system
SQLSpec Example Library - Code examples including standalone demo
Driver - Query execution