Fundamentals: Partial Query Results
Query evaluation in samtSQL terminates for one of three reasons. First, query evaluation terminates after samtSQL applies all AI operators to all relevant rows. In this scenario, the query result produced by samtSQL is complete. Second, query evaluation also terminates if user-defined thresholds limiting processing overheads are reached (e.g., query evaluation time or LLM token consumption limits). In this scenario, samtSQL may not have applied all AI operators to all input data. Hence, the query result produced by samtSQL is a partial result. Third, samtSQL may terminate evaluating SQL queries with a LIMIT clause once the required number of rows is available. While that matches standard SQL semantics, samtSQL may use rows with unevaluated AI operators to satisfy the LIMIT clause. This can lead to surprising behavior if the query formulation does not take this possibility into account.
Dealing with partial results
Whenever the result of an AI operator for a given row is unavailable, samtSQL uses the SQL NULL value as default value. It is not guaranteed that each row appearing in the final query result (after fully evaluating AI operators) appears in the partial result.
To make partial results maximally useful, it is recommended to anticipate early termination already when formulating the query. In particular, the query should check for SQL NULL values in the results of AI operators, indicating unevaluated rows.
NULL checks as well as other processing logic.
The following examples show partial result handling for a few common query types.
Retrieving rows passing AI filters
A common query type uses one or multiple AI filter operators to retrieve rows satisfying certain conditions. This query type handles partial results in a natural way. If query evaluation terminates before AI operators have been applied to all rows, the remaining rows are automatically assigned to SQL NULL values. Such rows will not appear in the query result.
SELECT description
FROM cars_images
WHERE AIFILTER(content, 'this is a red car')
LIMIT 10;
Applying AI map operator to input rows
The following query is intended to retrieve summaries of ten movie reviews.
SELECT AIMAP(text, 'summarize in three sentences') AS summary
FROM reviews
LIMIT 10;
However, as written, the query may terminate quickly resulting in ten result rows whose summary column is set to NULL. Even before evaluating the AI map operator on result rows, they are automatically assigned to a value of NULL. Hence, they technically satisfy the SQL LIMIT clause, but without generating the desired results.
The following extended version of the query prevents such cases and returns useful, partial results, even if query evaluation terminates early.
SELECT summary
FROM (
SELECT AIMAP(text, 'summarize in three sentences') AS summary
FROM reviews
) AS sub
WHERE summary IS NOT NULL
LIMIT 10;
A CTE form is equivalent and often easier to read for longer queries:
WITH labeled AS (
SELECT AIMAP(text, 'summarize in three sentences') AS summary
FROM reviews
)
SELECT summary
FROM labeled
WHERE summary IS NOT NULL
LIMIT 10;
The outer WHERE summary IS NOT NULL filters out rows that did not finish processing before LIMIT 10 is applied, so LIMIT counts only fully-evaluated rows.
Anti-pattern: do not repeat the AI call inside WHERE
The following query formulation will not lead to the desired result:
-- DO NOT DO THIS
SELECT AIMAP(text, 'summarize in three sentences') AS summary
FROM reviews
WHERE AIMAP(text, 'summarize in three sentences') IS NOT NULL
LIMIT 10;
The two AIMAP occurrences become two separate operator instances. This increases costs without producing reliable results, since it is not guaranteed that having evaluated the second operator instance on a specific row implies having evaluated the first operator instance on that same row as well.
samtSQL may emit advisory warnings for query shapes like this. See Warnings.
Rejected statement types
The /query endpoint executes only SELECT statements (including set operations such as UNION and CTEs). UPDATE, DELETE, INSERT, CREATE, DROP, and other statement types are rejected with HTTP 400 and a clear error message. Statements that combine non-SELECT shapes with AI operators (e.g. DELETE FROM t WHERE AIFILTER(...)) are rejected with a more specific message because AI operators are currently supported only inside SELECT queries.