DQL Syntax Reference
DQL (Datafi Query Language) is a SQL-compatible query language supported across Datafi's connected datasources. If you have written SQL before, DQL will feel immediately familiar. This page documents every supported keyword, clause, and function with examples.
SELECT
SELECT specifies the columns to return. Use * for all columns or list specific columns with optional aliases.
SELECT first_name AS name, total_amount AS amount
FROM orders
WHERE
WHERE filters rows based on a condition.
SELECT order_id, total_amount
FROM orders
WHERE total_amount > 1000
Comparison Operators
| Operator | Meaning |
|---|---|
= | Equal to |
!= or <> | Not equal to |
> / < | Greater than / Less than |
>= / <= | Greater or equal / Less or equal |
AND / OR
Combine multiple conditions in a WHERE clause. Use parentheses to control precedence.
SELECT *
FROM orders
WHERE (status = 'pending' OR status = 'processing')
AND total_amount > 200
LIKE
LIKE performs pattern matching on text columns. Use % as a wildcard for any sequence of characters and _ for a single character.
SELECT customer_id, email
FROM customers
WHERE email LIKE '%@datafi.us'
JOIN
JOIN combines rows from two or more tables. DQL supports INNER JOIN, LEFT JOIN, RIGHT JOIN, and FULL OUTER JOIN.
SELECT orders.order_id, customers.first_name, orders.total_amount
FROM orders
INNER JOIN customers ON orders.customer_id = customers.customer_id
SELECT customers.first_name, orders.order_id
FROM customers
LEFT JOIN orders ON customers.customer_id = orders.customer_id
CSV datasources do not support JOINs. See DQL Compatibility for the full matrix.
GROUP BY
GROUP BY groups rows that share a value in one or more columns so you can run aggregate functions on each group.
SELECT region, status, SUM(total_amount) AS revenue
FROM orders
GROUP BY region, status
Aggregate Functions
| Function | Description | Example |
|---|---|---|
COUNT(col) | Number of non-null values. | COUNT(order_id) |
SUM(col) | Sum of numeric values. | SUM(total_amount) |
AVG(col) | Average of numeric values. | AVG(total_amount) |
MIN(col) | Minimum value. | MIN(order_date) |
MAX(col) | Maximum value. | MAX(total_amount) |
SELECT
COUNT(order_id) AS total_orders,
SUM(total_amount) AS total_revenue,
AVG(total_amount) AS avg_order_value,
MIN(total_amount) AS smallest_order,
MAX(total_amount) AS largest_order
FROM orders
ORDER BY
ORDER BY sorts the result set. Use ASC for ascending (default) and DESC for descending.
SELECT region, customer_id, total_amount
FROM orders
ORDER BY region ASC, total_amount DESC
LIMIT
LIMIT restricts the number of rows returned.
SELECT order_id, total_amount
FROM orders
ORDER BY total_amount DESC
LIMIT 10
Always use LIMIT during exploration to avoid pulling large result sets from the source.
Putting It All Together
SELECT
customers.region,
COUNT(orders.order_id) AS order_count,
SUM(orders.total_amount) AS total_revenue,
AVG(orders.total_amount) AS avg_order_value
FROM orders
INNER JOIN customers ON orders.customer_id = customers.customer_id
WHERE orders.order_date >= '2025-01-01'
AND orders.status = 'completed'
GROUP BY customers.region
ORDER BY total_revenue DESC
LIMIT 20
This query joins orders with customers, filters for completed orders in 2025, groups by region, computes aggregate metrics, sorts by revenue, and returns the top 20 regions.
Next Steps
- DQL Compatibility -- Check which DQL features are supported on each connector.
- Query Examples -- Browse practical, runnable examples for real-world scenarios.
- PRQL Guide -- Explore the pipeline-based alternative to DQL.