Query-Level Security
Query-level security is Datafi's pre-execution validation layer. Before any SQL statement reaches your data sources, Datafi parses and introspects the query to verify that the requesting user has permission to access every referenced table and column. Unauthorized queries are rejected immediately -- they never execute against your database.
How SQL Introspection Works
When a query arrives at the Datafi coordinator, it passes through the following pipeline:
Step 1: Parse the SQL Abstract Syntax Tree
Datafi parses the incoming SQL into an abstract syntax tree (AST). This allows the engine to understand the query's structure regardless of formatting, aliasing, or complexity.
-- Original query
SELECT c.name, c.email, o.total
FROM customers AS c
JOIN orders AS o ON c.id = o.customer_id
WHERE c.region = 'us-west';
The parser extracts:
- Tables:
customers,orders - Columns:
customers.name,customers.email,orders.total,customers.id,orders.customer_id,customers.region
Step 2: Validate Table Permissions
Each extracted table is checked against the user's access policies. If the user does not have an explicit Allow rule for a table, the query is rejected.
{
"user": "analyst_jane",
"permitted_tables": ["customers", "orders", "products"],
"denied_tables": ["payroll", "credentials"]
}
Step 3: Validate Column Permissions
Even if a user has access to a table, they may not have access to every column. Column-level policies are evaluated next.
{
"table": "customers",
"user": "analyst_jane",
"permitted_columns": ["name", "email", "region", "city"],
"denied_columns": ["ssn", "dob", "salary"]
}
Step 4: Decision
If all tables and columns are permitted, the query proceeds to execution. If any table or column is denied, the entire query is rejected with a 403 Forbidden response.
Query Rejection
When a query is rejected, Datafi returns an error response that identifies the violation without leaking schema information.
{
"error": {
"code": "QUERY_UNAUTHORIZED",
"message": "Access denied: you do not have permission to query one or more referenced resources.",
"request_id": "req_7f8a9b2c"
}
}
Error messages are deliberately vague about which specific table or column caused the rejection. This prevents unauthorized users from probing the schema through trial-and-error queries.
Supported SQL Constructs
The introspection engine handles a wide range of SQL constructs, including those that attempt to obscure the tables and columns being accessed.
| Construct | Handled | Notes |
|---|---|---|
SELECT | Yes | All selected columns are validated |
JOIN | Yes | Tables and join columns are validated |
WHERE | Yes | Columns in filter conditions are validated |
GROUP BY | Yes | Grouping columns are validated |
ORDER BY | Yes | Sorting columns are validated |
Subqueries | Yes | Recursively introspected |
CTEs (WITH) | Yes | Each CTE is independently validated |
UNION / INTERSECT | Yes | All branches are validated |
Aliases | Yes | Aliases are resolved to source tables/columns |
SELECT * | Yes | Expanded to all columns and validated individually |
Functions | Yes | Column arguments inside functions are validated |
Handling SELECT *
When a user submits a SELECT * query, Datafi expands the wildcard to the full column list and validates each column individually. If any column is denied, the query is rejected.
-- User submits:
SELECT * FROM customers;
-- Datafi expands to:
-- SELECT id, name, email, ssn, dob, salary, region FROM customers;
-- Then validates each column against the user's permissions.
Encourage your users to specify explicit column lists rather than using SELECT *. This avoids surprises when column-level policies are in place and makes queries more efficient.
Agent Query Validation
When an AI agent generates and executes SQL, the same introspection pipeline applies. The agent's queries are validated against the invoking user's permissions, not the agent's own identity.
SQL Injection Prevention
The introspection engine also serves as a defense against SQL injection. Because every query is parsed into an AST before execution, malformed or injected SQL fragments are detected during parsing.
- Parameterized queries are enforced at the coordinator level.
- Multiple statement injection (e.g.,
; DROP TABLE) is detected and rejected. - Comment-based injection (e.g.,
--or/* */used to bypass filters) is neutralized during AST parsing.
-- Attempted injection:
SELECT * FROM customers WHERE id = '1'; DROP TABLE customers; --'
-- Datafi detects multiple statements and rejects the query.
Monitoring and Auditing
All query validation decisions are logged for auditing purposes. Each log entry includes:
| Field | Description |
|---|---|
request_id | Unique identifier for the request |
user_id | The authenticated user |
query_hash | SHA-256 hash of the submitted query |
tables_accessed | List of tables referenced |
columns_accessed | List of columns referenced |
decision | ALLOWED or DENIED |
denied_reason | Specific table or column that caused denial (internal only) |
timestamp | ISO 8601 timestamp |
Audit logs for denied queries retain the full query text for forensic analysis. These logs are accessible only to tenant administrators.
Best Practices
- Define column-level policies for sensitive tables. Table-level access alone is rarely granular enough for compliance requirements.
- Discourage
SELECT *in production. Wildcard selects are expanded and validated against all columns, which can lead to unexpected denials. - Review audit logs regularly. A high volume of denied queries from a single user may indicate misconfiguration or a probing attempt.
- Combine with RLS. Query-level security controls which tables and columns a user can reference. RLS controls which rows they can see. Use both for comprehensive governance.