Skip to main content

Query-Level Security

Query-level security is Datafi's pre-execution validation layer. Before any SQL statement reaches your data sources, Datafi parses and introspects the query to verify that the requesting user has permission to access every referenced table and column. Unauthorized queries are rejected immediately -- they never execute against your database.

How SQL Introspection Works

When a query arrives at the Datafi coordinator, it passes through the following pipeline:

Step 1: Parse the SQL Abstract Syntax Tree

Datafi parses the incoming SQL into an abstract syntax tree (AST). This allows the engine to understand the query's structure regardless of formatting, aliasing, or complexity.

-- Original query
SELECT c.name, c.email, o.total
FROM customers AS c
JOIN orders AS o ON c.id = o.customer_id
WHERE c.region = 'us-west';

The parser extracts:

  • Tables: customers, orders
  • Columns: customers.name, customers.email, orders.total, customers.id, orders.customer_id, customers.region

Step 2: Validate Table Permissions

Each extracted table is checked against the user's access policies. If the user does not have an explicit Allow rule for a table, the query is rejected.

{
"user": "analyst_jane",
"permitted_tables": ["customers", "orders", "products"],
"denied_tables": ["payroll", "credentials"]
}

Step 3: Validate Column Permissions

Even if a user has access to a table, they may not have access to every column. Column-level policies are evaluated next.

{
"table": "customers",
"user": "analyst_jane",
"permitted_columns": ["name", "email", "region", "city"],
"denied_columns": ["ssn", "dob", "salary"]
}

Step 4: Decision

If all tables and columns are permitted, the query proceeds to execution. If any table or column is denied, the entire query is rejected with a 403 Forbidden response.

Query Rejection

When a query is rejected, Datafi returns an error response that identifies the violation without leaking schema information.

{
"error": {
"code": "QUERY_UNAUTHORIZED",
"message": "Access denied: you do not have permission to query one or more referenced resources.",
"request_id": "req_7f8a9b2c"
}
}
warning

Error messages are deliberately vague about which specific table or column caused the rejection. This prevents unauthorized users from probing the schema through trial-and-error queries.

Supported SQL Constructs

The introspection engine handles a wide range of SQL constructs, including those that attempt to obscure the tables and columns being accessed.

ConstructHandledNotes
SELECTYesAll selected columns are validated
JOINYesTables and join columns are validated
WHEREYesColumns in filter conditions are validated
GROUP BYYesGrouping columns are validated
ORDER BYYesSorting columns are validated
SubqueriesYesRecursively introspected
CTEs (WITH)YesEach CTE is independently validated
UNION / INTERSECTYesAll branches are validated
AliasesYesAliases are resolved to source tables/columns
SELECT *YesExpanded to all columns and validated individually
FunctionsYesColumn arguments inside functions are validated

Handling SELECT *

When a user submits a SELECT * query, Datafi expands the wildcard to the full column list and validates each column individually. If any column is denied, the query is rejected.

-- User submits:
SELECT * FROM customers;

-- Datafi expands to:
-- SELECT id, name, email, ssn, dob, salary, region FROM customers;
-- Then validates each column against the user's permissions.
tip

Encourage your users to specify explicit column lists rather than using SELECT *. This avoids surprises when column-level policies are in place and makes queries more efficient.

Agent Query Validation

When an AI agent generates and executes SQL, the same introspection pipeline applies. The agent's queries are validated against the invoking user's permissions, not the agent's own identity.

SQL Injection Prevention

The introspection engine also serves as a defense against SQL injection. Because every query is parsed into an AST before execution, malformed or injected SQL fragments are detected during parsing.

  • Parameterized queries are enforced at the coordinator level.
  • Multiple statement injection (e.g., ; DROP TABLE) is detected and rejected.
  • Comment-based injection (e.g., -- or /* */ used to bypass filters) is neutralized during AST parsing.
-- Attempted injection:
SELECT * FROM customers WHERE id = '1'; DROP TABLE customers; --'

-- Datafi detects multiple statements and rejects the query.

Monitoring and Auditing

All query validation decisions are logged for auditing purposes. Each log entry includes:

FieldDescription
request_idUnique identifier for the request
user_idThe authenticated user
query_hashSHA-256 hash of the submitted query
tables_accessedList of tables referenced
columns_accessedList of columns referenced
decisionALLOWED or DENIED
denied_reasonSpecific table or column that caused denial (internal only)
timestampISO 8601 timestamp
info

Audit logs for denied queries retain the full query text for forensic analysis. These logs are accessible only to tenant administrators.

Best Practices

  1. Define column-level policies for sensitive tables. Table-level access alone is rarely granular enough for compliance requirements.
  2. Discourage SELECT * in production. Wildcard selects are expanded and validated against all columns, which can lead to unexpected denials.
  3. Review audit logs regularly. A high volume of denied queries from a single user may indicate misconfiguration or a probing attempt.
  4. Combine with RLS. Query-level security controls which tables and columns a user can reference. RLS controls which rows they can see. Use both for comprehensive governance.