Chat Interface
The Chat Interface lets you query your data using natural language. You type a question in plain English, and Datafi translates it into a validated, policy-enforced query that runs against your connected data sources. Results appear directly in the conversation thread alongside the generated query, so you can inspect exactly what was executed.
NL-to-SQL Pipeline
Every natural language question passes through a multi-stage pipeline that transforms your intent into a safe, executable query.
Stage 1: Context Assembly
When you submit a question, the pipeline assembles a rich context package that guides the LLM toward an accurate query. This context includes:
| Context Component | Description |
|---|---|
| Schema metadata | Table names, column names, data types, and relationships from your connected data sources. |
| Example queries | Curated PRQL examples that demonstrate patterns relevant to your schema. |
| Tenant customizations | Business-specific terminology, column aliases, and domain definitions configured by your administrator. |
| Conversation history | Prior questions and results from the current thread, enabling multi-turn follow-ups. |
The quality of the generated query depends heavily on context. Datafi automatically selects the most relevant schema elements and examples for your question, rather than sending your entire data catalog to the LLM. This keeps token usage efficient and improves accuracy.
Stage 2: LLM Generation
The assembled context and your question are sent to your tenant's configured LLM provider (OpenAI, Azure OpenAI, AWS Bedrock, or Snowflake Cortex). The LLM generates a response in PRQL (Pipelined Relational Query Language), which serves as a database-agnostic intermediate representation.
Stage 3: PRQL to SQL Compilation
The generated PRQL is compiled into database-specific SQL by the Datafi query engine. This compilation step handles dialect differences across Snowflake, PostgreSQL, MSSQL, MySQL, BigQuery, and Redshift, so the LLM does not need to know which database you are targeting.
Stage 4: Access Control Validation
Before execution, the compiled SQL is validated against your ABAC policies. The validation engine checks:
- Table-level access -- Can you query these tables?
- Column-level access -- Are any restricted columns referenced?
- Row-level filters -- Do any row-level policies need to be injected?
- Data masking -- Should any columns be masked or redacted in the result?
If the query violates a policy, it is rejected with an explanation. The LLM-generated query never bypasses governance.
Stage 5: Execution and Results
The validated query is routed to the appropriate Edge node, executed against your data source, and the results are returned to the chat thread. You see both the answer and the generated PRQL, so you can verify the logic.
Conversation Threads
Each chat session is organized as a conversation thread. Threads maintain context across multiple questions, so you can ask follow-up questions without restating your intent.
Multi-Turn Conversations
The pipeline includes conversation history in the context assembly stage. This means you can have exchanges like:
You: What were total sales last quarter?
Datafi: Total sales for Q3 2025 were $4.2M. [shows PRQL + results]
You: Break that down by region.
Datafi: [generates query referencing the same time period and metric, grouped by region]
You: Which region had the highest growth compared to Q2?
Datafi: [generates a comparative query across Q2 and Q3, ranked by growth]
Each follow-up question is interpreted in the context of the previous exchange. The pipeline carries forward the relevant table references, filters, and time ranges from earlier turns.
Thread Management
- New thread -- Start a fresh conversation with no prior context.
- Thread history -- Access previous threads from the sidebar. Each thread preserves the full question-and-answer sequence.
- Pin a thread -- Pin important threads for quick access.
- Share a thread -- Share a thread with a colleague. The recipient sees the questions and results, subject to their own access policies.
Voice Input
Datafi supports speech-to-text input, allowing you to ask questions using your microphone. Voice input is processed through a speech recognition service and then fed into the same NL-to-SQL pipeline as typed questions.
To use voice input:
- Click the microphone icon in the chat input bar.
- Speak your question clearly.
- Review the transcribed text in the input field.
- Press Enter or click Send to submit.
Voice input works best for concise, direct questions. For complex queries with specific column names or technical terms, you may want to type the question or edit the transcription before submitting.
Generated Query Inspection
Every response in the chat thread includes the generated PRQL query. You can:
- Copy the PRQL -- Use it in a Data View or query panel for further exploration.
- View the compiled SQL -- Expand the query detail to see the database-specific SQL that was executed.
- Edit and re-run -- Modify the generated PRQL and re-execute it without starting a new question.
Accuracy and Feedback
If a generated query does not match your intent, you can provide feedback directly in the thread:
- Thumbs up / thumbs down -- Rate the quality of the response. This feedback is used to improve context assembly over time.
- Correct and re-run -- Edit the generated PRQL to fix the query and submit the correction. The corrected query is stored as an example for future context assembly.
Corrections you provide are added to your tenant's example query library. Over time, this makes the NL-to-SQL pipeline more accurate for your specific schema and business terminology.
Next Steps
- Document Intelligence -- Extract structured data from documents and images.
- Agent Catalog -- Run pre-built agents for common data tasks.
- Query Reference -- Learn PRQL syntax for manual query authoring.