MCP Integration
Datafi includes a built-in Model Context Protocol (MCP) server that enables AI assistants to interact with your data through a standardized interface. The MCP server runs on port 8002 of the coordinator and exposes Datafi's capabilities as tools that AI assistants can invoke.
What Is MCP?
The Model Context Protocol is an open standard for connecting AI models to external data sources and tools. When an AI assistant connects to Datafi's MCP server, it can:
- Query federated data sources using natural language
- Ask questions about uploaded documents
- Execute AI agents
- Explore schemas and metadata
Capabilities
The MCP server exposes the following capabilities as tools that AI assistants can discover and invoke.
| Capability | MCP Tool | Description |
|---|---|---|
| Data Federation | datafi_query | Execute SQL queries across federated data sources |
| Document Q&A | datafi_document_qa | Ask questions about documents stored in Datafi |
| Agent Execution | datafi_run_agent | Invoke a pre-configured Datafi AI agent |
| Natural Language Queries | datafi_nl_query | Convert a natural language question to SQL and execute it |
| Schema Exploration | datafi_list_tables | List available tables and their schemas |
Data Federation
The datafi_query tool allows AI assistants to execute SQL queries across your connected data sources. All governance policies (RBAC, ABAC, query-level security, RLS) are enforced.
{
"tool": "datafi_query",
"arguments": {
"sql": "SELECT region, SUM(revenue) as total FROM sales GROUP BY region",
"connection_id": "conn_abc123"
}
}
Document Q&A
The datafi_document_qa tool enables AI assistants to answer questions using documents that have been uploaded and indexed in Datafi.
{
"tool": "datafi_document_qa",
"arguments": {
"question": "What is our refund policy for enterprise customers?",
"document_collection": "policies"
}
}
Agent Execution
The datafi_run_agent tool invokes a pre-configured Datafi agent. The agent's response is returned to the AI assistant, which can incorporate it into its reply.
{
"tool": "datafi_run_agent",
"arguments": {
"agent_id": "agent_sales_analyst",
"message": "What were the top 5 products by revenue last quarter?"
}
}
Natural Language Queries
The datafi_nl_query tool converts a natural language question into SQL, executes it, and returns the results -- all in a single step.
{
"tool": "datafi_nl_query",
"arguments": {
"question": "How many customers signed up last month?",
"connection_id": "conn_abc123"
}
}
Connecting AI Assistants
Claude Desktop
Add Datafi as an MCP server in your Claude Desktop configuration:
{
"mcpServers": {
"datafi": {
"url": "https://api.datafi.io:8002",
"headers": {
"Authorization": "Bearer YOUR_API_TOKEN"
}
}
}
}
Cursor
Configure Datafi as an MCP server in Cursor's settings:
{
"mcp": {
"servers": {
"datafi": {
"url": "https://api.datafi.io:8002",
"headers": {
"Authorization": "Bearer YOUR_API_TOKEN"
}
}
}
}
}
Generic MCP Client
Any MCP-compatible client can connect to Datafi by pointing to the MCP endpoint:
Endpoint: https://api.datafi.io:8002
Authentication: Bearer token in the Authorization header
Protocol: HTTP + JSON (MCP standard)
Governance and Security
MCP requests are subject to the same governance pipeline as all other Datafi API requests.
The MCP server authenticates requests using the same JWT tokens as the gRPC and HTTP APIs. The AI assistant must include a valid Bearer token with every request.
User Context
When an AI assistant invokes a Datafi tool, the request is executed in the context of the authenticated user -- not the AI assistant itself. This means:
- The user's RBAC roles determine which resources are accessible.
- The user's ABAC attributes are evaluated against resource conditions.
- Query-level security validates table and column access against the user's permissions.
- RLS filters the results based on the user's attributes.
Disabling MCP
If you do not need the MCP server, you can disable it at startup using the --no-mcp flag.
Docker
docker run -d \
--name datafi-coordinator \
-e MODE=coordinator \
datafi/coordinator:latest --no-mcp
Kubernetes
containers:
- name: coordinator
image: datafi/coordinator:1.12.0
args: ["--no-mcp"]
ports:
- containerPort: 8000
- containerPort: 50051
- containerPort: 8001
# Port 8002 is not exposed when MCP is disabled
When MCP is disabled:
- Port 8002 is not opened.
- No MCP-related resources are initialized.
- All other API protocols (gRPC, gRPC-Web, HTTP) continue to function normally.
Disable MCP in environments where AI assistant integration is not needed to reduce the coordinator's surface area.
Monitoring MCP Usage
MCP requests are logged with the same structured format as other API requests.
{
"level": "info",
"timestamp": "2025-01-15T10:30:00Z",
"protocol": "mcp",
"tool": "datafi_query",
"user_id": "user_abc123",
"tenant_id": "tenant_001",
"duration_ms": 230,
"status": "success",
"request_id": "req_mcp_001"
}
Best Practices
- Use integration tokens for AI assistants. Create a dedicated API token for each AI assistant integration rather than using a personal user token.
- Scope tokens to minimum permissions. The token used by an AI assistant should have only the roles and attributes needed for its intended use case.
- Monitor MCP usage. Track which tools are being invoked, by which users, and how frequently to identify unusual patterns.
- Disable MCP when not needed. If your deployment does not use AI assistant integrations, disable the MCP server to reduce surface area.