Skip to main content

MCP Integration

Datafi includes a built-in Model Context Protocol (MCP) server that enables AI assistants to interact with your data through a standardized interface. The MCP server runs on port 8002 of the coordinator and exposes Datafi's capabilities as tools that AI assistants can invoke.

What Is MCP?

The Model Context Protocol is an open standard for connecting AI models to external data sources and tools. When an AI assistant connects to Datafi's MCP server, it can:

  • Query federated data sources using natural language
  • Ask questions about uploaded documents
  • Execute AI agents
  • Explore schemas and metadata

Capabilities

The MCP server exposes the following capabilities as tools that AI assistants can discover and invoke.

CapabilityMCP ToolDescription
Data Federationdatafi_queryExecute SQL queries across federated data sources
Document Q&Adatafi_document_qaAsk questions about documents stored in Datafi
Agent Executiondatafi_run_agentInvoke a pre-configured Datafi AI agent
Natural Language Queriesdatafi_nl_queryConvert a natural language question to SQL and execute it
Schema Explorationdatafi_list_tablesList available tables and their schemas

Data Federation

The datafi_query tool allows AI assistants to execute SQL queries across your connected data sources. All governance policies (RBAC, ABAC, query-level security, RLS) are enforced.

{
"tool": "datafi_query",
"arguments": {
"sql": "SELECT region, SUM(revenue) as total FROM sales GROUP BY region",
"connection_id": "conn_abc123"
}
}

Document Q&A

The datafi_document_qa tool enables AI assistants to answer questions using documents that have been uploaded and indexed in Datafi.

{
"tool": "datafi_document_qa",
"arguments": {
"question": "What is our refund policy for enterprise customers?",
"document_collection": "policies"
}
}

Agent Execution

The datafi_run_agent tool invokes a pre-configured Datafi agent. The agent's response is returned to the AI assistant, which can incorporate it into its reply.

{
"tool": "datafi_run_agent",
"arguments": {
"agent_id": "agent_sales_analyst",
"message": "What were the top 5 products by revenue last quarter?"
}
}

Natural Language Queries

The datafi_nl_query tool converts a natural language question into SQL, executes it, and returns the results -- all in a single step.

{
"tool": "datafi_nl_query",
"arguments": {
"question": "How many customers signed up last month?",
"connection_id": "conn_abc123"
}
}

Connecting AI Assistants

Claude Desktop

Add Datafi as an MCP server in your Claude Desktop configuration:

{
"mcpServers": {
"datafi": {
"url": "https://api.datafi.io:8002",
"headers": {
"Authorization": "Bearer YOUR_API_TOKEN"
}
}
}
}

Cursor

Configure Datafi as an MCP server in Cursor's settings:

{
"mcp": {
"servers": {
"datafi": {
"url": "https://api.datafi.io:8002",
"headers": {
"Authorization": "Bearer YOUR_API_TOKEN"
}
}
}
}
}

Generic MCP Client

Any MCP-compatible client can connect to Datafi by pointing to the MCP endpoint:

Endpoint: https://api.datafi.io:8002
Authentication: Bearer token in the Authorization header
Protocol: HTTP + JSON (MCP standard)

Governance and Security

MCP requests are subject to the same governance pipeline as all other Datafi API requests.

info

The MCP server authenticates requests using the same JWT tokens as the gRPC and HTTP APIs. The AI assistant must include a valid Bearer token with every request.

User Context

When an AI assistant invokes a Datafi tool, the request is executed in the context of the authenticated user -- not the AI assistant itself. This means:

  • The user's RBAC roles determine which resources are accessible.
  • The user's ABAC attributes are evaluated against resource conditions.
  • Query-level security validates table and column access against the user's permissions.
  • RLS filters the results based on the user's attributes.

Disabling MCP

If you do not need the MCP server, you can disable it at startup using the --no-mcp flag.

Docker

docker run -d \
--name datafi-coordinator \
-e MODE=coordinator \
datafi/coordinator:latest --no-mcp

Kubernetes

containers:
- name: coordinator
image: datafi/coordinator:1.12.0
args: ["--no-mcp"]
ports:
- containerPort: 8000
- containerPort: 50051
- containerPort: 8001
# Port 8002 is not exposed when MCP is disabled

When MCP is disabled:

  • Port 8002 is not opened.
  • No MCP-related resources are initialized.
  • All other API protocols (gRPC, gRPC-Web, HTTP) continue to function normally.
tip

Disable MCP in environments where AI assistant integration is not needed to reduce the coordinator's surface area.

Monitoring MCP Usage

MCP requests are logged with the same structured format as other API requests.

{
"level": "info",
"timestamp": "2025-01-15T10:30:00Z",
"protocol": "mcp",
"tool": "datafi_query",
"user_id": "user_abc123",
"tenant_id": "tenant_001",
"duration_ms": 230,
"status": "success",
"request_id": "req_mcp_001"
}

Best Practices

  1. Use integration tokens for AI assistants. Create a dedicated API token for each AI assistant integration rather than using a personal user token.
  2. Scope tokens to minimum permissions. The token used by an AI assistant should have only the roles and attributes needed for its intended use case.
  3. Monitor MCP usage. Track which tools are being invoked, by which users, and how frequently to identify unusual patterns.
  4. Disable MCP when not needed. If your deployment does not use AI assistant integrations, disable the MCP server to reduce surface area.