API Overview
Datafi exposes a multi-protocol API surface that supports gRPC, gRPC-Web, HTTP, and the Model Context Protocol (MCP). All protocols share the same authentication model, authorization pipeline, and governance enforcement.
Multi-Protocol Strategy
Datafi supports multiple protocols to serve different client types and use cases.
| Protocol | Port | Best For | Transport |
|---|---|---|---|
| gRPC | 50051 | Backend services, high-throughput clients | HTTP/2 + Protobuf |
| gRPC-Web | 8001 | Browser-based applications | HTTP/1.1 or HTTP/2 + Protobuf |
| HTTP | 8000 | REST clients, health checks, simple integrations | HTTP/1.1 + JSON |
| MCP | 8002 | AI assistants (Claude, ChatGPT, Copilot) | HTTP + JSON |
All four protocols connect to the same coordinator and enforce identical authentication, authorization, and governance policies. The protocol choice does not affect security posture.
Authentication
All API requests must include a valid JWT Bearer token in the request metadata or headers.
gRPC
// Include the token in gRPC metadata
metadata: {
"authorization": "Bearer eyJhbGciOiJSUzI1NiIs..."
}
HTTP
curl -X POST https://api.datafi.io/v1/query \
-H "Authorization: Bearer eyJhbGciOiJSUzI1NiIs..." \
-H "Content-Type: application/json" \
-d '{"sql": "SELECT * FROM customers LIMIT 10"}'
gRPC-Web
// gRPC-Web client with authorization metadata
const metadata = {
'authorization': 'Bearer eyJhbGciOiJSUzI1NiIs...'
};
client.query(request, metadata, (err, response) => {
// Handle response
});
For details on token lifecycle, identity provider configuration, and JWKS rotation, see the Authentication page.
Response Format
Datafi returns query results in Apache Arrow columnar format for gRPC and gRPC-Web clients. HTTP clients receive JSON.
Why Apache Arrow?
| Benefit | Description |
|---|---|
| Columnar layout | Efficient for analytical queries that access a subset of columns |
| Zero-copy reads | Clients can process results without deserialization overhead |
| Language support | Arrow libraries available for Python, Java, Go, Rust, JavaScript, and more |
| Compression | Built-in support for LZ4 and ZSTD compression |
gRPC Response Example
message QueryResponse {
bytes arrow_record_batch = 1; // Serialized Arrow RecordBatch
QueryMetadata metadata = 2;
}
message QueryMetadata {
int64 row_count = 1;
int64 execution_time_ms = 2;
string query_id = 3;
bool from_cache = 4;
}
HTTP Response Example
{
"data": [
{"id": 1, "name": "Alice", "region": "us-west"},
{"id": 2, "name": "Bob", "region": "us-east"}
],
"metadata": {
"row_count": 2,
"execution_time_ms": 142,
"query_id": "qry_abc123",
"from_cache": false
}
}
API Surface
The Datafi API is divided between the coordinator (central control plane) and the edge server (data plane).
Coordinator API
The coordinator exposes 81 RPCs organized into four categories:
| Category | Description | Example RPCs |
|---|---|---|
| Catalog | Schema management, records, assets | GetSchema, ListTables, GetRecords |
| Query | Query execution, building, validation | ExecuteQuery, BuildQuery, ValidateQuery |
| Agent | AI agent lifecycle management | CreateAgent, RunAgent, ListAgents |
| Admin | Tenant and user administration | GetTenant, UpdateUser, ListRoles |
Edge API
The edge server exposes 3 RPCs with a minimal surface area:
| RPC | Description |
|---|---|
GetSchema | Retrieve the schema of a connected data source |
Query | Execute a query against a connected data source |
Ping | Health check |
Rate Limiting
API requests are subject to rate limiting based on your plan tier.
| Tier | Requests per Minute | Concurrent Queries |
|---|---|---|
| Free | 60 | 5 |
| Pro | 600 | 25 |
| Enterprise | Custom | Custom |
Rate-limited requests receive a 429 Too Many Requests response with a Retry-After header.
Error Handling
All errors follow a consistent format across protocols.
{
"error": {
"code": "QUERY_UNAUTHORIZED",
"message": "Access denied: you do not have permission to query one or more referenced resources.",
"request_id": "req_7f8a9b2c"
}
}
| Error Code | HTTP Status | Description |
|---|---|---|
UNAUTHENTICATED | 401 | Missing or invalid JWT token |
UNAUTHORIZED | 403 | Valid token but insufficient permissions |
NOT_FOUND | 404 | Resource does not exist or is not visible to the tenant |
QUERY_UNAUTHORIZED | 403 | Query references unauthorized tables or columns |
RATE_LIMITED | 429 | Too many requests |
INTERNAL | 500 | Server error (details logged internally) |
What to Read Next
- Coordinator API -- full catalog, query, agent, and admin operation details.
- Edge API -- minimal edge server surface area.
- MCP Integration -- connect AI assistants to Datafi.
- Integration Tokens -- create and manage API tokens for programmatic access.