Multi-Protocol APIs
The Datafi Coordinator exposes four distinct API protocols, each optimized for different client environments and use cases. All four protocols provide access to the same underlying platform capabilities, authenticated and authorized through the same JWT and ABAC mechanisms.
Protocol Overview
Port Configuration
| Protocol | Port | Transport | Serialization | Primary Use Case |
|---|---|---|---|---|
| gRPC | 50051 | HTTP/2 | Protocol Buffers | Server-to-server communication, high-throughput pipelines. |
| gRPC-Web | 8001 | HTTP/1.1 or HTTP/2 | Protocol Buffers | Browser-based applications using the Client Library. |
| HTTP | 8000 | HTTP/1.1 or HTTP/2 | JSON | REST-style integrations, scripts, CLI tools, webhooks. |
| MCP | 8002 | HTTP/1.1 or HTTP/2 | JSON | AI agent communication via the Model Context Protocol. |
Edge nodes expose a separate set of ports. gRPC on port 50051 for Coordinator-to-Edge communication, and HTTP on port 80 for health checks. You do not interact with Edge ports directly -- the Coordinator handles all routing.
gRPC (Port 50051)
gRPC is the highest-performance protocol available on the Coordinator. It uses HTTP/2 for multiplexed streaming and Protocol Buffers for compact binary serialization.
When to use gRPC:
- You are building a server-side application in a language with strong gRPC support (Go, Java, Python, Rust, C#, Node.js).
- You need bidirectional streaming for real-time data feeds.
- You want the lowest possible latency and the smallest message overhead.
- You are building internal microservices that communicate with the Coordinator.
Capabilities:
- Access to all 81 Coordinator RPC methods.
- Unary, server-streaming, and bidirectional-streaming call patterns.
- Deadline propagation and cancellation.
- Built-in retry and backoff policies via gRPC interceptors.
Authentication:
Include the JWT as metadata on every call:
metadata: {"authorization": "Bearer <your-jwt>"}
Example (Python):
import grpc
from datafi.v1 import coordinator_pb2, coordinator_pb2_grpc
channel = grpc.secure_channel("coordinator.example.com:50051", grpc.ssl_channel_credentials())
stub = coordinator_pb2_grpc.CoordinatorServiceStub(channel)
metadata = [("authorization", "Bearer eyJhbGciOiJSUzI1NiIs...")]
response = stub.ExecuteQuery(
coordinator_pb2.ExecuteQueryRequest(query="..."),
metadata=metadata,
)
gRPC-Web (Port 8001)
gRPC-Web adapts the gRPC protocol for browser environments. It provides the same Protocol Buffers serialization and type safety as native gRPC, but works over HTTP/1.1 and does not require HTTP/2 end-to-end.
When to use gRPC-Web:
- You are building a browser-based application using the Datafi Client Library.
- You want the performance benefits of Protocol Buffers in a web context.
- You need server-side streaming to the browser (e.g., progressive result loading).
Capabilities:
- Access to all 81 Coordinator RPC methods.
- Unary and server-streaming call patterns (bidirectional streaming is not supported in browsers).
- Automatic integration with the WebAssembly-based Client Library.
How the Client Library uses gRPC-Web:
The Datafi Client Library handles gRPC-Web communication transparently. You interact with a GraphQL API in your application code, and the library translates your queries into gRPC-Web calls under the hood.
import { DatafiClient } from "@datafi/client";
const client = new DatafiClient({
coordinatorUrl: "https://coordinator.example.com:8001",
token: "eyJhbGciOiJSUzI1NiIs...",
});
const result = await client.query(`
query {
employees(filter: { department: "engineering" }, limit: 50) {
employee_id
name
title
}
}
`);
HTTP (Port 8000)
The HTTP API provides a conventional REST-style interface using JSON serialization. It is the most accessible protocol for integrations, scripts, and tools that do not support gRPC.
When to use HTTP:
- You are integrating with third-party tools, webhooks, or no-code platforms.
- You are writing quick scripts in languages without gRPC libraries.
- You prefer working with JSON and standard HTTP methods.
- You are using
curl, Postman, or similar HTTP tools for testing and exploration.
Capabilities:
- Access to all 81 Coordinator RPC methods via RESTful endpoints.
- Standard HTTP methods (
GET,POST,PUT,DELETE). - JSON request and response bodies.
- Standard HTTP status codes for error handling.
Authentication:
Include the JWT in the Authorization header:
Authorization: Bearer <your-jwt>
Example (curl):
curl -X POST https://coordinator.example.com:8000/v1/query \
-H "Authorization: Bearer eyJhbGciOiJSUzI1NiIs..." \
-H "Content-Type: application/json" \
-d '{
"query": "SELECT employee_id, name, title FROM employees WHERE department = '\''engineering'\'' LIMIT 50"
}'
MCP (Port 8002)
The Model Context Protocol (MCP) is designed for AI agent communication. It enables large language models and autonomous agents to interact with the Datafi platform programmatically.
When to use MCP:
- You are building AI agents that need to query, explore, or analyze data.
- You want to expose your data catalog to LLM-based tools.
- You are integrating with agent frameworks that support MCP (e.g., Claude, LangChain, custom orchestrators).
Capabilities:
- Schema discovery -- Agents can explore available datasets, columns, and data types.
- Natural language to query translation -- Combined with the Coordinator's AI/ML orchestration layer.
- Policy-aware responses -- All ABAC policies are enforced, ensuring agents only access authorized data.
- Tool-use interface -- Structured tool definitions that agents can invoke.
Authentication:
MCP uses the same JWT authentication as all other protocols:
Authorization: Bearer <your-jwt>
When you configure an AI agent in Datafi, the platform automatically provisions the appropriate MCP endpoint and credentials. You do not need to manage MCP connections manually in most cases.
Performance Specifications
The following specifications apply across all protocols unless otherwise noted.
| Parameter | Value | Notes |
|---|---|---|
| Maximum message size | 1 GB | Applies to both request and response payloads. |
| Default request timeout | 5 minutes | Configurable per request via deadline or timeout headers. |
| Result serialization | Apache Arrow | Results are serialized in Apache Arrow columnar format for high-performance processing. |
| Concurrent connections | Unlimited | Bounded by available Coordinator resources and load balancer configuration. |
| TLS | Required | All protocols require TLS in production deployments. |
While the maximum message size is 1 GB, you should use pagination or streaming for large result sets. Streaming is available on gRPC and gRPC-Web protocols. For the HTTP protocol, use cursor-based pagination.
Protocol Comparison
| Feature | gRPC | gRPC-Web | HTTP | MCP |
|---|---|---|---|---|
| Serialization | Protocol Buffers | Protocol Buffers | JSON | JSON |
| Streaming | Bidirectional | Server-side only | No | No |
| Browser support | No | Yes | Yes | Via agent frameworks |
| Type safety | Strong (protobuf) | Strong (protobuf) | Weak (JSON) | Weak (JSON) |
| Latency | Lowest | Low | Moderate | Moderate |
| Payload size | Smallest | Small | Largest | Largest |
| Best for | Services | Web apps | Scripts / integrations | AI agents |
When to Use Each Protocol
- Building a web application? Use gRPC-Web through the Client Library. You get near-native performance, type safety, and automatic authentication handling.
- Building a backend service? Use gRPC for the best performance and strongest typing. Use HTTP if your language or framework lacks gRPC support.
- Writing a script or one-off integration? Use HTTP. It requires no code generation, and you can test endpoints with
curl. - Building an AI agent? Use MCP. It provides structured tool definitions and integrates with the platform's AI orchestration layer.
RPC Method Distribution
| Service | Protocol | Method Count |
|---|---|---|
| Coordinator | gRPC / gRPC-Web / HTTP / MCP | 81 methods |
| Edge | gRPC (TLS) | 3 methods (Query, GetSchema, Ping) |
The 81 Coordinator methods span all platform operations including authentication, workspace management, connector configuration, dataset operations, policy management, data app administration, agent orchestration, and workflow execution.
Next Steps
- Architecture -- Review how the protocols fit into the three-service architecture.
- Request Lifecycle -- Understand the full query flow from authentication through result aggregation.