Multi-Protocol APIs

The Datafi Coordinator exposes four distinct API protocols, each optimized for different client environments and use cases. All four protocols provide access to the same underlying platform capabilities, authenticated and authorized through the same JWT and ABAC mechanisms.

Protocol Overview

Port Configuration

Protocol	Port	Transport	Serialization	Primary Use Case
gRPC	50051	HTTP/2	Protocol Buffers	Server-to-server communication, high-throughput pipelines.
gRPC-Web	8001	HTTP/1.1 or HTTP/2	Protocol Buffers	Browser-based applications using the Client Library.
HTTP	8000	HTTP/1.1 or HTTP/2	JSON	REST-style integrations, scripts, CLI tools, webhooks.
MCP	8002	HTTP/1.1 or HTTP/2	JSON	AI agent communication via the Model Context Protocol.

Edge Node Ports

Edge nodes expose a separate set of ports. gRPC on port 50051 for Coordinator-to-Edge communication, and HTTP on port 80 for health checks. You do not interact with Edge ports directly -- the Coordinator handles all routing.

gRPC (Port 50051)

gRPC is the highest-performance protocol available on the Coordinator. It uses HTTP/2 for multiplexed streaming and Protocol Buffers for compact binary serialization.

When to use gRPC:

You are building a server-side application in a language with strong gRPC support (Go, Java, Python, Rust, C#, Node.js).
You need bidirectional streaming for real-time data feeds.
You want the lowest possible latency and the smallest message overhead.
You are building internal microservices that communicate with the Coordinator.

Capabilities:

Access to all 81 Coordinator RPC methods.
Unary, server-streaming, and bidirectional-streaming call patterns.
Deadline propagation and cancellation.
Built-in retry and backoff policies via gRPC interceptors.

Authentication:

Include the JWT as metadata on every call:

metadata: {"authorization": "Bearer <your-jwt>"}

Example (Python):

import grpc
from datafi.v1 import coordinator_pb2, coordinator_pb2_grpc

channel = grpc.secure_channel("coordinator.example.com:50051", grpc.ssl_channel_credentials())
stub = coordinator_pb2_grpc.CoordinatorServiceStub(channel)

metadata = [("authorization", "Bearer eyJhbGciOiJSUzI1NiIs...")]
response = stub.ExecuteQuery(
    coordinator_pb2.ExecuteQueryRequest(query="..."),
    metadata=metadata,
)

gRPC-Web (Port 8001)

gRPC-Web adapts the gRPC protocol for browser environments. It provides the same Protocol Buffers serialization and type safety as native gRPC, but works over HTTP/1.1 and does not require HTTP/2 end-to-end.

When to use gRPC-Web:

You are building a browser-based application using the Datafi Client Library.
You want the performance benefits of Protocol Buffers in a web context.
You need server-side streaming to the browser (e.g., progressive result loading).

Capabilities:

Access to all 81 Coordinator RPC methods.
Unary and server-streaming call patterns (bidirectional streaming is not supported in browsers).
Automatic integration with the WebAssembly-based Client Library.

How the Client Library uses gRPC-Web:

The Datafi Client Library handles gRPC-Web communication transparently. You interact with a GraphQL API in your application code, and the library translates your queries into gRPC-Web calls under the hood.

import { DatafiClient } from "@datafi/client";

const client = new DatafiClient({
  coordinatorUrl: "https://coordinator.example.com:8001",
  token: "eyJhbGciOiJSUzI1NiIs...",
});

const result = await client.query(`
  query {
    employees(filter: { department: "engineering" }, limit: 50) {
      employee_id
      name
      title
    }
  }
`);

HTTP (Port 8000)

The HTTP API provides a conventional REST-style interface using JSON serialization. It is the most accessible protocol for integrations, scripts, and tools that do not support gRPC.

When to use HTTP:

You are integrating with third-party tools, webhooks, or no-code platforms.
You are writing quick scripts in languages without gRPC libraries.
You prefer working with JSON and standard HTTP methods.
You are using curl, Postman, or similar HTTP tools for testing and exploration.

Capabilities:

Access to all 81 Coordinator RPC methods via RESTful endpoints.
Standard HTTP methods (GET, POST, PUT, DELETE).
JSON request and response bodies.
Standard HTTP status codes for error handling.

Authentication:

Include the JWT in the Authorization header:

Authorization: Bearer <your-jwt>

Example (curl):

curl -X POST https://coordinator.example.com:8000/v1/query \
  -H "Authorization: Bearer eyJhbGciOiJSUzI1NiIs..." \
  -H "Content-Type: application/json" \
  -d '{
    "query": "SELECT employee_id, name, title FROM employees WHERE department = '\''engineering'\'' LIMIT 50"
  }'

MCP (Port 8002)

The Model Context Protocol (MCP) is designed for AI agent communication. It enables large language models and autonomous agents to interact with the Datafi platform programmatically.

When to use MCP:

You are building AI agents that need to query, explore, or analyze data.
You want to expose your data catalog to LLM-based tools.
You are integrating with agent frameworks that support MCP (e.g., Claude, LangChain, custom orchestrators).

Capabilities:

Schema discovery -- Agents can explore available datasets, columns, and data types.
Natural language to query translation -- Combined with the Coordinator's AI/ML orchestration layer.
Policy-aware responses -- All ABAC policies are enforced, ensuring agents only access authorized data.
Tool-use interface -- Structured tool definitions that agents can invoke.

Authentication:

MCP uses the same JWT authentication as all other protocols:

Authorization: Bearer <your-jwt>

tip

When you configure an AI agent in Datafi, the platform automatically provisions the appropriate MCP endpoint and credentials. You do not need to manage MCP connections manually in most cases.

Performance Specifications

The following specifications apply across all protocols unless otherwise noted.

Parameter	Value	Notes
Maximum message size	1 GB	Applies to both request and response payloads.
Default request timeout	5 minutes	Configurable per request via deadline or timeout headers.
Result serialization	Apache Arrow	Results are serialized in Apache Arrow columnar format for high-performance processing.
Concurrent connections	Unlimited	Bounded by available Coordinator resources and load balancer configuration.
TLS	Required	All protocols require TLS in production deployments.

Large Result Sets

While the maximum message size is 1 GB, you should use pagination or streaming for large result sets. Streaming is available on gRPC and gRPC-Web protocols. For the HTTP protocol, use cursor-based pagination.

Protocol Comparison

Feature	gRPC	gRPC-Web	HTTP	MCP
Serialization	Protocol Buffers	Protocol Buffers	JSON	JSON
Streaming	Bidirectional	Server-side only	No	No
Browser support	No	Yes	Yes	Via agent frameworks
Type safety	Strong (protobuf)	Strong (protobuf)	Weak (JSON)	Weak (JSON)
Latency	Lowest	Low	Moderate	Moderate
Payload size	Smallest	Small	Largest	Largest
Best for	Services	Web apps	Scripts / integrations	AI agents

When to Use Each Protocol

Building a web application? Use gRPC-Web through the Client Library. You get near-native performance, type safety, and automatic authentication handling.
Building a backend service? Use gRPC for the best performance and strongest typing. Use HTTP if your language or framework lacks gRPC support.
Writing a script or one-off integration? Use HTTP. It requires no code generation, and you can test endpoints with curl.
Building an AI agent? Use MCP. It provides structured tool definitions and integrates with the platform's AI orchestration layer.

RPC Method Distribution

Service	Protocol	Method Count
Coordinator	gRPC / gRPC-Web / HTTP / MCP	81 methods
Edge	gRPC (TLS)	3 methods (`Query`, `GetSchema`, `Ping`)

The 81 Coordinator methods span all platform operations including authentication, workspace management, connector configuration, dataset operations, policy management, data app administration, agent orchestration, and workflow execution.

Next Steps

Architecture -- Review how the protocols fit into the three-service architecture.
Request Lifecycle -- Understand the full query flow from authentication through result aggregation.

Protocol Overview​

Port Configuration​

gRPC (Port 50051)​

gRPC-Web (Port 8001)​

HTTP (Port 8000)​

MCP (Port 8002)​

Performance Specifications​

Protocol Comparison​

When to Use Each Protocol​

RPC Method Distribution​

Next Steps​

Protocol Overview

Port Configuration

gRPC (Port 50051)

gRPC-Web (Port 8001)

HTTP (Port 8000)

MCP (Port 8002)

Performance Specifications

Protocol Comparison

When to Use Each Protocol

RPC Method Distribution

Next Steps