Multi-Tenant Isolation
Datafi is a multi-tenant platform that serves multiple organizations from shared infrastructure. Tenant isolation ensures that one tenant's data, metadata, queries, and AI context are never accessible to another tenant. Isolation is enforced at every layer of the stack.
Isolation Architecture
JWT Tenant Claims
Every JWT issued by your identity provider must include a tenant_id claim. Datafi extracts this claim during authentication and uses it to scope all subsequent operations.
{
"iss": "https://your-idp.example.com/",
"sub": "user_abc123",
"aud": "https://api.datafi.io",
"exp": 1700000000,
"tenant_id": "tenant_acme_corp",
"roles": ["editor"]
}
| Claim | Purpose |
|---|---|
tenant_id | Identifies the tenant for all downstream operations |
sub | Identifies the user within the tenant |
roles | Determines the user's RBAC permissions within the tenant |
If a JWT is missing the tenant_id claim, the request is rejected with a 401 Unauthorized response. Datafi never falls back to a default tenant.
Catalog Isolation
Each tenant has a logically isolated catalog that contains their schemas, datasets, connections, agents, and policies. A user in Tenant A cannot see, query, or reference any catalog object belonging to Tenant B.
How Catalog Isolation Works
The catalog service appends the tenant_id to every query against the metadata store. This is enforced at the service layer -- it cannot be bypassed by crafting custom API requests.
-- Internal catalog query (simplified)
SELECT schema_name, table_name, column_name
FROM catalog.schemas
WHERE tenant_id = $1 -- Always scoped to the authenticated tenant
What Is Isolated
| Resource | Isolated | Description |
|---|---|---|
| Schemas | Yes | Database schemas and table definitions |
| Datasets | Yes | Virtual datasets and their configurations |
| Connections | Yes | Data source connection strings and credentials |
| Agents | Yes | AI agent definitions, prompts, and configurations |
| Policies | Yes | Access policies, RLS rules, and governance settings |
| Audit logs | Yes | Query history and access logs |
| Users | Yes | User profiles and role assignments |
Cross-Tenant Detection and Rejection
Datafi actively monitors for cross-tenant access attempts. If a request attempts to access resources belonging to a different tenant, it is immediately rejected and logged.
Detection Mechanisms
- Claim mismatch detection -- if the
tenant_idin the JWT does not match the tenant context of the requested resource, the request is rejected. - Catalog boundary enforcement -- the catalog service validates that every referenced object belongs to the requesting tenant.
- Query scope validation -- before execution, the query engine verifies that all data sources are registered under the requesting tenant.
Cross-tenant attempts return a 404 Not Found rather than a 403 Forbidden. This prevents an attacker from confirming the existence of resources in other tenants.
Alerting
Cross-tenant access attempts are flagged as security events. You can configure alerts for these events through the Datafi admin console.
alerts:
cross_tenant_attempt:
enabled: true
threshold: 1
notification:
- type: email
recipients: ["[email protected]"]
- type: webhook
url: "https://alerts.example.com/datafi"
Tenant-Scoped AI Context
When AI agents operate within Datafi, their context -- including conversation history, tool results, and intermediate data -- is scoped to the tenant that owns the agent.
What Is Scoped
| AI Context Element | Scoped to Tenant | Description |
|---|---|---|
| Agent definitions | Yes | The agent's system prompt, tools, and configuration |
| Conversation history | Yes | Past interactions between users and the agent |
| Tool execution results | Yes | Data returned by tools during agent execution |
| Vector embeddings | Yes | Document embeddings stored for retrieval |
| Generated SQL | Yes | Queries generated by the agent |
Isolation in Practice
When an agent in Tenant A generates a SQL query, the query is:
- Validated against Tenant A's catalog to ensure all referenced tables exist.
- Checked against Tenant A's access policies to ensure the invoking user has permission.
- Executed against Tenant A's registered data sources only.
The agent has no mechanism to reference or discover data sources, schemas, or objects from any other tenant.
Infrastructure Isolation
While Datafi uses shared compute infrastructure, the following measures ensure that tenant data does not leak at the infrastructure level:
| Layer | Isolation Mechanism |
|---|---|
| Compute | Tenant-scoped request contexts; no shared in-memory state between tenants |
| Storage | Tenant-prefixed keys in object storage; IAM-scoped access |
| Cache | Tenant-prefixed cache keys; separate TTLs per tenant |
| Networking | No direct network paths between tenant workloads |
| Logging | Tenant ID appended to all log entries; log access scoped by tenant |
Best Practices
- Always include
tenant_idin JWT claims. Configure your identity provider to emit this claim for every token. - Monitor cross-tenant alerts. Investigate every cross-tenant detection event -- even a single occurrence may indicate a misconfiguration or attack.
- Use separate connections per tenant. Avoid sharing database credentials across tenants to provide an additional layer of isolation at the data source level.
- Audit tenant membership. Periodically review which users belong to each tenant and remove stale accounts.