Multi-Agent Coordination

When a single agent is not enough, you can coordinate multiple agents to tackle complex data tasks. Datafi provides coordination patterns that let agents communicate, share state, and work together -- either as peers or in a hierarchy. You configure coordination at the workflow level, and the platform handles message routing, state synchronization, and lifecycle management.

Coordination Patterns

Datafi supports four coordination patterns. You can use them individually or combine them within a single workflow.

Pattern	Description	Communication	Best For
Event-driven	Agents react to platform events. An event triggers one or more agents without direct coupling between them.	Publish/subscribe via event bus.	Decoupled pipelines, reactive architectures, monitoring.
Message-passing	Agents send and receive typed messages directly. One agent's output becomes another agent's input.	Point-to-point or broadcast messages.	Sequential handoffs, data enrichment chains, review workflows.
Shared state	Agents read from and write to a shared state store. Each agent contributes partial results to a common data structure.	Read/write to shared key-value store.	Collaborative analysis, aggregation from multiple sources, consensus-building.
Hierarchical	A supervisor agent delegates tasks to worker agents, collects results, and makes decisions.	Parent-child task delegation.	Complex orchestration, divide-and-conquer, multi-stage pipelines.

Event-Driven Pattern

Agents subscribe to event types. When an event is published (by the platform, a workflow, or another agent), all subscribed agents are triggered independently.

coordination:
  pattern: event-driven
  events:
    - type: data.loaded
      filter: "source == 'sales_warehouse'"
      agents:
        - data-quality-checker
        - schema-drift-detector
    - type: anomaly.detected
      filter: "severity >= 'high'"
      agents:
        - incident-reporter
        - auto-remediation-agent

Message-Passing Pattern

Agents exchange structured messages. The sender specifies the recipient and message schema; the recipient processes the message and optionally responds.

coordination:
  pattern: message-passing
  flow:
    - from: data-collector
      to: data-enricher
      message:
        schema: raw_records
    - from: data-enricher
      to: report-generator
      message:
        schema: enriched_records
    - from: report-generator
      to: email-distributor
      message:
        schema: formatted_report

Shared State Pattern

Agents read from and write to a shared state store during execution. The store is scoped to the workflow run and supports concurrent access with conflict resolution.

coordination:
  pattern: shared-state
  state_store:
    type: key-value
    conflict_resolution: last-write-wins
  agents:
    - name: revenue-analyzer
      writes: ["revenue_by_region", "revenue_trends"]
    - name: cost-analyzer
      writes: ["cost_by_region", "cost_trends"]
    - name: profitability-summarizer
      reads: ["revenue_by_region", "cost_by_region", "revenue_trends", "cost_trends"]
      writes: ["profitability_report"]

Hierarchical Pattern

A supervisor agent breaks a complex task into subtasks, delegates them to worker agents, collects results, and synthesizes a final output.

coordination:
  pattern: hierarchical
  supervisor: executive-analyst
  workers:
    - name: sales-analyst
      task: "Analyze Q3 sales performance"
    - name: marketing-analyst
      task: "Analyze Q3 campaign effectiveness"
    - name: ops-analyst
      task: "Analyze Q3 operational efficiency"
  aggregation:
    strategy: supervisor-synthesis
    timeout_seconds: 300

Agent Versioning

Every agent in Datafi is versioned using semantic versioning (major.minor.patch). Versioning lets you evolve agents safely without disrupting running workflows.

Action	Behavior
Publish a new version	The previous version remains available. Existing workflows continue using the version they reference.
Pin a version	Workflows and triggers reference a specific version (e.g., `[email protected]`).
Use latest	Reference `revenue-analyst@latest` to always use the most recently published version.
Deprecate a version	Mark a version as deprecated. Existing references still work, but new workflows cannot select it.
Rollback	Revert to a previous version by updating the workflow reference.

agents:
  - name: revenue-analyst
    version: "1.2.0"    # pinned version
  - name: cost-analyzer
    version: "latest"    # always uses newest

A/B Testing

Datafi supports A/B testing for agents, allowing you to compare two versions of an agent side by side in production. Traffic is split between versions based on a configurable ratio, and results are tracked independently for each version.

Setting Up an A/B Test

Navigate to AI > Agent Catalog and select the agent you want to test.
Click A/B Test and select the two versions to compare.
Configure the traffic split (e.g., 80/20, 50/50).
Define comparison metrics (success rate, execution time, token usage, output quality).
Set a test duration or sample size threshold.
Start the test.

Monitoring A/B Results

Metric	Version A	Version B
Success rate	Percentage of successful runs.	Percentage of successful runs.
Average duration	Mean execution time.	Mean execution time.
Token usage	Average tokens per run.	Average tokens per run.
Quality score	Based on user feedback (thumbs up/down).	Based on user feedback (thumbs up/down).

ab_test:
  agent: revenue-analyst
  versions:
    a: "1.2.0"
    b: "1.3.0-beta"
  traffic_split:
    a: 80
    b: 20
  metrics:
    - success_rate
    - avg_duration_ms
    - avg_token_usage
    - quality_score
  duration_days: 14

Statistical Significance

The A/B testing dashboard indicates when results reach statistical significance. Avoid drawing conclusions from small sample sizes -- wait until the dashboard confirms confidence before promoting a version.

Lifecycle Triggers

Agents and multi-agent workflows can be triggered automatically through multiple mechanisms.

Trigger Type	Configuration	Description
Manual	None (on-demand)	Triggered by a user from the catalog, API, or workflow.
Polling	`interval`, `backoff`	Checks a condition at regular intervals. Supports exponential backoff to reduce load when the condition is not met.
Schedule	`cron`, `timezone`	Runs on a cron schedule in the specified timezone.
Event	`type`, `filter`	Fires when a matching platform event occurs. Supports filter expressions.
Webhook	`path`, `auth`	Fires when an HTTP request hits the configured webhook path. Supports API key and JWT authentication.

triggers:
  - type: schedule
    cron: "0 8 * * MON"
    timezone: "America/New_York"

  - type: event
    event_type: data.loaded
    filter: "source == 'sales_warehouse' && row_count > 0"

  - type: webhook
    path: /hooks/revenue-report
    auth:
      type: api_key
      header: X-API-Key

  - type: polling
    interval_seconds: 300
    backoff:
      type: exponential
      max_interval_seconds: 3600
    condition: "SELECT COUNT(*) FROM staging.pending WHERE status = 'ready'"

Observability

Multi-agent coordination provides additional observability beyond single-agent monitoring:

Coordination trace -- Visualize the full message flow between agents, including event publications, message handoffs, and state reads/writes.
Agent dependency graph -- See which agents depend on which, based on coordination patterns.
Bottleneck detection -- Identify agents that slow down the overall workflow due to long execution times or frequent retries.
State inspector -- View the contents of the shared state store at any point during execution.

Design Considerations

When designing multi-agent systems, keep these principles in mind:

Prefer loose coupling -- Use event-driven or shared state patterns when agents do not need direct interaction. This makes it easier to add, remove, or replace agents without disrupting the system.
Set clear boundaries -- Each agent should have a single, well-defined responsibility. Avoid creating "super agents" that do everything.
Use guard rails consistently -- Apply resource limits and PII filtering to every agent, not just the entry point. A single unconstrained agent can compromise the entire workflow.
Version deliberately -- Pin versions in production workflows. Use latest only in development and testing environments.
Test coordination -- Test multi-agent workflows end-to-end, not just individual agents. Message schemas, state keys, and event filters can introduce subtle failures that only appear during coordination.

Next Steps

Agent Builder -- Create the agents that participate in multi-agent workflows.
Workflow Builder -- Define the graph-based workflows that orchestrate coordination.
Agent Catalog -- Browse available agents and their versions.
AI Infrastructure Overview -- Review the platform's AI architecture and LLM provider configuration.

Coordination Patterns​

Event-Driven Pattern​

Message-Passing Pattern​

Shared State Pattern​

Hierarchical Pattern​

Agent Versioning​

A/B Testing​

Setting Up an A/B Test​

Monitoring A/B Results​

Lifecycle Triggers​

Observability​

Design Considerations​

Next Steps​