Agent Builder

The Agent Builder lets you create custom AI agents with precise control over what they do, how they reason, and what constraints they operate within. Every agent is defined by a declarative specification that covers four areas: Identity, Capabilities, Behavior, and Guard Rails.

Navigate to AI > Agent Builder to open the visual editor, or define agents programmatically using YAML specifications.

Agent Specification

Identity

Identity defines who the agent is and what it aims to accomplish.

Field	Description	Example
Title	Display name shown in the UI.	`Revenue Analyst`
Name	Unique agent identifier, auto-generated from title.	`revenue-analyst`
Version	Semantic version string.	`1.2.0`
Description	Summary of the agent's purpose.	`Analyzes quarterly revenue trends and generates executive summaries.`
Author	Creator of the agent.	`data-team`
Tags	Labels for categorization and discovery.	`["analytics", "revenue", "reporting"]`
Icon	Visual identifier -- choose from Material Symbols icons or upload a custom logo image.	Material icon: `analytics`
Goals	List of objectives the agent pursues.	`["Compute quarterly revenue by region", "Identify top growth segments"]`
Success Criteria	Measurable conditions for a successful run.	`["Output contains revenue_by_region table", "Report generated in under 60s"]`

identity:
  title: Revenue Analyst
  name: revenue-analyst
  version: 1.2.0
  description: Analyzes quarterly revenue trends and generates executive summaries.
  author: data-team
  tags:
    - analytics
    - revenue
    - reporting
  icon:
    type: material-symbol
    value: analytics
  goals:
    - Compute quarterly revenue by region
    - Identify top growth segments
    - Generate executive summary report
  success_criteria:
    - Output contains revenue_by_region table
    - Report generated in under 60 seconds

Capabilities

Capabilities define the tools, data sources, and output formats available to the agent.

Tools

Tool	Description	Input	Output
`query`	Execute PRQL or SQL queries against connected data sources.	PRQL/SQL string, datasource ID	Table results
`search`	Semantic search across data catalog metadata.	Search query string	Matching tables, columns, descriptions
`llm`	Call the tenant's configured LLM for reasoning or generation.	Prompt string, parameters	Text response
`vision`	Analyze images using GPT-4 Vision.	Image URL or base64	Text description, structured data
`web_search`	Search the web for external information.	Search query	Search results with URLs and snippets
`web_fetch`	Fetch and parse content from a URL.	URL	Page content as text or markdown
`vision_extraction`	Extract structured data from documents using GPT-4 Vision.	Document file, extraction schema	Structured JSON
`http_api`	Make HTTP requests to external APIs.	Method, URL, headers, body	HTTP response
`email`	Send email notifications or retrieve email content.	Recipients, subject, body	Delivery status
`ftp`	Upload or download files from FTP/SFTP servers.	Host, path, credentials	File content or transfer status
`csv`	Parse or generate CSV files.	File path or data array	Parsed rows or file path
`json`	Parse, transform, or generate JSON documents.	JSON string or object	Transformed JSON
`markdown`	Generate formatted markdown documents.	Content structure	Markdown string
`markdown_table_formatter`	Format data as HTML or markdown tables with column labels, sorting, and value formatting.	Data array, columns, format options	Formatted table string
`regression`	Perform statistical linear regression analysis on grouped data.	Data array, x/y columns, confidence level	Slope, intercept, R², confidence intervals
`csv_writer`	Generate CSV files from structured data.	Data array, column definitions	CSV file content
`csv_formatter`	Format and transform CSV data with column mapping.	CSV data, formatting rules	Formatted CSV
`summarize`	Summarize text content using LLM with configurable length.	Text content, target length	Summarized text
`array`	Perform map, filter, reduce, and sort operations on arrays using JQ expressions.	Array data, operation, expression	Transformed array

Data Sources

You specify which connected data sources the agent can access. The agent can only query data sources listed in its specification, and all queries are further restricted by ABAC policies.

Output Formats

Define the formats the agent produces: text, json, markdown, table, or any combination.

capabilities:
  tools:
    - query
    - llm
    - csv
    - markdown
  data_sources:
    - datasource: sales_warehouse
      permissions: read
    - datasource: crm_database
      permissions: read
  output_formats:
    - markdown
    - table

Behavior

Behavior controls how the agent reasons, executes, and remembers context.

Execution Mode

Mode	Description	Best For
Sequential	Steps execute one after another. Each step's output is available to the next.	Linear analysis pipelines, step-by-step reasoning.
Parallel	Independent steps execute simultaneously.	Data gathering from multiple sources, batch processing.
Hybrid	Combines sequential and parallel execution. Groups of parallel steps feed into sequential stages.	Complex workflows that benefit from both patterns.

Reasoning Strategy

Strategy	Description	Best For
Step-by-step	The agent reasons through the problem one step at a time, building on each conclusion.	Analytical tasks requiring logical progression.
Parallel exploration	The agent explores multiple approaches simultaneously and selects the best result.	Open-ended questions with multiple valid approaches.
Hypothesis-driven	The agent forms a hypothesis, tests it against data, and refines iteratively.	Investigative analysis, anomaly detection.
Depth-first	The agent pursues one line of inquiry deeply before considering alternatives.	Detailed root cause analysis.
Breadth-first	The agent surveys all options at a high level before diving into any single path.	Exploratory analysis, option comparison.

Retry Policy

Configure how the agent handles failures:

behavior:
  execution_mode: hybrid
  reasoning_strategy: hypothesis-driven
  retry_policy:
    max_retries: 3
    backoff: exponential
    initial_delay_ms: 1000
  memory:
    type: hybrid
    short_term:
      max_turns: 20
    long_term:
      storage: tenant_memory_store
      ttl_days: 30

Memory

Memory Type	Description	Persistence
Short-term	Context from the current execution. Includes step outputs and intermediate results.	Current run only.
Long-term	Persisted knowledge from previous runs. The agent can recall past findings.	Configurable TTL.
Hybrid	Combines short-term and long-term. The agent uses current context and historical knowledge.	Both scopes.

Guard Rails

Guard rails constrain the agent's behavior to keep it safe, efficient, and compliant.

Resource Limits

Resource	Description	Default	Maximum
Tokens	Total LLM tokens (input + output) per run.	50,000	500,000
Memory	Working memory allocation per run.	256 MB	2 GB
CPU	CPU time limit per run.	60 s	600 s
API calls	Maximum external API calls per run.	50	500

Constraints

Constraint	Description
PII filtering	Scrub personally identifiable information before sending data to an LLM. Configurable patterns and entity types.
SQL injection prevention	All generated queries pass through parameterized validation. Agents cannot execute raw, unvalidated SQL.
Approval requirements	Require human approval before executing specific tools (e.g., `email`, `http_api`) or when token usage exceeds a threshold.
Data source restrictions	Limit the agent to specific data sources, schemas, or tables.
Output redaction	Automatically redact sensitive values in agent output.

guard_rails:
  resource_limits:
    max_tokens: 100000
    max_memory_mb: 512
    max_cpu_seconds: 120
    max_api_calls: 100
  constraints:
    pii_filtering:
      enabled: true
      entities: [email, phone, ssn, credit_card]
    sql_injection_prevention: true
    approval_required:
      tools: [email, http_api]
      token_threshold: 80000

Policy

Policy controls who can access and operate the agent. Each agent can define granular access control for four operations:

Operation	Description
Read	View the agent specification and details.
Write	Edit and update the agent configuration.
Delete	Remove the agent from the catalog.
Run	Execute the agent manually or via triggers.

policy:
  read:
    roles: [viewer, operator, builder, admin]
  write:
    roles: [builder, admin]
  delete:
    roles: [admin]
  run:
    roles: [operator, builder, admin]

LLM Configuration

Each agent can specify which LLM to use, overriding the tenant default from the LLM Registry.

llm:
  provider: openai
  model: gpt-4
  temperature: 0.3
  max_tokens: 4096

When no agent-level LLM is specified, the agent uses the tenant's default LLM for the text_generation capability.

Testing

The Agent Builder includes a built-in testing environment where you can validate your agent before publishing.

Test Runs

Click Test in the builder toolbar.
Provide sample input parameters.
The agent executes in a sandboxed environment with full logging.
Review each step: the reasoning, tool calls, intermediate results, and final output.
Inspect token usage, execution time, and resource consumption.

Assertions

Define assertions that automatically validate test run output:

tests:
  - name: revenue_report_generates
    input:
      quarter: Q3
      year: 2025
    assertions:
      - output.format == "markdown"
      - output.contains("Revenue by Region")
      - execution_time_ms < 60000

Test Scenarios

Define complete test scenarios with mock data and evaluation metrics:

tests:
  scenarios:
    - name: high_revenue_quarter
      description: Verify agent handles high-revenue data correctly
      input:
        quarter: Q3
        year: 2025
      expected_output:
        format: markdown
        contains: ["Revenue by Region"]
      assertions:
        - execution_time_ms < 60000
        - output.tables.length > 0
    - name: empty_dataset
      description: Verify agent handles empty results gracefully
      input:
        quarter: Q1
        year: 2020
      assertions:
        - output.contains("No data available")
  evaluation_metrics:
    - name: accuracy
      threshold: 0.95
    - name: latency_p99
      threshold: 30000

Test with Realistic Data

Test runs execute against your real data sources (with your access policies applied). This ensures the agent behaves correctly with production schemas and data volumes, not just mock data.

Monitoring

After publishing an agent, monitor its health and performance from AI > Agent Monitoring.

Metric	Description
Run count	Total runs over a selected time period.
Success rate	Percentage of runs that completed successfully.
Average duration	Mean execution time across runs.
Token usage	Average and peak token consumption per run.
Error rate	Percentage of runs that failed, grouped by error type.
Resource utilization	CPU, memory, and API call usage relative to configured limits.

Resource Alerts

When an agent consistently approaches its resource limits (above 80% utilization), consider increasing the limits or optimizing the agent's reasoning strategy to reduce token consumption.

Next Steps

Workflow Builder -- Compose agents into multi-step workflows.
Multi-Agent Coordination -- Coordinate multiple agents with shared state and event-driven patterns.
Agent Catalog -- Publish your agent to the catalog for others to discover and use.

Agent Specification​

Identity​

Capabilities​

Tools​

Data Sources​

Output Formats​

Behavior​

Execution Mode​

Reasoning Strategy​

Retry Policy​

Memory​

Guard Rails​

Resource Limits​

Constraints​

Policy​

LLM Configuration​

Testing​

Test Runs​

Assertions​

Test Scenarios​

Monitoring​

Next Steps​