AI/ML Configuration
Datafi's AI capabilities -- including natural-language Chat, autonomous Agents, and Insights -- rely on large language model (LLM) providers and supporting infrastructure that you configure at the tenant level. Navigate to Administration > AI/ML to manage these settings.
LLM Provider Selection
Each tenant can connect to one or more LLM providers. You select the active provider per feature (Chat, Agents, Insights) so that different workloads can use different models based on cost, latency, or compliance requirements.
| Provider | Authentication | Supported Models | Notes |
|---|---|---|---|
| OpenAI | API key | GPT-4o, GPT-4o mini, o1, o3-mini | Direct API access. Best for teams already using OpenAI. |
| Azure OpenAI | Azure AD / API key | GPT-4o, GPT-4o mini (deployed endpoints) | Data stays within your Azure tenant. Required for some compliance scenarios. |
| Amazon Bedrock | IAM role / access key | Claude (Anthropic), Titan, Llama | No model hosting required. Access models through your AWS account. |
| Snowflake Cortex | Snowflake session token | Cortex LLM functions | Runs within your Snowflake environment. Data never leaves Snowflake. |
Configuring a Provider
- Navigate to Administration > AI/ML > Providers.
- Click Add Provider.
- Select the provider type from the dropdown.
- Enter the required credentials (see the table above).
- Click Test Connection to verify access.
- Click Save.
Once a provider is configured, you assign it to specific AI features in the Feature Mapping section.
If your organization requires that data never leave a specific cloud boundary, choose Azure OpenAI (for Azure environments), Amazon Bedrock (for AWS environments), or Snowflake Cortex (for Snowflake-native processing).
Feature Mapping
| Feature | Description | Recommended Provider |
|---|---|---|
| Chat | Natural-language questions about your data. | Any provider with a conversational model. |
| Agents | Autonomous workflows that monitor and act on data. | Providers with function-calling support. |
| Insights | Automated trend detection, anomalies, and summaries. | High-throughput providers for batch processing. |
Vector Database Configuration
Datafi uses a vector database to store embeddings for semantic search, schema matching, and retrieval-augmented generation (RAG). Currently, Redis (with the RediSearch module) is the supported vector store.
| Parameter | Description | Default |
|---|---|---|
| Host | The hostname or IP address of your Redis instance. | localhost |
| Port | The port on which Redis is listening. | 6379 |
| Password | Optional authentication password. | -- |
| TLS Enabled | Whether the connection uses TLS encryption. | true |
| Index Prefix | A namespace prefix applied to all vector indices created by Datafi. | datafi: |
| Embedding Dimensions | The dimensionality of the embedding vectors. Must match your chosen embedding model. | 1536 |
To configure the vector database:
- Navigate to Administration > AI/ML > Vector Database.
- Enter your Redis connection details.
- Click Test Connection to confirm connectivity and module availability.
- Click Save.
Changing the Embedding Dimensions value after embeddings have been generated requires a full re-indexing of all stored vectors. Plan this operation during a maintenance window.
Model Parameters
You can fine-tune how Datafi interacts with LLMs by adjusting model parameters on a per-feature basis.
| Parameter | Range | Default | Description |
|---|---|---|---|
| Temperature | 0.0 -- 2.0 | 0.1 | Controls response randomness. Lower values produce more deterministic outputs. |
| Max Tokens | 1 -- 128000 | 4096 | The maximum number of tokens in a single model response. |
| Top P | 0.0 -- 1.0 | 0.9 | Nucleus sampling threshold. Lower values narrow the token selection pool. |
| Frequency Penalty | -2.0 -- 2.0 | 0.0 | Reduces repetition of tokens that have already appeared. |
| System Prompt | Free text | Platform default | The system-level instruction prepended to every request. |
To adjust model parameters:
- Navigate to Administration > AI/ML > Model Parameters.
- Select the feature (Chat, Agents, or Insights) you want to configure.
- Modify the parameter values.
- Click Save.
Parameter changes apply to new requests only. In-flight conversations retain the parameters that were active when the session started.
Next Steps
- User Management -- Manage users whose attributes may influence AI behavior.
- Events and Notifications -- Configure alerts for AI-related events.
- Workspace Management -- Adjust workspace-level settings.