AI/ML Configuration

Datafi's AI capabilities -- including natural-language Chat, autonomous Agents, and Insights -- rely on large language model (LLM) providers and supporting infrastructure that you configure at the tenant level. Navigate to Administration > AI/ML to manage these settings.

LLM Provider Selection

Each tenant can connect to one or more LLM providers. You select the active provider per feature (Chat, Agents, Insights) so that different workloads can use different models based on cost, latency, or compliance requirements.

Provider	Authentication	Supported Models	Notes
OpenAI	API key	GPT-4o, GPT-4o mini, o1, o3-mini	Direct API access. Best for teams already using OpenAI.
Azure OpenAI	Azure AD / API key	GPT-4o, GPT-4o mini (deployed endpoints)	Data stays within your Azure tenant. Required for some compliance scenarios.
Amazon Bedrock	IAM role / access key	Claude (Anthropic), Titan, Llama	No model hosting required. Access models through your AWS account.
Snowflake Cortex	Snowflake session token	Cortex LLM functions	Runs within your Snowflake environment. Data never leaves Snowflake.

Configuring a Provider

Navigate to Administration > AI/ML > Providers.
Click Add Provider.
Select the provider type from the dropdown.
Enter the required credentials (see the table above).
Click Test Connection to verify access.
Click Save.

Once a provider is configured, you assign it to specific AI features in the Feature Mapping section.

tip

If your organization requires that data never leave a specific cloud boundary, choose Azure OpenAI (for Azure environments), Amazon Bedrock (for AWS environments), or Snowflake Cortex (for Snowflake-native processing).

Feature Mapping

Feature	Description	Recommended Provider
Chat	Natural-language questions about your data.	Any provider with a conversational model.
Agents	Autonomous workflows that monitor and act on data.	Providers with function-calling support.
Insights	Automated trend detection, anomalies, and summaries.	High-throughput providers for batch processing.

Vector Database Configuration

Datafi uses a vector database to store embeddings for semantic search, schema matching, and retrieval-augmented generation (RAG). Currently, Redis (with the RediSearch module) is the supported vector store.

Parameter	Description	Default
Host	The hostname or IP address of your Redis instance.	`localhost`
Port	The port on which Redis is listening.	`6379`
Password	Optional authentication password.	--
TLS Enabled	Whether the connection uses TLS encryption.	`true`
Index Prefix	A namespace prefix applied to all vector indices created by Datafi.	`datafi:`
Embedding Dimensions	The dimensionality of the embedding vectors. Must match your chosen embedding model.	`1536`

To configure the vector database:

Navigate to Administration > AI/ML > Vector Database.
Enter your Redis connection details.
Click Test Connection to confirm connectivity and module availability.
Click Save.

warning

Changing the Embedding Dimensions value after embeddings have been generated requires a full re-indexing of all stored vectors. Plan this operation during a maintenance window.

Model Parameters

You can fine-tune how Datafi interacts with LLMs by adjusting model parameters on a per-feature basis.

Parameter	Range	Default	Description
Temperature	0.0 -- 2.0	`0.1`	Controls response randomness. Lower values produce more deterministic outputs.
Max Tokens	1 -- 128000	`4096`	The maximum number of tokens in a single model response.
Top P	0.0 -- 1.0	`0.9`	Nucleus sampling threshold. Lower values narrow the token selection pool.
Frequency Penalty	-2.0 -- 2.0	`0.0`	Reduces repetition of tokens that have already appeared.
System Prompt	Free text	Platform default	The system-level instruction prepended to every request.

To adjust model parameters:

Navigate to Administration > AI/ML > Model Parameters.
Select the feature (Chat, Agents, or Insights) you want to configure.
Modify the parameter values.
Click Save.

note

Parameter changes apply to new requests only. In-flight conversations retain the parameters that were active when the session started.

Next Steps

User Management -- Manage users whose attributes may influence AI behavior.
Events and Notifications -- Configure alerts for AI-related events.
Workspace Management -- Adjust workspace-level settings.

LLM Provider Selection​

Configuring a Provider​

Feature Mapping​

Vector Database Configuration​

Model Parameters​

Next Steps​