Supported Data Sources

Datafi connects to a broad range of relational databases, cloud data warehouses, cloud storage services, and third-party platforms. This page provides the complete compatibility matrix.

Relational Databases and Data Warehouses

Data Source	Authentication Methods	Driver Type	Key Features
Snowflake	JWT, OAuth	ODBC	Multi-cluster warehouses, time travel queries, semi-structured data (VARIANT).
PostgreSQL	Password, SSL client certificates	sqlx (native)	Full SQL support, JSONB columns, materialized views, extensions.
Microsoft SQL Server	Windows Authentication, SQL Authentication	ODBC	T-SQL dialect, linked servers, temporal tables.
MySQL	Password	Native driver	InnoDB and MyISAM support, full-text search, replication-aware connections.
MariaDB	Password	Native driver	MySQL-compatible with additional storage engines and features.
Amazon Redshift	IAM roles, Password	JDBC	Columnar storage, Spectrum for S3 queries, concurrency scaling.
Google BigQuery	Service Account (JSON key)	REST API	Serverless, petabyte-scale analytics, Standard SQL dialect.
Databricks	Personal Access Token	Spark SQL	Unity Catalog integration, Delta Lake support, SQL warehouses.
Azure Synapse Analytics	Azure Active Directory, SQL Authentication	ODBC	Dedicated and serverless SQL pools, PolyBase integration.
Oracle Database	TNS (Transparent Network Substrate)	OCI (Oracle Call Interface)	PL/SQL support, partitioning, RAC-aware connections.
IBM DB2	Password	ODBC	z/OS and LUW support, federated queries, workload management.

Driver Selection

Datafi selects the highest-performance driver available for each data source. Native drivers (such as sqlx for PostgreSQL) are preferred over ODBC when they provide better performance or feature coverage.

Cloud Storage

You can connect to flat files and semi-structured data stored in cloud object storage. Datafi reads these files and presents them as queryable datasets.

Storage Provider	Supported Formats	Authentication
Amazon S3	CSV, JSON, Parquet	IAM Role, Access Key + Secret Key
Azure Blob Storage	CSV, JSON, Parquet	Shared Access Signature (SAS), Azure AD
Google Cloud Storage	CSV, JSON, Parquet	Service Account (JSON key)

How it works:

You configure a cloud storage connector with the bucket or container path and authentication credentials.
Datafi scans the specified path and discovers available files.
Files are registered as datasets with inferred schemas.
Queries against these datasets are executed by the Edge node, which reads the files on demand.

note

Cloud storage connectors are best suited for analytical workloads on static or slowly changing data. For high-frequency transactional access, use a dedicated database connector.

File-Based Sources

Format	Description	Schema Detection
CSV	Comma-separated values. Configurable delimiters, headers, and encoding.	Automatic type inference from sample rows.
JSON	JSON arrays or newline-delimited JSON (NDJSON).	Automatic schema inference from document structure.

Third-Party Platform Connectors

Datafi provides connectors for third-party business platforms, allowing you to query operational data alongside your analytical databases.

Platform	Authentication	Data Access
Salesforce	OAuth 2.0	Objects, reports, and SOQL queries.
Microsoft Dynamics 365	Azure AD / OAuth 2.0	Entities and custom tables via Dataverse API.
NetSuite	Token-Based Authentication (TBA)	SuiteQL queries, saved searches, records.

info

Third-party connectors expose platform data as standard datasets. You can apply the same policies, build the same data views, and use the same query interface as you would with any relational database.

Planned Connectors

The following connectors are on the roadmap. Availability dates are subject to change.

Data Source	Status	Expected Driver
Shopify	In development	REST / GraphQL API
HubSpot	Planned	REST API
SAP HANA	Planned	ODBC
Teradata	Planned	ODBC
Elasticsearch	Planned	REST API
MongoDB	Planned	Native driver
Cassandra	Planned	Native driver
ClickHouse	Planned	Native driver

Connector Configuration Summary

Every connector requires the following baseline configuration:

Parameter	Description	Required
Name	A human-readable identifier for the connector.	Yes
Type	The data source type (e.g., `postgresql`, `snowflake`, `s3`).	Yes
Edge Server	The Edge node that will host this connection.	Yes
Authentication	Credentials specific to the data source type.	Yes
Host / Endpoint	The database hostname, IP address, or API endpoint.	Yes (except cloud storage)
Port	The database port. Defaults are applied per data source type.	No
Database / Schema	The default database and schema to connect to.	Varies by type
Connection Pool Size	Maximum number of concurrent connections to the data source.	No (default: 10)
Timeout	Connection and query timeout in seconds.	No (default: 300s)

Authentication Methods Reference

Method	Data Sources	Description
Password	PostgreSQL, MySQL, MariaDB, MSSQL, DB2, Redshift	Standard username and password authentication.
JWT	Snowflake	JSON Web Token for service-to-service authentication.
OAuth 2.0	Snowflake, Salesforce, Dynamics	Delegated authorization using OAuth flows.
SSL Client Certificates	PostgreSQL	Mutual TLS using client certificate and key.
Windows Authentication	MSSQL, Synapse	Integrated Windows / Kerberos authentication.
Azure Active Directory	Synapse, Dynamics	Azure AD tokens for Microsoft services.
IAM Roles	Redshift, S3	AWS Identity and Access Management.
Service Account	BigQuery, GCS	Google Cloud service account JSON key file.
Personal Access Token	Databricks	Token-based authentication for Databricks workspaces.
TNS	Oracle	Oracle Net Services connection descriptor.
Token-Based (TBA)	NetSuite	NetSuite Token-Based Authentication.
SAS Token	Azure Blob	Shared Access Signature for Azure Storage.
Access Key	S3	AWS access key ID and secret access key pair.

Next Steps

Multi-Protocol APIs -- Learn how to interact with your connected data sources through the available API protocols.
Request Lifecycle -- Understand how queries flow through the platform to your data sources.

Relational Databases and Data Warehouses​

Cloud Storage​

File-Based Sources​

Third-Party Platform Connectors​

Planned Connectors​

Connector Configuration Summary​

Authentication Methods Reference​

Next Steps​