Managing Datasets
After you connect a data source or upload a file, the Data Catalog gives you a full set of tools to inspect, organize, and maintain each dataset. This page covers schema browsing, dataset metadata, tagging, and deletion.
Dataset Details Page
Every dataset in the catalog has a dedicated details page. To open it, click the dataset name in the Data Catalog list. The details page shows:
- Connection information -- The connector type, Edge Server, database name, and connection status.
- Schema browser -- A navigable tree of tables, columns, data types, and relationships.
- Access summary -- The list of users and groups who have access, along with their roles.
- Policies -- Any row-level, column-level, or query-limit rules applied to the dataset.
- Activity -- A log of recent queries and access events.
Browsing the Schema
The schema browser on the dataset details page lets you explore the structure of your data without writing a query. You can:
- Expand tables to see their columns, data types, and constraints.
- View primary keys, foreign keys, and indexes where available.
- Search within the schema to locate a specific table or column by name.
- See sample values for each column to verify that the data looks as expected.
If you make structural changes to the underlying database -- such as adding a table or altering a column -- use the Refresh Schema action on the dataset details page to pull the latest structure into Datafi.
Tags
Tags are freeform labels you attach to a dataset to improve discoverability. Use them to categorize datasets by team, domain, environment, sensitivity level, or any other dimension that matters to your organization.
To add a tag:
- Open the dataset details page.
- Click Add Tag.
- Enter the tag name and save.
Examples of useful tags:
finance,marketing,engineering-- Team or department.production,staging,development-- Environment.pii,confidential,public-- Sensitivity level.daily-refresh,real-time-- Update frequency.
You can filter the Data Catalog by tag to quickly find all datasets that share a label.
Regions
Each dataset can be associated with a region that indicates where the data physically resides or where the Edge Server is located. Regions help users understand data residency for compliance and latency purposes.
You set the region during the connection process, and you can update it at any time from the dataset details page.
Editing Dataset Metadata
Data Owners can update the following metadata fields from the dataset details page:
- Dataset name -- Rename the dataset to something more descriptive.
- Description -- Add or update a plain-text description that explains what the dataset contains and how it should be used.
- Tags -- Add or remove tags.
- Region -- Change the associated region.
Changes to metadata take effect immediately and are visible to all users who have access to the dataset.
Deleting a Dataset
When you no longer need a dataset, you can permanently remove it from the Data Catalog.
Deleting a dataset is irreversible. All related configurations -- policies, sharing settings, tags, query history, and cached schema information -- are deleted along with the dataset. For uploaded files, the stored data is also permanently removed. This action cannot be undone.
To delete a dataset:
- Open the dataset details page.
- Click the DELETE button.
- A confirmation dialog appears. Type the word DELETE in the text field to confirm your intent.
- Click Confirm.
The dataset is removed from the catalog immediately. Users who previously had access will no longer see it, and any Data Views, Data Apps, or queries that reference the deleted dataset will stop functioning.
Before deleting a dataset, verify that no active Data Views, Data Apps, or scheduled queries depend on it. Removing a dataset that is still in use will break those downstream consumers without warning.
Next Steps
- Sharing Data -- Learn how to invite users and manage access roles.
- Connecting Datasets -- Add a new data source to replace or supplement one you removed.
- Uploading Files -- Upload a CSV or JSON file as a lightweight alternative to a full database connection.