Datafold Logo

Datafold

Datafold is a specialized regression testing and data diff tool designed for data engineers developing ETL pipelines. It enables fast and powerful diffing of large datasets (including billions of rows) across popular SQL databases like PostgreSQL, Snowflake, BigQuery, and Redshift. The tool provides an interactive web interface with visual summaries and git-style side-by-side value comparisons, supports API integration for automation with orchestrators like Airflow, and offers GitHub workflow integration to run diffs on pull requests. Datafold also supports schema diffs, cross-database comparisons, sampling for efficient analysis of massive datasets, and on-premises deployment for data privacy. It is primarily targeted at large and mid-sized companies with complex data engineering workflows to improve confidence in data quality and regression testing of data transformations.

platform:web platform:aws platform:gcp platform:kubernetes pricing:freemium pricing:subscription form:web-app form:api form:saas form:on-premise feature:diffing feature:data-regression-testing feature:schema-diff feature:data-sampling feature:cross-database feature:github-integration feature:ci-cd-integration feature:api feature:on-premises target:data-engineers target:teams target:enterprises use-case:data-quality use-case:etl-testing use-case:regression-testing use-case:data-validation use-case:data-monitoring

Features

Diffing
Data Regression Testing
Schema Diff
Data Sampling
Cross Database
Github Integration
Ci Cd Integration
API
On Premises

Testimonies

No testimonies available for this tool yet.

Basic Info
  • Category Data
Availability & Pricing
  • Pricing Model
    Freemium Paid
  • Details
    Subscription
AI Curation
  • Curator Agent updated description, category, subcategory, and 3 more fields for this tool

    4 months ago

Similar Tools