Transform Data

An Overview of Testing Options for dbt (data build tool)

Last Updated:

June 10, 2024

August 19, 2023


Noel Gomez

dbt, also known as data build tool, is a data transformation framework that leverages templated SQL to transform and test data. dbt is part of the modern data stack and helps practitioners apply software development best practices on data pipelines. Some of these best practices include code modularity, version control, and continuous testing via its built in data quality framework. In this article we will focus on how data can be tested with dbt via build in functionality and with additional dbt packages and libraries.

dbt tests in dbt Core

In dbt, there are two main categories of tests: data tests and unit tests. Data tests are meant to be executed with every pipeline run to validate the integrity of the data and can be further divided into two types: Generic tests and Singular tests.  The difference between the two types of data tests is that Generic tests are reusable functions(dbt macros) for which you do not have to write the SQL query every time, while Singular tests are custom tests you write when a generic test does not cover your use case. Regardless of the type of data test, behind the scenes the process is the same, dbt will compile code to a SQL SELECT statement and execute it against your database. If any rows are returned by the query, this indicates a failure to dbt. 

Unit tests on the other hand are tests that are meant to validate your transformation logic. They rely on predefined data for comparison to ensure your logic is returning an expected result. Unlike data tests which are meant to be run with every pipeline, unit tests are only meant to be run in the CI step when new code is introduced. Unit tests were incorporated in dbt Core as of version 1.8.

dbt core tests

dbt Core Generic tests

dbt Core ships with four generic tests:

unique: is a test to verify that every value in a column (e.g. customer_id) contains unique values. This is useful for finding records that may inadvertently be duplicated in your data.

not_null: is a test to check that the values for a given column are always present. This can help you find cases where data in a column suddenly arrives without being populated.

accepted_values: this test is used to validate whether a set of values within a column is present. For example, in a column called payment_status, there can be values like pending, failed, accepted, rejected, etc. This test is used to verify that each row within the column contains one of the different payment statuses, but no other. This is useful to detect changes in the data like when a value gets changed such as accepted being replaced with approved.

relationships: these tests check referential integrity. This type of test is useful when you have related columns (e.g. the customer identifier) in two different tables. One table serves as the “parent” and the other is the “child” table. This is common when one table has a transaction and only lists a customer_id and the other table has the details for that customer. With this test we can verify that every row in the transaction table has a corresponding record in the dimension/details table. For example, if you have orders for customer_ids 1, 2, 3 we can validate that we have information about each of these customers in the customer details table.

Using a generic test is done by adding it to the model's property (yml) file.

dbt Core Generic Tests

Generic tests can accept additional test configurations such as a where clause to apply the test on a subset of rows. This can be useful on large tables by limiting the test to recent data or excluding rows based on the value of another column. Since an error will stop a dbt build or dbt test of the project, it is also possible to assign a severity to a test and optionally a threshold where errors will be treated as warning instead of errors. Finally, since dbt will automatically generate a name for the test, it may be useful to override the auto generated test name for simplicity. Here's the same property file from above with the additional configurations defined.

dbt tests with where condition, severity, and name defined

dbt Core singular tests

Singular tests allow you to create tests when the default generic ones (or the ones in the packages discussed below) do not meet your needs. These tests are simple SQL queries that express assertions about your data. An example of this type of test can be a more complex assertion such as having sales for one product be within +/- 10% of another product. The SQL simply needs to return the rows that do not meet this condition.

dbt Core custom generic tests

In dbt, it is also possible to define your own custom generic tests. This may be useful when you find yourself creating similar Singular tests. A custom generic test is essentially the same as a dbt macro which has a least a model as a parameter, and optionally column_name, if the test will apply to a column. Once the generic test is defined, it can be applied many times just like the generic tests shipped with dbt Core. It is also possible to pass additional parameters to a custom generic test.

Custom generic test definition

dbt Core unit testing

As our data transformations become more complex, the need for testing becomes increasingly important. The concept of unit testing is already well established in software development, where tests confirm that individual units of code work as intended. Recognizing this, dbt 1.8 introduced unit testing.

Unlike the data tests we have above, which ensure that incoming data meets specific criteria and are run at every data refresh, unit tests are designed to verify that the transformation logic itself produces the expected results. In the context of dbt, unit tests validate transformation logic by comparing the test results against predefined data typically defined using seeds (CSV files) or SQL queries. Unit tests should only be executed when new data transformation code is introduced and implemented since they are designed to help catch potential issues early in the development process. It is recommended to run unit tests only during the CI step. Running them in production would be a redundant use of compute resources because the expected outcomes do not change. Unit testing is only available in 1.8 or higher, but there are community packages (dbt-unit-testing, dbt_datamocktool, dbt-unittest) that have worked to solve this problem and are worth exploring if you are not using dbt 1.8.

dbt Core freshness check

While not technically a dbt test, a freshness check validates the timeliness of source data. This is valuable because sometimes data will stop getting refreshed and the data pipelines will continue to run with a silent failure. To assure that you are alerted when a data delivery SLA is not met, simply add a freshness check to your sources.

Freshness check configuration

dbt data tests in dbt packages

In addition to the generic tests that can be found within dbt Core, there are a lot more in the dbt ecosystem. These tests are found in dbt packages. Packages are libraries of reusable SQL code created by organizations of the dbt community. We will briefly go over some of the tests that can be found in these packages.

dbt-utils generic dbt tests

The dbt-utils package, created by dbt Labs, contains generic dbt tests, SQL generators, and macros. The dbt_utils package include 16 generic tests including:

not_accepted_values: this test is the opposite of the accepted_values test and is used to check that specific values are NOT present in a particular range of rows.

equal_rowcount: this test checks that two different tables have the same number of rows. This is a useful test that can assure that a transformation step does not accidentally introduce additional rows in the target table.

fewer_rows_than: this test is used to verify that a target table contains fewer rows than a source table. For example, if you are aggregating a table, you expect that the target table will have fewer rows than the table you are aggregating. This test can help you validate this condition.

There are 17 generic dbt tests available in the dbt-utils package which can be found here.

dbt-expectations generic dbt tests

Another awesome package that can accelerate your data testing is dbt-expectations. This package is a port of the great Python library Great Expectations. For those not familiar, Great Expectations is an open-source Python library that is used for automated testing. dbt-expectations is modeled after this library and was developed by Calogica so dbt practitioners would have access to an additional set of pre-created Generic tests without adding another tool to the data platform. Tests in dbt-expectations are divided into seven categories encompassing a total of 62 generic dbt tests:

  • Table shape (15 generic dbt tests)
  • Missing values, unique values, and types (6 generic dbt tests)
  • Sets and ranges (5 generic dbt tests)
  • String matching (10 generic dbt tests)
  • Aggregate functions(17 generic dbt tests)
  • Multi-column (6 generic dbt tests)
  • Distributional functions (3 generic dbt tests)

You can find detailed information on all the dbt-expectations generics tests in their documentation.


The dbt_constraints was created by Snowflake to add primary and foreign key constraints to dbt models. When added to a dbt project, this package will automatically create unique keys for all your existing unique and dbt_utils.unique_combination_of_columns tests, foreign keys for existing relationship tests, and not null constraints for not_null tests. The package also provides three tests (primary_key, unique_key, and foreign_key) that are a bit more flexible than the standard dbt tests. These tests can be used inline, out-of-line, and can support multiple columns when used in the tests: section of a model. The primary_key test will also cause a not null constraint to be created on each column.

elementary dbt-data-reliabilit generic dbt tests

The elementary tool (more info below) has 10 generic dbt tests that can be leveraged to look for schema changes, validate JSON schemas and look for anomalies in source freshness among other things..

dbt-fihr generic dbt tests

The HL7® FHIR® (Fast Healthcare Interoperability Resources) is a standard that defines how healthcare information can be exchanged between different systems. This package contains 20 generic dbt tests to validate FHIR data types.

fhir-dbt-analytics generic dbt tests

The fhir-dbt-analytics package is maintained by Google and contains tests to check the quality of clinical data. The metrics might count the number of FHIR resources to compare to expected counts or check references between FHIR resources.

dbt testing during development

While the tests above run against production data and are run even when none of the dbt code has changed, there are some tests that should be applied during development. This will improve a project's long term maintainability, assure project governance, and validate transformation logic in isolation of production data.


dbt-checkpoint is a library that can be leveraged during the development and release life-cycle to assure a level of governance of the dbt project. Typical validations include assuring that dbt models and/or their columns have descriptions and that all the columns in a dbt model (sql) are present in a property file (yml).


This dbt-meta-testing package contains macros to assert test and documentation coverage leveraging a configuration defined in the dbt_project.yml configuration settings.

dbt-unit-testing (dbt 1.8 has built in unit testing)

While dbt tests are great to test with "real" data, sometimes you may want to test the logic of a transformation with "fake" data. This type of test is called a unit test. The dbt-unit-testing package has all you need to do proper dbt unit testing. (side note, the dbt Core team has announced the unit testing will be part of a future release of dbt although it may not be exactly as done using this package).


dbt_datamocktool can be used to create mock CSV seeds to stand in for the sources and refs that your models use and test that the model produces the expected output as compared with another CSV seed.

dbt-unittest (dbt 1.8 has built in unit testing)

The dbt-unittest is a dbt package to enhance dbt package development by providing unit testing macros.

Reporting results of dbt tests

By default, dbt will not store the results of a dbt test execution. There is a configuration that can be set for the dbt project or at the specific model level which will have dbt store the failures of the test in a table in the data warehouse. While this is a good start, these test results get overridden each time dbt tests are run. To overcome this deficiency, tools have been developed in the community that store results longitudinally and even provide dashboards of test results.


Elementary is an open source data observability tool for dbt. It simplifies the capture of dbt test results over time, enables testing without having to manually add tests to all your dbt model columns, and has a user interface for viewing test results as well as dbt lineage. Elementary can also send Slack notifications.

dbt Data Quality package

This dbt Data Quality package is a Snowflake only package that helps users access and report on the outputs from dbt source freshness and dbt test results.


The dbt-tools package makes it simple to store and visualize dbt test results in a BI dashboard.


re_data is an open-source data reliability framework for modern data stack.


Getting started with dbt testing is simple thanks to the predefined generic dbt tests found within dbt Core and the additional generic tests found in dbt-utils and dbt-expectations. In addition to these juggernauts of the dbt community other organizations in the dbt community have contributed a additional generic tests, tools to improve dbt development, libraries that can help with validation and governance before releasing code to production and tools that can improve data quality observability.

Noel Gomez

Datacoves Co-founder | 15+ Data Platform Expert.
Solving enterprise data challenges quickly with dbt & Airflow.

Looking for an enterprise data platform?

Datacoves offers managed dbt core and Airflow and can be deployed in your private cloud.


Table of Contents

dbt Cloud vs dbt Core: Get Our Free Ebook

Considering dbt and whether you should use open source dbt Core or pay for a managed solution? Our eBook offers an in-depth look at dbt Core, dbt Cloud, and Managed dbt Core options. Gain valuable insights into each alternative and how each solution fits best with your organization.
From small companies to large enterprise environments, this guide is your key to understanding the dbt landscape, pricing, and total cost of ownership.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.