Ultimate dbt Jinja Functions Cheat Sheet
Last Updated:
October 4, 2023
October 4, 2023
Jinja is the game changing feature of dbt Core that allows us to create dynamic SQL code. In addition to the standard Jinja library, dbt Core includes additional functions and variables to make working with dbt even more powerful out of the box.
See our original post, The Ultimate dbt Jinja Cheat Sheet, to get started with Jinja fundamentals like syntax, variable assignment, looping and more. Then dive into the information below which covers Jinja additions added by dbt Core.
This cheatsheet references the extra functions, macros, filters, variables and context methods that are specific to dbt Core.
Enjoy!
dbt Core: Functions
These pre-defined dbt Jinja functions are essential to the dbt workflow by allowing you to reference models and sources, write documentation, print, write to the log and more.
Functions | |
---|---|
ref(project_or_package, model_name, version=latest_version) | This function allows you to reference models within or accross dbt projects. The version and project_or_package parameters are optional.
|
source(source_name, table_name ) | This function allows you to create a dependency between the current model and a source.
|
log(msg, info=False) | This function writes a message to the log and/or stdout. To write to the both the log file and stdout set the info arg to True. If set to False, the function will only write to the log file.
|
print("string to print") | Use this function when you wish to print messages to the log and stdout. This is the same as log() with info=True.
|
doc('doc_block_name') | The doc function allows you reference {% docs blocks %} from markdown (.md) files in the description field of property files(.yml). It is important to note that the doc() function is how you reference the contents of a {% docs block %}.
The {% docs block %} can be used to override preset overviews.
|
return(data) | Use this function to return data to the caller. This function will preserve the data type: dict, list, int etc.
|
env_var('env_var_name', 'default_value') | env_var is used to incorporate environment variables anywhere you can use Jinja code.
Use the optional second argument to set a default and avoid compilation errors.
|
var('variable_name') | Use when configuring packages in multiple environments or defining values across multiple models in a package. For variables that rarely change, define them in the dbt_project.yml file. Variables that change frequently can be defined through the CLI.
|
dbt Core: Macros
These macros are provided in dbt Core to improve the dbt workflow.
Macros | |
---|---|
run_query() |
This macro allows you to run SQL queries and store their results. It returns a table object with the query result.
|
debug() | Call this macro within a macro to open up a python debugger. The DBT_MACRO_DEBUGGING environment variable must be set. The debug macro is only available in dbt Core/Datacoves. Only use {{ debug }} in a development environment.
|
dbt Core: Filters
These dbt Jinja filters are used to modify data types.
Filters | |
---|---|
as_bool |
This Jinja filter will coerce the output into a boolean value. If the conversion can’t happen then it will return an error.
|
as_number |
This Jinja filter will coerce the output into an numeric value. If the conversion can’t happen then it will return an error.
|
as_native |
This Jinja filter will return the native python type (set, list, tuple, dict, etc). Best for iterables, use as_bool or as_number for boolean or numeric values.
|
dbt Core: Project context variables
These dbt core "variables" such as config, target, source, and others contain values that provide contextual information about your dbt project and configuration settings. They are typically used for accessing and customizing the behavior of your dbt models based on project-specific and environment-specific information.
Project Context Variables | |
---|---|
adapters |
dbt uses the adapter to communicate with the database. Setup correctly using the Database specific adapter i.e Snowflake, RedShift. The adapter has methods that will be translated into SQL statements specific to your database. |
builtins | The builtins variable is a dictionary containing keys for dbt context methods ref, source, and config.
|
config |
The config variable allows you to get configuration set for dbt models and set constraints that assure certain configurations must exist for a given model:
config.get('<config_key>', default='value'): Fetches a configuration named <config_key> for example "materialization". - If no default is provided or the configuration item is not defined, it this will return None
config.require("<config_key>"): Strictly requires a key named <config_key> is defined in the configuration.
- Throws error if not set.
config.get and config.require are commonly seen in the context of custom materializations, however, they can be used in other macros, if those macros are used within a model, seed, or snapshot context and they have the relevant configurations set.
|
dbt_version |
The dbt_version variable is helpful for debugging by returning the installed dbt version. This is the dbt version running not what you define in your project. e.g. If you make a project with dbt 1.3 and run it on another machine with dbt 1.6, this will say 1.6 |
execute |
The execute variable is set to True when dbt runs SQL against the databse such as when executing dbt run. When running dbt commands such as dbt compile where dbt parses your project but no SQL is run against the database execute is set to False. This variable is helpful when your Jinja is relying on a result from the database. Wrap the jinja in an if statement.
|
flags | This variable holds the value of the flags passed in the command line such as FULL_REFRESH, STORE_FAILURES, and WHICH(compile, run, build, run-operation, test, and show). It allows you to set logic based on run modes and run hooks based on current commands/type.
|
graph |
The graph variable is dictionary that hold the information about the nodes(Models, Sources, Tests, Snapshots) in your dbt project. |
model |
This is the graph object of the current model. This object allows you to access the contents of a model, model structure and JSON schema, config settings, and the path to the model. |
modules | This variable contains Python Modules for working with data including:datetime, pytz, re, and itertools.
|
project_name |
This variable returns the name for the root-level project that is being run. |
target |
This variable contains information about your dbt profile target, such as your warehouse connection information. Use the dot notation to access more such as: target.name or target.schema or target.database, ect |
dbt Core: Run context variables
These special variables provide information about the current context in which your dbt code is running, such as the model, schema, or project name.
Run Context Variablels | |
---|---|
database_schemas |
Only available in the context for on-run-end. This variable allows you to reference the databases and schemas. Useful if using multiple different databases |
invocation_id | This function outputs a UUID every time you run or compile your dbt project. It is useful for auditing. You may access it in the query-comment, info dictionary in events and logs, and in the metadata dictionary in dbt artifacts. |
results |
Only available in the context for on-run-end. This variable contains a list of Results Objects. Allows access to the information populated in run results JSON artifact. |
run_started_at |
This variable outputs a timestamp for the start time of a dbt run and defaults to UTC. It Is a python datetime object. Use standard strftime formatting.
|
schemas |
Only available in the context for on-run-end. This variable allows you to reference a list of schemas for models built during a dbt run. Useful for granting privileges. |
selected_resources |
This variable allows you to access a list of selected nodes from the current dbt command. The items in the list depend on the parameters of —select, —exclude,—selector. |
this |
{{ this }} is the database representation of the current model. Use the dot notation to access more properties such as: {{ this.database }} and {{ this.schema }}. |
thread_id | The thread_id is a unique identifier assigned to the Python threads that are actively executing tasks, such as running code nodes in dbt. It typically takes the form of names like "Thread-1," "Thread-2," and so on, distinguishing different threads. |
dbt Core: Context methods
These methods allow you to retrieve information about models, columns, or other elements of your project.
Context Methods | |
---|---|
set(value, default) | Allows you to use the python method set(), which converts an iterable into a unique set of values. This is NOT the same as the jinja expression set which is used to assign a value to a variable. This will return none if the value entered is not an iterable. In the example below both the python set() method and the jinja set expression are used to remove a duplicate element in the list.
|
set_strict(value) | Same as the set method above however it will raise a TypeError if the entered value is not a valid iterable.
|
exceptions | Is used to raise errors and warnings in a dbt run:
raise_compiler_error will raise an error, print out the message. Model will FAIL.
warn will raise a compiler warning and print out the set method. Model will still PASS.
|
fromjson(string, default) | Is used to deserialize a JSON string into a python dict or list. |
fromyaml(string, default) | Is used to deserialize a YAML string into a python dict or list. |
tojson(value, default) | Serializes a Python dict or list to a JSON string. |
toyaml(value, default ) | Serializes a Python dict or list to a YAML string. |
local_md5 |
This variable locally creates an MD5 hash of the given string.
|
zip(*args, default) | This method is allows you to combine any number of iterables.
|
zip_strict(value) | Same as the zip method but will raise a TypeError if one of the given values are not a valid iterable.
|
Please contact us with any errors or suggestions.
Author:
Mayra Peña