Skip to content

dbt-external-tables 0.2.0

Compare
Choose a tag to compare
@jtcohen6 jtcohen6 released this 07 May 20:10
b6de385

This is a minor release that significantly changes the behavior of the stage_external_sources macro to improve its flexibility and performance. Several of the features described below reflect breaking changes compared with the 0.1.x versions of this package.

Features

  • On Snowflake, add support for staging external sources via snowpipes.
    • If external.snowpipe is configured, the stage_external_sources macro will create an empty table with all specified columns plus metadata fields, run a full historical backfill via copy statement, and finally create a pipe that wraps the same copy statement.
    • If no columns are specified, the pipe will instead pull all data into a single variant column. (This behavior does not work for CSV files.)
  • During standard runs, the stage_external_sources macro will attempt to "partially refresh" external assets. It will not drop, replace, or change any existing external tables or pipe targets. To fully rebuild all external assets, supply the CLI variable ext_full_refresh: true.
  • The stage_external_sources macro now accepts a select argument to stage only specific nodes, using similar syntax to dbt source snapshot-freshness.

Breaking changes

  • Properties of the external source config which previously accepted string or boolean values, such as auto_refresh, now expect boolean values in order for the refresh_external_table to infer whether it should be refreshed or ignored during partial refresh runs.
  • The refresh_external_table macro now returns a Jinja list [] instead of a string.

Quality of life

  • Improved info logging of the DDL/DML that the stage_external_sources macro is running. This is intended to provide more visibility into multistep operations, such as staging snowpipes.
  • Add sample analysis that, when copied to the root project's analysis folder, will print out a "dry run" version of all the SQL which stage_external_sources would run as an operation.
  • Add GitHub templates and codeowner