resolve conflict

gastlich · Dec 6, 2022 · 518161a · 518161a
2 parents 6786792 + 33e0a59
commit 518161a
Show file tree

Hide file tree

Showing 10 changed files with 78 additions and 15 deletions.
diff --git a/README.md b/README.md
@@ -486,7 +486,12 @@ or any other nested information.
 
 ### Testing
 #### Missing Primary Key Tests
-`fct_missing_primary_key_tests` ([source](models/marts/tests/fct_missing_primary_key_tests.sql)) lists every model that does not meet the minimum testing requirement of testing primary keys. Any models that does not have both a `not_null` and `unique` test configured will be highlighted in this model. 
+`fct_missing_primary_key_tests` ([source](models/marts/tests/fct_missing_primary_key_tests.sql)) lists every model that does not meet the minimum testing requirement of testing primary keys. Any model that does not have either
+
+1. a `not_null` test and a `unique` test applied to a single column OR 
+2. a `dbt_utils.unique_combination_of_columns` test applied to a set of columns 
+
+will be flagged by this model. 
 
 <details>
 <summary><b>Reason to Flag</b></summary>
@@ -496,15 +501,16 @@ Tests are assertions you make about your models and other resources in your dbt
 <details>
 <summary><b>How to Remediate</b></summary>
 
-Apply a [uniqueness test](https://docs.getdbt.com/reference/resource-properties/tests#unique) and a [not null test](https://docs.getdbt.com/reference/resource-properties/tests#not_null) to the column that represents the grain of your model in its schema entry. For models that are unique across a combination of columns, we recommend adding a surrogate key column to your model, then applying these tests to that new model. See the [`surrogate_key`](https://github.com/dbt-labs/dbt-utils#surrogate_key-source) macro from dbt_utils for more info!
+Apply a [uniqueness test](https://docs.getdbt.com/reference/resource-properties/tests#unique) and a [not null test](https://docs.getdbt.com/reference/resource-properties/tests#not_null) to the column that represents the grain of your model in its schema entry. For models that are unique across a combination of columns, we recommend adding a surrogate key column to your model, then applying these tests to that new model. See the [`surrogate_key`](https://github.com/dbt-labs/dbt-utils#surrogate_key-source) macro from dbt_utils for more info! Alternatively, you can use the [`dbt_utils.unique_combination_of_columns`](<https://github.com/dbt-labs/dbt-utils#unique_combination_of_columns-source>) test from `dbt_utils`. Check out the [overriding variables section](#overriding-variables) to read more about configuring other primary key tests for your project!
 
 Additional tests can be configured by applying a [generic test](https://docs.getdbt.com/docs/building-a-dbt-project/tests#generic-tests) in the model's `.yml` entry or by creating a [singular test](https://docs.getdbt.com/docs/building-a-dbt-project/tests#singular-tests) 
-in the `tests` directory of you project. 
+in the `tests` directory of you project.
 </details>
 
 #### Test Coverage
 `fct_test_coverage` ([source](models/marts/tests/fct_test_coverage.sql)) contains metrics pertaining to project-wide test coverage.
 Specifically, this models measures:
+
 1. `test_coverage_pct`: the percentage of your models that have minimum 1 test applied.
 2. `test_to_model_ratio`: the ratio of the number of tests in your dbt project to the number of models in your dbt project
 3. `< model_type >_test_coverage_pct`: the percentage of each of your model types that have minimum 1 test applied.
@@ -892,21 +898,39 @@ models:
 Currently, this package uses different variables to adapt the models to your objectives and naming conventions. They can all be updated directly in `dbt_project.yml`
 
 <details>
-<summary><b>Coverage Variables</b></summary>
+<summary><b>Testing and Documentation Variables</b></summary>
 
 | variable    | description | default     |
 | ----------- | ----------- | ----------- |
 | `test_coverage_pct` | the minimum acceptable test coverage percentage | 100% |
 | `documentation_coverage_pct` | the minimum acceptable documentation coverage percentage | 100% |
+| `primary_key_test_macros` | the set(s) of dbt tests used to check validity of a primary key | [["dbt.test_unique", "dbt.test_not_null"], ["dbt_utils.test_unique_combination_of_columns"]] |
+
+**Usage notes for `primary_key_test_macros:`**
+
+The `primary_key_test_macros` variable determines how the `fct_missing_primary_key_tests` ([source](models/marts/tests/fct_missing_primary_key_tests.sql)) model evaluates whether the models in your project are properly tested for their grain. This variable is a list and each entry **must be a list of test names in `project_name.test_macro_name` format**.
+
+For each entry in the parent list, the logic in `int_model_test_summary` will evaluate whether each model has all of the tests in that entry applied. If a model meets the criteria of any of the entries in the parent list, it will be considered a pass. The default behavior for this package will check for whether each model has either:
+
+1. __Both__ the `not_null` and `unique` tests applied to a single column OR
+2. The `dbt_utils.unique_combination_of_columns` applied to the model.
+
+Each set of test(s) that define a primary key requirement must be grouped together in a sub-list to ensure they are evaluated together (e.g. [`dbt.test_unique`, `dbt.test_not_null`] ).
+
+*While it's not explicitly tested in this package, we strongly encourage adding a `not_null` test on each of the columns listed in the `dbt_utils.unique_combination_of_columns` tests.*
+
 
 ```yml
 # dbt_project.yml
 # set your test and doc coverage to 75% instead
+# use the dbt_constraints.test_primary_key test to check for validity of your primary keys
 
 vars:
   dbt_project_evaluator:
     documentation_coverage_target: 75
     test_coverage_target: 75
+    primary_key_test_macros: [["dbt_constraints.test_primary_key"]]
+
 ```
 </details>
 

diff --git a/dbt_project.yml b/dbt_project.yml
@@ -57,6 +57,8 @@ vars:
   documentation_coverage_target: 100
   test_coverage_target: 100
 
+  primary_key_test_macros: [["dbt.test_unique", "dbt.test_not_null"], ["dbt_utils.test_unique_combination_of_columns"]]
+
   # -- DAG variables --
   models_fanout_threshold: 3
 

diff --git a/integration_tests/models/staging/source_1/schema.yml b/integration_tests/models/staging/source_1/schema.yml
@@ -8,6 +8,12 @@ models:
         description: hocus pocus
         tests:
           - unique
+  - name: stg_model_3
+    tests:
+      - dbt_utils.unique_combination_of_columns:
+          combination_of_columns:
+            - id
+            - color
   - name: stg_model_2
     columns:
       - name: id 

diff --git a/integration_tests/models/staging/source_1/stg_model_3.sql b/integration_tests/models/staging/source_1/stg_model_3.sql
@@ -1,2 +1,4 @@
 -- depends on: {{ source('source_2', 'table_3') }}
-select 1 as id
+select 1 as id, 'blue' as color
+union all 
+select 1 as id, 'red' as color
diff --git a/integration_tests/seeds/tests/test_fct_missing_primary_key_tests.csv b/integration_tests/seeds/tests/test_fct_missing_primary_key_tests.csv
@@ -8,5 +8,4 @@ report_1,FALSE,0
 report_2,FALSE,0
 report_3,FALSE,0
 stg_model_1,FALSE,1
-stg_model_3,FALSE,0
 stg_model_5,FALSE,0
diff --git a/integration_tests/seeds/tests/test_fct_test_coverage.csv b/integration_tests/seeds/tests/test_fct_test_coverage.csv
@@ -1,2 +1,2 @@
 total_models,total_tests,tested_models,test_coverage_pct,staging_test_coverage_pct,intermediate_test_coverage_pct,marts_test_coverage_pct,other_test_coverage_pct,test_to_model_ratio
-14,9,4,28.57,60.00,50.00,0.00,0.00,0.6429
+14,10,5,35.71,80.00,50.00,0.00,0.00,0.7143
diff --git a/macros/unpack/get_nodes.sql b/macros/unpack/get_nodes.sql
@@ -9,7 +9,6 @@
     {%- set values = [] -%}
 
     {%- for node in nodes_list -%}
-
         {%- set values_line  = 
             [
                 wrap_string_with_quotes(node.unique_id),
@@ -25,7 +24,8 @@
                 wrap_string_with_quotes(node.alias),
                 "cast(" ~ dbt_project_evaluator.is_not_empty_string(node.description) | trim ~ " as boolean)",
                 "''" if not node.column_name else wrap_string_with_quotes(dbt.escape_single_quotes(node.column_name)),
-                wrap_string_with_quotes(node.meta | tojson)
+                wrap_string_with_quotes(node.meta | tojson),
+                wrap_string_with_quotes(node.depends_on.macros | tojson)
             ]
         %}
 
@@ -51,7 +51,8 @@
               'alias',
               ('is_described', 'boolean'),
               'column_name',
-              'meta'
+              'meta',
+              'macro_dependencies'
             ]
          )
     ) }}

diff --git a/models/marts/core/int_all_graph_resources.sql b/models/marts/core/int_all_graph_resources.sql
@@ -1,4 +1,13 @@
 -- one row for each resource in the graph
+
+{# flatten the sets of permissable primary key test sets to one level for later iteration #}
+{%- set test_macro_list = [] %}
+{%- for test_set in var('primary_key_test_macros') -%}
+      {%- for test in test_set %}
+        {%- do test_macro_list.append(test) -%}
+      {%- endfor %}
+{%- endfor -%}
+
 with unioned as (
 
     {{ dbt_utils.union_relations([
@@ -56,8 +65,9 @@ joined as (
         end as model_type_folder,
         {{ dbt.position('naming_convention_folders.folder_name_value','unioned_with_calc.directory_path') }} as position_folder,  
         nullif(unioned_with_calc.column_name, '') as column_name,
-        unioned_with_calc.resource_name like 'unique%' and unioned_with_calc.resource_type = 'test' as is_not_null_test,
-        unioned_with_calc.resource_name like 'not_null%' and unioned_with_calc.resource_type = 'test' as is_unique_test,
+        {% for test in test_macro_list %}
+        unioned_with_calc.macro_dependencies like '%macro.{{ test }}%' and unioned_with_calc.resource_type = 'test' as is_{{ test.split('.')[1] }},  
+        {% endfor %}
         unioned_with_calc.is_enabled, 
         unioned_with_calc.materialized, 
         unioned_with_calc.on_schema_change, 
@@ -72,6 +82,7 @@ joined as (
         unioned_with_calc.owner_name,
         unioned_with_calc.owner_email,
         unioned_with_calc.meta,
+        unioned_with_calc.macro_dependencies,
         unioned_with_calc.metric_type, 
         unioned_with_calc.model, 
         unioned_with_calc.label, 

diff --git a/models/marts/tests/intermediate/int_model_test_summary.sql b/models/marts/tests/intermediate/int_model_test_summary.sql
@@ -13,7 +13,15 @@ count_column_tests as (
     select 
         relationships.direct_parent_id, 
         all_graph_resources.column_name,
-        count(distinct case when all_graph_resources.is_unique_test or all_graph_resources.is_not_null_test then relationships.resource_id else null end) primary_key_tests_count,
+        {%- for test_set in var('primary_key_test_macros') %}
+            {%- set outer_loop = loop -%}
+        count(distinct case when 
+                {%- for test in test_set %} 
+                all_graph_resources.is_{{ test.split('.')[1] }} {%- if not loop.last %} or {% endif %} 
+                {%- endfor %}
+            then relationships.resource_id else null end
+        ) as primary_key_method_{{ outer_loop.index }}_count,
+        {%- endfor %}
         count(distinct relationships.resource_id) as tests_count
     from all_graph_resources
     left join relationships
@@ -27,7 +35,17 @@ agg_test_relationships as (
 
     select 
         direct_parent_id, 
-        sum(case when primary_key_tests_count = 2 then 1 else 0 end) >= 1 as is_primary_key_tested,
+        sum(case 
+                when (
+                    {%- for test_set in var('primary_key_test_macros') %}
+                        {%- set compare_value = test_set | length %}
+                    primary_key_method_{{ loop.index }}_count = {{ compare_value}}
+                        {%- if not loop.last %} or {% endif %}
+                    {%- endfor %} 
+                ) then 1 
+                else 0 
+            end
+        ) >= 1 as is_primary_key_tested,
         sum(tests_count) as number_of_tests_on_model
     from count_column_tests
     group by 1

diff --git a/packages.yml b/packages.yml
@@ -1,3 +1,3 @@
 packages:
   - package: dbt-labs/dbt_utils
-    version: [">1.0.0", "<2.0.0"]
+    version: [">1.0.0", "<2.0.0"]