Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/add transformation runs #141

Merged
merged 63 commits into from
Jan 22, 2025
Merged

Conversation

fivetran-reneeli
Copy link
Contributor

@fivetran-reneeli fivetran-reneeli commented Dec 17, 2024

PR Overview

This PR will address the following Issue/Feature: internal ticket

This PR will result in the following new package version: v1.11.0

Schema Changes: Adding the Transformation Runs Table

  • We have added the transformation_runs source table. Note that not all customers have the transformation_runs source table, particularly if they are not using Fivetran Transformations. Therefore, the transformation_runs table will only populate if the table exists in your schema, via a new variable fivetran_platform_using_transformations, which automatically checks for the table. If the table doesn't exist, the staging stg_fivetran_platform__transformation_runs model will persist as an empty model and respective downstream fields will be null.

  • If the transformation_runs source table exists in your schema, fivetran_platform_using_transformations will be set to True and the following updates apply:

    • Added a new staging stg_fivetran_platform__transformation_runs model.
      • We have also added the get_transformation_runs_columns() macro to ensure all required columns are present.
    • Added the following fields to the fivetran_platform__usage_mar_destination_history end model for each destination and month:
      • paid_model_runs
      • free_model_runs
      • total_model_runs
    • If you would like to override these updates, you can also manually disable the fivetran_platform_using_transformations variable by setting it to False in your project.yml

PR Checklist

Basic Validation

Please acknowledge that you have successfully performed the following commands locally:

  • dbt run –full-refresh && dbt test
  • dbt run (if incremental models are present) && dbt test

Before marking this PR as "ready for review" the following have been applied:

  • The appropriate issue has been linked, tagged, and properly assigned
  • All necessary documentation and version upgrades have been applied
  • docs were regenerated (unless this PR does not include any code or yml updates)
  • BuildKite integration tests are passing
  • Detailed validation steps have been provided below

Detailed Validation

Please share any and all of your validation steps:

hex notebook

If you had to summarize this PR in an emoji, which would it be?

💃

Copy link
Contributor

@fivetran-jamie fivetran-jamie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks pretty good!! Made some minor suggestions and I have a couple of larger questions/requests:

  1. Can we confirm with eng whether or not customers who do not use transformations at all will have the transformation_runs table in their schemas? Wondering if we need to include a variable
  2. Can you move the Hex validations to a new integrity data validation test here?

models/staging/src_fivetran_platform.yml Outdated Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
@fivetran-reneeli
Copy link
Contributor Author

fivetran-reneeli commented Jan 4, 2025

Thanks @fivetran-jamie for the review, ready for another look pending databricks passing.. I also made the new validation test comparing run counts. I also asked eng, that's a good point.

Copy link
Contributor

@fivetran-jamie fivetran-jamie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking great, couple of minor doc-related comments

DECISIONLOG.md Outdated Show resolved Hide resolved
CHANGELOG.md Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
Copy link
Contributor

@fivetran-jamie fivetran-jamie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Contributor

@fivetran-joemarkiewicz fivetran-joemarkiewicz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fivetran-reneeli a few questions and change requests before approval.

CHANGELOG.md Outdated Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
Comment on lines 24 to 25
sum(case when free_type = 'PAID' then model_runs else 0 end) as paid_model_runs,
sum(case when free_type != 'PAID' then model_runs else 0 end) as free_model_runs,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've seen cases in the past where the casing of strings in the Fivetran Platform data model could possibly change in the future. Can we make this more future proof and simple to a lower casing search?

Suggested change
sum(case when free_type = 'PAID' then model_runs else 0 end) as paid_model_runs,
sum(case when free_type != 'PAID' then model_runs else 0 end) as free_model_runs,
sum(case when lower(free_type) = 'paid' then model_runs else 0 end) as paid_model_runs,
sum(case when lower(free_type) != 'paid' then model_runs else 0 end) as free_model_runs,

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey I saw this was added but don't we do upper(free_type) as free_type in the staging model?

Copy link
Contributor Author

@fivetran-reneeli fivetran-reneeli Jan 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh you are 100% correct thanks for noticing that. I think it makes more sense to keep casing in staging, but I just realized everywhere else we do a lower albeit in the transforms. I'll do a lower in the staging and remove it from the transforms.

cc @fivetran-joemarkiewicz

models/staging/src_fivetran_platform.yml Outdated Show resolved Hide resolved
Comment on lines 48 to 50
{% if target.type not in ('sqlserver') %}
limit 0
{% endif %}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to account for Redshift to limit 1 as we've recently uncovered with the union data practices?

cc: @fivetran-jamie

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah that's a great point! I think we should, as we've learned that Redshift won't respect the data type casts otherwise. @fivetran-reneeli can you add a condition to limit 1 instead of 0 for Redshift targets

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! I believe I'll need to add it to usage_cost and credits_used as well though. I'll add that in and note it in the CHANGELOG.

CHANGELOG.md Outdated Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
Copy link
Contributor

@fivetran-joemarkiewicz fivetran-joemarkiewicz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two updates before release. Once those are updated this will be good to go!

CHANGELOG.md Outdated Show resolved Hide resolved
models/staging/src_fivetran_platform.yml Outdated Show resolved Hide resolved
@fivetran-reneeli fivetran-reneeli merged commit 13705d8 into main Jan 22, 2025
11 checks passed
@fivetran-reneeli fivetran-reneeli deleted the feature/add_transformation_runs branch January 22, 2025 23:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants