Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Select modified source models only #256

Open
OGrohmann opened this issue Feb 22, 2024 · 2 comments
Open

Select modified source models only #256

OGrohmann opened this issue Feb 22, 2024 · 2 comments
Labels
enhancement New feature or request

Comments

@OGrohmann
Copy link

Describe the feature

I would like to integrate the stage_external_sources macro into our dbt ci job in a way that only new and modified source models are updated, similar to the modified statement in the build command (i.e. dbt build --select state:modified).

Describe alternatives you've considered

Currently we are just refreshing all sources in the ci job as a first step, which is very inefficient of course.

Additional context

This request is referring to the select argument of the stage_external_sources macro.

Who will this benefit?

the execution of our ci job for integration testing will be much faster.

@OGrohmann OGrohmann added enhancement New feature or request triage labels Feb 22, 2024
@dataders
Copy link
Collaborator

totally get it @OGrohmann! it should just know, right?

I'd be curious to know:

  • how many external tables you have in your project?
  • how often new external tables are added?
  • how often external table definitions change?
  • How long does it take for all the external sources to be staged?

That said just as @jtcohen6 responded in #263, we plan for external tables to live within Core and have access to state-based selectors like you propose.

Meanwhile, we'd love your eyes on dbt-labs/dbt-adapters#92, as we think about a real future for external tables

@dataders dataders removed the triage label Mar 14, 2024
@OGrohmann
Copy link
Author

Hi @dataders ,
sorry for the late response! Let me feedback to your questions:

  • how many external tables you have in your project?
    Currently 548 objects (Snowflake external tables & Snowpipes counted together)
  • how often new external tables are added?
    Regularly for every new request. Maybe 10-20 per month.
  • how often external table definitions change?
    Not frequently.
  • How long does it take for all the external sources to be staged?
    Normal run without full refresh takes around 25-30 minutes

As a workaround we thought to exclude the macro from our CI and CD jobs and manually update the affected sources upfront. Problem is that this cannot be done before the changes are merged to the main branch and the CI / CD jobs are immediately triggered after creating / completing the pull request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants