Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(persist_docs): write dbt model and project metadata to properties #449

Merged
merged 11 commits into from
Oct 13, 2023
Merged

feat(persist_docs): write dbt model and project metadata to properties #449

merged 11 commits into from
Oct 13, 2023

Conversation

roslovets
Copy link
Contributor

Description

Hi team, I suggest improvement which should significantly improve data observability from my point of view.

If persist_docs is enabled we write additional metadata to Glue Table Properties:

  • all fields from model.config.meta (i.e. owner, unique_key, other custom fields)
  • model.config.materialized value (to mark incremental tables)
  • dbt project name and version (to identify tables depoyed via dbt and find outdated tables by project version)

This improvement allows to treat Glue Catalog as a storage of tabel metadata and build interactive application around it without need to access dbt manifest.

Models used to test - Optional

Tested on view, table, incremental table

Checklist

  • You followed contributing section
  • You kept your Pull Request small and focused on a single feature or bug fix.
  • You added unit testing when necessary
  • You added functional testing when necessary

@roslovets roslovets changed the title feat[persist_docs]: write dbt model and project metadata to properties feat(persist_docs): write dbt model and project metadata to properties Oct 10, 2023
Copy link
Contributor

@Jrmyy Jrmyy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall the codes looks good, but it seems like a lot of new metadata, would it be possible to think of a way to activate or not this feature or to be at least able to customize which keys we want to push ?

Also please provide some unit tests :)

dbt/adapters/athena/impl.py Outdated Show resolved Hide resolved
dbt/adapters/athena/impl.py Outdated Show resolved Hide resolved
@roslovets
Copy link
Contributor Author

Overall the codes looks good, but it seems like a lot of new metadata, would it be possible to think of a way to activate or not this feature or to be at least able to customize which keys we want to push ?

Also please provide some unit tests :)

Well, I thought about it... Then I decided that Athena and Glue store lots of trash in Table Properties you will never use. Why should we be humble then?
In general I can add global switch for metadata persisting. But isn't it too complicated for such simple feature?
I will add some tests, thanks.

@nicor88
Copy link
Contributor

nicor88 commented Oct 10, 2023

In general I can add global switch for metadata persisting. But isn't it too complicated for such simple feature?

I believe that is not necessary for this case.
I'm wondering if we know if there are some sort of limitations in terms of the amount of properties that we can set and if there is a max length in terms of string size for each property value.

@Jrmyy
Copy link
Contributor

Jrmyy commented Oct 10, 2023

Ok ok let's go for this then :)

there is a max length in terms of string size for each property value.

I think there is something like 255 characters

@roslovets
Copy link
Contributor Author

In general I can add global switch for metadata persisting. But isn't it too complicated for such simple feature?

I believe that is not necessary for this case.
I'm wondering if we know if there are some sort of limitations in terms of the amount of properties that we can see and if there is a max length in terms of string size for each property value.

It's a very good point.
I've found limitations here: https://docs.aws.amazon.com/glue/latest/webapi/API_Table.html

I'm going to truncate long values and omit long keys.

Jrmyy
Jrmyy previously approved these changes Oct 10, 2023
Copy link
Contributor

@nicor88 nicor88 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💯 @Jrmyy do you want to do another round of checks?

@Jrmyy Jrmyy self-requested a review October 13, 2023 06:39
@Jrmyy Jrmyy merged commit 79abea6 into dbt-labs:main Oct 13, 2023
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants