Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ADAP-518] Convert information to a dict #496

Open
3 tasks done
Tracked by #495
Fokko opened this issue May 8, 2023 · 5 comments · May be fixed by dbt-labs/dbt-spark#752
Open
3 tasks done
Tracked by #495

[ADAP-518] Convert information to a dict #496

Fokko opened this issue May 8, 2023 · 5 comments · May be fixed by dbt-labs/dbt-spark#752
Labels
pkg:dbt-spark Issue affects dbt-spark type:enhancement New feature request

Comments

@Fokko
Copy link
Contributor

Fokko commented May 8, 2023

Is this your first time submitting a feature request?

  • I have read the expectations for open source contributors
  • I have searched the existing issues, and I could not find an existing issue for this feature
  • I am requesting a straightforward extension of existing dbt-spark functionality, rather than a Big Idea better suited to a discussion

Describe the feature

In preparation for having three-part identifiers catalog.schema.table (#495), I would like to change the information attribute on the SparkRelation into a dict:

https://github.com/dbt-labs/dbt-spark/blob/cb41ab049481bc458871d5c37fad47e59d6b759c/dbt/adapters/spark/relation.py#L36-L37

Describe alternatives you've considered

The current way is unmaintainable with the regex that extracts useful information from the big blob of text. Also, I noticed that the types are missing currently:

image

Who will this benefit?

Mostly the developers because it is hard to maintain right now, and hard to extend the current situation

Are you interested in contributing this feature?

Yes!

Anything else?

I wanted to add the database to the configuration. In Spark, this is called a catalog: https://spark.apache.org/docs/latest/api/python/reference/pyspark.sql/api/pyspark.sql.Catalog.html

Since Spark 3.0 it can discover tables/views from multiple catalogs, such as a Hive or Glue catalog. I would love to add this, but this refactor needs to be done first and I also want to keep the PRs concise.

@Fokko Fokko added type:enhancement New feature request triage:product In Product's queue labels May 8, 2023
@github-actions github-actions bot changed the title Convert information to a dict [ADAP-518] Convert information to a dict May 8, 2023
@Fokko Fokko linked a pull request May 8, 2023 that will close this issue
6 tasks
@dbeatty10
Copy link
Contributor

Thanks for kicking this off @Fokko 🏆

I started a tasklist to track each of the refactor(s) + feature implementation(s) needed for three-part identifiers:

As you create more issues for this, just let me know and we'll add them to that tasklist.

@dbeatty10 dbeatty10 removed the triage:product In Product's queue label May 8, 2023
@Fokko
Copy link
Contributor Author

Fokko commented May 9, 2023

@dbeatty10 Thanks! Much appreciated

Copy link
Contributor

github-actions bot commented Nov 6, 2023

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days.

@github-actions github-actions bot added the Stale Mark an issue or PR as stale, to be closed label Nov 6, 2023
Copy link
Contributor

Although we are closing this issue as stale, it's not gone forever. Issues can be reopened if there is renewed community interest. Just add a comment to notify the maintainers.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Nov 13, 2023
@dbeatty10
Copy link
Contributor

@Fokko I just noticed that this and #495 were closed as stale, so I'm re-opening them now.

@dbeatty10 dbeatty10 reopened this May 6, 2024
@dbeatty10 dbeatty10 removed the Stale Mark an issue or PR as stale, to be closed label May 6, 2024
@mikealfare mikealfare added the pkg:dbt-spark Issue affects dbt-spark label Jan 13, 2025
@mikealfare mikealfare transferred this issue from dbt-labs/dbt-spark Jan 13, 2025
mikealfare pushed a commit that referenced this issue Jan 14, 2025
* Support all types of data_type using time ingestion partitioning

* rework bq_create_table_as & fix partitions

* touchups after verifying no bug

* change case of test field because the parse routine now sanitizes the config val

---------

Co-authored-by: Mila Page <[email protected]>
Co-authored-by: Mila Page <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pkg:dbt-spark Issue affects dbt-spark type:enhancement New feature request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants