-
Notifications
You must be signed in to change notification settings - Fork 903
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add table mappings to compression settings #6990
base: main
Are you sure you want to change the base?
Conversation
a48c167
to
f2652ad
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #6990 +/- ##
==========================================
+ Coverage 80.06% 81.26% +1.19%
==========================================
Files 190 240 +50
Lines 37181 44729 +7548
Branches 9450 11163 +1713
==========================================
+ Hits 29770 36348 +6578
- Misses 2997 3991 +994
+ Partials 4414 4390 -24 ☔ View full report in Codecov by Sentry. |
f2652ad
to
b9df9ea
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this PR might need a bit more work. If we're going to store info about compressed and uncompressed assignments in the compression_settings table, we should take it out of the chunk catalog table in this same update. Storing the same thing in two places can lead to confusion and mistakes. We want to keep things tidy and have one source of truth for this information.
While I agree we should get rid of the metadata in the chunk catalog table, I think we should do this in increments. That's a pretty heavy lift and has other consequences on dependent tables, like compression_chunk_size, etc. This change does not mean we will store the same information in multiple places apart from an extra ID. I think we can live with this in the short term until the other changes are merged too. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For posterity, lets make a concrete plan for the rest of the refactoring that should follow this PR for clarity and prioritization purposes.
Removing the compressed_chunk_id chunk metadata should be followed up and would clean up the catalog and our code nicely.
ada1790
to
dc8a969
Compare
Here's the plan: https://github.com/timescale/eng-database/issues/618
100%. Already working on this, but it is a much bigger change so probably requires multiple follow up PRs. Next step is to remove unnecessary fields in compression_chunk_size, for example. |
@svenklemm I agree with single source of truth. However, we already have places were we duplicate information (chunk Ids), like compression_chunk_size, for example. The intention is to address this over multiple refactoring PRs, including getting rid of compressed chunks in the "chunk" table. But his is further down the line and this is just the first step. I do think it is important to make these changes incrementally. Doing everything in one PR will lead to massive changes and it will be very difficult to review. |
051b5a4
to
a0eff1d
Compare
892ea73
to
5ed75cf
Compare
Refactor the compression settings metadata table to include a mapping from a chunk's relid to its compressed chunk's relid. This simplifies the code that looks up compression metadata as it no longer requires first looking up the corresponding compressed chunk in the "chunk" metadata table. Instead, given the non-compressed chunk's relid, the corresponding compressed chunk's relid can easily be looked up via the compression settings. The refactoring is a step towards removing a chunk's compression table from the chunk metadata. In other words, the "compressed chunk" will no longer be a chunk, just a relation associated with the regular chunk via compression settings. However, this requires further refactoring that is left for follow-up changes.
5ed75cf
to
3bff242
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good step in the direction of avoiding unnecessary lookup of catalog data.
From some of the changes made here I see that we still have a lot of unnecessary indirection causing us to add extraneous updates of catalog data, e.g., attribute names, so we are far from done.
Refactor the compression settings metadata table to include a mapping from a chunk's relid to its compressed chunk's relid.
This simplifies the code that looks up compression metadata as it no longer requires first looking up the corresponding compressed chunk in the "chunk" metadata table. Instead, given the non-compressed chunk's relid, the corresponding compressed chunk's relid can easily be looked up via the compression settings.
The refactoring is a step towards removing a chunk's compression table from the chunk metadata. In other words, the "compressed chunk" will no longer be a chunk, just a relation associated with the regular chunk via compression settings. However, this requires further refactoring that is left for follow-up changes.
Disable-check: force-changelog-file