Add table mappings to compression settings #6990

erimatnor · 2024-06-03T11:17:22Z

Refactor the compression settings metadata table to include a mapping from a chunk's relid to its compressed chunk's relid.

This simplifies the code that looks up compression metadata as it no longer requires first looking up the corresponding compressed chunk in the "chunk" metadata table. Instead, given the non-compressed chunk's relid, the corresponding compressed chunk's relid can easily be looked up via the compression settings.

The refactoring is a step towards removing a chunk's compression table from the chunk metadata. In other words, the "compressed chunk" will no longer be a chunk, just a relation associated with the regular chunk via compression settings. However, this requires further refactoring that is left for follow-up changes.

Disable-check: force-changelog-file

codecov · 2024-06-04T09:24:11Z

Codecov Report

Attention: Patch coverage is 98.26087% with 2 lines in your changes missing coverage. Please review.

Project coverage is 81.26%. Comparing base (59f50f2) to head (3bff242).
Report is 719 commits behind head on main.

Files with missing lines	Patch %	Lines
tsl/src/compression/compression_dml.c	94.44%	0 Missing and 1 partial ⚠️
tsl/src/compression/create.c	85.71%	0 Missing and 1 partial ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #6990      +/-   ##
==========================================
+ Coverage   80.06%   81.26%   +1.19%     
==========================================
  Files         190      240      +50     
  Lines       37181    44729    +7548     
  Branches     9450    11163    +1713     
==========================================
+ Hits        29770    36348    +6578     
- Misses       2997     3991     +994     
+ Partials     4414     4390      -24

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

svenklemm

It looks like this PR might need a bit more work. If we're going to store info about compressed and uncompressed assignments in the compression_settings table, we should take it out of the chunk catalog table in this same update. Storing the same thing in two places can lead to confusion and mistakes. We want to keep things tidy and have one source of truth for this information.

erimatnor · 2024-06-07T16:07:48Z

It looks like this PR might need a bit more work. If we're going to store info about compressed and uncompressed assignments in the compression_settings table, we should take it out of the chunk catalog table in this same update. Storing the same thing in two places can lead to confusion and mistakes. We want to keep things tidy and have one source of truth for this information.

While I agree we should get rid of the metadata in the chunk catalog table, I think we should do this in increments. That's a pretty heavy lift and has other consequences on dependent tables, like compression_chunk_size, etc.

This change does not mean we will store the same information in multiple places apart from an extra ID. I think we can live with this in the short term until the other changes are merged too.

antekresic

For posterity, lets make a concrete plan for the rest of the refactoring that should follow this PR for clarity and prioritization purposes.

Removing the compressed_chunk_id chunk metadata should be followed up and would clean up the catalog and our code nicely.

src/process_utility.c

src/ts_catalog/compression_settings.c

tsl/src/compression/compression.c

erimatnor · 2025-01-29T10:18:48Z

For posterity, lets make a concrete plan for the rest of the refactoring that should follow this PR for clarity and prioritization purposes.

Here's the plan: https://github.com/timescale/eng-database/issues/618

Removing the compressed_chunk_id chunk metadata should be followed up and would clean up the catalog and our code nicely.

100%. Already working on this, but it is a much bigger change so probably requires multiple follow up PRs. Next step is to remove unnecessary fields in compression_chunk_size, for example.

erimatnor · 2025-01-29T10:23:11Z

Storing the same thing in two places can lead to confusion and mistakes. We want to keep things tidy and have one source of truth for this information.

@svenklemm I agree with single source of truth. However, we already have places were we duplicate information (chunk Ids), like compression_chunk_size, for example.

The intention is to address this over multiple refactoring PRs, including getting rid of compressed chunks in the "chunk" table. But his is further down the line and this is just the first step. I do think it is important to make these changes incrementally. Doing everything in one PR will lead to massive changes and it will be very difficult to review.

Refactor the compression settings metadata table to include a mapping from a chunk's relid to its compressed chunk's relid. This simplifies the code that looks up compression metadata as it no longer requires first looking up the corresponding compressed chunk in the "chunk" metadata table. Instead, given the non-compressed chunk's relid, the corresponding compressed chunk's relid can easily be looked up via the compression settings. The refactoring is a step towards removing a chunk's compression table from the chunk metadata. In other words, the "compressed chunk" will no longer be a chunk, just a relation associated with the regular chunk via compression settings. However, this requires further refactoring that is left for follow-up changes.

mkindahl

This is a good step in the direction of avoiding unnecessary lookup of catalog data.

From some of the changes made here I see that we still have a lot of unnecessary indirection causing us to add extraneous updates of catalog data, e.g., attribute names, so we are far from done.

sql/pre_install/tables.sql

src/chunk.c

src/ts_catalog/compression_settings.c

erimatnor added compression tech-debt Needs refactoring and improvement tasks related to the source code and its architecture. labels Jun 3, 2024

erimatnor force-pushed the refactor-compression-settings branch 7 times, most recently from a48c167 to f2652ad Compare June 4, 2024 09:22

erimatnor force-pushed the refactor-compression-settings branch from f2652ad to b9df9ea Compare June 4, 2024 10:21

erimatnor marked this pull request as ready for review June 4, 2024 11:57

erimatnor requested review from svenklemm and mkindahl June 4, 2024 11:57

fabriziomello assigned erimatnor Jun 5, 2024

svenklemm requested changes Jun 6, 2024

View reviewed changes

erimatnor requested a review from antekresic June 7, 2024 16:08

antekresic reviewed Jun 11, 2024

View reviewed changes

src/process_utility.c Outdated Show resolved Hide resolved

src/ts_catalog/compression_settings.c Show resolved Hide resolved

src/ts_catalog/compression_settings.c Show resolved Hide resolved

tsl/src/compression/compression.c Outdated Show resolved Hide resolved

mkindahl requested review from antekresic and svenklemm and removed request for mkindahl November 20, 2024 07:40

erimatnor force-pushed the refactor-compression-settings branch 3 times, most recently from ada1790 to dc8a969 Compare January 29, 2025 10:16

erimatnor force-pushed the refactor-compression-settings branch 3 times, most recently from 051b5a4 to a0eff1d Compare January 29, 2025 17:01

erimatnor force-pushed the refactor-compression-settings branch 3 times, most recently from 892ea73 to 5ed75cf Compare January 30, 2025 08:14

erimatnor force-pushed the refactor-compression-settings branch from 5ed75cf to 3bff242 Compare January 30, 2025 08:17

mkindahl approved these changes Jan 30, 2025

View reviewed changes

sql/pre_install/tables.sql Show resolved Hide resolved

src/chunk.c Show resolved Hide resolved

src/ts_catalog/compression_settings.c Show resolved Hide resolved

src/ts_catalog/compression_settings.c Show resolved Hide resolved

src/ts_catalog/compression_settings.c Show resolved Hide resolved

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add table mappings to compression settings #6990

Add table mappings to compression settings #6990

erimatnor commented Jun 3, 2024 •

edited

Loading

codecov bot commented Jun 4, 2024 •

edited

Loading

svenklemm left a comment

erimatnor commented Jun 7, 2024

antekresic left a comment

erimatnor commented Jan 29, 2025

erimatnor commented Jan 29, 2025

mkindahl left a comment

Add table mappings to compression settings #6990

Are you sure you want to change the base?

Add table mappings to compression settings #6990

Conversation

erimatnor commented Jun 3, 2024 • edited Loading

codecov bot commented Jun 4, 2024 • edited Loading

Codecov Report

svenklemm left a comment

Choose a reason for hiding this comment

erimatnor commented Jun 7, 2024

antekresic left a comment

Choose a reason for hiding this comment

erimatnor commented Jan 29, 2025

erimatnor commented Jan 29, 2025

mkindahl left a comment

Choose a reason for hiding this comment

erimatnor commented Jun 3, 2024 •

edited

Loading

codecov bot commented Jun 4, 2024 •

edited

Loading