-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CNDB-11613 SAI compressed indexes #1474
base: main
Are you sure you want to change the base?
Conversation
b6c7de2
to
cc3605e
Compare
6969bb1
to
5415fd8
Compare
What was ported: - current compaction throughput measurement by CompactionManager - exposing current compaction throughput in StorageService and CompactionMetrics - nodetool getcompactionthroughput, including tests Not ported: - changes to `nodetool compactionstats`, because that would require porting also the tests which are currently missing in CC and porting those tests turned out to be a complex task without porting the other changes in the CompactionManager API - Code for getting / setting compaction throughput as double
This commit introduces a new AdaptiveCompressor class. AdaptiveCompressor uses ZStandard compression with a dynamic compression level based on the current write load. AdaptiveCompressor's goal is to provide similar write performance as LZ4Compressor for write heavy workloads, but a significantly better compression ratio for databases with a moderate amount of writes or on systems with a lot of spare CPU power. If the memtable flush queue builds up, and it turns out the compression is a significant bottleneck, then the compression level used for flushing is decreased to gain speed. Similarly, when pending compaction tasks build up, then the compression level used for compaction is decreased. In order to enable adaptive compression: - set `-Dcassandra.default_sstable_compression=adaptive` JVM option to automatically select `AdaptiveCompressor` as the main compressor for flushes and new tables, if not overriden by specific options in cassandra.yaml or table schema - set `flush_compression: adaptive` in cassandra.yaml to enable it for flushing - set `AdaptiveCompressor` in Table options to enable it for compaction Caution: this feature is not turned on by default because it may impact read speed negatively in some rare cases. Fixes riptano/cndb#11532
Reduces some overhead of setting up / tearing down those contexts that happened inside the calls to Zstd.compress / Zstd.decompress. Makes a difference with very small chunks. Additionally, added some compression/decompression rate metrics.
Index compression options have been split into key_compression and value_compression so you can write: CREATE INDEX ON tab(v) WITH key_compression = { 'class': 'LZ4Compressor' } AND value_compression = { 'class': 'LZ4Compressor' }
5415fd8
to
18e8b04
Compare
18e8b04
to
71a3a1e
Compare
|
❌ Build ds-cassandra-pr-gate/PR-1474 rejected by Butler1 new test failure(s) in 8 builds Found 1 new test failures
Found 239 known test failures |
@@ -347,6 +347,9 @@ public enum CassandraRelevantProperties | |||
/** Watcher used when opening sstables to discover extra components, eg. archive component */ | |||
CUSTOM_SSTABLE_WATCHER("cassandra.custom_sstable_watcher"), | |||
|
|||
/** When enabled, a user can set compression options in the index schema */ | |||
INDEX_COMPRESSION("cassandra.index.compression_enabled", "false"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Being this a DS-only property, should we use a different prefix, as in ds.index.compression_enabled
, so it's easier for us to identify these properties?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, the property could be named INDEX_COMPRESSION_ENABLED
, or perhaps USE_INDEX_COMPRESSION
, so the name suggests that it's a boolean property.
/** | ||
public static boolean shouldUseAdaptiveCompressionByDefault() | ||
{ | ||
return System.getProperty("cassandra.default_sstable_compression", "fast").equals("adaptive"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would probably be better in CassandraRelevantProperties
.
* Builds a `WITH option1 = ... AND option2 = ... AND option3 = ... clause | ||
* @param builder a receiver to receive a builder allowing to add each option |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* Builds a `WITH option1 = ... AND option2 = ... AND option3 = ... clause | |
* @param builder a receiver to receive a builder allowing to add each option | |
* Builds a {@code WITH option1 = ... AND option2 = ... AND option3 = ...} clause. | |
* | |
* @param builder a consumer to receive a builder allowing to add each option |
|
||
public static class OptionsBuilder | ||
{ | ||
private CqlBuilder builder; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can be final
public static class OptionsBuilder | ||
{ | ||
private CqlBuilder builder; | ||
boolean empty = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can be private
* May not modify this object. | ||
* Should return null if the request cannot be satisfied. | ||
*/ | ||
default ICompressor forUse(Uses use) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: add @Nullable
.row(table.name, index.name) | ||
.add("kind", index.kind.toString()) | ||
.add("options", index.options); | ||
|
||
if (CassandraRelevantProperties.INDEX_COMPRESSION.getBoolean()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should add a note here or in CassandraRelevantProperties.INDEX_COMPRESSION
about how enabling index compression can be problematic for downgrades?
/* | ||
* Licensed to the Apache Software Foundation (ASF) under one | ||
* or more contributor license agreements. See the NOTICE file | ||
* distributed with this work for additional information | ||
* regarding copyright ownership. The ASF licenses this file | ||
* to you under the Apache License, Version 2.0 (the | ||
* "License"); you may not use this file except in compliance | ||
* with the License. You may obtain a copy of the License at | ||
* | ||
* http://www.apache.org/licenses/LICENSE-2.0 | ||
* | ||
* Unless required by applicable law or agreed to in writing, software | ||
* distributed under the License is distributed on an "AS IS" BASIS, | ||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
* See the License for the specific language governing permissions and | ||
* limitations under the License. | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need the header with Copyright DataStax, Inc.
instead of the ASF header.
} | ||
|
||
@Test | ||
public void testKeyCompression() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be great to have a test where we create an index with a certain key compression and then a second index with a different key compression.
@@ -194,7 +210,10 @@ public Keyspaces apply(Keyspaces schema) | |||
throw ire("Index %s is a duplicate of existing index %s", index.name, equalIndex.name); | |||
} | |||
|
|||
TableMetadata newTable = table.withSwapped(table.indexes.with(index)); | |||
// All indexes on one table must use the same key_compression. | |||
// The newly created index forces key_compression on the previous indexes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to emit a client warning about this?
What is the issue
SAI Indexes consume too much storage space, sometimes.
What does this PR fix and why was it fixed
This PR allows to compress both the per-sstable and per-index components of SAI.
Use
index_compression
table param to control the per-sstable components compression.Use
compression
property on the index to control the per-index components compression.Checklist before you submit for review
NoSpamLogger
for log lines that may appear frequently in the logs