-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
previous eq deletes handling on new write #12280
Comments
maybe we could have rewrite_position_delete_files but for eq |
sounds fair, if eq deletes are partition scoped, may be we need to stack it either per write or as part of async process like Let me think more about this approach |
currently we use starrocks which plans the scan like
so we have for 246 eq deletes rows, which bound stats cover the whole data files i don't know if it could be improved with bloom filter in canContainEqDeletesForFile |
I see if it written this way i.e join, each eq delete would be scanned only once right (same is what Impala does). Is there a configuration to read multiple eq deletes in a single execution task (essentially pack ?) as there will be always an issue with parallelism if we try to re:write eq deletes ? Consider The problem is even worse in Spark as eq delete can get scanned multiple times for a single file. so we need some strategies around how to distribute these tasks |
in our scenario each commit adds 1 eq delete file, every 5 minutes, 12 times an hour i think we can tradeoff a large number of delete files to more granular bounds, so that we reduce the number of data files and rows we need to recheck for deletes so the theoretical eq_delete_rewrite procedure should
this would help spark too, since it will reduce the number of file references now our situation is: Delete Statistics: Pos Delete Statistics: frankly i would love to see any improvement that reduce "Records with eq deletes total" |
Be careful about rewriting equality deletes to new equality deletes. The equality delete will remove every occurrence of the previous row in previous commits.
If we compact the equality deletes then we need to decide when these deletes should be applied. If we apply them at Commit 6 we lose PK1. If we apply them at Commit 2 then we will have duplicated PK2 Converting equality deletes to positional deletes with file granularity (spark like), or DVs (Impala like) could help to reduce the number of files to read for different readers. |
|
i was thinking about bloom filters some more, we can quickly determine what files need to be really examined, then emit position deletes for them |
Feature Request / Improvement
We do ingestion from debezium to iceberg via https://github.com/databricks/iceberg-kafka-connect/
Basically it uses flink delta writer.
Each batch of data writes small number of eq deletes for updates of prev commit data.
Most of db pk keys are uuid and so we even a handful of eq delete rows cover a large portion of data files (via lower/upper bounds check),
forcing costly check at query time.
We do run periodic compaction process, but it is inefficient, since it forces us to rewrite practically whole table, which would be "dirty" within 5 minutes of commit interval.
We thought about having multiple eq delete files, to make bounds more granular and to emulate poor man bloom filter.
But it again add many ranges and only postpone the issue, the table would be dirty say in 30 minutes, not 5.
If however new writer could read previous handful of eq deletes, maybe it could have combined them with new ones, so that the number of range buckets would stay ~ constant.
Query engine
None
Willingness to contribute
The text was updated successfully, but these errors were encountered: