-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
oraclelogminer: added in support for LOB replication in the LogToKV layer #1103
base: oracle-source-0826
Are you sure you want to change the base?
Conversation
…ayer Previously, LOB data would cause a crash because of antlr parsing oddities. The logic here aims to do a few things: 1. Update the insert and update queries that have EMPTY_CLOB or EMPTY_BLOB function calls to insert empty strings instead 2. Convert the hextoraw(...) data to actual bytea strings that can be interpreted in a CRDB query 3. Determine which items in the KV map for set and where clauses need to be updated with these parsing rules Resolves: CC-31048 Release Note: None
func replaceEmptyLobsWithEmptyString(input string) string { | ||
return emptyLobRegex.ReplaceAllString(input, "''") | ||
} | ||
|
||
// LogToKV parse a sql stmt log to a SetKV struct, where the key is the column to be rewritten, the value | ||
// is the value to override / insert. The true extraction logic can be found in the functions related | ||
// to oracleparser.MockListener. | ||
// Examples can be found in TestLogToKV(). | ||
func LogToKV(log string) (oracleparser.SetAndWhereKVStructs, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: take out all the verbose logging
input: `insert into "C##MYADMIN"."LOB_TABLE" | ||
("x","y","BLOB_COL","RAW_COL","LONG_RAW_COL","CLOB_COL","NCLOB_COL") | ||
values | ||
('9','10115','',HEXTORAW('52415731'),NULL,'','');`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For insert or update need to test out the latest LOB_TABLE and see why RAW is not being updated properly in the first INSERT.
@@ -32,6 +33,66 @@ type logToKVTestCase struct { | |||
|
|||
func TestLogToKV(t *testing.T) { | |||
for i, tc := range []logToKVTestCase{ | |||
{ | |||
input: `insert into "C##MYADMIN"."LOB_TABLE" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to figure out cases where there are non LOB cols and LOB cols but its majority non LOB.
@ZhouXing19 also realized that I just based this off your latest PR branch, but that has the WIP for the new parsing scheme huh? I'm seeing certain bugs here which makes sense still it's still a WIP. By the way, this change works fine with the branch version of oracle-source-0826 like two months ago. But we'll need to reevaluate this once the new method is in. It's not properly replicating data over. |
So I dug into this behavior a bit more closely and notice a bug with how we are doing updates for the LOB case. As previously mentioned for lobs:
What we actually see here however is that the update seems to clobber what happened in the insert. So from the above, we fully expect that every field will be filled out like how it is in the source:
However, if you look closer here, you'll see on the target:
In this case, any non PK element is basically set as NULL. This seems to imply that the update behavior is off here. Instead of hydrating with existing data in the target table, it just upserts NULL. Looking closer at
Question here is: how can we differentiate between what is truly null vs. what is just not set, but the intention is to set only that thing and not touch anything else? So after digging into a comment that @ZhouXing19 left about how we need to put the PK from the "WhereKV" into the "SetKV", I realized that a similar principle applies to these updates that come in from the LogMiner redo log. All the values for that column are fully defined in that WHERE statement. Although there is the caveat that we want to exclude ROWID, because a target CRDB will most likely not have a ROWID target. So the fix here is to put all other keys in the WhereKV to SetKV. But we need to think about cases where this breaks down. (First cut but not good enough, look below for final code here)
Good News
So this new methodology seems to work fine. We should put everything but ROWID inside. Actually, there is a caveat here, we can't let the key from the WHERE take precedence over the WHERE inside of the set. So if SetKV already has the key, then ignore the key from WHERE KV. Then I think the behavior should be fine. So basically precedence order is: Set val > Where val > default val when things get reified. Final code that works in:
|
46735f6
to
a7fa4e3
Compare
Previously, LOB data would cause a crash because of antlr parsing oddities. The logic here aims to do a few things:
Resolves: CC-31048
Release Note: None
This change is