Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running PR to keep track of custom changes to support YB #127

Open
wants to merge 55 commits into
base: 2.5.2.Final
Choose a base branch
from

Conversation

vaibhav-yb
Copy link
Collaborator

No description provided.

vaibhav-yb and others added 20 commits March 18, 2024 15:26
Initial changes required for the Debezium Connector for Postgres to work with YugabyteDB source.
…st YugabyteDB (#105)

This PR includes the changes required for the tests so that they can
work against YugabyteDB.

YugabyteDB issue: yugabyte/yugabyte-db#21394
Modified Dockerfile to package custom log4j.properties so that the log
files can be rolled over when their size exceeds 100MB.

Also changed the Kafka connect JDBC jar being used - this new jar has a
custom change to log every sink record going to the target database.
Changes in this PR:
1. Modification of Dockerfile to include transformers for aiven at the
time of docker image compilation
a. Aiven source:
https://github.com/Aiven-Open/transforms-for-apache-kafka-connect
…BC driver (#107)

## Problem

The Debezium connector for Postgres uses a single host model where the
JDBC driver connects to a PG instance and continues execution. However,
when we move to YugabyteDB where we have a multi node deployment, the
current model can fail in case the node it has connected to goes down.

## Solution

To address that, we have made changes in this PR and replaced the
Postgres JDBC driver with [YugabyteDB smart
driver](https://github.com/yugabyte/pgjdbc) which allows us to specify
multiple hosts in the JDBC url so that the connector does not fail or
run into any fatal error while maintaining the High Availability aspect
of YugabyteDB.

Changes in this PR include:
1. Changing of version in `pom.xml` from `2.5.2.Final` to
`2.5.2.ybpg.20241-SNAPSHOT`
a. This is done to ensure that upon image compilation, the changed code
from Debezium Code is picked up.
2. Replacing of all packages from `org.postgresql.*` to `com.yugabyte.*`
to comply with the new JDBC driver.
3. Masking the validator method in debezium-core which disallowed
characters like `: (colon)` in the configuration property
`database.hostname`
**Summary**
This PR is to support consistent snapshot in the case of an existing
slot.

In this case, the consistent_point hybrid time is determined from the
pg_replication_slots view, specifically from the yb_restart_commit_ht
column.

There is an assumption here that this slot has not been used for
streaming till this point. If this holds, then the history retention
barrier will be in place as of the consistent snapshot time
(consistent_point). The snapshot query will be run as of the
consistent_point and subsequent streaming will start from the
consistent_point of the slot.

**Test Plan**
Added new test
mvn -Dtest=PostgresConnectorIT#initialSnapshotWithExistingSlot test
**Changes:**
1. Providing JMX Exporter jar to KAFKA_OPTS to be further provided to
java options.
2. Modifying `metrics.yaml` to include correct regex to be scraped as
per Postgres connector.
…peruser (#115)

**Summary**
This PR adds the support for a non superuser to be configured as the
connector user (database.user).

Such a user is required to have the privileges listed in
https://debezium.io/documentation/reference/2.5/connectors/postgresql.html#postgresql-permissions

Specifically, the changes in this revision relate to how the
consistent_point is specified to the YugabyteDB server in order
to execute a consistent snapshot.

**Test Plan**
Added new test
mvn -Dtest=PostgresConnectorIT#nonSuperUserSnapshotAndStreaming test
…nectorTask (#114)

This PR is to add a higher level retry whenever there's a where while
starting a PostgresConnectorTask, the failures can include but not
limited to the following:
1. Failure of creating JDBC connection
2. Failure to execute query
3. Tserver/master restart
4. Node restart
5. Connection failure
…streaming (#116)

## Problem

PG connector does not wait for acknowledgement of snapshot completion
offset before transitioning to streaming. This can lead to an issue if
there is a connector restart in the streaming phase and it goes for a
snapshot on restart. In streaming phase, as soon as the 1st GetChanges
call is made on the server, the retention barriers are lifted and so the
server can no longer serve the snapshot records on a restart. Therefore
it is important that the connector waits for acknowledgement of snapshot
completion offset before it actually transitions to streaming.

## Solution

This PR introduces a waiting mechanism for acknowledgement of snapshot
completion offset before transitioning to streaming.

We have introduced a custom heartbeat implementation that will dispatch
heartbeat when forced heartbeat method is called but we'll dispatch
nothing when a normal heartbeat method is called.

With this PR, connector will dispatch heartbeats while waiting for the
snapshot completion offset i.e during the transition phase. For these
heartbeat calls, there is no need to set the `heartbeat.interval.ms`
since we are making forced heartbeat calls which do not rely on this
config. Note, this heartbeat call is only required to support
applications using debezium engine/embedded engine. It is not required
when the connector is run with kakfa-connect.

### Test Plan
Manually deployed connector in a docker container and tested two
scenarios: 0 snapshot records & non-zero snapshot records. Unit tests
corresponding to these scenarios will be added in a separate PR.
#119)

**Summary**
This PR adds support for the INITIAL_ONLY snapshot mode for Yugabyte.

In the case of Yugabyte also, the snapshot is consumed by executing a
snapshot query (SELECT statement) . To ensure that the streaming phase
continues exactly from where the snapshot left, this snapshot query is
executed as of a specific database state. In YB, this database state is
represented by a value of HybridTime. Changes due to transactions with
commit_time strictly greater than this snapshot HybridTime will be
consumed during the streaming phase.

This value for HybridTime is the value of the "yb_restart_commit_ht"
column of the pg_replication_slots view of the associated slot. Thus, in
the case of Yugabyte, even for the INITIAL_ONLY snapshot mode, a slot
needs to be created if one does not exist.

With this approach, a connector can be deployed in INITIAL_ONLY mode to
consume the initial snapshot. This can be followed by the deployment of
another connector in NEVER mode. This connector will continue the
streaming from exactly where the snapshot left.

**Test Plan**
1. Added new test -` mvn
-Dtest=PostgresConnectorIT#snapshotInitialOnlyFollowedByNever test `
2. Enabled existing test - `mvn
-Dtest=PostgresConnectorIT#shouldNotProduceEventsWithInitialOnlySnapshot
test`
3. Enabled existing test - `mvn
-Dtest=PostgresConnectorIT#shouldPerformSnapshotOnceForInitialOnlySnapshotMode
test `
… image (#118)

This PR adds the dependencies for the `AvroConverter` to function in the
Kafka Connect environment. The dependencies will only be added at the
time of building the docker image.
This PR adds a log which will be print the IP of the node every time a
connection is created.
Retry in case of failures while task is restarting. Right now any kind
of failure will lead to task throwing RetriableException exception
causing Task restart.
…te source (#120)

**Summary**
This PR enables 30/33 tests in IncrementalSnapshotIT for Yugabyte source

The tests that are excluded are 

1. updates
2. updatesLargeChunk
3. updatesWithRestart

**Test Plan**
`mvn -Dtest=IncrementalSnapshotIT test`
This PR comments out the part in the init_database i.e. the startup
script during tests where some extensions are being installed - it is
taking more than 2 minutes at this stage and since we do not need it in
the tests we use, it can be skipped.
Throw retry for all exceptions. In future, we will need to throw runtime
exception for wrong configurations.
Copy link

Hi @vaibhav-yb, thanks for your contribution. Please prefix the commit message(s) with the DBZ-xxx JIRA issue key.

…r image (#129)

This PR only changes the link in the `Dockerfile` to fetch the latest
custom sink connector jar from GitHub.

According to PR yugabyte/kafka-connect-jdbc#3,
changes include the following:
1. Addition of 3 new configuration properties 
    * `log.table.balance`:
i. Default is `false` but when set to `true`, the sink connector will
execute a query to get the table balance from the target table.
ii. Note that this is only applicable for consistency related tests
where the given query is applicable - it will fail if set in any other
tests.
    * `expected.total.balance`
i. Default is `1000000` (1M) which can be changed to whatever value we
are expecting the total balance to be in the target table.
    * `tables.for.balance`
i. This takes a comma separated string value for all the table names
from which the sink connector is supposed to extract balances from.
ii. This property will only be valid when `log.table.balance` is
specified as `true`
iii. There is no default for this property so if `log.table.balance` is
set to `true` and `tables.for.balance` is not specified then we will
throw a `RuntimeException`
2. Log additions to aid debugging.
Copy link

github-actions bot commented Jun 3, 2024

Hi @vaibhav-yb, thanks for your contribution. Please prefix the commit message(s) with the DBZ-xxx JIRA issue key.

…is set (#131)

## Problem
PG connector filters out record based on its starting point (WAL
position), which in turn depends on the offset received from Kafka. So,
in case, the starting point corresponds to a record in the middle of a
transaction, PG connector will filter out the records of that
transaction with LSN < starting point.

This creates a problem in the downstream pipeline expects consistency of
data. Filtering of records leads to PG connector shipping transaction
with missing records. When such a transaction is applied on the sink,
consistency breaks.

## Solution
When 'provide.transaction.metadata' connector configuration is set, PG
connector ships transaction boundary records BEGIN/COMMIT. Based on
these boundary records, sink connector writes data maintaining
consistency. Therefore, when this connector config is set, we will
disable filtering records based on WAL Resume position.

## Testing
Manually testing - Ran the connector with this fix in our QA runs where
the issue was first discovered. All 10 runs triggered passed.
Unit testing - Cannot reproduce the above mentioned scenario in a unit
test.
Copy link

github-actions bot commented Jun 7, 2024

Hi @vaibhav-yb, thanks for your contribution. Please prefix the commit message(s) with the DBZ-xxx JIRA issue key.

Copy link

Hi @vaibhav-yb, thanks for your contribution. Please prefix the commit message(s) with the DBZ-xxx JIRA issue key.

Copy link

Hi @vaibhav-yb, thanks for your contribution. Please prefix the commit message(s) with the DBZ-xxx JIRA issue key.

This PR adds a configuration to let the user disable consistent snapshot
i.e. `yb.consistent.snapshot` to the connector which is enabled by
default. Setting consistent snapshot means setting/establishing the
boundary between snapshot records (records that existed at the time of
stream creation) and streaming records.

If `yb.consistent.snapshot` is disabled i.e. set to `false`:
- We will not be setting a boundary for snapshot
- If the connector restarts after taking a snapshot but before
acknowledging the streaming LSN, the snapshot will be taken again. This
can result in some records being received both during the snapshot phase
and the streaming phase.
## Problem

The Postgres JDVC driver jar being used currently has a `CRITICAL`
vulnerability `CVE-2024-1597` as identified by `trivy` in the version
`42.6.0`.

## Solution

For the fix, this PR upgrades the jar to a driver version `42.6.1` which
does not have the vulnerability.
Copy link

Hi @vaibhav-yb, thanks for your contribution. Please prefix the commit message(s) with the DBZ-xxx JIRA issue key.

This PR adds a GitHub action to the repository which can be run manually
to publish a Docker image to Quay and create a GitHub release draft with
a fat jar.
Copy link

Hi @vaibhav-yb, thanks for your contribution. Please prefix the commit message(s) with the DBZ-xxx JIRA issue key.

This PR fixes the path for the connector jar file to pick up the correct
artefact to be packaged in the Docker image.
Copy link

Hi @vaibhav-yb, thanks for your contribution. Please prefix the commit message(s) with the DBZ-xxx JIRA issue key.

)

This PR adds the following tests:
1. Streaming for a table with composite PK
2. Streaming with different replica identities with both plugins i.e.
`yboutput` and `pgoutput`
3. Streaming changes for a table with foreign key
Copy link

github-actions bot commented Aug 5, 2024

Hi @vaibhav-yb, thanks for your contribution. Please prefix the commit message(s) with the DBZ-xxx JIRA issue key.

## Problem

Suppose we have a column with a non-basic value like `timestamp` - the
code for converting the values here is following:

```java
if (connectorConfig.plugin().isYBOutput()) {
    if (value != null && !UnchangedToastedReplicationMessageColumn.isUnchangedToastedValue(value)) {
        value = converter.convert(((Object[]) value)[0]);
        Struct cell = new Struct(fields[i].schema());
        cell.put("value", value);
        cell.put("set", true);
        result.put(fields[i], cell);
    } else {
        result.put(fields[i], null);
    }
} else {
    result.put(fields[i], value);
}
```

The `else` block above has a missing line which should utilise a
converter for the value.

## Solution

This PR modifies the `PostgresSchema#getTableSchemaBuilder` method to
return the default `TableSchemaBuilder` when the plugin is NOT
`yboutput`.

Additionally, this PR fixes the `else` block in `PGTableSchemaBuilder`
with the required conversion and adds tests verifying the same for
`pgoutput` by adding the following logic:

```java
...
else {
  value = converter.convert(value);
  result.put(fields[i], value);
}
...
```
Copy link

github-actions bot commented Aug 6, 2024

Hi @vaibhav-yb, thanks for your contribution. Please prefix the commit message(s) with the DBZ-xxx JIRA issue key.

This PR adds the logic to throw an exception if a connector is deployed
with `decimal.handling.mode=PRECISE`.
Copy link

github-actions bot commented Aug 7, 2024

Hi @vaibhav-yb, thanks for your contribution. Please prefix the commit message(s) with the DBZ-xxx JIRA issue key.

This PR adds a test which works with both `pgoutput` and `yboutput` and
tests that if we update the column of any type, we should get the
relevant change event.
Copy link

github-actions bot commented Aug 7, 2024

Hi @vaibhav-yb, thanks for your contribution. Please prefix the commit message(s) with the DBZ-xxx JIRA issue key.

This PR introduces the following change:
* Version of the pom in the jar file will now be the same as the one
being specified in the input form
Copy link

github-actions bot commented Aug 8, 2024

Hi @vaibhav-yb, thanks for your contribution. Please prefix the commit message(s) with the DBZ-xxx JIRA issue key.

This PR changes the following:
1. Adds a plugin to the pom.xml file to package the connector according
to Confluent specifications
2. Adds a GitHub pipeline to automate the above packaging process
Copy link

Hi @vaibhav-yb, thanks for your contribution. Please prefix the commit message(s) with the DBZ-xxx JIRA issue key.

## Problem

For very large tables, the default `SELECT *` query can take a really
long time to complete leading to longer time for snapshots.

## Solution

This PR aims to implement snapshotting the table in parallel using an
inbuilt method `yb_hash_code` to only run the query for a given hash
range. The following 2 configuration properties are introduced with this
PR:
1. A new `snapshot.mode` called `parallel` - this will behave exactly
like `initial_only` but we will have the ability to launch multiple
tasks.
2. `primary.key.hash.columns` - this config takes in a comma separated
values of the primary key hash component of the table.

> **Note:** When `snapshot.mode` is set to `parallel`, we will not
support providing regex in the property `table.include.list` and the
user will need to specify the full name of the table in the property.
Additionally, we will only allow one table in the `table.include.list`
if `snapshot.mode` is `parallel`.
Copy link

Hi @vaibhav-yb, thanks for your contribution. Please prefix the commit message(s) with the DBZ-xxx JIRA issue key.

…on (#163)

## Problem

With the introduction of the parallel snapshot model, we can have
multiple tasks when the snapshot mode is set to `parallel`. This
introduces a problem at the underlying layer when the connector stores
the sourceInfo for its partitions i.e. `PostgresPartition` objects in
Kafka.

The `PostgresPartition` is identified by a map which has a structure
`{"server", topicPrefix}` - currently this is the same for all the
`PostgresPartition` objects which are created by the tasks when
`snapshot.mode` is `parallel` and hence they all end up referring to the
same source partition in the Kafka topic. Subsequently, what happens is
that (assume that we have 2 tasks i.e. 0 and 1):
1. One task (task_0) completes the snapshot while the other is yet to
start.
a. After completion, `task_0` updates the `sourceInfo` saying that its
snapshot is completed.
2. When task_1 starts up, it reads the same `sourceInfo` object and
concludes that the snapshot is completed so it skips its snapshot.

The above situation will cause a data loss since task_1 will never
actually take a snapshot.

## Solution

This PR implements a short term solution where we simply add the task ID
to the partition so that each `PostgresPartition` can identity a
sourcePartition uniquely, the identifying map will now become
`{"server", topicPrefix_taskId}`.

**Note:**
This solution is a quick fix for the problem given that the number of
tasks in the connector remain the same.

This partially fixes yugabyte/yugabyte-db#24555
Copy link

Hi @vaibhav-yb, thanks for your contribution. Please prefix the commit message(s) with the DBZ-xxx JIRA issue key.

…query (#164)

This PR makes the changes to set the transaction isolation level to
`SERIALIZABLE, READ ONLY, DEFERRABLE` before executing the snapshot read
query.
Copy link

Hi @vaibhav-yb, thanks for your contribution. Please prefix the commit message(s) with the DBZ-xxx JIRA issue key.

## Problem

With the current retry model, it was being noticed that the connector
ended up retrying infinitely whenever an exception was thrown. This
could lead to false positives that the connector is still running while
the retries will keep failing.

## Solution

This PR addresses the issue by adding a check to the task layer so now
if the retry count reaches the maximum value, the connector will exit
and the task will reach in a failed state - this will help the end user
know the status of the task and act accordingly.

This PR also redefines the following properties and changes their
default values:
1. `errors.max.retries` - new default value is 60
2. `retriable.restart.connector.wait.ms` - new default value is 30000
(30s)

With the above change, the complete retry duration with the above
default configuration will now be 30 minutes. This effectively means
that if the connector/s task fails after exhausting all the retries then
it will go into a `FAILED` state.

For example, if the connector is needed to retry for a total of 30
minutes then we can handle it in 2 ways:
1. **By fixing the number of retries:** Let's say we want the number of
retries to be fixed to 15 so we can now configure our retry delay
accordingly i.e. `30 / 15 = 2 minutes = 120 s = 120000 ms`, so the
configuration will now add:

```json
"retriable.restart.connector.wait.ms":"120000",
"errors.max.retries":"15"
```

4. **By fixing the retry delay:** If we want to have a retry delay of a
minute then we will configure the number of retries accordingly i.e. `30
/ 1 = 30 retries`, so the configuration will now be:

```json
"retriable.restart.connector.wait.ms":"60000",
"errors.max.retries":"30"
```
Copy link

Hi @vaibhav-yb, thanks for your contribution. Please prefix the commit message(s) with the DBZ-xxx JIRA issue key.

This PR adds a bug fix which was causing one less number of retries, for
example, if `errors.max.retries` was set to 5, the connector was only
retrying for 4 times.

Additionally, this PR adds a test to verify the logic.
Copy link

Hi @vaibhav-yb, thanks for your contribution. Please prefix the commit message(s) with the DBZ-xxx JIRA issue key.

…d structure (#168)

This PR adds a transformer `PGCompatible` with full path
`io.debezium.connector.postgresql.transforms.yugabytedb`, this will be
helpful in converting the structure of the emitted events to match the
one emitted by standard Debezium connectors.

**Example:**
Consider the following schema for a table `test`: `(id INT PRIMARY KEY,
name TEXT, age INT)`
- If a record is inserted having values `(1, 'John Doe', 25)` then after
using the above transformer, the `payload` of the record would look
like:
```json
"payload": {
  "id": 1,
  "name": "John Doe",
  "age": 25
}
```
- If the same record is now updated and age is changed to 30 i.e.
`UPDATE test SET age = 30 WHERE id = 1;` then the `payload` would look
like:

```json
"payload": {
  "id": 1,
  "name": null,
  "age": 30
}
```

> **NOTE:** The above example assumes that the replica identity of the
table is `CHANGE` and that is how the assumption was made that the
`UPDATE` event will not contain the value for the fields which were not
updated. For more information on replica identity, see [YugabyteDB
docs](https://docs.yugabyte.com/preview/explore/change-data-capture/using-logical-replication/key-concepts/#replica-identity).
Copy link

github-actions bot commented Jan 2, 2025

Hi @vaibhav-yb, thanks for your contribution. Please prefix the commit message(s) with the DBZ-xxx JIRA issue key.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants