Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Looks like special caracters are getting corupted #2151

Open
BO-AndrejZ opened this issue Jan 17, 2025 · 7 comments
Open

Looks like special caracters are getting corupted #2151

BO-AndrejZ opened this issue Jan 17, 2025 · 7 comments

Comments

@BO-AndrejZ
Copy link

BO-AndrejZ commented Jan 17, 2025

I am running a maxwell to get data from a mysql bin log and write it to a RabbitMq with a docker container
Image: zendesk/maxwell:v1.41.2

For instance I update the filed with a mysql client like this:
update auftrag set Gewerk = '21 Lüftung' where nr = 2024170075;

Source DB settings (from the mysql client):

show variables like "character_set%";

character_set_client	utf8mb4
character_set_connection	utf8mb4
character_set_database	utf8
character_set_filesystem	binary
character_set_results	utf8mb4
character_set_server	utf8
character_set_system	utf8

the Log entry from Maxwell is as followed:
2025-01-16 11:59:39 DEBUG RabbitmqProducer - -> routing key:db.auftrag, partition:{"database":"db","table":"auftrag","type":"update","ts":1737028779,"xid":153331256,"commit":true,"data":{"Nr":2024170075","GewerkNr":"21","Gewerk":"21 Lüftung"}}

it looks like the utf-8 strings are getting curruped for some reason (this looks like ascii in the log) and this is also what gets pushed to RabbitMQ
I tried to set the jdbc options with the enviroment variable REPLICATION_JDBC_OPTIONS:useUnicode=true&characterEncoding=UTF-8 but it had no effect

Table Definition:

CREATE TABLE `auftrag` (
  `Nr` int(10) NOT NULL DEFAULT '0',
  `GewerkNr` varchar(10) COLLATE utf8_german2_ci DEFAULT NULL,
  `Gewerk` varchar(50) COLLATE utf8_german2_ci DEFAULT NULL,
  ... 
  PRIMARY KEY (`Nr`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_german2_ci;

My next try will be to set the config.properties and add the JDBC options from there

Any Idea what else I can set to resolve this?

@osheroff
Copy link
Collaborator

please dump the schema for the table in question. jdbc options aren't important here.

@BO-AndrejZ
Copy link
Author

BO-AndrejZ commented Jan 18, 2025

@osheroff Thank you for the response. I added the Table definition, The Value of the DB Filed looks fine if I do a select.
The MySql Client settings:
characterEncoding:UTF-8

@osheroff
Copy link
Collaborator

what version of mysql?

@BO-AndrejZ
Copy link
Author

5.7.37-enterprise-commercial-advanced-log

@BO-AndrejZ BO-AndrejZ changed the title Looks like special caracters are getting crrupted Looks like special caracters are getting corupted Jan 18, 2025
@osheroff
Copy link
Collaborator

ok, I can't reproduce this trivially. what's the output of this?

select * from maxwell.tables where name='auftrag';

and this?

select * from maxwell.columns where name='Gewerk';

@BO-AndrejZ
Copy link
Author

@osheroff, thank you. I think I see now what the Problem is:

select * from maxwell.tables where name='auftrag';
-- Chareset:latin1
select * from maxwell.columns where table_id = 426 and name='Gewerk';
-- Chareset:latin1

Will let maxwell read the schema again and that should be it I think

@osheroff
Copy link
Collaborator

yeah. I wish i knew why / how it ended up this way, but you're right about the fix

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants