Incorrect writing script for Bodo #5

maharajbrahma · 2023-06-26T04:58:19Z

Hello,

I have noticed that in the paper "BiLEX-Rx: Lexical Data Augmentation for Massively Multilingual Machine Translation", the script for Bodo language is incorrect under Appendix F table. Bodo uses Devanagari script (Deva) instead of Bengali (Beng). We can confirm the same from the GATITOS dataset.

Thanks.

icaswell · 2023-09-26T01:10:28Z

[resolved offline; copying answer here]

While the more common script for written Bodo is Devanagari, the web-crawl data that happened to be included in this project was in the Bengali script. That is why it is explicitly labeled as "brx-Beng", rather than "brx" (which would mean Bodo in the default script, aka Devanagari). The Gatitos data is correctly in Devanagari. However, at the time of writing this pre-print, there was only GATITOS data in 26 languages, not including Bodo, so Bodo-language Gatitos is not included in the paper.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Incorrect writing script for Bodo #5

Incorrect writing script for Bodo #5

maharajbrahma commented Jun 26, 2023

icaswell commented Sep 26, 2023

Incorrect writing script for Bodo #5

Incorrect writing script for Bodo #5

Comments

maharajbrahma commented Jun 26, 2023

icaswell commented Sep 26, 2023