Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to search or download #13

Open
sinaahmadi opened this issue May 16, 2024 · 2 comments
Open

Unable to search or download #13

sinaahmadi opened this issue May 16, 2024 · 2 comments

Comments

@sinaahmadi
Copy link

Hi,

The new website is sleek! However, it seems to have some glitches when it comes to searching or downloading. I have noticed this particularly for languages for which their codes contain the script name like "Central Kurdish" or "Kurdish (Arabic)".

When trying to download NLLB for that language (here: https://opus.nlpl.eu/NLLB/en&ku-Arab/v1/NLLB), searching doesn't return anything. If I try something on NLLB like Tamil-English (ta-eng) and the search works, I can then search the other language code, yet the download links remain the previous one. Ultimately, I get this error: We're sorry, no samples for Kurdish (Arabic) (ku-Arab) - in the[ NLLB](https://opus.nlpl.eu/NLLB/ku-Arab&/v1/NLLB) dataset, version v1 were found. at https://opus.nlpl.eu/sample/ku-Arab&/NLLB&v1/sample.

Thanks for your help.

@jorgtied
Copy link
Member

jorgtied commented Jul 19, 2024

We are looking into this. It seems to be a problem of the OPUS-API. The language pair does not show for some reason. The issue might be related to the way it is specified in the metadata (it says ku_Arab-en instead of en-ku_Arab -- in OPUS the language pair is typically specified by alphabetically sorted language IDs).

In the meantime, you could download the data from the links on the legacy NLLB OPUS site: https://opus.nlpl.eu/legacy/NLLB.php

@sinaahmadi
Copy link
Author

Thanks.
I have also contacted you many times regarding adding a few parallel corpora for Kurdish. Would you be able to add this to OPUS please? https://github.com/KurdishBLARK/InterdialectCorpus/tree/master

Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants