Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add WV-MOS #22

Open
lars76 opened this issue Jul 28, 2024 · 1 comment
Open

Add WV-MOS #22

lars76 opened this issue Jul 28, 2024 · 1 comment

Comments

@lars76
Copy link

lars76 commented Jul 28, 2024

Add WV-MOS from https://arxiv.org/pdf/2203.13086 Code is here: https://github.com/AndreevP/wvmos/tree/main

Also relevant is: https://www.arxiv.org/pdf/2407.12707 On “TTS Arena” UTMOSv1 has only a weak correlation with the leaderboard, while WVMOS has much better results. I haven't tested UTMOSv2 and WVMOS yet. But UTMOSv1 does not necessarily lead to a correct voice quality evaluation in my experiments.

@Satellite30
Copy link

I have already used these model in my work. My work is clean the different language audio data. The WV-MOS will predict the score lke -0.1, 0.123...。i listen these audios, many of them are noise or over-noise reduction。if i use 0.1 as the interval, such as 2.0, 2.1, 2.2。the differences between them is hard to seperate. if you choose 2.0, 3.0 to listen, you can get a better distinguish results. Besides,in the multilingual audio clips, some language always get a lower score comparing to the american english, like chinese, Cantonese,even British English。
However, The UTMOSv1 model seems has some different with the WV-MOS。it seems to seperate the score more distinguishable. Of coure,it also have the different score in the different language. Besides, I haven't checked if the UTMOSv1 model will output negative numbers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants