Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wav2vec-U 2 is missing a kmeans model #5088

Open
Mattias421 opened this issue Apr 21, 2023 · 3 comments
Open

Wav2vec-U 2 is missing a kmeans model #5088

Mattias421 opened this issue Apr 21, 2023 · 3 comments

Comments

@Mattias421
Copy link

🐛 Bug

https://github.com/facebookresearch/fairseq/blob/main/examples/wav2vec/unsupervised/scripts/prepare_audio_v2.sh

I believe prepare_audio_v2.sh is missing a line to learn kmeans e.g.

python $FAIRSEQ_ROOT/examples/hubert/simple_kmeans/learn_kmeans.py \
   $tgt_dir/mfcc $train_split 1 $tgt_dir/mfcc/cls$dim 64 --percent -1 

This would go between dump_mfcc and dump_km_label

@XR1988
Copy link

XR1988 commented Dec 19, 2024

Thanks for your work. What's the current status? I'm not getting good results; my UER is stuck around 90.
I've cloned this repo (it might have environment problems new): https://github.com/oneapi-src/ai-transcribe
Others cloned it with a virtual environment: https://github.com/voidful/wav2vec-u-exp
I'm having trouble with this: #5572
@Mattias421

@Mattias421
Copy link
Author

Hi XR1988, I haven't worked on this since last year but from what I remember I got OK results on small datasets. I Would try and train on full librispeech to confirm that your environment is working okay and then move on from there.

@XR1988
Copy link

XR1988 commented Dec 20, 2024

Thank you so much Mattias421 .

I'm having the same problem on both versions U-1.0 and U-2.0. Even after training for over 50,000 steps, the UER and WER remain stubbornly around 90, and the output lengths on the validation set keep getting shorter.

image

If you are working on the 'inter' project, it might be that the 'inter' component is no longer supported from the start. I've been cloning everything directly on a cloud server. But it might not be necessary. It's likely only there to support pure CPU training (in reality, you can use both CPU and GPU by changing the configuration settings)"

I've cloned this repo (it might have environment problems new): https://github.com/oneapi-src/ai-transcribe

I'm having trouble with this: #5572
usefully:#3581

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants