-
Notifications
You must be signed in to change notification settings - Fork 631
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Metrify RawNet3/Resemblyzer as Keywords & Update READMEs #85
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice PR description
deg_dir, ref_dir, dump_dir | ||
) | ||
result[metric] = str(similarity_score) | ||
if metric in ["fad", "rawnet3_similarity"]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is "fad" here? It seems that the original code does not contain the conditional judegment about fad. Is there a bug in the old code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. When modifying the input-selection part in the old code, the FAD part was mis-deleted. It should be here, as shown in the original commit, at line 64: 9682d0c#diff-4fa833e1c8dd8d05d182f8262a2cc5f727dc72a364db06f8acc5536eff3e6506
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Owing to the firewalled status of huggingface.co on the Aliyun server, it went undetected because prior to testing calc_metrics.py, all parts concerning FAD had to be commented out, or else the script couldn't correctly initialize.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@VocodexElysium Please review this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@VocodexElysium Please review this.
Basically, if your internet environment does not support getting access to Hugging Face in the terminal, importing FAD will cause errors since it is trying to connect with Hugging Face. You can avoid this by setting the correct VPN environment or just downloading the necessary things yourself and then adjusting the FAD code to import the model on your computer rather than downloading from Hugging Face (I think MingXuan did this successfully). So I don't think removing FAD-related code is necessary since it only involves the internet environment and is solvable.
New commit: fixed a found typo in Amphion/egs/metrics/README.md: Aduio -> Audio |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@Merakist Calculating FAD needs a connection to Hugging Face. I think you need to specify that and also attach the methodology of downloading and specifying the folder to the pretrained model for calculating FAD when the internet cannot guarantee a connection to Hugging Face in the README.md. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Merakist @VocodexElysium Please cooperate to refine the FAD-related doc to make it friendly and easy-to-understand for users
Updated README. Proposed a solution for FAD to load local models when |
Here are some advice:
|
@VocodexElysium Roger. The new commit condensed the title from |
I think you forgot to create folders for bert, roberta, and bart in the Amphion/pretrained directory. Please add them up. |
@VocodexElysium I have updated the README.md under |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Able to merge
Description
This pull request includes modifications to further integrate Resemblyzer into Amphion's evaluation process, which includes:
Modifying the cosine similarity calculation method: Previously, the
resemblyzer_similarity.py
used thescipy.spatial.distance.cosine
method for calculating cosine similarity. This update replaces it with PyTorch'storch.nn.functional.cosine_similarity
method, which is already used by other scripts to simplify the codebase and ensure uniformity.Metrifying RawNet3/Resemblyzer as keywords: Altered the
--metrics
argument handling fromspeaker_similarity
torawnet3_similarity
andresemblyzer_similarity
as distinct options to streamline the evaluation process by removing the need for user input during script execution.Testing
Changes
Amphion/evaluation/metrics/similarity/resemblyzer_similarity.py:
scipy.spatial.distance.cosine
totorch.nn.functional.cosine_similarity
method.Amphion/bins/calc_metrics.py:
speaker_similarity
to two distinct options:rawnet3_similarity
andresemblyzer_similarity
.Amphion/egs/metrics/README.md:
speaker_similarity
keyword withrawnet3_similarity
andresemblyzer_similarity
, and fixed misspelling.Amphion/README.md:
Usage
When calculating speaker similarity with
Amphion/egs/metrics/run.sh
using the command--metrics
, the user shall select the desired model (RawNet3/Resemblyzer) for calculation with the corresponding keyword (rawnet3_similarity
/resemblyzer_similarity
).