AST for multi-label audio tagging? #142

Antoine101 · 2024-12-16T16:12:23Z

Antoine101
Dec 16, 2024

Hi,

I am trying to describe acoustic scenes from audio samples by listing all the sources present in the sounds from a set list of learnt labels.

I have read your paper and used your model, mainly through the Hugging Face hub, for single label classification.

Does it work for multi-labels classification as well (one audio sample = possibly multiple labels)?

In here you say this checkpoint is able to classify an audio into one of audioset classes.

In your paper however, you mention results obtained on the FSD50K dataset which is multi-label dataset (correct me if I'm wrong).

I have come accross the LwLRAP metric which seems to be suited to multi-labels tasks. Did you use this metric specially for finetuning your model on the FSD50K? Or did you tweak FSD50K to turn it into a single label dataset?

And finally, would it be possible to finetune AST on my multi-labels downstream task through the hugging face checkpoint? Does it only require the appropriate arrays of labels and metric? Is it only a matter of metric or is there more to it?

Thanks a lot in advance.

Antoine

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AST for multi-label audio tagging? #142

{{title}}

Replies: 0 comments

Select a reply

AST for multi-label audio tagging? #142

Antoine101 Dec 16, 2024

Replies: 0 comments

Antoine101
Dec 16, 2024