You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The validator validate_cut_set serves to make sure no duplicated ids in a CutSet. If I get it right, it is because the CutSet maintains a dict-like interface. My question is this: what should I do to duplicate some of the samples(cuts)? Will it work well if I simply comment out the whole validate_cut_set function and duplicate corresponding lines in cutset.jsonl?
In my experiment setting, the amount of data from different speakers varies over a wide range. In order to achieve the minimum level of balance, the data from speakers with fewer data need to be duplicated. Is there any more lhotse way to achieve that?
The text was updated successfully, but these errors were encountered:
That used to be a constraint but at some point we dropped it. I may have missed that validation still checks for this. If you could make a PR to remove this check it would be greatly appreciated. Thanks.
t13m
added a commit
to t13m/lhotse
that referenced
this issue
Feb 22, 2025
The validator
validate_cut_set
serves to make sure no duplicated ids in aCutSet
. If I get it right, it is because theCutSet
maintains a dict-like interface. My question is this: what should I do to duplicate some of the samples(cuts)? Will it work well if I simply comment out the wholevalidate_cut_set
function and duplicate corresponding lines incutset.jsonl
?In my experiment setting, the amount of data from different speakers varies over a wide range. In order to achieve the minimum level of balance, the data from speakers with fewer data need to be duplicated. Is there any more lhotse way to achieve that?
The text was updated successfully, but these errors were encountered: