Finding label errors in datasets and learning with noisy labels. https://pypi.org/project/cleanlab/
https://github.com/cgnorthcutt/cleanlab
- Data Noising as Smoothing in Neural Network Language Models. ICLR2017
noise contrastive estimation
-
Noise-contrastive estimation: A new estimation principle for unnormalized statistical models, paper
-
Noise-Contrastive Estimation for Answer Selection with Deep Neural Networks
- Random Sampling. We randomly select a number of negative samples for each positive answer.
- Max Sampling. We select the most difficult negative samples. In each epoch, we compute the similarities between all (p+, p−) pairs using the trained model from the previous training epoch. Then we select the negative answers by maximizing their similarities to the positive answer.
- Mix Sampling. We take advantages of both random sampling and max sampling by selecting half of the samples from each strategy.