Sunday, April 4, 2021

Machine Learning: Cleanlab

 https://github.com/cgnorthcutt/cleanlab

 

 


 

cleanlab is python package for machine learning with noisy labels. cleanlab cleans labels and supports finding, quantifying, and learning with label errors in datasets.

cleanlab is powered by confident learning, published in this paper | blog.


  • News! (Mar 2021) cleanlab supports ICLR workshop paper (Northcutt, Athalye, & Mueller, 2021), by finding label errors across 10 common benchark datasets (ImageNet, CIFAR-10, CIFAR-100, Caltech-256, Quickdraw, MNIST, Amazon Reviews, IMDB, 20 News Groups, AudioSet). Along with the paper, the authors launched labelerrors.com where you can view the label errors in these datasets.
  • News! (Dec 2020) cleanlab supports NeurIPS workshop paper (Northcutt, Athalye, & Lin, 2020).
  • News! (Dec 2020) cleanlab supports PU learning.
  • News! (Jan 2020) cleanlab achieves state-of-the-art on CIFAR-10 for learning with noisy labels. Code to reproduce is here: examples/cifar10. This is a great place for newcomers to see how to use cleanlab on real datasets. Data needed is available in the confidentlearning-reproduce repo, cleanlab v0.1.0 reproduces results in the CL paper.
  • News! (Feb 2020) cleanlab now natively supports Mac, Linux, and Windows.
  • News! (Feb 2020) cleanlab now supports Co-Teaching (Han et al., 2018).

 

No comments:

Post a Comment