v0.2.8 Happy Holidays!
This release contains the following:
- A full-fledged ML data quality improvement workflow using Lance showing model performance insights, detecting mislabels, and doing active learning. An experimental integration with Label Studio is demonstrated as well.
- Critical bug fix affected read/write of dictionary columns
- Imagenet dataset converter
What's Changed
- [BUG] Fix reading version aux data reading and writing by @eddyxu in #384
- [Benchmark] upload scripts for coco / imagenet benchmark dataset by @eddyxu in #385
- Closes #387 by @changhiskhan in #388
- Data quality notebook and associated code by @changhiskhan in #389
- [DUCKDB] Do not build PyTorch by default by @eddyxu in #392
- brew pin python by @changhiskhan in #391
- fix off by one error using negative indices for diff'ing by @changhiskhan in #383
- Fix GHA for duckdb extension by @changhiskhan in #394
- [DUCKDB] Add a Derivative macro by @eddyxu in #393
- [Benchmark] Create imagenet from raw dataset by @eddyxu in #386
- Various fixes for imagenet and fmt changes by @changhiskhan in #396
Full Changelog: v0.2.7...v0.2.8