Zero-Shot Auto-Labeling: The End of Annotation for Computer Vision with Jason Corso
EPISODE 735
|
JUNE
10,
2025
Watch
Follow
Share
About this Episode
Today, we're joined by Jason Corso, co-founder of Voxel51 and professor at the University of Michigan, to explore automated labeling in computer vision. Jason introduces FiftyOne, an open-source platform for visualizing datasets, analyzing models, and improving data quality. We focus on Voxel51’s recent research report, “Zero-shot auto-labeling rivals human performance,” which demonstrates how zero-shot auto-labeling with foundation models can yield to significant cost and time savings compared to traditional human annotation. Jason explains how auto-labels, despite being "noisier" at lower confidence thresholds, can lead to better downstream model performance. We also cover Voxel51's "verified auto-labeling" approach, which utilizes a "stoplight" QA workflow (green, yellow, red light) to minimize human review. Finally, we discuss the challenges of handling decision boundary uncertainty and out-of-domain classes, the differences between synthetic data generation in vision and language domains, and the potential of agentic labeling.
About the Guest
Jason Corso
Voxel51; University of Michigan
Resources
- Zero-shot auto-labeling rivals human performance
- Auto-Labeling Data for Object Detection
- Voxel51 Research Reveals Auto-Labeling Achieves up to 95% of Human-Level Performance While Cutting Costs by 100,000x
- Class-wise Autoencoders Measure Classification Difficulty And Detect Label Mistakes
- Class-wise Autoencoders Measure Classification Difficulty and Detect Label Mistakes (GitHub)
- Voxel51
- Voxel51 Fiftyone (GitHub)
- FiftyOne documentation
- YOLOE: Real-Time Seeing Anything
- YOLO-World Model
- Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection
- Annotation is dead
- Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks
