Data Labeling

Data Labeling

Most machine learning in use today is supervised learning, meaning it requires labeled training data in order to work. Sometimes these labels can be extracted from existing data, but often the labels we need to support ML applications must be manually created. Labeling is often thought of as something that happens separately and before the machine learning process itself, but this is changing for several reasons. First, the collection of labeled data is very expensive, and integrating it into the ML process ensures that only the amount and type of data needed to meet the project’s goal is labeled. Perhaps more interesting is the increasing maturity of techniques like active learning and semi-supervised learning. These techniques place labeling squarely in the ML loop and use ML itself to determine which data to label. In any case, efficient labeling requires tooling customized to the data and labels for a specific problem.

Snorkel Flow
A radically faster approach to building and deploying AI applications
Scale AI
Our API provides access to human-powered data for hundreds of use cases
Find the smart data inside your big data
Google Vertex AI
Fully managed, end-to-end platform for data science and machine learning
Azure Machine Learning
Enterprise-grade machine learning service to build and deploy models faster
Amazon SageMaker
Machine learning for every developer and data scientist