Most machine learning in use today is supervised learning, meaning it requires labeled training data in order to work. Sometimes these labels can be extracted from existing data, but often the labels we need to support ML applications must be manually created. Labeling is often thought of as something that happens separately and before the machine learning process itself, but this is changing for several reasons. First, the collection of labeled data is very expensive, and integrating it into the ML process ensures that only the amount and type of data needed to meet the project’s goal is labeled. Perhaps more interesting is the increasing maturity of techniques like active learning and semi-supervised learning. These techniques place labeling squarely in the ML loop and use ML itself to determine which data to label. In any case, efficient labeling requires tooling customized to the data and labels for a specific problem.