Debugging traditional software is a well-understood process with well-known tools and workflows. The industry collectively has decades of experience debugging code. Machine learning is different. The computers are effectively programming themselves with data, and producing probabilistic logic which is not testable in the same way as code. Not having good debugging tools means that the time and cost of training ML models are likely too high and it’s probably taking too long to either get to the level of accuracy needed or to identify issues leading to inaccurate predictions.
If we look at an ML system, it consists of datasets, model architecture, model weights, algorithm parameters, and more. Models can perform poorly for a variety of reasons including but not limited to: features that lack predictive power; hyperparameter values that are suboptimal; data that contains errors or anomalies; buggy feature engineering code, and many other issues. A further complication is the amount of time that it takes to run an experiment by training a model and verifying the results. Longer iterative cycles and larger error domains make debugging ML models a completely different challenge.
ML Debugging is effectively a new discipline that attempts to test ML models, probe their responses and decision boundaries, and vet for accuracy, fairness, security, and other risk factors.
What is now starting to emerge are a class of tools purpose-built to address the needs of the ML practitioners that let them do the following: