ML Pipelines or Workflows

As an organization’s use of data science and machine learning matures, both business and technical stakeholders alike benefit from a unified ML workflow, with a common framework for working with the organization’s data, experiments, models, and tools. The benefits of a common platform apply across the ML workflow. A unified view of data helps data scientists find the data they need to build models more quickly. A unified view of experiments helps individual users and teams identify what’s working faster, and helps managers understand how resources are being allocated. A unified view of deployed models helps operations and DevOps teams monitor performance across a wide fleet of services. Finally, a unified view of infrastructure helps data scientists more readily access the resources they need for training.

With a unified approach to the machine learning workflow, it becomes much easier to facilitate and manage cross-team collaboration, promote the reuse of existing resources, and take advantage of shared skills. It also enables individual team members to more quickly become productive when transitioning to new projects, driving increased efficiency and better overall outcomes for data science and ML/AI projects.

While they vary by vendor, most workflow type systems should do things such as:

  • Assist with maintaining parity between development and production environments;
  • Provide version control of assets and artifacts;
  • Automate different steps of the ML workflow;
  • Allow for the easy creation, visualization, and management of the workflows; and
  • Record all actions in a system such as training data, platform configurations, and model parameters which assist with audibility and reporting.
