Model Portability and Compression

Model Portability and Compression

Organizations sometimes find that they want to deploy a model somewhere and they find that they need to reduce the model size in order for it to successfully run in the target production environment. There are many approaches to compression or quantization of models that range from removing data from the training sets to removing layers of the neural network to changing things like the floating-point precision used for the calculations from 15 decimal places to 3. Regardless of the technique used, the goal is to reduce the size and therefore storage, compute, and power costs to run it while reducing the impact on accuracy.

Deploy machine learning models to production
AI and machine learning model management and operations for enterprise data science teams
The machine learning acceleration platform
Model serving made easy
Google Vertex AI
Fully managed, end-to-end platform for data science and machine learning
Azure Machine Learning
Enterprise-grade machine learning service to build and deploy models faster
HPE Ezmeral ML Ops
A solution that brings DevOps-like agility to the entire machine learning lifecycle
MATLAB is a programming and numeric computing platform used by millions of engineers and scientists to analyze data, develop algorithms, and create models
Your trusted platform for enterprise AI
Your entire MLOps stack in one open-source tool