ML Mashups: Creativity in the Past, Present, and Future of Machine Learning

It’s common to think of creativity as conceiving new ideas from nothing, but some of the best examples of creativity come in the form of bringing together previously unconnected ideas. Peanut butter cups provide one tasty example—sorry, I’m writing this at lunchtime—but so do many Tiktok videos, which combine published music with user-created videos to create the next viral wonder.

My recent conversation with Paperspace’s Dillon Erb got me thinking about creativity and machine learning in the context of the “mashup economy”–remember that term?

The term mashup originated in the music industry when disc jockeys (DJ Earworm, anyone?) would combine two or more songs to create a new musical experience. In software, mashups integrate content and function from multiple sources and combine them into a single new service, usually for a new and specific purpose.

Mashups became a popular way of delivering new web applications about 15 years ago and helped define the interactivity and flexibility of web 2.0. The invention and dissemination of open REST APIs and other technologies made it easier for developers to remix the work of other creators. Mashups helped create the Internet we know and love, where end-users contribute to and interact with the web’s ongoing creation.

Today, we’re seeing similar things happen in the world of machine learning. Dillon called this idea Compositional ML, and it’s certainly been cool to see innovators create new mashups using OPMs (other people’s models) today. The prevalence of open publishing, public datasets, pre-trained models, open-source code, and readily accessible tools and services are allowing people to take ML models off the shelf, combine them, train or tune them on new data, and remix them to create a totally new invention.

One cool example discussed in the interview is PixRay, one of many interesting “model mashups” that have emerged recently from the AI art scene. PixRay combines OpenAI’s open-source CLIP model with VQGAN and several other ideas and models to generate impressive pixel art images from text prompts.

I’m excited about mashups both as an accelerator for continued innovation in ML and also as a path for making the technology more accessible to a broader base of users.

Let me know the most interesting model mashups you’ve seen in the comments below!

P.S. If you liked this blog, we send thoughts like this out every week in our weekly newsletter!