The Anatomy of a Production-Scale Continuously-Training Machine Learning Platform

Denis Baylor (Google Inc.)
Eric Breck (Google Inc.)
Heng-Tze Cheng (Google Inc.)
Noah Fiedel (Google Inc.)
Chuan Yu Foo (Google Inc.)
Zakaria Haque (Google Inc.)
Salem Haykal (Google Inc.)
Mustafa Ispir (Google Inc.)
Vihan Jain (Google Inc.)
Levent Koc (Google Inc.)
Chiu Yuen Koo (Google Inc.)
Lukasz Lew (Google Inc.)
Clemens Mewald (Google Inc.)
Akshay Modi (Google Inc.)
Neoklis Polyzotis (Google Inc.)
Sukriti Ramesh (Google Inc.)
Sudip Roy (Google Inc.)
Steven Whang (Google Inc.)
Martin Wicke (Google Inc.)
Jarek Wilkiewicz (Google Inc.)
Xin Zhang (Google Inc.)
Martin Zinkevich (Google Inc.)

Creating and maintaining a platform for reliably producing and deploying machine learning models requires careful orchestration of many components—-a learner for generating models based on training data, modules for analyzing and validating both data as well as models, and finally infrastructure for serving models in production. This becomes particularly challenging when data changes over time and fresh models need to be produced continuously. Unfortunately, such orchestration is often done ad hoc using glue code and custom scripts developed by individual teams for specific use cases, leading to duplicated effort and fragile systems with high technical debt. We present the anatomy of a general-purpose machine learning platform and one implementation of such a platform at Google. By integrating the aforementioned components into one platform, we were able to standardize the components, simplify the platform configuration, and reduce the time to production from the order of months to weeks, while providing platform stability that minimizes service disruptions. We present the case study of one deployment of the platform in the Google Play app store, where the machine learning models are refreshed continuously as new data arrive. Deploying the platform led to reduced custom code, faster experiment cycles, and a 2% increase in app installs resulting from improved data and model analysis.

THE FUTURE IS HERE

AI Now

Norwich U. Military Writer's Symposium – DARPA and Unimagined Technologies 10/12/2022

Inside the Mind of a Soldier: DARPA's Brain Chip Revolutionizing Warfare

How AI Can End Bias in Recruiting

Gender bias in AI: Speaking to Sophia

How generative AI can address the theory-practice gap | Derek Dubois | TEDxURI

Why AI Will Spark Exponential Economic Growth | Cathie Wood | TED

The Best AI Finance Tool Experts Will Ever Need (Free File)

Examining AI Functions In Finance – How Does This Tech Help You Make Better Investment Decisions?

Ethics in the Age of AI | Davos 2024 | World Economic Forum

AI and the Paradox of Self-Replacing Workers | Madison Mohns | TED