ML Ops: Key to Accelerated Business Outcomes and Increased Last Mile Adoption
You have just completed a machine learning pilot, tackling a challenging business problem which has stymied your executives for the last few years. As the plaudits roll in, you start thinking about how to move your project from the pilot stage to a full-blown application which will drive business value to your stakeholders.
Chances are, you reach out to your IT team to help scale your pilot, and are subject to a barrage of questions around scalability, system performance, integrations, and access privileges, topics which you didn’t think about too much when you set up your pilot. All this while, your stakeholders are pressuring you to go full scale so they can start seeing a return on their investment.
We at Tredence understand your challenges and can help you overcome them and take your machine learning to the next level. This article is the first part of a series on our MLOps capabilities and accelerators. Here we will talk about the common challenges in deploying MLOps and demystify the Ops component which serves as the nemesis to many. We will then elucidate our offerings in this space.
Why is machine learning at scale such a big challenge?
Let us start by looking back to the world of software engineering, where large-scale enterprise applications follow a DevOps methodology to drive continuous integration (CI) and continuous deployment (CD) and ensure high software quality. The DevOps process deals with versioned code being pushed through the software engineering value cycle.
Now let’s come back to the world of machine learning. One or more data scientists in your team has built a set of ML models, which have been tested in a controlled environment. The models must be deployed in a production environment, as well as continuously monitored, managed and improved upon.
Which is how we come to MLOps. Quoting from Wikipedia, “MLOps (a compound of “machine learning” and “operations”) is a practice for collaboration and communication between data scientists and operations professionals to help manage production ML (or deep learning) lifecycle. Similar to the DevOps or DataOps approaches, MLOps looks to increase automation and improve the quality of production ML while also focusing on business and regulatory requirements.
MLOps is not just about the moving of versioned code.
It is the moving of versioned code, data and models.
MLOps is not just about the moving of versioned code. It is the moving of versioned code, data and models. But how good are we at versioning all of these? Data scientists focus on building a best in class model, and not in documenting their code and model parameters in a standard manner. Inconsistent document standards and incomplete information on codes and models, especially legacy codes which business requires be integrated now with new ML models – in short, inconsistent production acceptance criteria – present the first big challenge in setting up your ML models for scale.
Because ML models have been created in controlled environments, thought is often not given to the computational needs of deploying them in a production scale environment. This leads to an underestimation of computational needs, in turn leading to very high compute needs, drastically reducing the ROI of the project.