Developing an effective MLOps can lead to better AI outcomes

Machine Learning (ML) has reached a critical inflection point in 2021. Through the pandemic, we have seen a massive acceleration of digitisation, including not only a willingness, but an imperative to improve automation capabilities.

Despite this, we still see a gap in the use of AI; there is a dissonance between the uptake of tools like Alexa, Siri and autonomous cars, and how quickly we are making AI a mainstream business function.

There are a couple of reasons for this. One is that many businesses simply do not properly understand AI and the applications for it, and therefore have no strategy in place to leverage it.

The other is that businesses are invested in the potential for AI but have trouble moving beyond a proof of concept. Getting AI concepts into production is a real challenge because ML models represent both code and data, the various modes of ML are still not completely understood across organisations, and governance structures are still being defined.

The result is that companies end up hiring a group of data scientists who become stuck in silo, unable to deploy their concepts, or at best, do so at a glacial pace due to operational question marks.

We frequently talk with customers who have made huge investments in expensive data science teams who do amazing work, but then can't find ways to get models into production. They can easily take more than nine months to ship a model to production.

The latter problem has led to the rise of what is known MLOps (Machine Learning Operations), a collaboration practice between data scientists, software engineers and operations engineers to streamline repeatable machine learning end to end, including deploying to production environments.

Like DevOps for software, the critical benefit of MLOps is that it automates idea pipelines to get ML models out faster. A successful MLOps strategy can also lead to more efficient, productive, accurate, and trusted models by improving testing and reducing the impact of human bias and error.

Understanding the MLOPS lifecycle

Similar to DevOps, MLOps is an ‘infinite loop’ process from idea to production, but it introduces two new elements - model engineering and model deployment. This makes everything more complex than the use of DevOps for any software development. For example, the use of Continuous integration (CI) applies not only to the testing and validating of data, schemas, and models, but to the code and components.

Continuous deployment C(D) refers to the whole system, including deploying another ML-provided service, not just a single software or service. Finally, Continuous training (CT) is unique to ML models and refers to model service and retraining.

Find a single source of truth

The other challenge is that there is not yet a single tool that does end to end MLOps, but rather a variety of tools that do a certain part of MLOps really well.

Many of the data science teams we hear from struggle to integrate an endless list of machine learning technologies, from open source projects to expensive ML platforms within their company's software development lifecycles. This is the same struggle we saw early on in the DevOps space - an explosion of tools and no good way to wrangle them all together

To manage these complexities, it is important to establish a single source of truth for MLOps that unites people and tools so that ML applications can be deployed more simply, faster and cheaper.

Ultimately, an effective MLOps workflow unites app developers and data scientists into the same, automated system — one that continually audits and manages model interpretability over time. The level of automation in these processes not only signals organisational maturity, but also success, as the more models produced, the greater the likelihood that ML concepts will reach production.

Additionally, companies that are able to share their knowledge and successes in an open source environment allow other businesses to improve processes and advance their ML maturity faster.

The most successful customers we talk to are data science teams who have built their work on top of their organisation’s existing DevOps platforms, with continuous integration and deployment and live data feeds.

To avoid being caught in a web of complexity and to bring data science teams out of the shadows, an integrated MLOps strategy is essential so the full potential of ML opportunities for business growth can be realised.

Monmayuri Ray is Engineering Manager (Applied Machine Learning and Anti-Abuse) at GitLab