Download as pdf or txt
Download as pdf or txt
You are on page 1of 20

MLOPS CONCEPT

Problem Background of MLOps

Machine learning applications have become the de facto solution to many problems in our daily lives.
Building these applications is a little different from building standard software for several reasons.
Some of these differences include:

● You need to constantly monitor the performance of models because they may degrade with time
(Drifting model performance).
● Bringing a machine learning model to productions involves many people, including data scientists,
data engineers, software engineers and business people.
● Deploying and operating machine learning models at scale is a big challenge.
● Team skills: In an ML project, the team usually includes data scientists or ML researchers, who
focus on exploratory data analysis, Data analysis: Data preparation: Model training: Model
evaluation: Model validation: Model serving. These members might not be experienced software
engineers who can build production-class services.
Problem Background of MLOps
Data scientists can implement and train an ML model with predictive performance on an offline holdout dataset,
given relevant training data for their use case. However, the real challenge isn't building an ML model, the
challenge is building an integrated ML system and to continuously operate it in production.

A NeurIPS paper on hidden technical Debt in ML systems shows you developing models is just a very small part of the
whole process.

As shown in the following diagram, only a small fraction of a real-world ML system is composed of the ML code. The required surrounding
elements are vast and complex.
Problem Background of MLOps

According to analysts, most organizations fail to successfully deliver AI-based applications,


which were tested on sample or historical data, into interactive applications which work with
real-world and large-scale data.

Organizations tend to put too much emphasis on the creation of ML models and placing them
behind some API end point. This emphasis overlooks the bigger challenges such as accessing and
preparing data in production, integrating the models with online business applications, monitoring
and governing the model’s performance, or delivering continuous improvements.
One of the key challenges is that the As a team, the data science team often works in a silo, and use manual
development processes, which then need to be manually converted into production-ready ML pipelines. This requires
separate teams of ML engineers, data engineers, DevOps and developers to invest additional time and resources,
often much more.

Data
Acquisitio
n Package

Data
ML Model
Annotatio
Validation
n
Manual Monitoring Compile
convert
ML Operations
Development Team
Team
ML Model Data
Training Validation

Release Deploy
ML Model
Building

• ML Scientist doing research on local (POC)


• ML Scientist is not a Soft. Eng: More math, stats etc->Code is not factored Prone to error
• The ML code is passed to Eng/Ops team: 50% time to do R&D, Slow Model Update
• Different language, packages and toolsets, dependencies other to answer questions: Deteriorating result
• Code scalability • Parameters, packages
• Sometimes need to be converted • Logical Flow, Class
Problem Background of MLOps
We will have to shift our perspective, not to depreciate scientific principles but to make them more
easily accessible, reproducible, collaborative, and most importantly to increase the speed at which
machine learning capabilities can be released.

The research-oriented data science approach that is currently dominant can no longer prevail. Data
science MUST adopt agile software development practices with micro-services, continuous integration
(CI), continuous delivery (CD), code versioning (Git), and data/configuration/metadata versioning.

ML systems cannot be built in an ad hoc manner, isolated from other IT initiatives like DataOps and
DevOps. They also cannot be built without adopting and applying sound software engineering
practices.
Why use MLOps?
The goal of MLOps is to reduce technical friction to get the model from an idea into production in the shortest
possible time to market with as little risk as possible.

In most real-world applications, Business and regulatory requirements can change rapidly, requiring a more frequent
release cycle. These impact to data changes constantly, and thus models need to be retrained, or even whole
pipelines need to be rebuilt to tackle feature drift. This is where MLOps comes in to combine operational know-how
with machine learning and data science knowledge.

Developing code or models is just the first step. The biggest effort goes into making each element production-ready,
including data collection, preparation, training, serving and monitoring, and enabling each element to run repeatedly with
minimal user intervention.
Machine learning operations (MLOps) is the
practice of efficiently developing, testing,
deploying, and maintaining machine learning
(ML) applications in production.

MLOps automates and monitors the entire


machine learning lifecycle and enables
seamless collaboration across teams, resulting
in faster time to production and reproducible
results.
What problems does MLOps solve?

MLOps is not about running notebooks in production environments and is not about placing an ML model
behind an API end point. MLOps is about building an automated ML production environment from data
collection and preparation to model deployment and monitoring.

Business benefits:
● Deliver business value from data and AI/ML faster
● Increase team’s productivity and eliminate silos
● Provide reliable and reproducible results
● Observe, explain, and improve model behavior and accuracy
How is MLOps different from DevOps,
Data Ops? MLOps

MLDev
DevOps

DataOps

- DevOps are standards to ensure the success of a software project. delivery of high-quality software . This practice
provides benefits such as shortening the development cycles, increasing deployment velocity, ensure that there is
continuous and dependable releases (CI/CD)

- DataOps involve a set of rules that ensure that high-quality data is available to analyze and train machine learning
models

In the machine learning space, A model’s performance can degrade over time. This could be a result of changes in
data or errors during deployment. As a result of that MLOps is also concerned with ensuring that an ML model
performs optimally at any point in time. It also ensures that the model is retrained in the event of dismal
performance.
MLOps Stages

Preparing and Building and


Developing and
Framing Business Searching for Processing data automating Monitoring the
training Machine Deploy the model
Objectives Relevant Data (Data Ops/Data machine learning model
Learning model
eng) pipeline

Validating Codes, Data, Continuous


Data Schemas, Models Integration

Continuous
Delivering ML Pipelines Delivery

Retraining model with Continuous


updated data Training
MLPOS Components and Characteristics
● Data versioning : the entire dataset used to create a certain model can also be versioned. ensures
that there is reproducibility in the process of creating models.
● ML metadata store :Tracking metadata of model training, for example model name, parameters,
training data, test data, and metric results.
● Model versioning : Versioning models is important because it enables switching between models
in real-time.
● Model registry : A registry for storing already trained ML models.
● Model serving : Using CD tools for deploying pipelines to the target environment.
● Model monitoring : Once machine learning models have been deployed, they have to be
monitored for model drift and production skew
● Model retraining : Machine learning models can be retrained for two main reasons: To improve the
performance of the model and When new training data becomes available
● Feature store : it stores the features used in training a machine learning model. It
● CI/CD : Continuous integration and continuous deployment in machine learning ensure that
high-quality models are created and deployed often
● ML Pipeline Orchestrator : Automating the steps of the ML experiments.
MLops Maturity

There are three levels of automation, starting from the initial level with manual model
training and deployment, up to running both ML and CI/CD pipelines automatically.

1. Manual process. (level 0)


2. ML pipeline automation. (level 1)
3. CI/CD pipeline automation. (level 2)
Level :0
Manual process. This is a typical data science process, which is performed at the beginning of implementing ML. This level
has an experimental and iterative nature. Every step in each pipeline, such as data preparation and validation, model training
and testing, are executed manually. The common way to process is to use Rapid Application Development (RAD) tools, such
as Jupyter Notebooks.

Characteristic:

● Manual, script-driven, and interactive process:


● Disconnection between ML and operations:
● Infrequent release iterations:
● No CI: No CD
● Deployment refers to the prediction service:
● Lack of active performance monitoring:
Level :1
ML pipeline automation. The next level includes the execution of model training automatically. We introduce here the continuous
training of the model. Whenever new data is available, the process of model retraining is triggered. This level of automation also
includes data and model validation steps.
Level :3
CI/CD pipeline automation. In the final stage, we introduce a CI/CD system to perform fast and reliable ML model deployments in
production. The core difference from the previous step is that we now automatically build, test, and deploy the Data, ML Model, and
the ML training pipeline components.
Solution : Target area development

vertex-ai
Source :
● Dataset OPs :
https://docs.google.com/presentation/d/1OqzCG9Cabx8s4AFUMHCAjvqz5HAXZsm7U0qqOFDu2yA/edit#slide=id.g10c352fa
a72_0_100
● TrainingOPs :
https://docs.google.com/presentation/d/1mM815WZRfRyl3fDEcsufgAh49y8BhinpPZTpMisCw-k/edit#slide=id.g120d349fd8e_
1_63
● Monitoring :
https://docs.google.com/presentation/d/1KljgGSqiepvkMZWCYtm7RejuBPgXt47VCA6hcC6_5K0/edit#slide=id.g105f3c1a80c
_0_124
● https://ml-ops.org/content/mlops-principles#iterative-incremental-process-in-mlops
● https://cloud.google.com/architecture/mlops-continuous-delivery-and-automation-pipelines-in-machine-learning
● https://www.freecodecamp.org/news/what-is-mlops-machine-learning-operations-explained/
● https://learn.layer.ai/what-is-mlops/#DevOps_vs_MLOps_vs_Data_Ops

You might also like