Download as pdf or txt
Download as pdf or txt
You are on page 1of 12

Tasks in MLops

Prof. Shichkina Yulia

Head of Department "Artificial Intelligence


in Medicine and Physiology",

Saint Petersburg, 2023


Cognitive architecture for co-evolutionary hybrid intelligence

2
Kirill Krinkin, Yulia Shichkina Cognitive Architecture for Co-Evolutionary Hybrid Intelligence – presented at AGI-2022
Пример CHI
Example
Pipeline concept

A pipeline is a description of
a machine learning (ML)
workflow, including all of
the components in the
workflow and how the
components relate to each
other in the form of a graph.
Pipeline concept
A pipeline is a description of a machine learning (ML) Important:
workflow, including all of the components in the workflow and • data can be passed between the components
how the components relate to each other in the form of a graph. • components can execute multiple times in loops
• conditionally after resolving an if/else
Pipeline configuration includes:
• definition of the inputs Small data
• the inputs and outputs of Component 1
each component Output data -> serialized
component Docker
A pipeline component is self-contained set image Component 2
of code that performs one step in the ML
Input data -> deserialized
workflow (pipeline), such as:
• data preprocessing Big data
component Docker
• data transformation Component 1
image
• model training Output data -> file
• …
‘OutputPath’
Run a pipeline: Data file
component Docker
• system launches one or more ‘InputPath’
image
Kubernetes Pods
• Pods start Docker containers Component 2
• containers in turn start programs File -> Input data
Pipeline examples

https://qiita.com/oguogura/items/32fcaaa7ece2ab868e81 https://cloud.google.com/blog/products/ai-machine-learning/getting-started-kubeflow-pipelines
Factors impacting the structure of the Pipeline

» Components can be executed in parallel or sequentially In parallel?


» Components can have different execution times Code Code
» Components can be dependent and independent
» Components with the same and different set of operations, with the Combine?

same and different input data can be executed in parallel. Code Make a Code
» Components can be executed depending on a certain condition group?

» There are restrictions on the use of conditional operator for components


» Components can be combined into groups Code Code
Code
» Components can have different run times
» Data can be transferred in different ways (transfer via external storage, Code Code
as an artifact, as a command line argument)
Conditional execution is
possible?
These and other conditions affect the component structure and the structure Code
of the pipelines
Information graph
Information graph (Algorithm graph) - an oriented graph consisting of
vertices corresponding to algorithm operations and directed arcs In parallel?
corresponding to data transfer between them (results of some operations are
passed as arguments to other operations). It should not be confused with the Code Code
program control graph and flowchart.
Combine?

P1(p1,…,pk) Code Make a Code


group?

P2(p1,…,pk)
Code Code
Code

Code Code
Example:
p1 = code execution time Conditional execution is
p2 = frequency of code changes possible?
p3 - necessity of execution for different data
Code
Tasks
Lightweight containers

Tasks from simple to complex


1. Recommendations for optimizing a Kubernetes graph-based pipeline and user-entered parameters
2. Rebuild python code of the Jupiter Pipeline based on the Kubernetes graph and user input
parameters
3. Automatic adaptation of python code from Jupiter to the Pipeline

Step 0: Step 1:
1. Install kubeflow 1. Explore built-in optimization techniques
2. Using simple examples to learn combinations 2. Explore existing approaches
of component execution:
1. Independent
2. Sequential Step 2 and following steps:
3. Parallel 1. Consolidating sequential containers
4. Grouped 2. Organizing parallel containers
5. Conditional 3. Organizing group containers
3. Use simple examples to learn how to transfer 4. ……
9
data
Install Kubeflow
Kubeflow – is an open-source platform for machine learning and MLOps on Kubernetes introduced by Google.

Designed for: Minimum installation kit:


• data exploration, • kubernetes cluster or kubernetes cluster emulation
• feature construction, • Minikube - Minikube allows to work with containers locally
• feature transformation, • Microk8s
• model training, • Kind
• model evaluation, • K3s
• model fine-tuning, ✓ WSL (Windows Subsystem for Linux)
• model delivery ✓ Docker Desktop for Windows
• model versioning application for working with and sharing containers and microservices. It provides a simple
graphical user interface that allows to manage containers, applications and images directly from the
user's computer.
✓ Hypervisor
a program that provides simultaneous, parallel execution of multiple operating systems on the same
host computer. The hypervisor also provides isolation of operating systems from each other,
protection and security, resource sharing between different running operating systems, and resource
management.
➢ Hyperkit
➢ VirtualBox
➢ kvm2 под Linux
➢ Hyper-V под Windows.
• kubectl - command line tool for cluster management
• Kubeflow
• Jupyter
It's well told here:
https://www.youtube.com/watch?v=LSvvIt2m1Jo
The documentation referenced in the video is here :
https://www.kubeflow.org/docs/components/pipelines/v1/installation/localcluster-deployment/
Shichkina Yulia

Contacts:
shichkina@co-evolution.ai
89819636645 - WhatsApp

You might also like