Professional Documents
Culture Documents
REPORT_Hazem_CHAABI_rev
REPORT_Hazem_CHAABI_rev
REPORT_Hazem_CHAABI_rev
Option :
SYSTIC
Topic :
Done by :
Hazem CHAABI
Supervisors :
Ms. Rim Barrak – Sup’Com
Ms. Helene Coullon – IMT-Atlantique
M. Daniel Balouek-Thomert – University of Utah
Summary
List of Figures
List of Tables
General Introduction 2
Chapter 2 Background 7
2.1 Introduction...........................................................................................................7
2.2 Edge-to-Cloud Continuum.....................................................................................7
2.3 Urgent Computing.................................................................................................8
2.4 Dynamic Reconfiguration......................................................................................8
2.5 Use case: Canny Edge Detection..........................................................................8
2.6 Software Product Line...................................................................................10
2.7 Feature Model...........................................................................................................10
2.8 Monolith VS Micro-services..........................................................................12
2.9 Software Environment.........................................................................................13
2.10 Conclusion..................................................................................................................14
BIBLIOGRAPHY
General Conclusion 33
Bibliography 34
List of Figures
5.1 Before and after applying Canny Edge Detection algorithm on an image.........23
5.2 the CED workflow Micro-services Architecture..................................................23
5.3 the feature model of the CED micro-services.....................................................24
5.4 Box plot of the execution time of CED in Edge and Cloud paradigms.............26
5.5 Scatter plot showing the computing time in each of the computing paradigms
colored by the used smoothening filter
26
5.6 Train test Split................................................................................................27
5.7 Predicting the computing paradigm parameter..................................................28
5.8 Predicting the Operator parameter.........................................................................28
5.9 Predicting the smoothening filter parameter.........................................................29
5.10 Sequence diagram of the dynamic deployment scenario.......................................30
5.11 XGBoost accuracy on the operators, smoothening filter and computing
paradigm parameters.
31
List of Tables
Abreviations List
• G5k = Grid’5000.
• K8s = Kubernetes.
• abv3.
General Introduction
In this internship, we deem a new breed of urgent smart services using the IoT-to-Cloud
Continuum, combined alongside the recent advancements in Artificial Intelligence and
Big Data Analytics. First, these services and applications need a large computing power
to perform well, while usually being under the restrictions to move data from the edge
of the network to the Cloud [2]. Second, these services and applications demand system
support to program reactions that arises at run-time, particularly when the target
infrastructure capacities and capabilities is unknown during the design [3].
• In the first chapter, we describe the general frame of the project by introducing the
2
3
• In the fourth chapter, We will present our Scientific Contribution to the addressed
problem.
• The fifth and last chapter, we will detail the technical implementation of our con-
tribution along with an evaluation to the solution.
Chapter 1
General Presentation
1.1 Introduction
This end-of-studies project was a collaboration between IMT-Atlantique Nantes and The
University of Utah. This chapter is composed of three sections, the first one is a presen-
tation of the host school in which I did my internship. In the second section, we present
the need of our project and we specify the goals of our work.
In this internship, I was a member of the LS2N lab which is a joint research unit
pools the digital research strengths of three higher education institutions (University of
Nantes,
4
1.3. Presentation of the Project 5
The Scientific Computing and Imaging (SCI) Institute at the University of Utah is an
internationally recognized leader in visualization, scientific computing, and image anal-
ysis applied to a broad range of domains. The SCI Institute brings together faculty in
bioengineering, computer science, mathematics, and electrical engineering in applying ad-
vanced computing technologies to challenges in a variety of domains, including biology
and medicine. The SCI Institute includes 19 faculty members and over 200 other scien-
tists, administrative support staff, and graduate and undergraduate students.
The overarching goals of the SCI Institute’s scientific computing research are to cre-
ate new techniques, tools, and systems, by which scientists may solve problems
affecting various aspects of human life.
Upon detecting an urgent alert, in most cases we need to do further complex calcula-
tions (e.g., executing a machine learning model) in order to confirm that it is not a false
positive. As the IoT devices that detected the alert are generally not powerful enough to
do the complex calculations, Dynamic reconfiguration can be the solution as it allow us
to move from one configuration to another while the application is running .
1.4 Conclusion
This chapter is dedicated to describe the host organisation and the project framework.
The next chapter we specify a literature review to establish the theoretical framework of
the bestowed work.
Chapter 2
Background
2.1 Introduction
This chapter outlines the relevant fundamentals required to implement the proposed work.
We define at first Edge-to-Cloud Continuum, Urgent Computing and Dynamic Reconfig-
uration which are necessary concepts to understand our work. After that, we present
our use case application which can be viewed as a software product line. Then, we
introduce Feature Models and give the utilities of Micro-service architectures. Finally,
we detail the software environment used to develop our solution.
7
2.3. Urgent Computing 8
Source: [6]
CED technique. It is the most commonly used and popular edge detection tool. Canny
is a more efficient method of edge extraction than any other method currently available,
and it gives good results. The Canny operator can manage a variety of edge image data
and effectively remove noise.[7]
CED is as a chain of features that executes in a manner in which every feature has a
predecessor that produces a necessary output for the execution of that feature.
2.6. Software Product Line 10
Figure 2.2 describes a simplified feature model simplified by the mobile phone production.
The model depicts how features are used to designate and develop software for mobile
phones.
The program loaded in the phone depends on the features that it supports. According to
the model, all phones must contain support for calls, and displaying information in
either a basic, coloured or high resolution screen. Moreover, the software for mobile phones
may optionally contain support for GPS and multimedia tools such as video camera and
an MP3 player.
2.7. Feature Model 11
Source: [8]
We group together as baseline feature models those that provide the following rela-
tionships between the features:
• Mandatory. A child feature has a obligatory connection to its parent when the
child is embedded in all products in which its parent feature is present. For
example, each mobile phone must provide call support.
• Optional. A child feature has an optional relation to its parent when the child can
be optionally included in all of the products in which the parent feature appears.
In the example, mobile phones may have GPS support as an optional feature.
• Or. A set of child features has an or relationship with its parent when one or
several of them can be included in the products in which the parent feature is
found. In Figure 2.2, when selecting Media, Camera, MP3 or both can be
included.
A feature model may also contain cross-tree constraints among features. These constraints
are usually of the form:
2.8. Monolith VS Micro-services 12
• Programming languages:
• Programming tools:
• DevOps tools:
2.9[2.10] Conclusion
In this chapter, several basic concepts necessary for the implementation of the project
have been outlined.
Chapter 3
Related Work
3.1 Introduction
In this chapter, we are going to state the related work in the field of Variability manage-
ment and also the proposed solutions in literature to reconfigure applications.
For instance, Naily et al. introduces a framework to engineer the connected micro-services
as a software product line. They have proposed a framework (ABS micro-services frame-
work) for developing softwares based on microservices with the Software Product Line
Engineering (SPLE) concept. They aim to reduce the effort of adapting to changes in
requirements by using the rigorous variability management approach offered by SPLE.
Meanwhile, in [11], Acher mentions that Feature Models are the most used variability
management approach in the industry based on a survey. Later, in his HDR, he intro-
duced FAMILIAR (for FeAture Model scrIpt Language for manIpulation and
Automatic
15
3.3. Re-Configuration 16
Reasoning) which is a domain-specific language with textual syntax that permits opera-
tions on multiple feature models and their configurations.
3.3 Re-Configuration
Previous studies in the literature have been devoted to investigate the performance of
dif- ferent AI methods for automatically re-configuring applications due to the high
number of parameters that applications have nowadays.
In [11], Acher introduced his approach to constrain variability models.As the example
in Figure 3.1 shows, at first, he gathers a training dataset using different configurations
of the variability model. After that, he labels that dataset with Boolean values (accept-
ed/not accepted output). Then, the labeled data is passed through a machine learning
decision tree model to learn the patterns for classifying the images as accepted or not
and therefore presenting new constraints. Combining the variability model with the new
constraints give us a new one that only produces the accepted images.
Source: [11]
In [12],they present a system that uses machine learning classifiers to ensure high
accuracy in offloading (choosing the which computation paradigm to run on) . The pro-
3.4. Conclusion 17
posed solution is based on a contextual database for training and testing classification
algorithms.
3.4 Conclusion
In this chapter, we presented how the variability management is being modeled, in our
case we combine feature models with workflows because we’re programming an urgent
application. Also, we mentioned the methods used in literature to configure
applications.
Chapter 4
Scientific Contribution
4.1 Introduction
In this chapter, we will give an overview of our main contribution, from combining
feature models with workflows to take re-configuration decisions using machine learning
to using those decisions to dynamically re-configure our application.
18
4.3. Decision model 19
several different experiments and collect and save the used parameters in the experiments
along with some other features (e.g., execution time). For example, in our use case, we
collected the variability parameters of the application (operators and smoothening filters
+ their internal parameters) along with the execution time, the quality of the output
(good, medium, and bad) and the computing paradigm (”Edge” and ”Cloud”). Note that
the bigger the collected dataset the better the decision model will be.
After that, we need to prepare the data to be used for training. The data preprocessing
differs from a use case to another. But, it is generally one or a combination of these
Techniques:
• Data Cleaning.
• Dimensionality Reduction.
• Feature Engineering.
• Data Transformation
• Data balancing
• Sampling Data.
The last step of the data preprocessing is sampling the data, we need to divide the
collected dataset to at least two parts, train data to feed it the machine learning decision
model in order to learn the its patterns and test to validate the performance of the trained
model.
The next step is to build a decision model. At first, we have to choose a metric for our
model to optimize (e.g., ”accuracy”, ”F1 score”, ”recall”), the metric must be adequate
to the type of the dataset, the distribution of different classes and the use case. After
that, we need to pick a machine learning model. this part is very challenging due to the
huge number of algorithms that can be used(e.g., ”decision tree”, ”random
forest”,”neural networks”). So, the solution is to train as much as models as possible,
evaluate their performances using the test data, sort them with the chosen metric and
pick the most efficient one or combine the top candidates for better generalization.
4.4. Dynamic Reconfiguration 20
4.5 Conclusion
In this chapter, we detailed our contribution. At first, we presented why we have to
combine feature models with workflows to represent urgent applications. Then, why we
need to use machine learning to take the re-configuration decision. Finally, we specified
our approach to re-configure applications with the output of the decision model.
Chapter 5
5.1 Introduction
This chapter is dedicated to present the working environment at first. Then, we begin by
Implementing our use case. Following that, we detail the process of building the dataset
from the gathering to analysing it and how it is preprocessed. Then, we present the
steps to train our Decision Model. After that, we go through how we made our solution
dynamically re-configurable. Finally, we evaluate our work.
Model Hp Pavilion
Processor Intel(R) Core(TM) i5-9300H CPU @ 2.40GHz
RAM 16GB
Data Storage 512GB SSD
Operating System (OS) Windows 11
21
5.2. Use-case Implementation 22
a scalable and flexible testbed for experimental research in all areas of computing, with
a focus on parallel and distributed computing, including the Cloud, Big Data and AI.
Key features:
• designed to support Open Science and reproducible research, with full traceability
of infrastructure and software changes on the testbed.
Figure 5.1: Before and after applying Canny Edge Detection algorithm on an image.
After making the micro-services, it was time to containerize them so that we can
transfer them between environments without being afraid from affected by the
environmental setups and also for K8s orchestration later on.
• Execute all the parameters combinations of the CED Program on each of the re-
served machines.
5.3. Building the Decision Model 25
• Collecting the execution time and the quality of the output on each run.
Figure 5.4: Box plot of the execution time of CED in Edge and Cloud paradigms.
Figure 5.5: Scatter plot showing the computing time in each of the computing
paradigms colored by the used smoothening filter.
encoding technique called Label Encoding. This approach is very simple, it converts each
value in a column to a number.
Before moving to the model training we have to split our dataset into train and test
datasets (Train test split is a model validation procedure that allows you to evaluate
the performance of a model on new data). We split our data using a function in the
python library Sklearn called train-test-split into 70% train data and leaving 30% for
testing as Figure 5.6 shows.
Figures 5.8 and 5.9 show a performance comparison between the classification models
while predicting the operator parameter and the smoothening filter parameter respectively.
As shown in Figures 5.7, 5.8 and 5.9 the XGBoost is the most performing model
overall, so based on this experimental result we decided to save it in the three
prediction cases and make it the decision making model for the next steps.
• creating a requirements.txt file that contains all the dependencies and their versions
that are needed to execute the micro-service.
• creating a Dockerfile containing all the instructions needed to build the image.
After that, we create a docker compose file in which we specify the ports that are going
to used for each of the micro-services along with their corresponding images so that they
can communicate with each other.
which are needed to create the containers and expose the ports specified in our docker
compose file. After that, we run the Skaffold dev command to deploy our containers and
check every second for changes in the files required to build the images. Upon detecting
a modification, Skaffold automatically deletes the container that needs to be changed
and replace it with a new one that have the required modifications.
As shown in Figure 5.10, first a request to the decision model is made to predict the
optimal configuration. If the model output is different from the deployed configuration,
we update the the configuration of the micro-service(s) that need to be changed. Then,
Skaffold will detect the modifications and automatically change the deployed
containers.
5.5 Evaluation
The common evaluation for this type of problems is to evaluate the performance of the
decision model [15].
The chosen decision model (XGBoost) gave us a minimum accuracy of 65% while pre-
dicting the operator and smoothening filter parameters, and an accuracy of 85% while
predicting the computing paradigm. Note that in urgent applications, the minimum al-
lowed confidence (accuracy in our case) is 60%.
Figure 5.11: XGBoost accuracy on the operators, smoothening filter and computing
paradigm parameters.
Table 5.4: Comparison between the output of the decision and the expected output.
As you can see in Table 5.4, the output of the decision model is sometimes different
of what we are expecting. But, the quality of the output image is the same as we
requested and the execution time requested is in most cases respected.
5.6. Conclusion 32
5.6 Conclusion
During this chapter, we have successfully implemented a dynamically re-configurable
ap- plications that takes as an input an image and gives an output the edges inside it.
For the reconfiguration decisions, we built a model that takes the desired execution
time of the application (a deadline), the desired quality of the output image and the
comput- ing paradigm and predicts the optimal configurations. those predicted
configurations are detected and deployed automatically without any manual
intervention.
General Conclusion
During this project, we first combined feature models with workflows to model urgent
applications. Then, we built a dataset that contains different execution scenarios. Next,
with the collected data we have built a machine learning model to predict the optimal
configurations to make during a re-configuration. Finally, we used the predicted configu-
rations to re-configure the deployed urgent application.
We have achieved:
However, this work can still be improved. For example, we can venture some other
decision models (e.g., reinforcement learning). Also, we can build a language to model
feature models and workflows.
33
Bibliography
[1] Daniel Balouek-Thomert, Ivan Rodero, and Manish Parashar. Harnessing the computing
continuum for urgent science. ACM SIGMETRICS Performance Evaluation Review, 2020.
[2] Kevin Fauvel, Daniel Balouek-Thomert, Diego Melgar, Pedro Silva, Anthony Simonet,
Gabriel Antoniu, Alexandru Costan, V´eronique Masson, Manish Parashar, Ivan
Rodero, and Alexandre Termier. A distributed multi-sensor machine learning approach
to earth- quake early warning. Proceedings of the AAAI Conference on Artificial
Intelligence, 2020.
[3] Eduard Gibert Renart, Daniel Balouek-Thomert, and Manish Parashar. An edge-based
framework for enabling data-driven pipelines for iot systems. 2019.
[4] Maverick Chardet, H´el`ene Coullon, and Simon Robillard. Toward safe and efficient
recon- figuration with concerto. Science of Computer Programming, 2021.
[5] Emile Cadorel, H´el`ene Coullon, and Jean-Marc Menaud. Online multi-user
workflow scheduling algorithm for fairness and energy optimization. 2020.
[7] Mohd Ansari, Diksha Kurchaniya, and Manish Dixit. A comprehensive analysis of image
edge detection techniques. International Journal of Multimedia and Ubiquitous
Engineering, 12:1–12, 11 2017. doi: 10.14257/ijmue.2017.12.11.01.
[8] Paolo Arcaini, Angelo Gargantini, and Paolo Vavassori. Generating tests for detecting
faults in feature models. 2015 IEEE 8th International Conference on Software Testing,
Verification and Validation, ICST 2015 - Proceedings, 2015.
[9] Andr´e Carrusca, Maria Cec´ılia Gomes, and Jo˜ao Leit˜ao. Microservices
Management on Cloud/Edge Environments. Springer International Publishing, 2020.
[10] Moh Naily, Maya Setyautami, Radu Muschevici, and Ade Azurat. A Framework for
Modelling Variable Microservices as Software Product Lines. 2018. doi: 10.1007/ 978-3-319-
74781-1 18.
34
BIBLIOGRAPHY 35
[11] Mathieu Acher. Modelling, Reverse Engineering, and Learning Software Variability. PhD
thesis, Universit´e de Rennes 1, 2021.
[12] Warley Junior, Eduardo Oliveira, Albertinin Santos, and Kelvin Dias. A context-sensitive
offloading system using machine-learning classification algorithms for mobile cloud
environ- ment. Future Generation Computer Systems, 2019.
[15] Hongzhi Guo, Jiajia Liu, and Jianfeng Lv. Toward intelligent task offloading at the edge.
IEEE Network, 2020.