Download as pdf or txt
Download as pdf or txt
You are on page 1of 66

Here are the signatures of the supervisors:

Company’s Supervisor Signature

ESPRIT’s Supervisor Signature

ii
Acknowledgement

I dedicate all this success to the people that been there for me since day one.

To my parents Abderrahman and Hayet

Even the most elegant of language and expressions cannot adequately convey my
unfathomable love for you or my sincere gratitude for all of your efforts. You’ve given
me a feeling of accountability, optimism, and self-assurance in the face of challenges in life.

Your advice has always guided my steps towards success. Your endless patience, your
understanding and your encouragement are for me the essential support that you have
always been able to give me. I owe you what you have always been able to give me.

I owe you what I am today and what I will be tomorrow. I promise you that I will al-
ways do my best to remain your pride and never disappoint you. May God, the Almighty,
preserve you, grant you health, happiness, and protect you from all harm.

To my sisters Mouna , Rim and Abir

Thank you for always being by my side, your presence, your devoted love, your ten-
derness, your prodigious advice and your constant support, always lead me to success and
happiness.

To my Friends

Thank you for your advice and for all the good times we had together. I hope our
friendship will last forever.

iii
Thanks

At the end of this work, all people who have helped build this project, whether directly
or indirectly, deserve my sincere gratitude..

To start with,i cannot express how much i thank my parents who have been with me
since day one and they deserve much more

I thank my professional supervisor, Mr. Slim MAALOUL, for his encouragement,


advice and constructive criticism which fueled my motivation and thinking.

I would also like to my academic supervisor, Mrs. Wided MATHLOUTHI, for walking
me through the steps and evaluations of this project .

I would like to express my gratitude to the team of SBS, to the people who left the
team and to those who are still with us. Thank you for your unconditional support, your
support, your relevant comments and your advice, thank you for bringing motivation to
the work environment and contributing to the smooth running of my internship.

Thank you also to the professors of the Private School of Engineering and Technology
(ESPRIT), who provided me with the tools necessary for the success of my university
studies.

I will not miss the opportunity to warmly thank the president and the members of the
jury for having granted me the honor of judging my work.

iv
Contents

Acknowledgement iii

Thanks iv

Table of abbreviations and acronyms x

General Introduction 1

1 Project Presentation 3
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Host Company . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2.1 Smart Business Solution . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2.2 Areas of activity . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.3 Client Presentation COGEPHA . . . . . . . . . . . . . . . . . . . 4
1.3 Project Context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3.1 Problematic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3.2 Proposed Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.4 State of the art . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.4.2 Business Intelligence . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4.3 Data Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.4.4 Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2 Planning and definition of needs 9


2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Work Methodologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2.1 Methodologies of a BI project . . . . . . . . . . . . . . . . . . . . . 9
2.2.1.1 Agile BI Methodology . . . . . . . . . . . . . . . . . . . . 9
2.2.1.2 GIMSI Methodology . . . . . . . . . . . . . . . . . . . 10
2.2.1.3 Choice of Methodology . . . . . . . . . . . . . . . . . . 13
2.2.2 BI Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2.3 Life cycle of Ralph Kimball’approach . . . . . . . . . . . . . . . . . 15
2.2.4 Choice of method . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.3 Project Planning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.4 Definition of Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.4.1 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.4.2 Functional Requirements . . . . . . . . . . . . . . . . . . . . . . . . 18
2.4.3 Non-Functional Requirements . . . . . . . . . . . . . . . . . . . . . 19

v
2.5 Functional Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.5.1 Identification of actors . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.5.2 Technical Environment . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

3 Multidimensional Modeling 24
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.2 Identifying dimensions and fact tables . . . . . . . . . . . . . . . . . . . . . 24
3.2.1 Indicators and measures . . . . . . . . . . . . . . . . . . . . . . . . 24
3.2.2 Dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.2.3 Fact Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.3 Multidimensional Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.3.1 Data warehouse models . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.3.2 Choice of model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.3.3 Data Marts Conception . . . . . . . . . . . . . . . . . . . . . . . . 29
3.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

4 Implementation and visualization of data 33


4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.2 Design and development of the ETL Process . . . . . . . . . . . . . . . . . 33
4.2.1 Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.2.2 Transformation and loading . . . . . . . . . . . . . . . . . . . . . . 34
4.2.3 SSIS Component Behavior . . . . . . . . . . . . . . . . . . . . . . . 34
4.2.4 Loading data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.3 Analysis of prepared data . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.3.1 Utility of the OLAP cube . . . . . . . . . . . . . . . . . . . . . . . 39
4.3.2 Implementation of OLAP cubes . . . . . . . . . . . . . . . . . . . . 39
4.4 SSIS deployment and job planning . . . . . . . . . . . . . . . . . . . . . . 40
4.4.1 SSIS Package Deployment . . . . . . . . . . . . . . . . . . . . . . . 40
4.4.2 Job creation and planning . . . . . . . . . . . . . . . . . . . . . . . 41
4.5 Data Visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.5.1 Creation of dashboards . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.5.2 Deployment of dashboards . . . . . . . . . . . . . . . . . . . . . . . 46
4.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

5 Predictive analysis 48
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
5.2 Data Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
5.3 Key Influencers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
5.4 Decomposition Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
5.5 Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
5.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

General Conclusion 53
Bibliographie et nétographie55

vi
List of Figures

1.1 Logo of Smart Business Solution SBS . . . . . . . . . . . . . . . . . . . . . 3


1.2 Logo of COGEPHA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Business Intelligence Architecture . . . . . . . . . . . . . . . . . . . . . . . 6
1.4 ETL Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.1 Inmon Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13


2.2 Kimball Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.3 Life cycle of a business intelligence project according to Ralph Kimball . . 15
2.4 Project Plan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.5 Solution’s technical architecture . . . . . . . . . . . . . . . . . . . . . . . 21
2.6 SQL Server/SSMS Logos . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.7 SSIS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.8 Power BI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

3.1 Fact Tables Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30


3.2 Purchases Data Mart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.3 Purchases Orders Data Mart . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.4 Shipments Data Mart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.5 Stock Data Mart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

4.1 Staging Area Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34


4.2 Staging Area Package . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.3 Dimension Item Implementation . . . . . . . . . . . . . . . . . . . . . . . . 35
4.4 Example of Historization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.5 Data Warehouse Implementation . . . . . . . . . . . . . . . . . . . . . . . 37
4.6 Fact Purchases Implementation . . . . . . . . . . . . . . . . . . . . . . . . 38
4.7 Fact Purchases Orders Implementation . . . . . . . . . . . . . . . . . . . . 38
4.8 Fact Stock Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4.9 Fact Shipments Implementation . . . . . . . . . . . . . . . . . . . . . . . . 39
4.10 Cubes and Dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.11 Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.12 SSIS Catalog . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.13 History of Execution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.14 Home . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.15 Stock Dashboard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.16 Stock Evolution Dashboard . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4.17 Purchases Dashboard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.18 NotPaid DAX Measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.19 Purchases Daily Dashboard . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.20 AsPlanned DAX Measure . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

vii
4.21 Shipments Dashboard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.22 Web Version over the Power Bi Service . . . . . . . . . . . . . . . . . . . . 47
4.23 Mobile Version . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

5.1 Purchases Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49


5.2 Shipments Key Influencers . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
5.3 Purchases Key Influencers . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
5.4 Shipments Decomposition Tree . . . . . . . . . . . . . . . . . . . . . . . . 51
5.5 Purchases Decomposition Tree . . . . . . . . . . . . . . . . . . . . . . . . . 51
5.6 Vendors Clusters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

viii
List of Tables

2.1 Comparative table of the different approaches to designing a DWH. . . . . 14

3.1 Dimensions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.2 Fact Tables for the Stock . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.3 Fact Tables for the Purchases . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.4 Fact Tables for the Shipment . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.5 Comparison of star model and snowflake model. . . . . . . . . . . . . . . . 29

ix
Table of abbreviations and acronyms

IT Information Technology
BI Business Intelligence
DWH/DW Data WareHouse
ERP Enterprise Resource Planning
DM Data Mart
AI Artificial Intelligence
STG Staging Area
ETL Extract,Transform,Load
SSIS SQL Server Integration Services
OLAP Online Analytical Processing
SSAS SQL Server Analysis Services
DB Database
MSBI Stands for Microsoft Business Intelligence
SSMS SQL Server Management Studio
IDE Integrated Development Environment
MDX Multidimensional Expressions
DAX Data Analysis Expressions
CRM Customer relationship management
SME Small and medium-sized enterprises
KPI Key Performance Indicator

x
General Introduction

Competition between companies that provide services and products is intensifying and
they are looking for business opportunities to ensure their ability to grow and engage with
their customers to capture market share. Over time, the world has come across a huge
amount of data that keeps growing every day and conveys a lot of relevant information
that can influence business strategy and economics. After this phenomenon, business
intelligence has emerged to extract, clean, recover and analyze data so managers can
completely comprehend their genuine condition, maximize their assets, and keep them
informed in order to make the best judgments possible.

BI has changed a lot since its introduction: the methods used, the solutions consid-
ered and used. It is important to note that BI basically encompasses the entire Business
Intelligence system. Data management infrastructure (data warehouse, ETL), reporting
tools, business analytics, data visualization, etc. The purpose of this set is to monitor
its activity. Stimulate innovation, anticipate and adapt to future market developments,
increase efficiency in all areas of activity.

There is now a causal relationship between investment and IT maintenance, compa-


nies are trying to create value by implementing information systems. Make decisions and
continue to invest in the latest to improve and unlock performance Tangible and intangi-
ble benefits the pressure of demand for equipment Information systems, maintenance of
internal and external networks, purchase of software, etc. Development of new applica-
tions, the growth of these investments has been exponential.

My host, SBS, is primarily responsible for securing their digital transformation on


behalf of their clients. The company produces a lot of data each day without gaining
no tactical or strategic advantage. This allows you to view information in real time and
use multiple metrics to assess the health and progress of your business over time. To
meet these needs, the company expressed the need to set up a decision support solution
covering all major operations and benefiting the decision-making chain.

It is in the context of the graduation project, named "LOGISTICS ACTIVITY DASH-


BOARD", we decided to set up a decision-making system for the host organization "Smart
Business Solution" in order to make good use of its data to strengthen the decision-making
process.

1
I will offer the 5 chapters below to further explain this project:

- The first chapter “Project framework”: This is an introductory chapter that presents
the host organization, the problems, the proposed solutions and the general research. Fi-
nally, we end the chapter with a presentation and an explanation of the work methodology
applied.

- The second chapter "Project planning and definition of needs": As its name suggests,
this chapter aims to plan the project’s various stages, then determine the end users’ de-
mands before defining the technological environment..

- The third chapter “Multidimensional modeling”: This is devoted to the data model-
ing phase, the selection of indicators and the identification of tables of facts and dimen-
sions.

- The fourth chapter “Implementation and visualization of data”: This chapter focuses
on the design and development of the data preparation domain as well as the specification
and development of user applications and dashboards.

-The fifth chapter “Predictive analysis”: This chapter focuses on the predictive anal-
ysis, axis of data collection and presents the future vision of the module.

In our final section, we’ll give a final assessment of the work done, discuss the outcomes,
and outline the project’s potential future directions.

2
Chapter 1

Project Presentation

1.1 Introduction
This chapter is dedicated to introduce the host company, the project’s context,the
problematic and the proposed solution. The first section is dedicated to introduce the
general context of the project. To begin, we will introduce the host company and its
different areas of activity. In the second section, we will study the existing system, which
includes the difficulties and the problems to fix, in order to formulate the solutions that
need to be built. In the last section, we will elaborate more about the process of the
project respecting Business Intelligence’s state of the art.

1.2 Host Company


In this section we will present our host company, in which the end of studies project
will be carried out , the company Smart Business Solution.

1.2.1 Smart Business Solution


The project was carried out within the Microsoft partner company SBS (Smart Business
Solution), which specializes in the implementation and installation of ERP integrated
management solutions and business management solutions..

Figure 1.1: Logo of Smart Business Solution SBS

The company offers to provide installation, implementation and integration of business


management solutions to help growing companies meet their operational, financial, regu-
latory and technical challenges. Through a proven process of discovery, analysis, design
and delivery, across a range of ERP, CRM, Business Intelligence and Office Productivity
technology solutions to create, deploy and support innovative systems in their center of
data or in the cloud.

3
1.2.2 Areas of activity
The main activities of SBS are: Integration of ERP business management solutions
and outsourcing.

• The integration of management solutions is offered according to the needs of the


companies and their already in place systems:
ERP Microsoft Dynamics NAV for SME for installation on local servers and
ERP Microsoft Dynamics 365 Business Central for web versions hosted on
the cloud. The integration of business management solutions includes: organization
and process consulting, assistance with the implementation of the ERP according
to a certified methodology and any specific developments during the integration of
the ERP.

• Business Intelligence Reports and Dashboards that allow clients to monitor their
different activities (sales,purchases,stock...) and understand their data coming from
the ERP better

• Outsourcing allows companies to make their IT investments more efficient while


ensuring the reliability of information, data security and the availability of the
existing systems (hardware and software). It consists of: System engineering, con-
sulting, Hardware and Software assistance and management of the various software
packages marketed.

1.2.3 Client Presentation COGEPHA


Comptoir Général Pharmaceutique COGEPHA [1] is one of the key links in the
Tunisian pharmaceutical industry. Its profession: wholesale distributor. Thus, the com-
pany acts as an intermediary between the medications manufacturers on one hand, from
which it obtains its supplies, and on the other hand, the pharmacies and private clin-
ics,which it supplies (respectively 80% and 20% of its customers).

Figure 1.2: Logo of COGEPHA

1.3 Project Context


In order to complete my studies as an IT engineer specialized in Business Intelligence
within École Privée d’ingénieurs Supérieurs et de Technologie "ESPRIT", In order to
improve my knowledge and practical skills for my graduation project, I was told to im-
plement a project. So I was asked to implement a project named “Logistics Activity

4
Dashboard” within SBS: this will allow the client company COGEPHA must have a
comprehensive and in-depth understanding of the data relating to its operation, which
will be a significant benefit when making decisions.

1.3.1 Problematic
Given the specificity of pharmaceutical products, particularly the validity limit of these
products, the company wishes to maintain an optimized stock value while guaranteeing its
availability for sale. This balance is dependent on several key factors including ’the rate
of satisfaction of supplier orders’ as well as ’average delivery times’. In order to achieve
the company’s needs, we built an interactive dashboard that focuses on monitoring the
stock value and shipments activity. Through the dynamic features of PowerBI, we were
able to load all the data and navigate through time filters to track the stock value through
time. This was needed since the company old reporting service, done through Excel, is
static and need to be updated monthly and archived.

1.3.2 Proposed Solution


This project’s objective is to create a decision-making system providing the necessary
means for the decision-makers to enable them to make strategic and operational decisions
and the most appropriate solutions.
Indeed as BI developers, it is our responsibility to give our clients the necessary reports
that produce knowledge and information so that he can properly follow the logistics and
stock activity, to follow the evolution of the values of the products and the costs and to
have a clear visibility over quantities received and quantities shipped.
Our solution allows users to save time while searching for information, ensure trustwor-
thy access to pertinent information, and enable data analysis from a variety of perspectives
to help you make the right decision and enhance the business process.In our case it will
provide a global vision on the shipments and suppliers efficiency and keep track and or-
ganize data.
During this project we are going to design an automated vision to collect the set of
data coming from SQL Server, process it and then integrate it in an analysis friendly data
warehouse. In our case, in order to fix all the metrics and transform the data related
to stock,shipments and purchases, we will consider implementing an optimized ETL that
as opposed to the traditional one starts with a staging area serving as an operational
database. The latter ensure checking the quality of the data arriving without interfering
with the original source of data, in preparation of loading it into the data warehouse
automatically, each and every day. The integrated data warehouse, once it is constructed,
will serve as the repository from which we will retrieve the information required to create
our dashboards.

1.4 State of the art


1.4.1 Introduction
It is crucial to establish some theoretical and technological terms when we start this
project because they may subsequently affect how well we grasp the project as a whole
and how we create our solution.

5
1.4.2 Business Intelligence
Business intelligence (BI) is the collective name for a group of technologies that aid
in decision-making and give a broad picture of a company’s numerous operations. By
providing essential data at the right time and through the right channel, the resulting BI
Information System acts as the primary tool for making decisions and enables goals to
be monitored and the means of accomplishing them to be altered. BI can be the focus
of extremely various techniques from one firm to another because it is a field that is still
in full development. Nonetheless, it all has the same objective—to help decision-makers
evaluate the performance of their business. The four steps of a BI project are collection,
modeling, restitution, and analysis. Figure 1.3 depicts these stages, which are further
described below[2].

Figure 1.3: Business Intelligence Architecture

The Collecting and Integration phase: The process of collection involves find-
ing, choosing, and extracting transactional data from our source. An ETL tool will then
load the data into a data warehouse after making the necessary modifications to structure
it in accordance with a unified, standardized, and usable manner. It is necessary to focus
at this stage on the end user’s specific needs in order to meet all their expectations.

We use the Microsoft SSIS data integration technology in our project.

The ETL procedure is described in more depth in the following graphic (figure 1.4):

These actions are performed periodically. In our project it will be launched on a daily
basis at midnight when there is no data traffic.

The Modeling and Analysis phase: This phase is all about storing data in an
appropriate format for the needed analysis. It calls for the design of a data warehouse
that enables the simplification of the data model and the organization of the information
delivery.

6
Figure 1.4: ETL Process

At this level, the data is fed into the data warehouse from the production databases
using the ETL process.
We presented the data in accordance with a number of analytical axes throughout
the decision analysis so that we could readily extract them and compare them in various
ways. This is accomplished using multidimensional databases (Cubes) made by OLAP
computing method and data processing technology.
The OLAP system makes it possible to perform aggregations on the measures and a
rapid analysis on the data, in our case this system is developed in analysis services (SSAS).

The Restitution phase: The task of presenting value-added information in this


final step, also known as reporting, is to make it appear in the most readable way possible
in the context of helping the decision making process.
During this stage, we displayed the data using dashboards, which are straightforward,
understandable, and dynamic.
We have decided to also carry out a predictive analysis in order to enrich our analysis
and enhance the decision maker’s understanding.

1.4.3 Data Modeling


The data warehouse is a technical infrastructure implemented capable of integrat-
ing, organising, storing, and archiving in an intelligible manner data produced within
an information system or imported from outside the information system (rented or pur-
chased) from which end users draw relevant information using restitution and analysis
tools (OLAP, DataMining)[4]

Data Marts

The Datamart is a collection of focused, sorted, aggregated, and organized data that
has been organized to satisfy certain business goals.SQL queries from one or more rela-
tional databases are used to generate the data store, which is then saved in memory under
the control of a database management system.
Data warehouse

A data warehouse, is a database created specifically to store all the data/information

7
needed to enhance decision-making. The Datawarehouse is only to be used for this pur-
pose. ETL (Extract, Transform, Load) tools in particular are responsible for supplying it
with data from the production bases.

The characteristics of a Datawarehouse are:

— Subject-oriented: The data is arranged according to themes at the center of


the data warehouse, purchases for example, will be repatriated from the various OLTP
production databases and grouped together.

— Integrated: The data is gathered from various sources that each use a different
format. Before making them available for use, they are integrated.

— Non-volatile: Data does not vanish or alter over time or with processing (Read-
Only).

— Historized: Non-volatile data is also timestamped. WAs a result, we are able


to see how a particular value has changed over time. The level of detail that is stored
depends on the type of data. Not all data should be archived.

1.4.4 Measures
Fact tables are tables that contain records where everyone relates to a well-defined
business operation.

Each table is represented by:

— Navigation Keys: they allow access to one or more axes of analysis. Technically
these are foreign keys from related dimensions with her.

— Facts or Measures: These are the calculable numerical values, on which we can
perform aggregations and from which we can calculate performance indicators. We can
cite three types of measures:

Additive measure which is the most flexible and the most used since it allows to ag-
gregate the values of the measure with all the dimensions.

A semi-additive measure that is aggregated with some dimensions.

Non-additive measure that does not allow the aggregation to be applied to any di-
mension

1.5 Conclusion
In this first chapter we began by presenting the reception, the context of the project, the
problems encountered, the solutions proposed and we ended up stating the concept of
business intelligence and the phases of its different segments

8
Chapter 2

Planning and definition of needs

2.1 Introduction
In the second chapter, we will first outline the work methodologies that are pertinent to
the project proposals for solution planning and decision support made in the first chapter.
Then, we will list the functional and non-functional needs, and finally, we will conduct a
functional study.In the end, we shall outline the solution’s technical environment.

2.2 Work Methodologies


A decision-making project requires an in-depth understanding of customer needs and
a thorough examination of how organizational units operate in order to determine the
most effective metrics and indicators to take into consideration more effective strategic
and tactical decisions. To begin this type of project, we must first select the appropriate
approach and comprehend the various project life cycle phases, and limiting the delivery
times of publications by meeting the client’s needs.

2.2.1 Methodologies of a BI project


The BI project towards the management consoles is particularly delicate, it is not a
project of which we are ready to assemble the bricks, because it concerns the main question
of the company’s offer. For, it is important to apply this problem-specific decision support
to all managers, because that is really what Business Intelligence is.
Like any IT project, a good project management approach is usually highly recom-
mended.
While a project in the field of BI is not a typical IT project because it is dependent
on accuracy and requires a tailor-made methodology for its success based on functional
and operational aspects with strong involvement of decision-makers. A diverse choice of
methodology manifests within us, and choosing the right method lies in the perspective
of what we want to deliver in the end.

2.2.1.1 Agile BI Methodology


Agile business intelligence[5] moves away from the traditional engineering model: anal-
ysis, design, construction, testing and implementation. By adopting the agile approach,
organizations are seeing faster return on investment and are able to adapt quickly to

9
changing business needs. But to do that, you have to put in place the ideal framework for
implementing and managing BI. There are different versions, but the underlying method-
ology is the same. Below are the main steps for successfully adopting an agile BI approach.
• The concept: develop a loose vision of BI.
• The initiation: define the different priorities.
Identify the main needs and requirements of the business. This includes understanding
the questions to be answered through the BI system Discover the available data sources
Understand the channels for disseminating information: reports, dashboards Prioritize
key business requirements and needs taking into account time and budget constraints
Choose the Business Intelligence software to use.
• Construction iteration: provide a functional system that meets the evolving
needs of stakeholders.
• The transition: start the previous build iteration in production
Release the pilot project in a small subgroup
• The production: support all elements from construction iterations and transition
to production
Operate and support the system, dashboards and reports.

Identify flaws and improvements.

2.2.1.2 GIMSI Methodology


GIMSI[6] is a methodology to create performance scorecards and collaborative perfor-
mance dashboards as decision support systems, specifically for management. Designed in
10 steps that follow one another, it is part of a modern management method that favors
cooperation and the sharing of knowledge. Gimsi focuses on the essential question: How
are decisions really made ? This method consists of 4 main phases divided into 10 steps:

Identification

To situate oneself correctly in the context of the company, the GIMSI method pro-
poses a first phase which makes it possible to describe the company’s internal and external
environments:

• Environment of the company :


Analysis of the economic environment and business strategy in order to define the area
and the impact of the project.

This is the step that identifies the company strategically and competitively in terms of:

- Market
- Company resources and policies
- Business Strategy

10
• Identification of the company :

Structural analysis of the company to identify the processes, activities and concerned
persons.

This is the step where the company’s structure is determined:

- Define the processes it is targeting.


- List the pertinent actions taken by these processes.
- List the participants.

Design

This is the second stage of the GIMSI approach for cataloguing every component in-
volved in dashboards.

• Description of the objectives :

Selection of tactic objectives for each team and indicators.

In order to be able to make decisions based on the research done in the earlier phases,
this step helps you to outline the company’s objectives.
Each objective needs to meet the requirements listed below:

- Measurable: It needs to be expressed in a form that can be measured


- Accessible: Must possess the way of access
- Realistic: The access mechanism must be reasonable
- Constructive: It must support the overarching goals

• Selection of the indicators :

Choice of indicators according to the objectives.

This process comprises of identifying quantifiable performance indicators, monitoring


and measuring the company’s goals, and enhancing the analysis,and visualize the data so
that it can be understood at a glance. These are classified by business process and built
according to their respective objectives.
Each objective must adhere to particular standards, such as:

- Real time: It needs to be updated often in order to allow for decision-making at any
moment.
- Measurable: It must be able to measure one or more constructible objectives
- Ergonomic design for the dashboard

• Collection of information :

Identifying the necessary information to design the indicators.

11
This is the phase in which the data needed to construct the indicators can be collected
according to the objectives established in the previous phase. Each piece of data must
meet certain criteria:

- Accessible: It needs to be physically accessible


- Logically available:It need cleaning and inspection
• Construction of the dashboard :
Definition of each team’s dashboard.

This is the stage that allows us to manage the objectives that we set ourselves in the
previous step. Each of them must contain the following elements:

- A timeframe or date
- Measures
- Visuals
- Responses to the concerns and aims
• The Dashboard system :
Design of the dashboard system, control of the global consistency.

This is the step that explores the interactions between the dashboards and allows an
overall consistency of decision-making systems. In fact, this harmony ensures that:

- Strong link between decision-makers and dashboards: the information processed by


dashboards must meet the needs of decision-makers in order to make the best decisions.
- Joint decision making: Decision makers share structured and analyzed information
for better understanding and greater visibility.

Implementation

This is the third phase of the GIMSI method to implement the solution.
• BI Tools:
This is the step that allows you to study the technological needs and analyze the market
offer, in order to choose the most appropriate tool for the company’s objectives and the
most adapted to its current problems.
• Integration and Deployment :
This is the step that allows the implementation of the technologies chosen during the
previous step in the company as well as the integration of the solution with the existing
one.

Continuous Improvement

The audit constitutes the fourth phase of the GIMSI method, which watches over
the system. This is the step that allows the permanent monitoring of the solution. It
guarantees not only the sustainability of our system, but also its performance according
to the needs and objectives set.

12
2.2.1.3 Choice of Methodology
After having studied the two different methodologies that can be used for the imple-
mentation of our solution We see that the GIMSI technique has several decision points,
which is unique in the design of a management system based on dashboards that encour-
ages communication and knowledge sharing amongst decision-makers.

2.2.2 BI Approach
Designing a DataWareHouse (DWH or DW) is an intricate and crucial step in the
creation of a decision support system. Bill Inmon and Ralph Kimball, both fathers of the
DataWareHousing process (design and development process of DWH), offer two different
approaches to modeling a data warehouse. The two methods must be compared in order
to determine which one best meets your needs.

Bill Inmon’s Approach (Top-Down)

Bill Inmon thinks the DWH and the DM are physically separate. DW has its own phys-
ical existence, oriented to storage, traceability and scalability, but DM also has its own
physical existence. They leverage DW and offer a performance-based payment structure
based on their expressed needs. Figure 2.1 illustrates Bill Inmon’s approach to designing
a data warehouse.

Figure 2.1: Inmon Approach

Approach Ralph Kimball (Bottom-Up)

Ralph Kimball views DW very differently than most people. According to him, DWs
can be viewed as a coherent collection of DMs between them based on coherently shared
dimensions. The issue with Ralph Kimball’s strategy is the reliance on quick decision-
making processes that meet the needs expressed. Figure 2.2 shows the design by DW
following the approach of Ralph Kimball.

13
Figure 2.2: Kimball Approach

Comparison of Modeling Approaches

Based on the presentation of the last two approaches, we will compare their main
characteristics such as process, schematization, basic model and the nature of the head-
to-head result. Table 2.1 presents a comparative study of the DWH design approaches. [7]

Ralph-Kimball Bill-Inmon
Process Bottom-Up Top-Down
Principles Model the data mart Create a centralized
then integrate them data warehouse in
and thus form a data which the data will be
warehouse consolidated
Schematization Star Snowflake
Data structure Business process ori- Data oriented
ented: KPI, Dash-
boards
End-user accessibility Strong Weak
Persistence of source data Stable Changing
Data warehouse delivery Fast Slow

Table 2.1: Comparative table of the different approaches to designing a DWH.

During the realization of the current project, and according to the customer’s needs,
starting from the reduction of development costs, until the realization of a decision sup-
port system which delivers an action, a series of tactical actions, while delivering artefacts
as quickly as possible We adopt a bottom-up approach to meet all needs.

14
2.2.3 Life cycle of Ralph Kimball’approach
To better adapt the project to the changing and sometimes volatile needs of our cus-
tomers, Ralph Kimball provided the Business Intelligence project lifecycle. While adapt-
ing to the frequent changes that may occur from one project to another. This diagram
shows Ralph Kimball’s lifecycle as well as vertical and horizontal dependencies among
various nodes.

Figure 2.3: Life cycle of a business intelligence project according to Ralph Kimball

In the next part we will dissect the nodes as follows

Project planning

- It has an interacting relationship with the notion of needs, as the arrow suggests.
In fact, it emphasizes these requirements from the perspective of available resources and
level of expertise, which is directly related to work assignment, duration, and sequencing.

Definition of needs

- The user must be heavily involved and his wants must be understood in order to
properly build the data warehouse.

- Determining the important elements that will enable the business to accurately spec-
ify its requirements, which will be utilized to model the data warehouse.

- This activity serves as the foundation for the other concurrent activities including
technology, data, and user applications.

Definition of the technical architecture

- A description of the technical architecture that will be used.

15
- A description of the problem and how technologies were integrated to solve it so that
he might develop.

- The needs, the current technical environment, and the anticipated strategic technical
directions are taken into account.

Selection and installation of technologies

- Choosing the right tools in accordance with the technical architecture research (Ex:
data preparation and access tools)

Dimensional modeling

- Once the requirements have been established, modeling and the creation of a dimen-
sional model (including fact tables and dimensions) can proceed.

- Determining metrics and hierarchies

Design of the physical data model

- Identification of the physical structures required for the implementation of the logical
data in the database. (Ex: identification of keys, constraints, types . . . )

- Performance-management techniques like partitioning

- Security

Design and development of the data staging area

- The development of the ETL process: data extraction, cleaning, consolidation, and
loading

User Application Specification

- Building data visualization models

User application development

- The integrity of a shared understanding between the developers and the end user is
ensured by the establishment of dashboards based on the preceding stage.

Deployment

- It is the place where the many stages of development merge.

- A request for a change or correction is considered.

16
- Forecasting end-user training

Maintenance and Growth

- User monitoring

- A guarantee that the data warehouse will run continuously and effectively (perfor-
mance optimization)

- Data storage

- Establishing a communication procedure

Project management

- Verify the lifecycle is proceeding as planned.

- Track the project’s progress.

- Control adjustments

- Creation of a communication strategy

2.2.4 Choice of method


In order to set up the project, we must be able to meet the different needs of each level
of representation of our clients. To do this, we must provide a management mode that
allows us to adapt the project to the constantly changing needs of end users. When car-
rying out a project without a predefined scope, we will choose Ralph Kimball’s life cycle,
which offers an iterative approach that allows backtracking to meet new requirements or
changes, and also offers incremental advancement to through the entire process. Repeat
this for each new data store requested by the user. The method can also identify goals
based on close collaboration and communication with users.

2.3 Project Planning


The planning stage of a project is crucial to its execution. It enables accurate work
progress monitoring. This project’s planning process is divided into three phases: task
determination, task planning, and task duration. The project is divided into many mis-
sions, starting from February 1 to the end of August. In this section, we will rely on the
Gantt chart for the development of the solution. The diagram below allows us to visually
represent the various dated duties. tasks should be dated to reduce the effects of any
delays.

17
Figure 2.4: Project Plan

2.4 Definition of Requirements


Requirements definition is the second step in the development phase of Ralph Kimball’s
life cycle decision support solution. This phase of the specification which consists of
identifying the needs, specifically the functional and non-functional demands and the
definitions of the end users and their roles.

2.4.1 Objectives
The aim is converted into an action intention by the logistical goal of the solution. To
establish company demands, various logistics objectives frequently need to be clarified.
Corporate requirements frequently dictate how their objectives will be met. The answer
should enable analysis, monitoring, and:

- Maintain an optimized stock value

- Ensure product’s availability for sale

- Track supplier’s performance

- Minimize purchase’s costs

- Minimize shipments delays

- Track the quantities received and shipped.

Now that the business needs have been defined, we can proceed to the definition of
the functional and non-functional needs of the solution.

2.4.2 Functional Requirements


The objectives are converted into action intentions by the solution’s functional require-
ments. To specify the business needs, it is frequently necessary to refine several functional
requirements. Corporate requirements frequently dictate how their objectives will be met.
The following features are included in our solutions so that users can gain from them:

- Data integration: The process of creating the solution starts with data integra-
tion, cleaning, transforming and loading the data into a database for business intelligence.

18
Hence, in order to give the dimensional model, we shall rely on the ETL procedure.

- Preparation of data and creation of the dashboard: To meet the customer’s


request, we must prepare the data to be stored in the decisional database. Aggregations,
metrics, performance metrics, and data formatting are all included in preparation. The
dashboard is created once the data has been organized in an actionable and prepared
format.

This stage produces a set of dashboards that offer global project visibility, staff, and
business operations to aid in decision-making.

Track Purchases

- Monitoring of purchases costs


- Track the evolution of purchases amounts
- Monitoring of the most purchased quantities
- Follow-up of the distribution of purchases per supplier class
- Track the satisfaction of our needs from suppliers

Track Stock

- Ensure continuous optimization


- Balance in availability of products
- Keep a reasonable stock value

Track Shipments

- Minimise delays
- Satisfy customer’s needs
- Explain the shipments that exceeded time

2.4.3 Non-Functional Requirements


- Response time and optimization: Our Client’s data is very voluminous, during
this project we had to analyze several tables that will be the subject of our solution. By
removing irrelevant data and conducting transformations, a very big volume of data ne-
cessitates an extremely significant effort to optimize the response time. The significance
of the response time depends on the volume of data, among other things.

- Reliability: The data must be relevant and of good quality.

- Ergonomics: The user application needs to have a user-friendly, practical, and


simple interface. The dashboards’ order must also be given consideration.

- Flexibility: The solution must be scalable to allow other modules to be added or


modified.

19
2.5 Functional Study
2.5.1 Identification of actors
We began by conducting a thorough analysis of the database and extrapolating the
data that we may utilize to increase visibility and accelerate the decision-making pro-
cess in order to identify the perpetrators. When the data had been used, we studied
COGEPHA’s businesses in order to improve our comprehension of the data source and
direct the solution. Improve our knowledge of the data source and tailor the solution to
the needs of the targeted end users. We may distinguish between two categories of users
who played various roles:
The system administrator: This is a collaborator whose mission is to:

- Manage the dashboards


- Control data
- Update data

End users: The following people are the app’s direct users:

- The general manager


- The audit and management control director
- The finances director

Those who make decisions will all have access to dashboards. The administrator’s
responsibilities include managing the dashboards, updating the data, ensuring the appro-
priate management and operation of the ETL (Extraction, Transformation, and Loading)
component, and modifying the indicators.

2.5.2 Technical Environment


At this point in the project, we do this after determining the application’s final actors
and deriving the functional and non-functional demands. Define the architecture and the
technical environment.

Technical architecture

In order to accomplish our goals, we will establish a technical architecture that will
specify the steps that data must go through in order to change states,beginning with data
gathering, passing through modeling, up to the visualization and finally deployment. the
visualization stage and finally the deployment. We based our work on the technical ar-
chitecture below in order to create unified graphic charters:

A decision-making information system is made up of three main parts:

— The first part is the ETL zone, which stands for extract, transform, and load. De-
velopers must have exclusive access to this zone where data processing is done. Under
no circumstances should an end user be able to access it because the data is not yet

20
Figure 2.5: Solution’s technical architecture

necessarily in a consistent state.

— The data storage space is the second part: historically we had a data warehouse
here based on OLAP technology. Now we can imagine other types of data storage such
as in-memory storage or distributed data systems (HDFS, . . .). However, regardless of
the format or storage medium chosen, dimensional modeling is an excellent approach to
logically structure data and provide access to it.[8]

— The data restitution section, which includes all the tools that produce reports or
dashboards, is the final part. This area can also provide data at an atomic level for ma-
chine learning tasks.

Technical environment

Now and following the definition of the architecture, we present the tools that will be
used to carry out this project.

Microsoft SQL Server

An integrated environment for administering SQL infrastructures is SQL Server Man-


agement Studio (SSMS) used to access, set up, manage, control, develop, and maintain
all SQL Server components (fig.2.6), Azure SQL Database and SQL Data Warehouse.

SSMS is a single, all-inclusive solution that combines a large selection of graphical


tools with a variety of potent script editors to empower database administrators and de-
velopers of all skill levels to have access to SQL Server.

21
Microsoft has also built in backward compatibility for older versions of SQL Server,
enabling older SQL Server instances to be connected to by a newer version of SSMSs.

Figure 2.6: SQL Server/SSMS Logos

SQL Server Integration Services

SSIS[9] is a tool for extracting, transforming and loading data, in short what is called
an ETL. We extract from a data source, then follow the transformation if necessary, and
then inject this data to MS SQL Server or other destinations.

Figure 2.7: SSIS

To work with SSIS you need a Microsoft Visual Studio. The design environment for
an SSIS package is therefore Visual Studio with, if possible, access to your data server.
This is to at least check that the import was successful (in addition to the progress logs
available with your SSIS package).

22
SQL Server Analysis Services

An analytical data engine used in business decision analytics is Analysis Services[10].


For enterprise reports and client apps like Power BI reports, Excel, Services reports, and
other data visualization tools, it offers enterprise-grade semantic data models.

Power BI

Microsoft offers a tool for data analysis called Power BI[11]. With a user interface
that is easy enough for end users to use, it enables the production of interactive data
visualizations.(Dashboard).

Figure 2.8: Power BI

2.6 Conclusion
This chapter is dedicated to both defining the overall vision of the solution made pos-
sible and to project planning, including the specification of functional and non-functional
demands to better identify the expected objectives of the solution. In order to proceed to
the preparation section of the following chapter, identify the solutions, technologies, and
technical requirements for your solution project.

23
Chapter 3

Multidimensional Modeling

3.1 Introduction
Ralph Kimball’s method will be continued in this chapter by detailing the layout of our
proposed fix. In fact, during this stage of the project, after determining the functional and
non-functional requirements and identifying the application’s final actors, we will continue
with the application, we will continue by modeling and designing our dimensions and our
tables of facts.

3.2 Identifying dimensions and fact tables


During this phase of the project, we will present, after having installed the necessary
tools, a dimensional model by specifying the tables of facts and dimensions. We will also
specify metrics and hierarchies in accordance with the agreed methodology.

3.2.1 Indicators and measures


An Indicator is a specific observable and measurable quantity that can be used to show
the changes achieved or the progress made by a program towards the achievement of a
specific effect.

They can also be KPI (Key Performance Indicator) is a quantitative measure that
allows you to follow the progress of your company or organization in relation to your key
business objectives.

Below we will list all of the indicators,KPIs and measures used in our project:

- Quantity Running Total: This will calculate the quantities of products in the stock
while seeing the evolution over time.

- Stock Value Running Total: This will calculate the value of the stock while seeing
the evolution over time.

-Exceeded Percentage: Percentage of the shipments that exceeded time of delivery.

24
-Avg time to prepare: Average time to prepare a shipment to be delivered.

-Time To Prepare: Time To Prepare the shipment for each delivery

-Exceeded Time: a boolean measure that indicates if a shipment exceed time or no.

-Active Vendors: Count of active vendors that delivered purchases during current
month.

-Not Paid: the amount of purchases that are still not paid.

-Cost of Delay : Cost of the not paid amounts in terms of the declined orders.

-Purchases Amount: the sum of the amounts of purchases invoices.

-Availability : Indicates the availability of a vendor in terms of ordered quantity and


the received quantity.

- Ordered Quantity: Sum of the quantity in purchases orders.

- Received Quantity: Sum of the quantity in purchases invoices(Purchased).

-As Planned: a boolean measure that indicates if a shipment respected the planned
delivery date predicted by the audit manager.

-Planned Percentage: Percentage of the shipments that respected the planned ship-
ment date.

-Shipments Costs: Cost of the products shipped.

-Balance of trade : difference between costs of shipments and costs of purchases.

-Coverage Rate: the ratio of value stock covering the needs of the customers.

-Overbought: percentage of the extra quantity purchased based on the sold quantity.

-Rotation Rate: describes the time required for a company to pay, be paid and renew
its inventory.

-Stock Value: Indicates the value of the products in the stock over time.

-VendorDisAmount : Sum of discount of amounts of the vendors.

25
3.2.2 Dimensions
A dimension is a table that contains the axes of analysis (the Dimensions) according
to which we want to study observable data (the facts) which, subjected to a multidimen-
sional analysis, give users the information necessary for decision-making.

To guarantee data filtering and labeling, a dimension’s properties must be qualitative


and descriptive. There are various dimensions, among which we list the following:

• Compliant dimension: Whose attributes are a subset of the attributes of an-


other table. This type is used when the dimensions represent different levels of granularity.

• Shared dimension: The dimension that various data marts employ.

• Slowly changing dimension: The dimension that can undergo member descrip-
tion changes over time.

• Time dimension: Central because most of the facts correspond to business events
of the company.

Let’s now present the dimensions for our project after having briefly discussed the
various types of dimensions. The table below (tab. 3.1) lists the type, various features,
and description for each dimension.

Dimension Attributes Description Type


Date Id,FullDate,Month, Year... Contains information about Shared+
temporary frames regard- Time
less of time
Customer CustomerId,Name,Address, Contains all COGEPHA’s Shared
Phone... customers and their infor-
mations
Location LocationId,Code, Name... Contains all the locations of Shared
COGEPHA Stores
Item ItemId,Description, Familly Contains all the products of Shared+
, Supplier... COGEPHA and their sup- Slowly
plier Changing
Vendor VendorID,Name,Address, Contains all the suppliers Shared
Location Code... for COGEPHA
PurchaseDetails PurchaseDetailId, Order Contains details of all the Not
Date, Due Date , Vendor purchases of COGEPHA Shared
No...
PurchaseOrders PurchaseOrderDetailId, Or- Contains details of all the Not
Details der Date, Document Type, ordered purchases of CO- Shared
Status , Vendor No... GEPHA and their status

26
Item Entries ItemEntriesId,ItemNo,Type, Contains details of all the Not
Source,Location... item entries to stock of CO- Shared
GEPHA including sales and
purchases
Shipments De- ShipmentDetailId,Customer Contains details of all the Not
tails No, City, Creation Time , shipments of COGEPHA Shared
Shipment Date...
Table 3.1: Dimensions.

3.2.3 Fact Tables


A fact table is a collection of structured data, made-up of dimension-type fields (the
context) and measure-type fields (the facts). It often relates to just one business proce-
dure.

Foreign keys to dimension tables and numeric values that represent measures are two
different sorts of columns found in fact tables.

The logistics chain and in particualr our project encompasses three business processes,
the stock,the purchases and the shipments which will be presented.

• Topic 1: Track Stock evolution in terms of value and quantity to assure that it
satisfies the needs while keeping an optimized value .

The following table (tab. 3.2) will detail the many measure that need be taken for
this fact table :

Fact Table Dimensions Indicators and Measures


Fact Stock Item, ItemEntries, Cus- Quantity Running Total,
tomer, Vendor, Date, Loca- Stock Value Running To-
tion tal,Stock Value, Rotatio
Rate, Coverage Rate, Stock
Value, Balance of trade
Table 3.2: Fact Tables for the Stock

• Topic 2: Track Purchases in terms of amounts and quantity and track vendors per-
formances .

For this process we will need two fact tables listed below :

27
Fact Table Dimensions Indicators and Measures
Fact Purchase Item, PurchaseDetails, Ven- Active Vendors, Ven-
dor, Date, Location dorDisAmount, Not
Paid, Received Quan-
tity,Overbought,Purchases
Amount
Fact Purchase- Item, PurchaseOrders- Availability, Ordered Quan-
Orders Details, Vendor, Date, tity
Location

Table 3.3: Fact Tables for the Purchases

• Topic 3: Track Shipments in terms of delays and quantity delivered to satisfy our
customers .

For this process we will need two fact tables listed below :

Fact Table Dimensions Indicators and Measures


Fact Shipments Item, ShipmentsDetails, Exceeded Percentage, Avg
Customer, Vendor, Date, time to prepare, Time To
Location Prepare, Exceeded Time,
As Planned, Planned Per-
centage, Shipments Costs
Table 3.4: Fact Tables for the Shipment

3.3 Multidimensional Modeling


3.3.1 Data warehouse models
There are several types of data warehouse models which are: [12]

• Star model: This type of model, initiated by Ralph Kimball, is represented by a


central fact table around which the dimensions gravitate to analyze the facts that are
contained therein. Each dimension is described by a single table whose attributes can
represent all possible granularities.

• Snowflake model: In a snowflake schema, initiated by Inmon, the fact table is also
at the heart of the model. The dimensions gravitate around the central table but the
difference lies in a greater hierarchy of these dimensions. These are linked by a succes-
sion of relationships down to the finest granularity, which is directly related to the fact
table. Technically, this scheme avoids information redundancy but requires joins when
aggregating these dimensions via a succession of foreign keys.

• Constellation model: A series of star and/or snowflake diagrams make up this model
in which the fact tables share certain dimension tables.

28
3.3.2 Choice of model
The final design will incline toward the constellation of facts. Moreover, based on the
dimensions’ chosen granularity, we will find star or snowflake models. This is why we will
contrast these two models in the table below (tab.3.5) to determine which is best for us..

Star Model Snowflake Model


Query complexity Low High
Join used Easy Difficult redundancy
to manage
Time consumed during of Takes less time Takes more due to ex-
query execution cessive use of the join
Simplicity Simple Complex

Table 3.5: Comparison of star model and snowflake model.

We first concentrate on the business process while determining the level of granularity
existing in the fact table in order to select the model that works best for us. Second, the
model’s simplicity, adaptability, and performance in terms of response time all played a
role in our decision. In light of this, we selected the star model that is displayed below.

3.3.3 Data Marts Conception


We will begin with the design of our data marts because we will be using the Ralph
Kimball methodology. The model below (fig.3.1) displays the data warehouse’s four fact
tables.

The first "FactStock" essentially focuses on the variation of the stock in terms of value,
quantity and availability. This table allows you to track missing items, overbought items
and overall inventory status.

Our “FactPurchases” fact table will track the costs of all our purchases, most active
vendors and overall expenses of the purchase cycle.

In the third fact table "FactPurchasesOrders" we will look at the different vendors and
orders while focusing on the declined orders to assess suppliers performances.

Lastly, "FactShipments" will track deliveries to our clients and focusing mainly on
times and delays, while assessing the costs of shipments.

All of these fact tables which are the center of our data marts will build the data ware-
house.Each data mart follows the Star Model and all together consist the Constellation
Model.

Below we find the model for each data mart in detail:

29
Figure 3.1: Fact Tables Model

3.4 Conclusion
Throughout this chapter, we have identified the different sectors of activity, the mea-
sures and the axes of analysis. Indeed, we presented the dimensions and the fact tables,
after that, we discussed multidimensional datamart models before putting out the physical
data warehouse model.The development of the data stage will be covered in the following
chapters, the specification and the development of the user application and we will end
by exposing the deployment model of the solution.

30
Figure 3.2: Purchases Data Mart

Figure 3.3: Purchases Orders Data Mart

31
Figure 3.4: Shipments Data Mart

Figure 3.5: Stock Data Mart

32
Chapter 4

Implementation and visualization of


data

4.1 Introduction
In this chapter and following the modeling, we will move on to the next technical
stage of our project.We’ll concentrate on creating the components of the data preparation
section and using these data to make decisions.At last, the development of data in the
form of a dashboard that will give the decision maker a better view of operations while
adopting the Kimball approach to project completion decision-making.

4.2 Design and development of the ETL Process


There will be two stages to this phase. The data must first be extracted from our
source server before being transformed and loaded into the data warehouse.

4.2.1 Extraction
We currently pull the various required data from the transactional databases and store
them in the Staging Area (STG). To create a new foundation from which to perform our
modifications, we will build a duplicate of the tables from the source database specifically
dedicated to our company.

We used the visual studio IDE creating a solution named after our project. This so-
lution encompasses projects of all phases named STG,DWH and SSAS.

The first project is the STG. We create a package for each extracted table, in this
package we have, the source component and the destination component.In between,the
Slowly Changing Dimension (SCD) is a dimension that stores and manages both current
and historical data over time in a data warehouse.

It is considered and implemented as one of the most critical ETL tasks in tracking the
history of dimension records.Its use here is to update old rows and add the new ones into
the table without having to truncate the table.(fig.4.1)

33
Figure 4.1: Staging Area Task

And we create the package which allows filling out the tables in a sequential manner as
shown in the figure below(fig.4.2), the sequence container’s packages will all be performed
at once.

Figure 4.2: Staging Area Package

4.2.2 Transformation and loading


At this stage we have taken into consideration the behavior of SSIS components since
we want to reduce the amount of time it takes to transform data.[13]

4.2.3 SSIS Component Behavior


We will present in this part the different types of SSIS components so that we can
choose the most suitable ones.

• Blocking components These are the most greedy components in terms of resources,
they require to review all the lines arriving on their input before sending them on their
output flow. Example: Aggregate, Sort, Row Sampling.

• Semi-blockers These transformations temporarily block the flow of data. This is the
case, for example, of the Merge Join which will send data to its output stream when all

34
the lines with the same join key have arrived at its input. Example: Merge, Union All.

• Non-blocking objects This type of object does not retain data either partially (semi-
blocking component) or totally (blocking component). They are therefore more efficient
in terms of execution time. Example: Derived Column, Lookup, OLEDB Command.

4.2.4 Loading data


At the STG level we were able to process the different tables and form a clear view of
which ones will be the sources of our dimensions and our fact tables creating our Logis-
tics Activity Data Warehouse

We will now approach the feeding step to concretize the theoretical components after
defining the appropriate dimensions in the modeling phase.

For some dimensions, we have done historization to keep track. Like the Item dimen-
sion case shown in the figure below (fig 4.3).

Figure 4.3: Dimension Item Implementation

35
For a business need we have chosen to keep the change history of the last direct cost
of items, for the other attributes such as name, description if there will be a change the
record will be modified. So, first of all we started from our two sources which are STG
and DWH to be able to compare the new data with those which already exist in our
destination (DWH).

We used the Union All component to perform a join on all outcomes. In this case we
will have three possible cases:

• Insertion: When it is a new Item.

• Update: A change in an attribute whose history we don’t need to keep allows the
change of this same item. (Change of name, description...).

• Historization: In this case the item is in the DWH but one of its attributes that we
need to keep track of "Last Direct Cost" has been changed. So we add a new row with
this same item keeping the same business key but an incremented surrogate key, so it will
be the current item with the new Last Direct Cost and we keep the other record while
changing its end date and having the old Last Direct Cost.An example is shown in the
figure below (fig 4.4)

Figure 4.4: Example of Historization

For all the dimension we used the ’Script’ component to create the auto-incremented
surrogate key(Technical key). We also used ’Derived Columns’ to replace null and empty
values into "UNKNOWN" so we can filter the data better.

The fact tables will be loaded at the final transformation stage already identified dur-
ing the fact table identification phase. To fully understand this step, we will go through
each fact table.

To prepare the data, we created several tasks using different data transformation com-
ponents to build our operational data marts and we will now discuss some of them.

Purchases Data Mart

Since most of the data for the Purchases Data Mart mainly comes from the Purchases
Invoice Line table. In our case, we are primarily interested in the amounts and costs,
so we pull the data from the table to perform the purchase amount aggregation so that
we can add it to as a measure and insert it into Fact Purchase Table.(fig 4.6)

Purchases Orders Data Mart

Since most of the data for the Purchases Data Mart mainly comes from the Purchases
Line table. In our case, we are primarily interested in the non received orders, so we pull

36
the data to perform the measures needed on declined orders including their amounts and
quantities and insert it into Fact PurchaseOrders Table.(fig 4.7)

Shipments Data Mart

Shipments data mainly come from the Sales Shipments Line table, so we will in-
tegrate the data into the Fact Shipments table so that we can calculate and focus on
quantities shipped and their costs.(fig 4.9)

Stock Data Mart

Stock data mainly come from the Value Entry table, so we will integrate the data
into the Fact Stock table so that we can calculate the value of the stock and the needed
measures.(fig 4.8)

We performed Lookups with dimensions at the source component level instead of joins
as it performs better when it comes to execution time. As a result, we were able to access
every match in our fact table. We have thus added some of our measures that don’t
require complex calculating power such as sum and multiplication through the ’Derived
column’ component, these measures have been calculated from the data extracted by the
source. Finally, we reach the last stage, data loading, which involves writing data more
accurately into the correct table in the database. The data warehouse’s "Master" package
is depicted in the figure below. Execution of the fact tables will come after the sequential
execution of the dimensions. (fig.4.5).

Figure 4.5: Data Warehouse Implementation

37
Figure 4.6: Fact Purchases Implementation

Figure 4.7: Fact Purchases Orders Implementation

Figure 4.8: Fact Stock Implementation

We also used ’Conditional Split’ to split the [Source No] column in table Value Entry
from STG ,which contains both Vendors and Customers IDs, to separate them into two
new columns VendorID and CustomerID in FactStock.

38
Figure 4.9: Fact Shipments Implementation

4.3 Analysis of prepared data


The SSAS multidimensional solution analyzes data across multiple dimensions using
cube structures. It has an OLAP data query and calculation engine for balance perfor-
mance with scalable data conditions.

This engine is the industry-leading OLAP server integrates effectively with a variety
of BI products. It enables end users to analyze data across several dimensions, giving
them the knowledge they require for better decision-making.

The complicated and potent query language used by OLAP cubes is called MDX, and
it produces multidimensional reports made up of one or more two-dimensional arrays.

4.3.1 Utility of the OLAP cube


To acquire quick query performance against dimensional data, a multidimensional
model is primarily created.[14]

A multidimensional model consists of cubes and dimensions that can be extended and
annotated to allow complicated query techniques, speed up response times, and provide a
single data source for reporting. Another advantage offered by Analysis Services multidi-
mensional databases is integration with commonly used BI reporting tools such as Excel,
PowerBI, as well as custom applications and third-party solutions.

4.3.2 Implementation of OLAP cubes


In our project, we created computed indicators that will eventually be SSAS measure-
ments using OLAP cubes. These determined measures are aggregates calculated with
MDX expressions rather than base members deriving from columns.s

The figure below (fig.4.10) provides the dimensions and cubes that will be used in our
filters. Only the data required for the analysis has been loaded.

39
Figure 4.10: Cubes and Dimensions

We have created the aggregation metrics that will need to be displayed in our reports.
We were thus able to visualize through a simple ’drag and drop’ the results of our mea-
surements along several axes. For example, the measure StockValue presents the value of
the current stock by multiplying "Unit Cost" from Dimension Item By "Quantity" from
Dimension ItemEntries.

Figure 4.11: Measures

The creation of these dynamic measures allows us to facilitate the creation of dash-
boards and access to data without the need to make additional calculations.

4.4 SSIS deployment and job planning


Now that the development of our SSIS packages is complete, we can proceed to deploy
to the SQL server. There we can also schedule and run the package. In the following we
will present the steps followed for the deployment.

4.4.1 SSIS Package Deployment


The first step is to use Visual Studio to deploy an entire SSIS project. So we right-click
on the project and select Deploy.

40
For this step we must choose the destination (Our SQL server) and we must have the
SSIS catalog in which we are going to store the project. The deployment result is shown
in the figure below.

4.4.2 Job creation and planning


To run the package, we only have to locate it in the catalog folder, we right-click and
we select Execute.

Figure 4.12: SSIS Catalog

After package verification we schedule these packages so that our ETL can run for a
specific period of time (probably overnight). So, we create in SQL Server Agent a new
job that will take care of the execution of our packages. In the example below we have
created the data warehouse job. We can define one or more jobs to run the package at
predefined times. We have several types of planning namely daily, weekly and monthly
planning. The schedule will be set according to the type of business and the frequency of
data change. In our case, we scheduled the execution daily at 02:00 for STG and 05:00
for DW.

In the figure below we have launched our Job which has been executed successfully.

41
Figure 4.13: History of Execution

4.5 Data Visualization


4.5.1 Creation of dashboards
After having studied the existing situation, we will need to interact with users, which
is accomplished through the creation of the user application, once we have identified the
technologies to be utilized and created the relevant data model.
This stage entails the modeling of dashboards, reports and performance indicators (KPIs)
adapted to users.

We will use the Power BI tool for the creation of the various dashboards, all the in-
dicators and graphic components of which will be automatically linked to the filters to
guarantee the dynamic aspect.

Home Screen

To ensure increased use of our application, we have designed a home interface (fig.4.14)
which provides the user with visibility of all the reports covered by our solution. Indeed,
this page contains the name of the project and each theme is represented by a button
that will redirect us to the selected page.

42
Figure 4.14: Home

Stock Overview

The Stock overview below (fig.4.15) provides a general idea of the stock through the
indicators and percentages needed to monitor the status of the application since 2018 (the
monitoring date is fixed by the customer). We can then view:

- The stock value by vendor Class, the percentage of each class in terms of value, the
coverage both in numbers and days and rotation rate.

- Coverage by location

- The distribution of quantity purchased by Vendors over a period of time divided into
months.

Figure 4.15: Stock Dashboard

43
The stock evolution show the evolution of stock and quantity in the stock through
time while having filters on vendor class and item description.

Figure 4.16: Stock Evolution Dashboard

Purchases Overview

The Purchases overview below (fig.4.17) provides a general idea of the purchases cycle.
We can see:

- The amounts of purchases by vendor class, divided into months.

- Ordered Quantity VS Received Quantity and the availability for each vendor class.

- The amounts for each class and the not paid amount and the respective cost of delay
for each class.

The purchases daily (fig.4.19) shows the amounts daily for any month chosen by the
user.

For some measures,we had to use DAX [15] mainly focusing on time intelligence func-
tions so can filter through time and calculate complicated measures.

For example figure 4.18 show the NotPaid measure which calculate the amount of pur-
chases not paid yet , it uses CALCULATE which calculates an expression while allowing
filters and the first filter we compare today’s date to due date by calculating the difference
in days with DATEDIFF function.

44
Figure 4.17: Purchases Dashboard

Figure 4.18: NotPaid DAX Measure

Figure 4.19: Purchases Daily Dashboard

Shipments Overview

The Shipments overview below (fig.4.21) provides a general idea of the shipments in-
cluding Quantity and costs by location. We can see:

- Most Active Cities.

- Counts and percentage that exceeded time distributed by location.

45
- Average time to prepare shipments.

- Percentage of shipments that follows the planned date of delivery.

Another example of DAX measures, is AsPlanned (fig.4.20) which uses the condition
IF on an expression which in our case is comparing the planned shipment date and the
actual shipment date and following the result is either YES or NO.

Figure 4.20: AsPlanned DAX Measure

Figure 4.21: Shipments Dashboard

4.5.2 Deployment of dashboards


After creating our Dashboards with Power BI Desktop, we deployed them on the Mi-
crosoft Power BI online service. We first created a folder for our client after managing
access and then we uploaded our dashboards.

These reports can be distributed and consumed on the web and on mobile devices in
order to meet various business needs, as shown in the figures below.

46
Figure 4.22: Web Version over the Power Bi Service

Figure 4.23: Mobile Version

4.6 Conclusion
In this chapter, we’ve shown how to use the application to track post- and pre-
recruitment states by setting up dynamic graphical user interfaces that give managers
a broad yet granular perspective. Making strategic decisions will be much facilitated by
these dashboards.

47
Chapter 5

Predictive analysis

5.1 Introduction
The numerous dashboards allowed us to extract reliable data, which was beneficial for
us to have a clear visibility and to follow the evolution of the logistics cycle, but since
the volume of data does not stop increasing we will need to further enrich our client’s
knowledge and to optimize the company’s strategic and operational decisions.

Today the main concern for a Business Manager is to transform potential opportuni-
ties into projects, hence the need to take a broader view of the opportunity and its success
rate. It is a question of predicting the future status of the latter.

5.2 Data Prediction


Predictive analytics involves using data, statistical algorithms, and machine learning
techniques to predict likely future outcomes based on historical data. The goal is to ex-
trapolate from past events to better predict future events.

In our project, we will predict purchases amounts to try and find a trend.

It can improve Business Process by identifying certain risks or alerting you to potential
problems in your Business Structure.

48
Figure 5.1: Purchases Prediction

5.3 Key Influencers


The key influencers visual helps you understand the factors affecting a metric you’re
interested in. It analyzes your data, ranks the factors that are important and displays
them as key influencers. For example, suppose you want to determine what influences
purchases amounts to increase. One of the factors is the class of medicines or the vendors
class.

Figure 5.2: Shipments Key Influencers

49
Figure 5.3: Purchases Key Influencers

5.4 Decomposition Trees


The decomposition tree visual in Power BI lets you visualize data across multiple di-
mensions. It automatically aggregates data and allows you to explore your dimensions in
any order.

Since this is also an AI (artificial intelligence) visualization, you can ask the program
to find the next dimension to explore based on certain criteria. This tool is valuable for
exploration and conducting root cause analysis.

In our case the shipments will be analyzed by City, Customer, Month and Year and
Item tracking the costs.

Also the purchases will be analyzed by Vendor Class, Vendor Name, Month and Year
and Item tracking the amounts.

This will provide an overview over how we spend the money and who receives the most.

50
Figure 5.4: Shipments Decomposition Tree

Figure 5.5: Purchases Decomposition Tree

51
5.5 Clustering
Clustering is an unsupervised machine learning algorithm that looks for patterns in
data by dividing it into clusters. These clusters are created such that the points are
homogenous within the cluster and heterogenous across clusters. Clustering is commonly
used in market segmentation and several areas of marketing analytics.

In the figure below (fig 5.6) , we divide our vendors into 3 clusters by the amounts
and quantities they purchase.

In the top right , we can see a Smart AI description which describes the clusters .

Figure 5.6: Vendors Clusters

5.6 Conclusion
This chapter has allowed us to outline the various stages of the project’s execution,
from the models’ creation and preparation to the development of the dashboards and,
finally, to the implementation of the predictive model and various analytical techniques.s

52
General Conclusion

This report, which summarizes an enlightening internship, ends with a reminder that
the goal of our project is to create and deploy a decision-making solution dedicated to the
Logistics department.

To accomplish this, a working technique customized for the business was selected.

We first began with a project strategic analysis. In this investigation, the needs were
identified, the objectives and the boundaries of our project in order to have a clear under-
standing of the future and to be able to work with realistic, definable short-, medium-,
and long-term goals. Then, we created the technical framework for the project, and as a
result, we chose the ideal tools for the job.

We started implementing our method, which is broken down into several parts, after
preparing our work area.

The first phase was an essential phase which consists in understanding the data in
order to move forward better and to have a clearer idea of the indicators to be measured.
The most important stage follows that allowing the extraction, transformation and load-
ing of data into the DW with the use of Microsoft technologies (SSIS, SSAS). Finally,
we reach the restitution phase, where we chose to use the Power BI in order to generate
interactive, clear and detailed dashboards ensuring access to pertinent and well-organized
data in order to advance our understanding we have developed an application web that
allows to predict the success rate of the opportunity, this application helps our decision
makers to unveil the future of a commercial proposal from where they can influence its
status.

This project gave us the chance to put our academic knowledge into practice, to ex-
perience working life, and to learn about the business world. So, we had the opportunity
to apply the entire BI process, to wear the hat of a BI developer, and to deal with the
inherent challenges such task distribution and time and effort management. We now know
how to effectively defend our work, persuade others, and communicate the concepts of
"intelligence" and interactivity to our clients.

Given the complexity of the links between the tables in the database, the biggest chal-
lenges encountered during the months of work were at the level of interpreting the data.
In addition, our approach is universally applicable and flexible.

By way of perspectives and avenues for development, we plan, on the one hand to
enhance the dashboard module so that it can keep up with the business field’s rapid evo-

53
lution, on the other hand to continue analyzing the data and implement other methods
of understanding it.

To conclude, I would like to express my satisfaction at having the opportunity to work


in a stimulating setting with favorable material conditions and as part of a strong team
that encouraged me and assisted me to succeed in my project.

54
Bibliography

[1] Title="COGEPHA SITE",URL="https://www.cogepha.com/en/"

[2] Title="BI Architecture",URL="https://www.domo.com/learn/article/what-is-


business-intelligence-architecture"

[3] Title="ETL",URL="https://www.geeksforgeeks.org/etl-process-in-data-warehouse/"

[4] Title="Data Mart VS Data Warehouse", URL="https://www.snowflake.com/guides/difference-


between-data-warehouse-and-data-mart"

[5] Title= "GIMSI Method",URL="www.executive-dashboard.org/performance-


indicators/methodology-gimsi"

[6] Title="Agile BI", URL="https://www.datapine.com/blog/agile-business-intelligence-


analytics/"

[7] Title="Inmon or Kimball",URL="https://www.computerweekly.com/tip/Inmon-or-


Kimball-Which-approach-is-suitable-for-your-data-warehouse"

[8] Title="DW AND STG", URL="https://social.technet.microsoft.com/wiki/contents/articles/40301.d


warehouse-stg-ods.aspx"

[9] Title="SSIS",URL="https://learn.microsoft.com/en-us/sql/integration-services/sql-
server-integration-services?view=sql-server-ver16"

[10] Title="SSAS",URL="https://learn.microsoft.com/en-us/analysis-services/ssas-
overview?view=asallproducts-allversions"

[11] Title="Power BI",URL="https://powerbi.microsoft.com/en-us/"

[12] Title="DW Models", URL="http://infogoal.com/datawarehousing/data_models_for_data_wareh


business_intelligence.htm"

[13] Title="SSIS Transformation", URL="https://www.tutorialgateway.org/ssis-


transformations"

[14] Title="OLAP Cubes", URL="https://www.guru99.com/online-analytical-


processing.html"

[15] Title="DAX", URL="https://learn.microsoft.com/en-us/dax/"

55

You might also like