Professional Documents
Culture Documents
DataScience Project Report
DataScience Project Report
General introduction 1
General conclusion 16
ii
LIST OF FIGURES
iii
LIST OF TABLES
iv
GENERAL INTRODUCTION
As students, we all strive to be well-prepared for the workforce and ready to tackle any chal-
lenges that may arise in our future careers. However, the traditional college education system
may not always fully prepare us for the specific demands of our chosen field. This is where our
data science project comes in.
Our project aims to bridge the gap between classroom learning and real-world experience
by collecting data from engineers already working in the IT field through surveys. By gathering
information about their everyday tasks, job responsibilities, and the subjects studied at college,
we can gain valuable insights into what skills and knowledge are necessary for success in the
industry. This data will provide a road map for students, highlighting the subjects that are most
relevant to their future careers, and will help them to make the most of their college education.
One of the key benefits of our project is that it will allow students to better understand the
skills that are in high demand in the industry and to make more informed decisions about their
education and career paths. It will also help to ensure that the IT engineering curriculum at col-
leges and universities are up-to-date with the latest industry trends and technologies, enabling
students to be better prepared for the workforce.
Furthermore, this project can also help to improve the educational system by identifying
areas where the curriculum needs to be updated and by providing students with access to real-
world examples of the skills and knowledge that they need to develop in order to be successful.
In summary, our project is designed to help students be better prepared for the workforce by
providing them with valuable insights into the skills and knowledge that are in high demand in
the IT industry. By collecting data from engineers already working in the field and analyzing
1
it, we can help to enhance the educational system and prepare students for the real-world chal-
lenges that they will face in their future careers.
The remainder of this report is structured based on the methodology that we will be using,
which IBM master plan, as follows: The first chapter will be presenting the first two steps, which
are business understanding and analytic approach. In this chapter, we Will discuss in the first
section a study of the existing solution, a description of the proposed solution and its goals, it
also contains the requirements set by the project. In a later section, we will talk about the busi-
ness and data science objectives, the SMART strategy as well as KPIs. A general conclusion
summarizes the work done and presents possible perspectives.
2
CHAPTER I
Introduction
This chapter provides the first two steps of the IBM master plan which are business un-
derstanding and analytic approach for the upcoming project. It covers elements such as the
problematic, a study of the existing situation, a presentation of the objectives and requirements
associated with it. Additionally, it outlines business and data science objectives, SMART strat-
egy as well as KPIs.
1 Business understanding
The first step of the IBM Master Plan is Business Understanding. The goal of this step is to
gain a comprehensive understanding of the current business environment, including objectives
and potential risk factors and to align the project goals with the overall business strategy.
A project case is an essential element to any project; it captures the reasons for initiating a
project or task and the approach to solve them. Therefore, this sub-chapter aims to define the
problematic and our proposed solution.
3
1.2 Problematic
In this sub-section, we will discuss the limitations that our project has the potential to sur-
pass.
• Schedule Stiffness : The traditional curriculum often has a rigid schedule that does not
allow for flexibility in case of absence or individual differences in learning pace.
• Increased Costs : The traditional curriculum often involves high costs, not only in terms
of money but also in terms of time and resources, which can be a hindrance for students.
• Limited Didactic Material : The availability of didactic material can be limited, and
what is available can often be expensive, which can create difficulties for students and
their families.
• Outdated Teaching Methods : The traditional curriculum may not be up-to-date with
the latest teaching methods and technologies, making it less effective and less engaging
for students.
• Inflexible Assessment Methods : The traditional curriculum often employs inflexible as-
sessment methods that may not accurately reflect the individual strengths and weaknesses
of each student.
• Insufficient Career Preparation : The traditional curriculum may not provide adequate
preparation for students entering the workforce, leaving them ill-equipped to succeed in
their chosen careers.
• Lack of Relevance : The traditional curriculum may not be relevant to the real-world
challenges and opportunities faced by students, making it less engaging and less effective.
There are several platforms used to predict the curriculum, they are based on collecting and
analyzing data related to student preferences, learning styles, performance, and other relevant
4
factors to predict the curriculum and suggest courses or study materials that would be most
beneficial for individual students.
The most popular ones are: Knewton and Carnegie. Both of them provide personalized
learning experiences for students. They use AI algorithms to predict a student’s curriculum and
adapt the learning experience to their individual needs and pace.
The table I.1 shown below presents the advantages and disadvantages of Knewton.
knewton
Advantages Disadvantages
Designed to provide a customized
Limited subject areas: Although Knewton
learning experience for each student
offers a range of subjects, it may not have
based on their strengths, weaknesses,
resources available for every subject or level.
and learning style
Reliance on technology: Knewton’s
Data-driven: The platform uses data
personalized learning experience is
to track student progress and provide
heavily reliant on technology and
insights into areas where they need
internet access, which can be a challenge
additional support.
for some students.
Adaptive learning: The platform
Limited integration: Knewton’s platform
adjusts the difficulty of questions
may not integrate well with existing
based on a student’s progress, making
learning management systems and tools
it a more efficient way for students to
used by schools and universities.
learn and retain information.
Interactive content (videos, animations,
Expensive
simulations)
5
The table I.2 shown below presents the advantages and disadvantages of Carnegie.
Carnegie
Advantages Disadvantages
Research-based: The company’s approach Limited subject areas: Carnegie Learning
to education is based on research and data, only focuses on math education, which
ensuring that its solutions are evidence-based means that it may not be suitable for students
and effective. who need resources in other subjects.
Student-centered learning: Carnegie
Learning’s solutions are designed to Technological challenges: (require technology
be student-centered, with a focus on and internet access)
active learning and engagement.
Collaborative learning: The company’s Steep learning curve: The company’s technology
technology and solutions encourage students and solutions may have a steep learning curve
to work together and engage in collaborative for some students and educators, requiring
learning, promoting a more inclusive and time and training to fully understand and use
supportive learning environment. effectively.
Emphasis on Math Expensive
Our project not only aims to address the current limitations of traditional curriculum, but
also provides additional value by incorporating key features that enhance its usefulness and
impact.
• Web application : Front-end for students allowing them to access the platform, sign in,
and specify their preferences within their desired field. This feature provides students
with insight into which specialty they should pursue. The Front-End will serve as a valu-
able tool for the secondary targets or audience, providing them with a personalized and
interactive experience.
• Mobile compatibility : A mobile app that allows students to access their learning mate-
rials and progress on-the-go.
• Career preparation: Providing students with resources and tools to help them prepare
for their future careers.
• Multilingual support: Providing support for multiple languages to cater to a global au-
dience.
6
• Extensibility: The project also prioritizes extensibility by being designed in a way that
allows it to be trained on future data. This will enable the model to continually adapt and
evolve as the needs of the alumni students and the industry change over time.
• Smart agent integration: A chatbot can greatly enhance the student’s learning experi-
ence. It would use natural language processing and machine learning algorithms to assist
students in their learning journey.
• Social learning: A social learning environment that allows students to collaborate and
share their knowledge with each other.
In order to overcome the challenges presented by the traditional way of education at esprit
we propose an AI based platform that we call PRODIGY PATH.
Our platform provides a flexible, competency-based curriculum that is tailored to the indi-
vidual needs and goals of each student. Combining the most advanced AI algorithms and the
7
inputs from esprit alumni, our platform generates the best and the most efficient curriculum for
the education at esprit.
Our solution aims at providing esprit students with the skills and competencies that are in
high demand in the workplace. Our key Features will be:
• Hands-on experience : Our platform provides students with opportunities for hands-on
experience, including internships, practicums, and real-world projects.
• Flexible scheduling : Our platform provides students with a flexible schedule, allowing
them to learn at their own pace and on their own time.
In this section, we do an in-depth analysis of our two types of requirements which are:
• Functional requirements: defines the basic system behavior meaning the different functions
intended for the users of the application.
• Non-Functional requirements: specifies how efficiently the system should fulfill in its internal
functioning.
• Skills Deficiency Analysis : Identify skill deficiency among alumni students based
on the data collected and analyzed.(comparison between studied curriculum and job
requirements among alumni )
8
• Reporting : Generate reports and visualizations that effectively communicate the
results of the analysis and recommendations to stakeholders.
• Collaboration : The ability for team members to cooperatively access and work on
the platform.
• Student engagement : The platform should ensure that the speciality choice will
be effectively assigned based on student preferences within the IT field.
• Content management : The platform should allow educators to create, upload, and
manage a variety of educational content, i.e: Modules Subjects.
• Used technologies :
• Scalability : ability to scale the solution as the volume and complexity of data increase.
• Availability : ensuring that the solution is accessible and available when needed, with
minimal downtime.
• Reliability : ensuring that the solution produces consistent and accurate results, and that
it operates as expected.
• Security : protection of sensitive data, ensuring data privacy and confidentiality, and
guarding against unauthorized access.
• Usability : designing the solution to be easy to use and understand, with a clear and
intuitive user interface.
9
• Interoperability : ensuring that the solution integrates with existing systems and tech-
nologies, and that it can exchange data with other systems.
In this sub-section, we will discuss the constraints within our project which might influence
the development process and what solutions need to be taken into account in order to mitigate
any risk.
• Data Availability : The availability of data from alumni students may be limited, making
it difficult to fully capture the scope of skills gaps among this group.
• Data Quality : The quality of the data collected from alumni students may be question-
able, making it challenging to rely on the data for analysis.
• Privacy Concerns : Protecting the confidentiality of the data collected from alumni stu-
dents may be a challenge, especially if the data contains sensitive information.
• Computational power :Analyzing large amounts of data may require a significant amount
of computational power, which may be a constraint on the project.
• Budget : The budget allocated for the project may be limited, making it difficult to acquire
the necessary resources and tools needed to carry out the analysis.
• Timeframe : The timeframe for completing the project may be limited, making it chal-
lenging to carry out the necessary data collection, analysis, and reporting activities.
2 Analytic approach
The analytic approach is a critical component of any data science project, as it determines
the methods and tools used to solve the business problem. The goal of this section is to provide a
clear and concise overview of the analytic approach that will be used to address the competency-
based curriculum problem faced by our school.
10
2.1 Business objectives
• For ESPRIT : As our main client, esprit aims through this project to improve its quality
of education which leads:
– marketplace domination : Esprit will gain a better competitive advantage over other
universities though the quality of its students.
– Better reputation : Esprit will have a strong brand image that is recognized and
trusted by all the IT companies around the world.
– Better economic growth for the university: Esprit will attract more students which
will generate more revenue and support its financial growth and stability.
– More room for research and development: Esprit will be one of the pillars that con-
tribute to the improvement of education and scientific research.
– Improved teacher efficiency: Teachers will be able to tailor their instructions and
focus on areas where their students need the most help.
– Opportunities for networking: By working with other students, faculty, and staff on
the project, as students we can expand their network and make valuable connections
for the future.
11
• For the country : This project helps the country establish the unicef sdgs (Sustainable
Development Goals) :
Data science objectives play a critical role in ensuring that a project meets its business ob-
jectives. In the context of the university alumni recommender system project, specifying the
data science objectives is essential in order to develop an effective solution that meets the needs
of future students.
The primary objective of the project is to use the available data to develop a recommender
system that can provide customized course recommendations to each student. To achieve this
objective, it is necessary to select the appropriate data science techniques and tools that can
effectively process the large amount of data contained in the alumni dataset.
The selection of data science techniques and tools will depend on the specific requirements
of the project. For instance, content-based filtering and collaborative filtering are two com-
monly used techniques in recommender systems, and these could be used to develop the alumni
recommender system. It is also crucial to use machine learning algorithms to process the data
and generate accurate recommendations.
Furthermore, the choice of tools will be influenced by factors such as the size of the data,
the computing resources available, and the required processing speed. For instance, if the data
is too large to process on a single machine, distributed computing techniques could be utilized
to improve the processing speed.
The data science objectives of the university alumni recommender system project could in-
clude:
• Data Collection : Collecting and organizing the alumni data, which includes their names,
graduation year, current and past companies, technical and soft skills, and daily work
activities, through form submissions which we consider to be E-interviews.
• Data Pre-processing : Cleaning and preprocessing the data to remove any inaccuracies,
outliers, or missing values, and to ensure that it is in a format suitable for analysis. This
12
step is necessary due to the manner in which the data is collected which is prone to errors.
• Model Development : Developing and training machine learning models to generate rec-
ommendations based on the alumni data. In our case, we will be building a recommender
system for a curriculum using a combination of content-based filtering and collaborative
filtering methods.
• Model Evaluation : Evaluating the performance of the models, determining the best-
performing model, and refining the models as necessary. For example, We can use Preci-
sion to measure the percentage of recommended courses that are relevant to the student,
while using Recall measures the percentage of relevant courses that are recommended.
These metrics can be used to evaluate the overall accuracy of the recommendations gen-
erated by the system.
• Model Maintenance : Monitoring the model’s performance and making any necessary
updates or changes to ensure that it continues to provide accurate recommendations over
time.
SMART is an acronym that stands for specific, measurable, achievable, relevant and time-
based. Each element of the SMART framework works together to create a goal that is carefully
planned, clear and trackable. Below, we will present the SMART strategy for our project :
• Specific :
The goal of the project is to identify the skills and knowledge needed by engineers in the
workforce and use this information to update the curriculum for students in engineering
programs.
• Measurable :
The success of the project will be measured by the degree to which the updated curriculum
aligns with the skills and knowledge required by engineers in the workforce, as determined
through surveys and interviews with engineers and employers.
13
• Achievable :
The project will be completed by conducting surveys and interviews with a representative
sample of engineers and employers in various industries, and using the data collected
to update the engineering curriculum. A team of data scientists, educators, and subject
matter experts will be assembled to complete the project.
• Relevant :
The updated engineering curriculum will better prepare students for success in the work-
force and help bridge the skills gap between education and industry.
• Time-bound :
The project will be completed within 13 weeks, including 3 weeks for understanding the
business, 6 weeks for data collection and analysis, 1 week for data modeling and 1 week for
creating an attractive dashboard. Intermediate deadlines will be established for comple-
tion of specific project milestones, such as data collection and curriculum development.
2.4 KPIs
A Key Performance Indicator (KPI) is a metric used to evaluate the success of a business
process or an aspect of a business. KPIs help organizations track their progress and make data-
driven decisions. In our project, the KPIs could include metrics such as:
• The cost savings achieved from training investments based on the data analysis.
• The effectiveness of the training programs in filling the identified skills gaps
14
• Curriculum conformity with the appropriate body of governance’s description of the pro-
fession.
Conclusion
In conclusion, this chapter has discussed various elements that are integral for succeeding
our project. In particular, we have highlighted the problems associated with the existing system
and reviewed the project objectives and requirements. To achieve our data science objectives,
it is imperative to understand these elements and apply them as needed for successful project
outcomes. With this knowledge, better decisions can be made in order to empower data-driven
decision making. Consequently, it is expected that an improved version of the project will result
due to a better understanding of its goals and implementation strategies.
15
GENERAL CONCLUSION
In today’s rapidly changing and competitive world, it is crucial for organizations and indi-
viduals to continually enhance their skills and knowledge. A competency-based curriculum,
which focuses on what learners are expected to do rather than just what they are expected to
know, can provide a solution to this need. The implementation of a competency-based curricu-
lum can help identify skills gaps and plan for future training investments, ultimately saving time
and resources.
Our project aims to empower learners to take control of their own learning paths by utilizing
data science to establish their skill sets and provide personalized recommendations. The use of
technology, such as social learning environments, and emotion detection in form submissions,
helps to create a more immersive and interactive experience for students. The integration of a
conversational AI model, such as GPT-3, allows for an even more personalized experience as
students can receive instant and relevant answers to their questions.
Throughout the development and implementation of this project, it is essential to monitor its
success and effectiveness. This can be done through key performance indicators (KPIs), such as
student satisfaction and engagement, as well as the improvement of their skills and knowledge.
By continuously measuring and analyzing these KPIs, we can ensure that the project is meeting
its goals and making a positive impact on student learning.
In conclusion, the integration of a competency-based curriculum and data science has the
potential to revolutionize the way we approach learning and skill development. By utilizing
technology to personalize the learning experience and constantly monitoring its success, we
can empower learners to take control of their own paths and equip them with the skills and
16
knowledge necessary for success in today’s ever-changing world.
17