Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 17

A MACHINE LEARNING MODEL FOR UNIVERSITY STUDENTS’ PROGRESSION

TABLE OF CONTENTS

Chapter 1........................................................................................................................................1

1.0 Introduction............................................................................................................................1

1.1 Background of the problem...................................................................................................1

1.2 Statement of the Problem.......................................................................................................6

1.3 Objectives of the Study..........................................................................................................7

1.3.1 Broad Objective..............................................................................................................7

1.3.2 Specific Objectives.........................................................................................................7

1.4 Research Questions................................................................................................................8

1.5 Significance of the Study.......................................................................................................8

1.6 Motivation of the Study.........................................................................................................9

1.7 Scope of the Study...............................................................................................................10

REFERENCES............................................................................................................................12

ii
Chapter 1

1.0 Introduction

This chapter gives a brief review that will introduce the reader to the background of the research
on “A machine learning model for university students’ progression”. Under the problem
statement the foundation and the direction of the research will be established.

1.1 Background of the problem

The global landscape of higher education is witnessing a growing emphasis on student success
and retention. Institutions worldwide are recognizing the importance of not only attracting
students but also ensuring their successful progression through to graduation (Aspesi et al., 2019;
Yanta et al., 2021). This focus is driven by several factors, including economic considerations,
workforce development needs, and the ethical imperative of fulfilling the educational aspirations
of students. However, student attrition remains a persistent challenge, with various personal,
academic, and institutional factors contributing to dropout rates. Recent studies have shown that
global dropout rates can range from 10% to 50%, depending on the institution and program, with
significant economic and social costs associated with these losses (Kahu et al., 2019; Tinto,
2017). In the United States, for example, the six-year graduation rate for first-time, full-time
undergraduate students is only around 60%, highlighting the scale of the issue (National Center
for Education Statistics, 2023).

Figure 1: Global 12-Month Dropout Rates among Fall-Term First-Time Undergraduates

1
Source: National Student Clearinghouse Research Center (2021)
In response to these challenges, universities are increasingly turning to data-driven approaches to
understand and enhance student progression (Tempelaar et al., 2021; Thomas & Galambos,
2021). The advent of sophisticated data collection and storage technologies, such as learning
management systems and student information systems, has led to a wealth of information on
student demographics, academic performance, engagement patterns, and other relevant factors
(Huml et al., 2019; Agudo-Peregrina et al., 2020). Machine learning, a subset of artificial
intelligence, offers a powerful toolkit for analyzing this data and extracting actionable insights.
By leveraging machine learning algorithms, institutions can identify early warning signs of
academic difficulty, personalize learning experiences, and optimize resource allocation to
improve student outcomes (Oladokun et al., 2020; Vandamme et al., 2019). This shift towards
data-driven decision-making is transforming the way universities approach student success and
retention, with the potential to create more equitable and effective educational systems.

Machine learning algorithms have demonstrated significant potential in various domains, from
healthcare to finance. Their ability to identify complex patterns, make predictions, and adapt to
new information has made them indispensable tools for decision-making (Almarabeh et al.,
2019; Xing & Du, 2019). In education, machine learning is revolutionizing the way institutions
understand student behavior and personalize learning experiences (Glover et al., 2011). Unlike
traditional statistical methods, machine learning algorithms can handle large and complex
datasets, uncovering hidden relationships and patterns that may not be immediately apparent
(Chen et al., 2020; Faisal et al., 2020). This allows educators to gain a deeper understanding of
the factors that influence student success and tailor interventions accordingly.

The application of machine learning in student progression goes beyond mere prediction. By
identifying early warning signs of academic difficulty or disengagement, institutions can
implement timely interventions to support at-risk students (Johnson et al., 2020; Kovacic et al.,
2021). These interventions can range from personalized tutoring and mentoring to academic
advising and mental health support. A proactive, data-driven approach can significantly improve
student retention and overall success rates. A 2020 meta-analysis of 26 studies found that
universities using predictive analytics to identify at-risk students and provide targeted support
saw an average increase of 15% in retention rates (Smith & Lange, 2020). Furthermore, machine

2
learning can also be used to personalize learning experiences, tailoring educational content and
resources to individual student needs and preferences (Glover et al., 2011; Hwang et al., 2020).
This can lead to increased engagement, motivation, and ultimately, better academic performance.
The potential of machine learning to transform higher education is immense, with the promise of
creating more supportive, effective, and equitable learning environments for all students.

The field of machine learning is rapidly evolving, with new algorithms and techniques emerging
constantly. Traditional models like logistic regression, which were once the mainstay of student
success prediction, are now being complemented or even superseded by more sophisticated
approaches. These include decision trees, random forests, gradient boosting, and neural networks
(Musso et al., 2020; Zawacki-Richter et al., 2019). Each model offers unique advantages and
limitations, and the choice of the most appropriate model depends on the specific research
question and data characteristics. Recent studies have shown that ensemble methods, which
combine multiple models, often outperform individual models in predicting student outcomes
(Lakkaraju et al., 2019; Tosik et al., 2020). The increasing complexity and sophistication of
machine learning models have the potential to further enhance our understanding of student
progression and inform more effective interventions.

In the Kenyan context, the demand for higher education has been steadily increasing, as shown
in the graph below, leading to a diverse student population with varying academic backgrounds
and needs. Universities are grappling with challenges such as limited resources, increasing class
sizes, and a need for more targeted support services (Mbuvha et al., 2021; Ngugi et al., 2022).
Machine learning offers a promising avenue for addressing these challenges by enabling
institutions to make data-driven decisions and optimize resource allocation (Onyango et al.,
2023). For instance, predictive models can help identify students who are most likely to benefit
from specific interventions, allowing universities to focus their resources where they will have
the greatest impact (Oketch et al., 2020). Furthermore, machine learning can be used to analyze
large datasets to identify patterns and trends that may not be immediately apparent to human
observers.

Research on student progression in Kenya has identified several key factors that influence
academic success and retention. These include socio-economic background, high school
performance, academic preparedness, financial constraints, and psychosocial factors (Kalinowski

3
et al., 2021; Getange et al., 2022). Understanding the interplay of these factors is crucial for
developing effective interventions and support systems. Machine learning models can be used to
analyze these factors and their interactions, providing a more nuanced understanding of the
student experience (Wanja et al., 2023). This can inform the design of targeted interventions that
address the specific needs of different student populations, ultimately improving retention and
graduation rates.

While machine learning models developed in other contexts can offer valuable insights, it is
essential to consider the unique characteristics of the Kenyan higher education system. Factors
such as cultural norms, language barriers, and specific academic challenges need to be taken into
account when designing and implementing predictive models (El Mrabet & Ait Moussa, 2022;
Kombo et al., 2023). For example, research has shown that financial constraints are a significant
barrier to student success in Kenya, and models that do not account for this factor may not be
accurate or effective (Mwendwa, 2021). Additionally, the use of machine learning in education
raises ethical considerations, such as the potential for bias in algorithms and the need for
transparency and accountability in decision-making processes.

Recent data from the Commission for University Education (CUE) in Kenya indicates a positive
trend in student progression and graduation rates. The average graduation rate for public
universities has increased from 65% in 2018 to 72% in 2022 (CUE, 2023). This improvement
can be attributed to various factors, including increased government funding for higher
education, improved academic support services, and the adoption of technology-enhanced
learning approaches. Dropout rates in Kenyan universities vary depending on various factors,
including the institution, program of study, and student demographics. On average, the dropout
rate in public universities is estimated to be around 15%, while private universities experience
slightly lower rates (Kenya National Bureau of Statistics, 2022). However, it's important to note
that dropout rates can be significantly higher for certain programs or student groups.

A machine learning (ML) model is, at its core, an algorithm designed to learn from data and
improve its performance over time without explicit programming (Alpaydin, 2016). It functions
by identifying patterns within datasets, using these patterns to make predictions or decisions.
This learning process can be supervised (where the model is given labeled data with correct
answers), unsupervised (where the model finds patterns in unlabeled data), or reinforcement-

4
based (where the model learns through trial and error). The power of machine learning lies in its
ability to adapt and generalize, making it a valuable tool for uncovering insights in complex data
like that found in educational settings.

Figure 2: Simplified Overview of the Machine Learning Process


Source: Adapted from Alpaydin (2016)
In education, machine learning models are used to predict student performance, personalize
learning paths, and identify at-risk students for early intervention (Zawacki-Richter et al., 2019).
This aligns perfectly with the objectives of our study, as we aim to develop a similar predictive
model tailored to the Kenyan higher education context. This approach will allow us to uncover
factors that influence student progression, inform interventions, and ultimately, improve student
success rates. By leveraging machine learning, Kenyan universities can enhance their ability to
identify at-risk students, personalize learning experiences, and optimize resource allocation
(Oketch et al., 2020). This can lead to further improvements in student outcomes, increased
retention rates, and a more equitable and inclusive higher education system. As machine learning
technology continues to advance, its potential to transform education in Kenya and beyond is
only just beginning to be realized (Mwangi & Kamau, 2023).

5
1.2 Statement of the Problem

Student progression is a multifaceted phenomenon influenced by a complex interplay of


academic, personal, and environmental factors (York et al., 2019). Existing research often
simplifies this complexity, focusing on readily quantifiable metrics like grades and attendance
while overlooking the nuances of individual student experiences (Kim et al., 2021). This lack of
a holistic understanding limits the effectiveness of interventions designed to support student
success. Furthermore, the conceptualization of "success" itself is often narrowly defined,
primarily focused on graduation rates rather than encompassing a broader range of outcomes
such as skill development, well-being, and career readiness (Yorke & Longden, 2008).

In the Kenyan context, the rapid expansion of higher education has created unique challenges for
student progression. Limited resources, large class sizes, and diverse student backgrounds
contribute to varying levels of academic preparedness and support needs (Ngugi et al., 2022).
Additionally, socio-economic factors such as financial constraints and cultural expectations
significantly impact student experiences and outcomes (Kombo et al., 2023). Existing research
often fails to adequately account for these contextual nuances, leading to the development of
models that may not be fully applicable or effective in the Kenyan context (Wanja et al., 2023).

Recent studies have explored the application of machine learning to predict student outcomes in
various contexts. For example, Oketch et al. (2020) successfully applied machine learning to
predict student dropout in a Kenyan university, but their model primarily focused on academic
variables and did not consider the broader contextual factors. Similarly, Wanja et al. (2023)
developed a model for predicting academic performance using machine learning algorithms, but
their study was limited to a single university and did not account for the diversity of Kenyan
higher education institutions. In a broader context, Musso et al. (2020) highlighted the role of
both cognitive and non-cognitive factors in predicting student dropout and academic success,
suggesting that a comprehensive approach is needed. Additionally, studies by Tempelaar et al.
(2021) and Thomas & Galambos (2021) have emphasized the importance of utilizing a wide
range of data sources, including demographic information, academic records, and engagement
patterns, to improve the accuracy of predictive models. However, these studies often lack
sufficient exploration of contextual factors that are particularly relevant in the Kenyan context,
such as financial constraints and cultural expectations.

6
This study thus aims to address these limitations by developing a comprehensive machine
learning model for predicting student progression in Kenyan universities that incorporates a wide
range of academic, personal, and contextual factors, including socio-economic background, high
school performance, academic motivation, financial aid status, and mental health indicators. By
utilizing a large and diverse dataset and employing rigorous methodological approaches, this
study will provide a more nuanced understanding of the factors influencing student progression
in Kenya. Furthermore, the model will be designed with interpretability in mind, enabling
educators and policymakers to understand the underlying reasons behind predictions and make
data-driven decisions to support student success.

1.3 Objectives of the Study

1.3.1 Broad Objective

The broad objective of this study is to enhance the understanding of factors influencing
university student progression in Kenya and develop a machine learning model capable of
accurately predicting student outcomes. This model will empower universities to implement
targeted interventions and support systems to improve student success rates, retention, and
overall academic experience.

1.3.2 Specific Objectives

i. Identify and analyze the academic, personal, and contextual factors that significantly
impact student progression in Kenyan universities.

ii. Develop a robust and interpretable machine learning model to accurately predict student
progression outcomes (graduation, dropout, and academic performance).

iii. Evaluate the model's predictive accuracy, sensitivity, specificity, and generalizability
across different Kenyan universities and student populations.

iv. Utilize the model to identify at-risk students and predict critical periods for intervention.

7
v. Translate the findings into actionable recommendations for universities, policymakers,
and stakeholders in the Kenyan higher education sector.

1.4 Research Questions

i. To what extent do academic, personal, and contextual factors influence student


progression in Kenyan universities?

ii. How accurately can a machine learning model predict student progression outcomes in
Kenyan universities?

iii. How well does the predictive accuracy of the model generalize across different Kenyan
universities and student populations?

iv. To what extent can the model identify at-risk students and predict critical time periods for
academic intervention?

v. What recommendations can be made to stakeholders in the Kenyan higher education


sector based on the model's findings?

1.5 Significance of the Study

This study holds significant implications for Kenyan universities, providing them with a data-
driven tool to proactively identify and support at-risk students. By understanding the key factors
influencing student progression, institutions can tailor interventions and allocate resources more
effectively. This could lead to improved retention rates, higher graduation rates, and an enhanced
learning experience for all students. Ultimately, this would contribute to the production of highly
skilled graduates who meet the demands of the Kenyan workforce and contribute to national
development.

Policymakers and stakeholders in the Kenyan higher education sector will benefit from the
study's findings by gaining a deeper understanding of the systemic challenges and opportunities
related to student progression. This knowledge can inform evidence-based policies and
interventions aimed at improving the overall quality and relevance of higher education in Kenya.
By addressing the underlying issues affecting student success, policymakers can create a more

8
equitable and inclusive educational environment, where all students have the opportunity to
thrive.

Furthermore, this study contributes to the growing body of research on machine learning
applications in education, particularly in the context of developing countries. The development
of a contextually relevant and interpretable predictive model for Kenyan universities can serve as
a model for similar initiatives in other regions facing similar challenges. By sharing the
methodology and findings of this study, it can inspire and guide future research efforts aimed at
enhancing student success and promoting educational equity globally.

1.6 Motivation of the Study

The motivation for this study stems from the pressing need to address the persistent issue of
student attrition in Kenyan universities. Despite efforts to improve access to higher education, a
significant number of students still fail to complete their degrees, leading to wasted resources
and unrealized potential. A 2022 report by the Kenya National Bureau of Statistics revealed that
the average dropout rate across public universities is approximately 15%, a figure that has
remained relatively stagnant over the past few years. This alarming statistic underscores the need
for innovative solutions to enhance student progression and success.

The development of an accurate and reliable machine learning model for predicting student
progression can revolutionize how universities approach student support. By identifying at-risk
students early in their academic journey, institutions can proactively implement targeted
interventions, such as personalized tutoring, academic advising, and financial assistance
programs (Tempelaar et al., 2021). Research has consistently shown that early intervention is
crucial for improving student retention and success rates. A 2020 meta-analysis by Smith &
Lange found that universities utilizing predictive analytics to identify and support at-risk students
experienced an average 15% increase in retention rates.

Moreover, the unique challenges faced by students in the Kenyan context necessitate the
development of a contextually relevant predictive model. Existing models developed in other
countries may not fully account for the specific cultural, socio-economic, and academic factors
that influence student progression in Kenya (El Mrabet & Ait Moussa, 2022). By incorporating
relevant contextual factors into the model, such as financial constraints, family support, and

9
cultural expectations, this study aims to provide a more accurate and actionable tool for Kenyan
universities.

The potential impact of this study extends beyond the individual student. By improving student
progression and graduation rates, this research can contribute to the development of a highly
skilled workforce that is essential for Kenya's economic growth and social development
(UNESCO, 2021). Furthermore, the insights gained from this study can inform policy decisions
at the national level, leading to more effective and equitable educational strategies.

The motivation for this study is not only driven by the challenges faced by Kenyan higher
education but also by the transformative potential of machine learning technology. By harnessing
the power of data and predictive analytics, this research aims to empower universities, students,
and policymakers to create a brighter future for higher education in Kenya.

1.7 Scope of the Study

This study focuses on developing a machine learning model to predict student progression in
Kenyan public universities within a specific time frame. The study will employ a comprehensive
approach, incorporating a wide range of academic, personal, and contextual factors that have
been identified as potentially influential in prior research. Data will be collected from a
representative sample of Kenyan public universities and their students, ensuring diversity in
terms of academic programs, geographic locations, and socioeconomic backgrounds. The data
collection period will encompass the most recent three academic years (2021-2023) for which
comprehensive data is available, ensuring the relevance and applicability of the findings to the
current higher education context.

The study will be limited to undergraduate students enrolled in full-time programs within the
specified time frame, as this population represents a significant portion of the higher education
landscape in Kenya and faces unique challenges in terms of progression and retention.

The developed machine learning model will be evaluated for its accuracy, generalizability, and
interpretability within the Kenyan higher education context. This evaluation will involve
assessing the model's performance on unseen data from the same time period, examining its
predictive power across different universities and student subgroups, and interpreting the

10
underlying factors that contribute to the model's predictions. Ultimately, the study aims to
provide a practical and actionable tool for Kenyan universities to identify at-risk students,
implement targeted interventions, and improve student success rates within this specific
timeframe.

11
REFERENCES

Agudo-Peregrina, Á. F., Hernández-García, Á., & Pascual-Miguel, F. J. (2020). Early dropout


prediction using data mining: A case study. Computers & Education, 147, 103773.
Almarabeh, H., Amer, M., & Al-Qudaimi, K. (2019). Predicting students’ performance using
artificial neural network: A case study. International Journal of Interactive Mobile
Technologies, 13(8), 142-153.
Alpaydin, E. (2016). Machine learning: The new AI. MIT Press.
Aspesi, C., Aznar, D., & Gallinari, P. (2019). Early prediction of university students’ success:
The role of academic, motivational and socio-economic factors. Frontiers in Psychology,
10, 1151.
Chen, L., Chen, P., & Lin, Z. (2020). Ensemble methods for student performance prediction.
Educational Technology & Society, 23(4), 21-35.
Commission for University Education (CUE) - Kenya (2023). Annual Statistical Report 2022.
Faisal, M. A., Qadir, J., Hussain, S., & Al-Fuqaha, A. (2020). Student performance prediction in
online learning using hybrid data mining techniques. Applied Sciences, 10(13), 4497.
Getange, K., Ndung’u, R., & Orodho, J. A. (2022). Factors influencing students’ academic
performance in Kenyan universities: A case of the University of Nairobi. Journal of
Education and Practice, 13(2), 45-54.
Glover, I., Law, J., & Youngman, M. (2011). A machine learning approach to predicting
university student progression. Knowledge-Based Systems, 24(5), 679-688.
Huml, A., Knezevic, B., & Mihaljevic, B. (2019). Predictive validity of machine learning models
for early detection of at-risk students in higher education. Computers in Human Behavior,
97, 27-37.
Hwang, G. J., Xie, H., Wah, B. W., & Gašević, D. (2020). Predicting student achievement: A
comparison of approaches for multi-step time series prediction. Computers & Education,
143, 103672.
Kahu, E. R., Nelson, K., & Picton, C. (2019). Student success in distance education:
Understanding the role of student background characteristics, individual differences, and
academic engagement. Distance Education, 40(1), 106-123.
Kalinowski, I., Jaworski, M., & Szymański, B. (2021). Student dropout prediction in e-learning
using semi-supervised learning. Expert Systems with Applications, 164, 113815.
Kenya National Bureau of Statistics (KNBS). (2022). Statistical Abstract 2022.

12
Kenya National Bureau of Statistics. (2022). Statistical Abstract 2022. Nairobi: Kenya National
Bureau of Statistics.
Kim, M. K., & Asbury, C. (2021). Factors influencing the academic achievement of
undergraduate students: A conceptual review. International Journal of Educational
Research Open, 2, 100022.
Kombo, D. K., Othuon, L., & Mibei, G. (2023). Challenges facing implementation of e-learning
in public universities in Kenya: A case of Masinde Muliro University of Science and
Technology. International Journal of Education and Research, 11(3), 197-208.
Kovacic, Z., Buljubasic, F., & Husic, D. (2021). Early prediction of student success: A
comparison of machine learning models. Decision Support Systems, 145, 113505.
Lakkaraju, H., Harman, C., Bachrach, Y., & Leskovec, J. (2019). Interpretable & explorable
approximations of black box models. arXiv preprint arXiv:1907.02483.
Mbuvha, R., Marwala, T., & Boulton, A. (2021). Dropout prediction in tertiary education using
deep learning. Applied Sciences, 11(1), 326.
Musso, M., Kyndt, E., Cascallar, E. C., & Dochy, F. (2020). Predicting student dropout and
academic success: The role of cognitive and non-cognitive factors in higher education.
Studies in Higher Education, 45(6), 1182-1196.
Mwendwa, J. (2021). Financial constraints as a major factor hindering access to and completion
of higher education in Kenya. Journal of Higher Education in Africa, 19(1), 1-17.
National Center for Education Statistics (NCES). (2023). The Condition of Education 2023
(NCES 2023-144). U.S. Department of Education, Institute of Education Sciences.
Ngugi, P. K., Nyamongo, E. M., & Iravo, M. A. (2022). Factors influencing student dropout in
public universities in Kenya: A case of the University of Nairobi. International Journal
of Educational Administration and Policy Studies, 14(1), 1-12.
Oketch, M. O., Onderi, H., & Osodo, J. (2020). Determinants of student dropout in public
universities in Kenya: A case study of Maseno University. International Journal of
Research in Business and Social Science, 9(5), 127-136.
Oladokun, V. O., Adebanjo, A. T., & Charles-Owaba, O. E. (2020). Predicting students'
academic performance using artificial neural network: A case study of a Nigerian
university. International Journal of Emerging Technologies in Learning, 15(11), 71-87.
Onyango, M., Agumba, J., & Oboko, R. (2023). Machine learning in education: A review of its
applications and challenges in Kenya. International Journal of Computer Applications,
183(12), 11-19.
Smith, V. G., & Lange, A. (2020). The impact of predictive analytics on student retention in
higher education: A meta-analysis. Review of Educational Research, 90(2), 298-337.

13
Smith, V. G., & Lange, A. (2020). The impact of predictive analytics on student retention in
higher education: A meta-analysis. Review of Educational Research, 90(2), 298-337.
Tempelaar, D. T., Rienties, B., & Giesbers, B. (2021). Student success prediction using learning
analytics: A systematic literature review. Computers in Human Behavior, 118, 106689.
Tempelaar, D. T., Rienties, B., & Giesbers, B. (2021). Student success prediction using learning
analytics: A systematic literature review. Computers in Human Behavior, 118, 106689.
Thomas, L., & Galambos, N. (2021). A review of research on student success in higher
education: Implications for practice and policy. Journal of College Student Development,
62(3), 263-281.
Tinto, V. (2017). Through the eyes of students: A new lens for understanding student
engagement. Journal of Higher Education, 88(6), 813-836.
Tosik, K., Wójcikowski, M., & Wolny, R. (2020). Ensemble learning for early prediction of
student success: A case study. Computers & Education, 157, 103971.
UNESCO. (2021). Education for Sustainable Development: Building a Better Future for All.
Paris: UNESCO.
Vandamme, J. P., Meskens, N., & Superby, J. F. (2019). Predicting student dropout in higher
education: The influence of dissonance between students' and institutions' perceptions.
Studies in Higher Education, 44(10), 1718-1732.
Wanja, S. M., Oketch, M. O., & Onderi, H. (2023). Predicting student academic performance
using machine learning algorithms: A case study of a Kenyan university. International
Journal of Education and Development using ICT, 19(1), 132-148.
Xing, W., & Du, D. (2019). Dropout prediction in MOOCs: Using deep learning for personalized
intervention. Journal of Educational Computing Research, 57(3), 547-570.
Yanta, P., Cula, Y., & Pambudi, D. S. (2021). Student dropout prediction in e-learning using
deep neural network. International Journal of Emerging Technologies in Learning,
16(11), 189-205.
Yorke, M., & Longden, B. (2008). The first-year experience of higher education in the
UK. Higher Education Academy.
Zawacki-Richter, O., Marín, V. I., Bond, M., & Gouverneur, F. (2019). Systematic review of
research on artificial intelligence applications in higher education–where are the
educators? International Journal of Educational Technology in Higher Education, 16(1),
1-27.
Zawacki-Richter, O., Marín, V. I., Bond, M., & Gouverneur, F. (2019). Systematic review of
research on artificial intelligence applications in higher education–where are the
educators? International Journal of Educational Technology in Higher Education, 16(1),
1-27.

14
15

You might also like