Professional Documents
Culture Documents
Data Mining Project
Data Mining Project
Data Mining Project
SUCCESSFUL STUDENT
PERFORMANCE
Luis Pineda
Student Success is Not Genetic
Luis Pineda
❏ Social - The aim of this project is:
circumstances
- To find the most significant attribute(s) related to a student’s
❏ Lack of money
academic performance in math.
for external
- To dispel any prior notions of what makes a student successful.
help
- To identify harmful traits and practices in the context of
❏ Problems at
academia.
home
❏ Little/no help - Importance:
from family - With a solid understanding of this phenomenon, we could then turn
members our focus to troubled students and better provide the help and
❏ Few family role guidance they need.
models - Predict troubled students before they
❏ External fail. → Save resources, provide better
education.
responsibilities
USING DATA MINING TO PREDICT SECONDARY SCHOOL
STUDENT PERFORMANCE: Paulo Cortez and Alice Silva
Luis Pineda
● Aimed to examine the relationships between student debt, mental health and academic
performance.
● Students' perceptions of their own levels of debt rather than level of debt per se relates to
performance. Students who worry about money have higher debts and perform less well than
their peers in degree examinations.
● Students from lower socioeconomic backgrounds and postgraduate students had higher
debts. There was no direct correlation between debt, class ranking or General Health
Questionnaire (GHQ) score; however, a subgroup of 125 students (37.7%), who said that
worrying about money affected their studies, did have higher debt and were ranked lower in
their classes.
● http://onlinelibrary.wiley.com/doi/10.1111/j.1365-2929.2006.02448.x/full
Stressful life events and health-related quality of life
in college students: Damush, Teresa M,Hays, Ron D, DiMatteo, M Robin
Luis Pineda ➢ Distressful events were found
to have the largest impact
➢ Data was collected from 350 West Coast ○ Illness,
University Students: sexuality-related
○ 49.1% freshmen, 35.4% sophomores, events, and deviance
12.3% juniors, and 3.1% seniors events
➢ Some measures were strongly
○ 50.0% were Caucasian, 36.8% Asian,
intercorrelated. For example:
9.7% Hispanic, 1.4% African ○ Respondents who
American, and 3.1% reported Other reported greater
ethnicity. anxiety, bodily pain, or
○ 57.2% female depression also
○ reported less sense of
➢ Goal was to find a correlation between belonging, less
positive affect, and
stressful life events and health-related quality
poorer social
of life: functioning,
○ zero-order product moment
correlations to evaluate associations
between stressful life events
experienced in the recent past and
HRQOL measure (Seen in Table 2).
Gender, ethnicity, and social cognitive factors predicting the
academic achievement of students in engineering. Hackett, Gail; Betz, Nancy E.;
Casas, J. Manuel; Rocha-Singh, Indra A.
33 Attributes:
Data Description
- School, Sex, Age, Address, ● Collection of student & demographic data of
395 Portuguese students for Math class.
Family Size, Parent’s
○ Acquired from school reports and
Cohabitation Status, Mother’s questionnaires.
Education, Father’s Education, ○ Clean data, no missing values. No
Mother’s Job, Father’s Job, data cleaning required.
Reason for Choosing School,
Student’s Guardian, Travel ● Predictor value of final grade--Highly
Time from School, Study Time correlated with 1st and 2nd period grades.
Per Week, Number of past class ○ G1 & G2 ignored for prediction
failures, Extra School Support,
● Data is mostly:
Family Education Support, Extra ○ Binary: “this” or “that”/ “yes” or “no”
Paid Courses, Extra Curricular ○ Numeric: on a scale from 1-10, 2-5, - 208 of the participants were female (52.7%)
Activities, Attended Nursery etc. - 187 of the participants were male (47.3%)
School, Wants Higher ● 649 Instances
Education, Access to Internet, ● Multivariate
In a Romantic Relationship,
Quality of Family Relationships, ● Some correlation was found between passing
Free Time After School, Goes or failing a class and: Mother/Father’s
Out with Friends, Alcohol education level, Past failures, and Amount of
Consumption During Workday, time spent going out--though not very strong.
Weekend Alcohol Consumption, ○ Failures had a correlation of -.338, the
Health Status, Absences highest of all attributes
Luis Pineda
12 20 10 75.4% 71.8%
Balanced
Depth P C Training Testing
- Balanced data: 130
cases for passing and
failing 5 16 8 75.6% 63.1%
10 30 15 70.4% 52%
12 20 10 72.1% 61.1%
Results:
➢ Expected. Failures and absences were
➢ Most significant attributes: highest contributor to a pass or fail
○ Failures grade.
○ Absences ➢ Surprisingly, on the balanced data,
whether a student considered their
○ free time mother or father their guardian also
○ family support had some impact; with mothers having
➢ Effective accuracy in models varied, though a higher passing score
➢ On Unbalanced data, access to
never quite consistently above 70% internet also had some impact, though
○ Expected. Source study used 1st and it was more or less 50/50
2nd Semester grades, which obviously
resulted in higher accuracies
K-Nearest Neighbor Analysis Luis Pineda
No Feature
3 72.9% 27.1% 3 65.8% 34.2%
1 67.8% 32.2%
1 73.9% 26.1%
3 71.2% 28.8%
3 72.7% 27.3%
6 69.6% 30.4%
6 68.9% 31.1%
Demo & Parental Influences Results
● Balanced Decision Tree For Demographics was more successful.
● No Feature K3 Nearest Neighbor Yielded highest results for Demographics.
● Balanced Decision Tree for Parental Influence was also more successful.
● Feature Selection K1 Nearest Neighbor Yielded highests results for Parental
Influence.
● These two test cases were extremely irrelevant when testing for a Pass or Fail
in regards to Grade 3.
Results Interpretation:
➢ It is harder to classify failing students
➢ Students who have failed one or more classes should be given attention, as
they are more likely to fail a course
➢ Students who have multiple absences are also more likely to fail a course
➢ While going out and enjoying one’s free time could seem dangerous for
struggling students, it is good to encourage some respite from stressful
schoolwork.
➢ Students whose Mothers & Fathers were both educated and employed
resulted in higher rates of passing.