Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 4

Tardies vs.

GPA Regression Analysis


BAN 602 – Dr. Curtis Price
Fall 2021

Part 1: Introduction
The intent of this student is to analyze the impact of class tardies in a given week on a student’s grade
point average (GPA). Our dependent variable will be student GPA while number of class tardies will be
our independent variable. This makes sense because it is more likely that the number of tardies a
student accumulates each week will impact GPA. Meanwhile, it is less likely that a student’s GPA will
have an impact on the number of tardies.

Since a student’s GPA is calculated by the grades they receive in class and a student’s ability to earn
positive grades when they are tardy (late) to class is more difficult, it is likely that there will be a negative
correlation between these two variables. That is, as student tardies in a given week increase, we will
likely see a decrease in student GPA.

Part 2: The Data


Data was collected on the campus of Victory College Prep. Specifically, I asked students as they entered
the building each morning if I could ask them a few questions. If they agreed, I would ask them a series
of questions including their current GPA and the number of class tardies they had logged the previous
week. On this campus student grades are updated weekly and students are informed of their academic
data the first period of every week which is why students can explicitly state this data with some degree
of confidence. Additionally, this school is a K-12 charter school campus with roughly 1000 students thus
presenting a wide range of ages and data to synthesize.

A total of 30 observations were gained. The definitions of each of the variables along with the
descriptive statistics are in the table below.

Variable Name Description Average Max Min


GPA Grade point 2.58 4.0 0.78
average on a 4.0
scale
Class Tardies Number of tardies 4.33 17 0
accrued by
student the prior
week

Part 3: Regression Results


In this analysis, we are exploring the role that class tardiness has on GPA. We will define that GPA is our
dependent variable and that the number of tardies to class is our independent variable. In terms of the
analysis, we are trying to understand the impact that tardies have on GPA. To do this, we will estimate
the following simple linear regression equations:

GPA = β0 + β1(Tardies) + u

The regression results are shown below:


Tardies vs. GPA Regression Analysis
BAN 602 – Dr. Curtis Price
Fall 2021

Below is a scatter plot with the fitted linear trend line. The trend line is the result of a Simple Linear
Regression and it conforms to the estimate in the table above.

The estimate from the Simple Linear Regression model is shown in the equation below. For this model,
we put a hat () on the dependent variable to remind us that this is an estimate from data:

GPA = 3.17-0.14(Tardies)
The interpretation of the intercept term would be that that if the student had zero tardies, the student’s
GPA would be a 3.17. This amount seems reasonable given our data, but we should be cautious since a
student’s GPA being impact entirely by tardies to class is unlikely.

The estimated impact of tardies on GPA is negative indicating that the more tardies a student
accumulates in a given week the lower the student’s GPA will be. The coefficient of the independent
variable is -0.14 indicating that for every additional tardy, a student’s GPA will drop by 0.14 points. This
Tardies vs. GPA Regression Analysis
BAN 602 – Dr. Curtis Price
Fall 2021

number is negative which is what we would expect and stated at the beginning. As a student
accumulates more tardies, they minimize their learning time and miss potential “top of class”
assignments. It is difficult to tell if this is reasonable in terms of size, but falls in line with the anticipated
negative correlation.

Part 4: The sufficient conditions for good estimates


Linear in parameters

Given the data and


looking at the scatter
plot, there seems to be a clear correlation between tardies and GPA. The assumption of a linear
relationship between the dependent and independent variables does not seem too unlikely and the
linear guesstimate is practical in this case.

Random sampling

The data collected was not generated randomly, which is a problem if we are wanting to make
inferences from these results for a broader population—the fact that is a mere convenience sample
presents problems with the data collection process that should give us pause to infer that results are
typical. First, the data was collected at a K-12 charter school. Therefore, it is unclear whether grade
level or non-charter attending students have similar data results. Second, having students explicitly
state their data rather than having it verified by a non-biased source could skew or bias the data.
Ideally, we would want students to have to present proof of their GPA and tardy data to make the data
more accurate and fully reliable.

Variation in the independent variable(s)

While 30 data points were collected which is a significant cross-section of data, it is concerning that we
do not see a broader cross section of GPA data—specifically between 0.0 and 2.0. Only 8 of the
surveyed students fell between these data points. Additionally, we do not see many students
represented with greater than 8 tardies in a given week—in fact, only 4 of the surveyed students
indicated as such. We would need to analyze schoolwide GPA and tardy data to know if the averages
our sample collected are reliable and fully representative.
Tardies vs. GPA Regression Analysis
BAN 602 – Dr. Curtis Price
Fall 2021

Zero mean of the error term conditional on the independent variable(s)

The error term likely contains many factors that would impact GPA but are not tardies to class. For
example, grade level and difficultly of classes or particular teacher grade scales would certainly impact
GPA, but are excluded from the data.

Overall, it is hard to take the estimate the approximation seriously. We would need more data about
the students included in this data list and would also need confidence that the independent variable is
not correlated with factors that have been excluded from the analysis. While this estimate does tell us
something about the relationship between GPA and tardies, it is likely that it is not tardies by itself
which is driving this relationship. To understand this relationship better, we should include more data
and detail in our analysis. Student data such as grade level and course load difficulty would make sense
to include in future analysis.

You might also like