Professional Documents
Culture Documents
Jarhe 01 2014 0010
Jarhe 01 2014 0010
Access to this document was granted through an Emerald subscription provided by All users group
For Authors
If you would like to write for this, or any other Emerald publication, then please use our Emerald
for Authors service information about how to choose which publication to write for and submission
guidelines are available for all. Please visit www.emeraldinsight.com/authors for more information.
About Emerald www.emeraldinsight.com
Emerald is a global publisher linking research and practice to the benefit of society. The company
manages a portfolio of more than 290 journals and over 2,350 books and book series volumes, as
well as providing an extensive range of online products and additional customer resources and
services.
Emerald is both COUNTER 4 and TRANSFER compliant. The organization is a partner of the
Committee on Publication Ethics (COPE) and also works with Portico and the LOCKSS initiative for
digital archive preservation.
into treatment programs, such as summer bridge programs, is based on self-selection. Self-selection
makes it very difficult to estimate the true treatment effect because the selection process itself
often introduces a source of bias.
Design/methodology/approach – By using propensity scores, the authors can match students
who participated in the summer bridge program with equivalent students who did not participate
in the summer bridge program. By matching students in the treatment group to equivalent students
who do not participate in the treatment, the authors can obtain an unbiased estimate of the treatment
effect. The authors also describe a method to conduct a sensitivity analysis to estimate the amount
of hidden bias generated from unobserved factors that would be needed to alter the inferences made
from a propensity score matching analysis.
Findings – Findings suggest there is no significant difference in the pass rates of the subsequent
intermediate algebra course for students who participated in the summer bridge program when
compared to matched students who did not participate in the summer bridge program. Thus, students
who participate in the summer bridge program fared no better or worse when compared to similar
students who do not participate in the program. These findings also appear to be robust to hidden bias.
Originality/value – This study describes a unique way to estimate the causal effect of participating
in a treatment program when there is self-selection into the treatment program.
Keywords Elementary algebra, Intermediate algebra, Mathematics education,
Propensity score analysis, Sensitivity analysis, Rosenbaum bounds, Summer bridge programmes
Paper type Research paper
Introduction
Developmental mathematics programs have been a controversial issue in higher
education for many years (Adelman, 1998; Boylan et al., 1994; Boylan and Saxon, 1998).
The need to evaluate developmental mathematics programs is widespread, with most
institutions doing some type of analysis on their effectiveness (Altieri, 1990; Umoh
et al., 1994; Waycaster, 2001, 2004). Early intervention programs for developmental
mathematics, are often used to bridge the gap between high school and college-level
mathematics courses (Boylan, 1999; Boylan et al., 1992). One type of such an early
intervention program is summer bridge program (Garcia, 1991; Strayhorn, 2011).
Summer bridge programs are usually offered to students in the summer before they
enroll in their first semester. The objectives of such bridge programs are to increase
students’ precollege mathematics knowledge through targeted instruction in content areas
Journal of Applied Research in
of need (Edgecombe, 2011), and sometimes also include assistance with study skills and Higher Education
a general acclimation to campus and college life. These programs can help students avoid Vol. 7 No. 2, 2015
pp. 331-345
sitting through an entire semester of remedial coursework when in reality they may only © Emerald Group Publishing Limited
2050-7003
need to review a portion of the content found in the course (Boylan and Saxon, 1999). DOI 10.1108/JARHE-01-2014-0010
JARHE Many community colleges and universities offer bridge programs and have
7,2 researched their effectiveness. For instance, the National Center for Postsecondary
Research (Barnett et al., 2012) published a study on eight summer bridge programs in
Texas. The study found that in the first year and a half after the summer bridge
programs, students that completed the programs passed their first college-level math
course at higher rates when compared to students who did not participate in the
332 programs. In addition, the percentage of students that passed their first college-level
mathematics course after participating in a summer bridge program was higher than
non-participating students and remained statistically significant for the first four
semesters following participation in the program.
A summer bridge program in mathematics, reading, and writing conducted in 2009
at Elgin Community College in Illinois, found that 70 percent of students who
participated in the program then placed into a college-level course. In the following
year, the percentage of bridge students that placed into a college-level mathematics
Downloaded by 114.120.239.122 At 20:46 20 September 2016 (PT)
course increased to 87 percent. They also found that 82 percent of the successful
summer bridge students then earned the grade of C or better in the next subject-related
college-level course that fall as compared to 76 percent of students who did not
participate in the program (Douglas and Schaid, 2010).
These are just a few examples of studies suggesting that summer bridge programs
can be an effective way to get students ready for college-level courses and their success
can be attributed to many different factors. One such factor is the use of a variety of
instructional methods such as hands-on and visual approaches to learning (Boylan and
Saxon, 1999). Bonham and Boylan (2011) also suggest utilizing technology to deliver a
variety of instructional methods, especially with developmental mathematics programs
where students will use technology to identify strengths and weaknesses in their
content knowledge.
Although summer bridge programs appear to be effective in getting students
prepared for college-level courses, most of the research on the effectiveness of these
programs relies on observational data and this poses a problem for estimating the true
effect of such an intervention program. Many of the aforementioned studies used simple
descriptive techniques to compare students who participated in the bridge program to
those students who did not. As with many observational studies in mathematics
education, when participation in a treatment program is based on selection, this can be a
concern because selection effects can bias the estimate of a treatment effect (Graham,
2010; Graham and Kurlaender, 2011).
Summer bridge programs are typically designed for students that demonstrate
weaker skills and most programs invite, rather than mandate, students to participate
in the program. Thus, some invited students may choose to participate while others do
not. This makes estimating the true program effect very difficult because the reasons
students self-select into such a program vary and are based on many different factors
that may or may not also have an effect on student performance in and after the
program (Shadish et al., 2002). For example, students with the weakest mathematical
skills may more frequently elect to participate in a summer bridge program because
they (and/or their parents) are more likely to believe they need the remediation and
hope the program will help them. It may also be that more females will select to
participate in a bridge program. Thus, the only way to estimate the true effectiveness of
such programs would be to conduct a random assignment (Shadish et al., 2002).
A random assignment would require assigning students to participate in the
summer bridge program based only on chance (Shadish et al., 2002). By randomly
assigning students to participate in the summer bridge program (the treatment group) Mathematics
versus not participating in the summer bridge program (the control group), this would bridge
establish the two groups as being equivalent except for the treatment assignment.
If both groups were equivalent at the onset of the study, then a simple comparison of the
program
success rates between the two groups would be an unbiased estimate of the treatment
effect. However, with summer bridge programs, assigning students to participate
by virtue of random assignment is not practical. Thus, we are forced to deal with 333
self-selection which is one of the key issues in evaluating summer bridge programs and it
can make the estimate of the program effect biased (Schneider et al., 2007).
To address selection effects with observational data, statistical techniques such as
propensity scores (Graham, 2010; Graham and Kurlaender, 2011; Guo and Fraser, 2010;
Morgan and Winship, 2007, Rubin, 1997) can be used to obtain an unbiased estimate
of a treatment effect. Propensity score analyses is a way to match treatment and control
participants and thus obtain an unbiased estimate of the treatment effect (Guo and
Downloaded by 114.120.239.122 At 20:46 20 September 2016 (PT)
cohort in this study, there were six problem sets, one for each of the six units in the
elementary algebra curriculum, and students had access to them prior to and
throughout the Summer Institute. The problem sets had to be completed in one sitting
but could be attempted (with the same question types but with different questions)
as often as students chose. Students received immediate feedback after each attempt in
a review that showed each question again along with the student answer and the
correct answer, and those attempt reviews could be examined again at any time. On the
last day of the program, students took the exemption examination. The examination
is based on the database of questions used for the practice problem sets and is
administered and proctored on campus. Scores of 70 percent or better earn exemption
from elementary algebra and provide placement into the next course, intermediate
algebra. Although intermediate algebra does not count toward the general education
requirements for the university, it does earn credit toward graduation.
Data
Data collected consists of n ¼ 506 full-time, first-time students, all of whom took
intermediate algebra in their first semester in the fall of 2010. One group consisted of
students who initially placed into, and enrolled directly in the intermediate algebra
course and the other group was the treatment group. The treatment group consisted of
approximately 18 students who originally placed into elementary algebra and then
successfully completed the bridge program during the summer of 2010, which elevated
their placement to intermediate algebra, and who also enrolled in intermediate algebra in
the fall of 2010. About half of the students that successfully completed the bridge program
did not enroll in intermediate algebra in the fall of 2010 when the data were collected
despite their eligibility[1]. Descriptive statistics for both the treatment and control groups
can be found in Table I. Variables collected for this study are described as follows:
INSTITUTE – this is a binary variable that represents whether or not a student
participated in the Summer Institute (the value “1” was assigned if the student participated
in the Summer Institute, and “0” was assigned if the student did not participate in the
Summer Institute).
PASS – this is a binary variable that represents whether or not a student passed the
intermediate algebra course with the grade of a C- or better on their first attempt[2]
(the value “1” was assigned if the student passed intermediate algebra with the grade of
C- or better on their first attempt, and “0” if the student did not earn at least a C- on their
first attempt taking the course).
FEMALE – this is a binary variable that represents a student’s gender (the value “1” Mathematics
was assigned if the student identified as a female and “0” was assigned if the student bridge
identified as male).
SATMATH – this is a continuous score received on the mathematics portion of the
program
SAT examination (scores range from 280 to 540).
SATVERBAL – this is a continuous score received on the verbal portion of the SAT
examination (scores range from 290 to 690). 335
SATWRITING – this is a continuous score received on the written portion of the
SAT examination (scores range from 320 to 710).
Analysis
The effect of participating in the Summer Institute on whether or not a student passed
intermediate algebra on their first try, was initially estimated by considering the
percentage of students who passed intermediate algebra based on whether or not they
Downloaded by 114.120.239.122 At 20:46 20 September 2016 (PT)
participated in the Summer Institute. The numbers of students who passed and failed
intermediate algebra based on seminar participation is presented in Table II.
Approximately 74.0 percent of students who did not participate in the Summer
Institute passed the intermediate algebra course as compared to only 50.0 percent of
students who participated in the Summer Institute (Fisher’s exact test, p o 0.05). This
finding would suggest that the Summer Institute did not help students succeed in
intermediate algebra, and in fact, the students who participated in the Summer Institute
fared worse than did non-participating students, as a significantly smaller percentage
of students who participated in the Summer Institute passed intermediate algebra as
compared to those students who did not participate.
However, the question arises as to whether or not the larger percentage of non-
participants who passed intermediate algebra was because the program was
ineffective, or was it because those students who participated in the Summer Institute
(the treatment group) were somehow systematically different from those students who
mathematics scores.
Propensity scores can be used to establish equivalent treatment and control groups
by creating a model that predicts treatment status using relevant pre-treatment
characteristics (Graham, 2010; Graham and Kurlaender, 2011; Guo and Fraser, 2010;
Lunceford and Davidian, 2004). The propensity score model used in this study is
a logistic regression analysis that predicts participation in the treatment program
based on gender, SAT mathematics score, SAT verbal score, and SAT writing score
as is described in Equation (1). This model estimates the conditional probability of
participating in the Summer Institute (treatment program) based on the aforementioned
collection of relevant pre-treatment variables. The results of this analysis are presented
in Table III:
probðtreatmentjpre treatment variablesÞ
1
¼ (1)
ðb0 þ b1 Female þ b2 SATMATH þ b3 SATVERBAL þ b4 SATWRITINGÞ
1þe
There are many different ways to match equivalent observations in the treatment and
control groups with propensity scores using techniques such as Mahalanobis metric
matching, k-nearest neighbor matching, and caliper matching to name a few (Guo and
Table III.
Parameter estimates,
standard errors,
test statistics, and
p-values for the
logistic regression
model of selection
that predicts Variable Estimated parameter SE Test statistic p-Value
Summer Institute
participation based Female 0.247 0.579 0.43 0.669
on gender, SAT math −0.043 0.008 −5.29 0.000
SAT mathematics Sat verbal 0.019 0.006 2.98 0.003
score, SAT verbal Sat writing 0.003 0.006 0.46 0.645
score, and SAT Constant 5.631 2.421 2.33 0.020
writing score Note: n ¼ 502
Fraser, 2010). We decided on one-to-one nearest neighbor matching as this technique Mathematics
matches each individual participant in the treatment group to an individual member of bridge
the control group with the closest propensity score. We initially decided on sampling
with replacement because there is reason to believe that the propensity score
program
distributions were likely to be different between Summer Institute participants and
non-participants. This is probably due to the fact that fewer students participated in
the Summer Institute as compared to non-participants (Smith and Todd, 2005; Caliendo 337
and Kopeinig, 2005).
Table IV shows the summary statistics for the participants and non-participants
who are matched with replacement (i.e. those non-participants who have similar
estimated propensity scores). Notice that in the matched with replacement sample that
there is the same percentage of females in the treatment and control groups, and the
mean SAT mathematics scores show much less of a difference.
Propensity scores can be used to create matched treatment and control groups and
Downloaded by 114.120.239.122 At 20:46 20 September 2016 (PT)
Pass Table V.
Summer Institute No Yes Total Number of students
who passed and
No 10 8 18 failed intermediate
Yes 9 9 18 algebra for matched
Total 19 17 36 (with replacement)
Note: n ¼ 36 students
JARHE difference in the pass rates for the matched samples was found, as 44.4 percent of the
7,2 non-participants passed intermediate algebra on their first try as compared to 50.0
percent of Summer Institute participants (Fisher’s exact test, pW0 05). This contradicts
the previous finding with the non-matched sample suggesting that a lower percentage of
students in the treatment group passed intermediate algebra when compared to those
students in the control group.
338 We can also consider matching without replacement and these summary statistics
are presented in Table VI. Notice that in the sample that was matched without
replacement, that there are some similarities as well as some differences between the
treatment and control groups. For instance, 61 percent of the treatment group is female
as compared to 72 percent of the control group. The mean SAT mathematics score for
the treatment group is 426.11 as compared to 419.44 for the control group. The mean
SAT verbal score for the treatment group is 505.56 as compared to 490.56 for the
control group. The mean SAT writing score for the treatment group is 485.56 as
Downloaded by 114.120.239.122 At 20:46 20 September 2016 (PT)
Limitations
Perhaps the biggest limitation of propensity scores is that the method only matches on
observed variables and not unobserved variables (Graham, 2010; Graham and
Table VII.
Number of students Pass
who passed and Summer Institute No Yes Total
failed intermediate
algebra for matched No 7 11 18
(without Yes 9 9 18
replacement) Total 16 20 36
students Note: n ¼ 36
Kurlaender, 2011; Guo and Fraser, 2010; Lunceford and Davidian, 2004; Morgan and Mathematics
Winship, 2007; Rosenbaum and Rubin, 1983; Rubin, 1997, 2006; Schneider et al., 2007). bridge
In other words, the assignment to the treatment is considered “strongly ignorable”
(Rosenbaum and Rubin, 1983). The true value of propensity scores relies on correctly
program
modeling the selection process (Weiss, 1998). If the selection model does not adequately
describe the selection process because there are unobserved factors that are related to the
selection process that are also related to the outcome, then propensity score techniques do 339
not offer much more in terms of reducing bias than do standard inferential techniques.
Thus, hidden bias can exist if there are unobserved factors that impact both the
treatment and the outcome measure simultaneously (Rosenbaum, 2002).
Although there are no direct tests that can be done to detect hidden bias,
a sensitivity analysis that relies on a bounding approach can be used to determine
how strong an unobserved variable needs to be to alter the inferences made with
a propensity score analysis (Becker and Caliendo, 2007; Rosenbaum, 2002). Rosenbaum
Downloaded by 114.120.239.122 At 20:46 20 September 2016 (PT)
the treatment to differ by a factor of 2.5, then we may see a significant difference in the
pass rates between the treatment and control groups. In other words, if students are 2.5
times more likely to participate in the treatment group and these students also have
a higher probability of passing intermediate algebra (i.e. a positive treatment effect),
then the estimate of the treatment effect would be considered an overestimate of the
true but unknown treatment effect, suggesting that there could be hidden bias. As for
underestimating the treatment effect, notice when Γ ¼ 6.00 that the p-value for the
treatment effect is now 0.0492. This suggests that if unobserved factors caused the odds
ratio of selection into the treatment to differ by a factor of 6, then we may see a significant
difference in the pass rates between the treatment and control groups. Thus, if students
are six times more likely to participate in the treatment group and these students also Mathematics
have a higher probability of not passing intermediate algebra (i.e. a negative treatment bridge
effect), then the estimate of the treatment effect would be considered an underestimate of
the true treatment effect, again suggesting that there could be hidden bias. However, for
program
this study, we are not necessarily concerned with overestimating the treatment effect since
this is an example where the treatment effect at Γ ¼ 1 is non-significant and the
underestimation bounds provide the magnitude of hidden bias that would be needed for 341
the treatment effect to become significant (Becker and Caliendo, 2007).
So essentially, the closer Γ is to 1 with a changing significance level, the more
this suggests that the treatment effect may be overestimated or underestimated due
to unobserved factors, and thus the estimate of the treatment effect is not robust to
hidden bias.
Although there are no formal methods to decide the appropriate critical value for Γ,
various rules of thumb suggest that values of Γ which are less than 2.0 that also alter the
Downloaded by 114.120.239.122 At 20:46 20 September 2016 (PT)
conclusion regarding the significance of the estimate of the treatment effect should be of
concern. This is because if Γ is less than 2.0, then one student could be less than twice as
likely to participate in the treatment program due to unobserved factors. Since the values
for Γ that we found are greater than 2, this leads us to believe that the estimate of the
treatment effect is robust to hidden bias. This helps strengthen the inference of no
significant difference in the pass rates between those students who participate in the
Summer Institute versus matched students who did not participate in the program.
Even though the main goal of this study is to illustrate how propensity scores can
be used to evaluate the effectiveness of a treatment program, another limitation of this
study is the small sample size of the treatment group and subsequent matched
comparison group. One of the concerns with our study is that even though more than
40 students initially participated in the Summer Institute, and more than 90 percent of
these students passed the elementary algebra exemption examination, only 18 of those
successful students went on to take the intermediate algebra course in the subsequent
fall semester. Most of the other students who passed the Summer Institute did not
take intermediate algebra during the course of time when the data were collected.
Furthermore, some of the students who passed the exemption examination earned
a high enough score on a computerized placement examination to be able to take
college-level mathematics courses above the level of intermediate algebra. Clearly,
a larger sample size would have allowed for a more powerful study and thus have made
for a more robust evaluation (Weiss, 1998).
Discussion
As we have shown, propensity score matching has many advantages over more
traditional methods that consider unmatched samples (Zanutto, 2006). Even though
propensity score matching does rely on the correct functional form of the model between
the participation in the treatment program and the relevant variables, sensitivity
analyses can be done to see if the estimate of the treatment effect is robust to hidden bias.
As Lunceford and Davidian (2004) suggest, even including numerous additional
variables in the propensity score model will increase the precision of the estimate of the
treatment effect. We also found that changes in unobserved characteristics did not
appear to alter the estimate of the treatment effect, thus giving more confidence in the
propensity score analysis.
Perhaps the greatest advantage of using propensity score matching is that it is easy
to describe findings to an audience with little or no statistical background who can
JARHE appreciate the importance of comparing groups that are equivalent. The importance
7,2 of comparing two groups that are similar, gives more confidence in the true estimates of
the treatment effect.
Given that the Summer Institute was a pilot program, further analysis of subsequent
cohorts will surely yield a more extensive analysis. However, with our analysis we
found that students who participated in the pilot summer bridge program fared no
342 better or worse when compared to students who were matched based on gender,
mathematics, verbal, and writing skills. Furthermore, this finding appears to be robust
to hidden bias. In some of the developmental education literature there is a
misunderstanding that students who participate in developmental mathematics
programs should perform better in mathematics than non-developmental students
(Goudas and Boylan, 2012). As Goudas and Boylan (2012) state, the purpose of
developmental education is, “[…] that remedial students should perform equally to
non-remedial students and only in gatekeeper courses […]” (p. 4). Given that the
Downloaded by 114.120.239.122 At 20:46 20 September 2016 (PT)
References
Aakvik, A. (2001), “Bounding a matching estimator: the case of a norwegian training program”,
Downloaded by 114.120.239.122 At 20:46 20 September 2016 (PT)
Corresponding author
Dr Sally A. Lesik can be contacted at: lesiks@ccsu.edu
Downloaded by 114.120.239.122 At 20:46 20 September 2016 (PT)
For instructions on how to order reprints of this article, please visit our website:
www.emeraldgrouppublishing.com/licensing/reprints.htm
Or contact us for further details: permissions@emeraldinsight.com