Professional Documents
Culture Documents
Minat Siswa-1
Minat Siswa-1
ABSTRACT. Students in the USA have fallen near the bottom in international
competitions and tests in mathematics and science. It is thought that extrinsic factors
such as family, community, and schools might be more influential than intrinsic attitudes
toward science interest. However, there are relatively few valid and reliable measures of
intrinsic factors such as interest relating to science. With the lack of intrinsic measures, it
is difficult to determine the impact of extrinsic factors on the intrinsic construct. A fuller
picture of the factors affecting intrinsic factors such as science interest will allow
interventions to become more refined and targeted. Several studies suggest that student
interest toward science affects the likelihood of the student pursuing advanced courses in
science. The goal of this paper is to establish the validity and reliability of the Science
Interest Survey and to determine if the survey meets the formal requirements of
measurements as defined by the Rasch model. Results using both IRT and CRT analysis
suggest that Science Interest Survey is an adequate measure of the unidimensional
construct known as science interest. Results further suggest the Science Interest Survey is
a valid and reliable measure for assessing science interest levels.
INTRODUCTION
Many studies have looked at one or two specific factors and tried to find a
relationship between one factor (e.g. family, peers, and school resources) and
student attitudes and/or achievement toward science (Ainley et al., 1999).
Yet, no study to date has examined the factors side by side to understand the
role that each factor plays. This is due, in part, to the lack of a reliable
instrument to measure these relationships as they relate to interest (Churchill,
1979). Taking into consideration intrinsic and extrinsic factors, such as
school, family, and peers, this instrument design sought to measure factors
that contribute to students’ further pursuit of science and the factors that
contribute to student interest. Ultimately, the SIS measures extrinsic factors
across the three domains of interest that provide the most influence on
students’ pursuits of science. Understanding the role that extrinsic factors
play has the potential to inform and assist in the reform of science education,
increase STEM workforce development, and make the USA once again a
leader in science and technology.
THEORETIC FRAMEWORK
learning. The model is divided into three major contexts within which such
learning occurs: personal (motivation and expectations, interest, prior
knowledge and experience, and choice and control), sociocultural (with-
group sociocultural mediation, facilitated mediation by others, and culture),
and physical (advance preparation, setting, design, and subsequent reinforc-
ing events and experiences).
Summary
Attitudes are intrinsic in nature, but research by Catsambis (1995) has
shown that extrinsic factors such as families, communities, and the school
environment may be what are contributing to low-level science achieve-
ment.1 In the Handbook of Research on Science Teaching and Learning,
Simpson, Koballa, Oliver & Crawley (1994, p. 211) state, “the key to
successes in education often depends on how a student feels toward
home, self and school.” The role of family and community has long been
shown to be an influential factor in the success of all students, particularly
minority students. Children, who are encouraged by their parents to take
advanced science and math courses and are advised as to the importance
of science and math education, perform better on science and math tests.
If this is the case, then it is alarming to hear that as students get older,
parents become less involved in their child’s education (Johnston &
Viadero, 2000).
METHODS
Rasch Modeling
Rasch measurement provides a theoretical model to create an equal
measure construction of a Science Interest Survey. Rasch constructed
measures are used in the educational, medical, and psychological fields of
study. The model use in these fields is primarily for the evaluation of
validity and development of instruments. The nature of the model is
probabilistic based upon logits (Rasch, 1960). This probabilistic model
allows for an adequate measure of those items that are less likely to be
endorsed. Individuals who exhibit a higher likelihood of exhibiting a
greater endorsement level are more likely to show an increase in science
interest. Consequently, when a high measuring subject does not endorse
items that are ranked lower in the partial credit model, those endorse-
ments are considered unexpected.
CONSTRUCTING AND VALIDATING THE SCIENCE INTEREST SURVEY 649
The use of the Rasch model provides for the construction of a liner
measure from ordinal observation and provides for the observation and
quantification of the response categories within the survey (Linacre,
1999). The construction of the linear measure from the ordinal data is
accomplished through the transformation of raw scores, to a common
metric of logits (Linacre, 2002). The ordering of the item and response
measure creates an additive relationship allowing for the development of
probabilistic models (Betemps, Smith & Baker, 2003). The probabilistic
models allow for statistical comparisons of the expected responses to the
actual responses within the model. From this comparison of responses, it
is possible to provide an indication of fit to the model. Instruments
developed using Rasch analysis contain items that remain fixed allowing
for the calibration across differing samples. The objective of this study is
to determine SIS item fit against the Rasch model. Comparison of model
fit provides a linear-equal-interval measure for science interest. This study
will examine the quality of the rating scale, assess item quality in
defining science interest dimensions, describe how well the items
represent the interest range, and evaluate item function with regard to
the subjects.
The use of classical test theory (CTT) and Rasch measurement (RM) as
a mixed evaluation approach results in commonality of outcome
regarding the Science Interest Survey. CTT has two conceptual
limitations that are addressed when using RM; the first limitation is the
use of an ordering continuum of items within the unidimensional
constructs and secondly the ability to create an additive scale due to the
fixed nature of the RM characteristics (Prieto, Alonso & Lamarca, 2003).
The use of RM allows for the use of alternative scaling investigations
with a review of the underlying structure of the measure. The use and
comparison of CTT and RM as a confirmatory analysis method
strengthens the outcomes of the analysis and provides for a more robust
picture of the mechanics of the measure. Thus, this paper presents the
parallel reduction designed by Rust & Golombok (2009).
Instrument
The original Science Interest Survey contains 21 items with five response
categories to describe respondent levels of interest in science. Table 1
shows each item and the associated subscale. Reverse items are
designated via the designation Reversed. Response categories are 1
through 5 in an ordinal, Likert-like scale. The rating scale is 1—strongly
disagree, 2—disagree, 3—do not know, 4—agree, and 5—strongly agree.
650 LAMB ET AL.
TABLE 1
Item and subscale assignments
Participants
Test data were obtained from 528 students in classes from randomly
selected teachers in grades 5 through 12. Schools selected for the
study are located in several states within the continental USA.
Student data were obtained primarily from Southeastern and
Midwestern states. Twenty-eight students (5.3%) are from the
elementary level, 150 students (28.41%) are from the middle school
level, and 350 students (66.29%) are from the high school level.
Ages ranged from 9 to 18: 2.46% were between the ages of 9 and
11, 44.59% are aged 12 to 14, and the remaining 51.04% are 15 to
18 years old. Table 2 shows the racial distribution of the study
participants.
The research design is a non-randomized intact group, posttest
only research design. The most serious threat to internal validity
within this design is potential selection bias within the grouping and
lack of quantification of changes in interest level. Mitigation of this
effect was accomplished through selection of a sufficiently large pool
of intact classes (n = 18, β = 0.78). Table 3 shows the experimental
design.
Statistical Analysis
Analysis of the measure was accomplished using WINSTEPS (Linacre,
2002) and JMP Statistical Discovery Software. Stability of the measure to
the 99% (±1/2 logit) confidence interval is provided at the 150-subject
threshold (Linacre, 1994). Item misfit results indicate a lack of relation-
ship between the item and other items in the scale. The lack of fit in this
context can be interpreted as statistical interferences and a resulting lesser
TABLE 2
Racial distribution of study participants
TABLE 3
Experimental design
RESULTS
Descriptive Statistics
The data from this sample suggest that the Rasch model describes the
internal structure of the measure and associated items. Review of Table 4
indicates that all items were answered with relatively few missing
responses. The largest number of missing responses was for item 14,
and the total percentage of missing responses is 2.65% of the respondents.
TABLE 4
Response frequency and percent by item
No Strongly
response disagree Disagree Do not Strongly
Item (0) (1) (2) know (3) Agree (4) agree (5)
1 1 19 71 102 240 95
2 0 34 328 0 138 27
3 3 8 21 27 234 235
4 3 138 229 72 63 23
5 2 12 34 95 265 120
6 3 53 133 128 156 55
7 0 16 49 172 166 119
8 4 10 31 53 287 143
9 8 40 161 214 73 32
10 1 45 116 214 103 49
11 4 39 90 110 225 60
12 0 53 116 221 109 29
13 9 26 64 167 204 58
14 14 100 185 104 88 37
15 3 45 75 60 245 100
16 8 55 222 102 92 49
17 9 33 76 121 226 63
18 4 90 125 188 95 26
19 0 42 113 121 194 50
20 6 14 10 90 199 209
21 4 42 82 88 225 87
Total 86 914 2,331 2,449 3,627 1,666
endorsements
654 LAMB ET AL.
The least endorsed category is the category strongly disagree (8.25%); the
most frequently endorsed category is agree (32.76%).
Table 5 shows the means, standard deviation, maximum, and minimum
for each of the subscales in the measure. The subscale with the largest
mean is the T subscale 3.947, and the subscale with the lowest mean is
the I subscale with a value of 2.948.
Instrument Reliability
Internal Reliability Statistics. Review of internal reliability for each
subscale indicates an adequate level of internal reliability. While specific
subscales may show a slightly low internal reliability, the overall internal
reliability of the measure is adequate when reviewed (α = 0.72).
Removal of the third response category “do not know” slightly
increased the person separation index from 8.73 to 8.75. The separation
coefficient is analogous to the Fisher discrimination ratio (Wright, 1996).
An 8.75 separation ratio identifies that a total of eight levels of ability or
strata can be discerned from the measure by the test sample. This increase
in item separation results from the increased discrimination between the
item choices. Conversion of the discernable (distinct strata) to a KR-20 or
alpha coefficient places the value 8.75 at about a 0.97 alpha range which
is considerably higher than the Cronbach’s alpha calculations shown in
Table 6. This discrepancy in reliability can be attributed to the non-linear
nature of the transformation of the data associated with the Rasch model
(Fisher, 1992).
Measure Construct
Factor Analysis. The rotated factor matrix shown in Table 6 shows that
five factors account for 89.9% total observed variance. Analysis of factor
TABLE 5
Subscale means and standard deviations
TABLE 6
Cronbach’s alpha for each subscale
T subscale 0.70
S subscale 0.60
P subscale 0.60
I subscale 0.50
F subscale 0.70
By item coefficient alpha full survey 0.72
exist, and the Rasch analysis confirms that the five factors tie together to
identify one commonality, theta which is the construct called science
interest.
A principal component analysis using a promax rotated solution of
the residuals is used to show the dimensionality of the Science
Interest Survey. The residual loading factor is the remainder after the
underlying trait has been removed. Only items with a loading factor
greater than ±0.30 are recommended for use (Lamoureux, Pallant,
Pesudovs, Hassell and Keefe, 2006). Using the ±0.30 criterion
threshold, the removal of item 14 and item 18 is indicated. The first
factor explained 40.6 units of variance. Results suggest that there is a
unidimensional structure associated, with the measure. Table 7 shows
the factor loading for each of the items.
Figure 1 shows the fraction of the total variance in the data that is
accounted by each of the principal components. The plot shows the
TABLE 7
Factor 1 from principal component analysis of standardized residuals
16 0.76
9 0.66
4 (reverse) 0.63
7 (reverse) 0.55
2 (reverse) 0.55
12 (reverse) 0.55
19 (reverse) 0.55
14 0.09
21 −0.67
15 −0.56
11 −0.54
5 −0.54
13 −0.48
6 −0.48
10 −0.43
8 −0.40
17 −0.38
3 −0.37
20 −0.33
1 −0.32
18 −0.07
CONSTRUCTING AND VALIDATING THE SCIENCE INTEREST SURVEY 657
Figure 1. Screen plot showing resulting eigenvalues for survey in science items
TABLE 8
Rotated promax factor analysis loading
item 7, and item 9. Language used in survey items, for example item 9, is
somewhat subjective. However, item responses are psychometrically sound
showing proper infit and outfit statistics (infit 1.10, outfit 1.11 for item 9,
which is considered productive for the measure per Linacre, 1997). Mean
infit and outfit statistics for the total measure are 1.00 and 1.01, respectively,
and show proper functioning of the measure. Chi-square results for model fit
suggest that the SIS measure conforms to the Rasch model as there is no
significant difference between the observed and expected item response
outcomes (p = 0.063).
Figure 2 also illustrates the item order and calibration of the measure of
the science interest variable. The scale uses a range 0–100 and is a
CONSTRUCTING AND VALIDATING THE SCIENCE INTEREST SURVEY 659
Construct Validity
A construct is defined as a postulated attribute of a person, which is
assumed to be reflected in measure performance (Cronbach & Meehl,
1955; Embretson, 1983). This measured construct is defined as a
unidimensional trait denoted by the term Θ (theta). Construct validity is
defined as the degree to which as scale measures with theoretical
psychological constructs (Θ) to which it is proposed to measure. When a
test measures a trait, which is difficult to define such as in an affective test
measuring the construct of science interest, multiple expert reviewers may
rate the individual item relevance to the construct. Table 9 shows the
independent relevance rating of each reviewer for the items contained on
the Science Interest Survey.
Analysis of reviewer agreement of relevance shows that 56.25% of items
were shown to have strong relevance to science interest as rated by expert
reviewers. This percentage relevance corresponds to a construct validity
TABLE 9
Relevance rating for each item on the Science Interest Survey
Reviewer 2
DISCUSSION
The primary purpose of this study was to design and validate a new
measure of science interest. Secondarily, the study also examines the
underlying factors which make up the construct of science interest. It was
hypothesized that the data obtained using the Science Interest Survey
adequately fits the Rasch model. A secondary hypothesis is the factors of
family influence, peer influence, teacher influence, informal science
experience, and classroom science experiences can be assessed and used
to measure science interest. Confirmation of the hypothesizes would
result in a psychometrically sound measure of science interest using
extrinsic factors.
Solutions for the promax rotated factors analysis show five obliquely
rotated factors (research question 1). This solution reveals a factor
structure which is considered simple with five linearly dependent factors
(Thurstone, 1947). The five suggested factors using classical response
theory show the factors of family support, teacher support, peer support,
informal science experience, and classroom science experiences. Each of
these factors results from extrinsically measured items loading on the
latent trait of science interest. The goal of the Rasch analysis is to
establish the validity and reliability of the Science Interest Survey
662 LAMB ET AL.
each of the items indicates that the items show the same residual
correlation on differing subscales this also support the premise of
unidimensionality. Concerns about the resulting similarity of residual
score are not salient because the scores fall below the root ≥1 criterion.
Eigenvalue falling below root ≥1 are not considered important because
the variance each standardized variable contributes to a principal
component extraction equals 1. While the use of inverse items may
supply subconscience cues as to socially desirable responses other
items within the subscale help to assure consistency of answers,
Rasch analysis indicates that the items are functioning appropriately
within the measure. The person item map of the Rasch scaled
Science Interest Survey shows good targeting of the scale, with no
floor or ceiling effect. The adequate targeting of the survey to subject
participation suggests the ability of the respondents to assess their
level of understanding items. The person item map also shows
several items representing the same level of difficulty along the
ability continuum perhaps suggesting that the items could be
removed. However, the Science Interest Survey is a short survey
and the maintenance of the additional items is unlikely to create
undue burden. The additional items may also allow for slightly better
targeting.
The content of the survey suggests the latent trait being sampled.
Externalized factors associate with the actions others take to increase the
subject science interest levels dominate the content of the survey. This
leads to a global construct of environmental factors (situational interest)
that influence the science interest. In addition, confirmation of outcomes
is established through 61.9% of students assessing the survey as correctly
reflecting their interest level. This level of agreement from the subjects
with the survey outcomes adds an addition level of reliability and validity.
This study demonstrates that the application of the Rasch model supports
the 19-item, four-response scale, Science Interest Survey as a valid scale
for assessing science interest levels. A raw score to Rasch person measure
conversion allows researchers to use the Science Interest Survey without
resorting to Rasch analysis.
CONCLUSION
NOTE
1
The authors of this study acknowledge that extrinsic factors such as ethnicity or
gender can play a role as covariates in response outcomes in an interest survey, and thus,
further investigations of these extrinsic factors as subsets of the general population are
warranted and are addressed in follow-up studies.
2
The authors agree that there is a high correlation between raw scores and Rasch
ability estimates, and the usefulness of the SIS equation is: The ability of researchers to
convert raw scores to Rasch scores without the use of Rasch analysis each time the
instrument is used. The usage of conversion equations for field instruments is commonly
practiced in the clinical arena where practitioners do not necessarily have the time or
expertise to complete fit analyses each time the instrument is used.
REFERENCES
Ainley, M., Hidi, S. & Berndorff, D. (1999). Situational and individual interest in
cognitive and affective aspects of learning. Paper presented at the American Educa-
tional Research Association Meetings. Montreal, Quebec, Canada
Ainely, M., Hidi, S. & Berndorff, D. (2002). Interest, learning and the psychological
process that mediate their relationship. Journal of Educational Psychology, 94(3), 545–
561.
Ainley, M. (2006). Connection with learning: Motivation, affect and cognition of interest.
Educational Psychology Review, 18(4), 391–405.
Anderson, D., Lucas, K. B. & Ginns, I. S. (2003). Theoretical perspectives on
learning in an informal setting. Journal of Research in Science Teaching, 40, 177–
199.
Annetta, L., Minogue, M., Holmes, S. & Cheng, M. (2009). Investigating the impact of
video games on high school students’ engagement and learning about genetics,
Computers in Education, 53(1), 74–85.
Atwater, M. M., Wiggins, J. & Gardner, C. M. (1995). A study of urban middle school
students with high and low attitudes toward science. Journal of Research in Science
Teaching, 32, 665–677.
Bandura, A. (1997). Self-efficacy: The exercises of control. San Francisco: W.H. Freeman
and Company.
CONSTRUCTING AND VALIDATING THE SCIENCE INTEREST SURVEY 665
Barton, A., Tan, E. & Rivet, A. (2010). Creating spaces for engaging school science
among urban middle school girls. American Educational Research Journal, 45(1), 68–
103.
Betemps, E., Smith, R. & Baker, D. (2003). Measurement precision of the clinician
administered PTSD scale: A Rasch model analysis. Journal of Applied Measurement, 4
(1), 59–69.
Blatchford, P., Baines, E., Rubie-Davis, C., Bassett, P. & Chowne, A. (2006). The effect
of a new approach to group work on pupil–pupil and teacher–pupil interactions. Journal
of Educational Psychology, 98(4), 750–765.
Boyd, D., Grossman, P., Lankford, H., Loeb, S. & Michelli, N. (2006). Complex by
design. Journal of Teacher Education, 57(2), 155–166.
Business Roundtable (2005). Tapping America’s potential: The education for innovation
initiative. Retrieved December 15, 2006, from http://www.itic.org/archives/TAP%
20Statement.pdf
Bulunuz, M. & Jarret, O. (2010). Developing an interest in science: Background
experience of preservice elementary teachers. International Journal of Environmental &
Science Education, 5(1), 65–84.
Catsambis, S. (1995). Gender, race, ethnicity, and science education in the middle grades.
Journal of Research in Science Teaching, 32, 243–257.
Churchill, G. (1979). A paradigm for developing better measures of marketing constructs.
Journal of Marketing Research, 16(1), 64–73.
Christidou, V. (2011). Interest, attitudes and images related to science: Combining
students’ voices with voices of school, science teachers and popular science. Interna-
tional Journal of Environmental and Science Education, 6(2), 141–159.
Cronbach, L. & Meehl, P. (1955). Construct validity in psychological tests. Psychological
Bulletin, 52(4), 281–302.
Cooney, S. (2001). Closing gaps in middle grades. ERIC Document Reproduction Service
No. ED 479 781.
Edgar, K. & Fox, N. (2006). Temperamental contributions to children’s performance in an
emotion-word processing task: A behavioral and electrophysiological study. Brain and
Cognition, 65(1), 22–35.
Embretson, S. (1983). Construct validity: Construct representation versus nomothetic
span. Psychological Bulletin, 93(1), 179–197.
Engineering Workforce Commission (2008) Engineering and technology degrees, 2007
(Washington, DC). www.nsf.gov/statistics/wmpd/pdf/tabc-8.pdf
Falk, J. H. & Dierking, L. D. (2000). Learning from Museums. Walnut Creek, CA:
AltaMira Press. 272 pp.
Fabrigar, L., MacCallum, R., Wegener, D. & Strahan, E. (1999). Evaluation the use of
exploratory factor analysis in psychological research. Psychological Methods, 4(3),
272–299.
Fisher, J. (1992). Changing AIDS-risk behavior. Psychological Bulletin, 111(3), 455–474.
Fredricks, J., Alfeld, C. & Eccles, J. (2010). Developing and fostering passion in academic
and nonacademic domains. Gifted Child Quarterly, 54(1), 18–30.
George, R. & Kaplan, D. (1997). A structural model of parent and teacher influences on
science attitudes of eighth graders: Evidence from NELS: 88. Science Education, 82,
93–109.
Gonzales, P., Williams, T., Jocelyn, L., Roey, S., Kastberg, D. & Brenwald, S. (2008).
Highlights from TIMSS 2007: Mathematics and science achievement of U.S. fourth- and
666 LAMB ET AL.
Tai, R. H., Liu, C. Q., Maltese, A. V. & Fan, X. (2006). Planning early for careers in
science. Science, 312(5777), 1143–1144.
Talton, E. L. & Simpson, R. D. (1985). Relationships between peer and individual
attitudes toward science among adolescent students. Science Education, 69, 19–24.
Talton, E. L. & Simpson, R. D. (1986). Relationships of attitudes toward self, family, and
school with attitude toward science among adolescents. Science Education, 70, 365–
374.
Thurstone, L. (1947). Multiple Factor Analysis, Thurstone, LL., University of Chicago
Press: Chicago.
Wright, B. D. (1996). Reliability and separation. Rasch Measurement Transactions, 9(4),
472.
Yager, R. E. & Penick, J. E. (1986). Perceptions of four age groups toward science
classes, teachers, and the value of science. Science Education, 70, 355–363.
Zacharia, Z. & Barton, A. (2004). Urban middle-school students’ attitudes toward a
defined science. Science Education, 88, 197–222.
Jeannette Meldrum
North Carolina State University
Raleigh, NC, USA