Computer-Based Reasoned Multiple-Choice Test Instrument To Identify Critical Thinking Skills On Impulse and Momentum

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 9

Computer-Based Reasoned Multiple-Choice Test Instrument

to Identify Critical Thinking Skills on Impulse and


Momentum
Racy Religia1, a), Annisaa’ Mardiani 1, b), Habibah K Baihaqi 1, c) and Supahar2, d)

Master Program of Physics Education, Faculty Mathematics and Natural Sciences, Universitas Negeri Yogyakarta,
1

Yogyakarta, Indonesia
2
Department of Physics Education, Faculty of Mathematics and Natural Sciences, Universitas Negeri Yogyakarta,
Yogyakarta, Indonesia
a)
Corresponding author: racyreligia.2020@student.uny.ac.id
b)
annisaamardiani1997@gmail.com
c)
habibah0001pasca.2020@student.uny.ac.id
d)
supahar@uny.ac.id

Abstract. This research that was aimed to know the eligibility of a test instrument based on the validation results and
empirical evidence. The test instrument consists of 15 items with critical thinking indicators. Tottaly 44 students were
involved as the empirical research subjects, while the instrument was validated by a teacher as the expert and also two
peers as the practitioners. The data collected from the empirical test were analyzed using Quest programme to find the
reliability, difficulty level, validity, and compatibility of the items model. The results found that all of the items were
valid based on the 3 raters with some revision. Based on the empirical test results, some items were found compatible and
others not compatible with the model by 0.98±0.27. Furthermore, the mean scores of the instrument was 0.00±0.53
indicating good item difficulty. Meanwhile, for the reliability level, 15 items test it was poor by having 0.00 reliability.
Because of reliability is not for the items test but that is reliability of student result. Validity of item test show that 6 out
of 15 items which were not fit or called not valid. So, 9 of 15 item is called valid. Hence, it needs to be revised based on
items’ material, construction, and language. In this study the result shows that item test need tested to students before
implementation in learning. Good item test is fullfiled validity, reliability, and difficulty. Test items that match the
criteria are ready to be used in learning, so it can stimulate and improve students' critical thinking skills, especially on
impulse and momentum material. The teacher can know the importance of determining the criteria for the test instrument.
A good test instrument can identify certain indicators.

INTRODUCTION

In the 21st century, there are a lot of challenges in the educational field that make it more complex and make
students need to prepare for some skills to face those challenges. Education is not only limited to provide students
with knowledge and simple thinking skills that have been given in the status quo, but also prepare them to be able to
develop some of the essential skills regarding the 21st century demand. According to the PISA (Programme of
International Student Assessment), the importance of the ability that needs to be considered are the Skills of learning
and innovation[1,2]. Focus in this research on the critical thinking aspect of students, due to it being an important
factor which supports improving the thinking of students. Student need critical thinking skill to analyze and solve
problems in real life [3]. Critical thinking is based on cognitive strategy with the methods used to solve some
problems.
Critical thinking can define as the ways of person uses data and evidence in decision-making related to what is
believed and done with a clear purpose [4,5]. Even though the development of students’ critical thinking skills is the
main purpose in science education, for some majors such as physics, it has not been properly given.
A researcher argued that critical thinking refers to one’s ability to solve daily life problems, but they found that
educators lacked experience seen from their ability to conduct critical thinking skill tests on specific sub-chapters of
a material [6]. The learning process has good impact on students' critical thinking skills to growth. There are factors
that can affect the physics learning outcomes in students.
In several previous studies, students assessed that the learning process on physics, especially the concepts of
momentum and impulse at school, was not good enough [7]. According to research by experts, it can be concluded
that momentum and impulse are part of abstract mechanics that are difficult for students to understand [8]. So that
students need to think more critically in understanding impulses and momentum for solving the problems.
Some attempts need to be made to develop critical thinking skills of students in learning activities. In other
research, students should be given chances to gain problem solving skills to increase their critical thinking [9].
Problem solving with critical thinking skills are very related. Assessment and learning are said to be meaningful if
not only in the realm of memorization because in learning and assessment there is the possibility that students can
have the ability of critical thinking [10]. The assessment factor in the learning process can also be used to improve
and train students to think critically. Item tests must be good and appropriate in order to improve critical thinking.
This research focuses on developing question items that can be used to improve and identify students' critical
thinking. Teachers are expected to be able to present item questions that can be used in good categories to improve
students' critical thinking.
There are 5 critical thinking skill indicators based, namely (1) identifying problems; (2) designing experiments;
(3) collecting tools; (4) hypothesizing; and (5) specifying the tools to use [11]. Critical thinking can be categorized
into 5 indicators: (1) giving a simple explanation; (2) constructing basic skills; (3) inferring; (4) giving further
explanation; and (5) controlling tactics and strategies [12]. Thus, this research determined the critical thinking skills
into 4 categories: (1) interpreting; (2) analyzing; (3) inferring; and (4) explaining.
Maximum use of technology in learning, is dependent on the role of the teacher. Several researchers suggest
developing a learning environment using technology to create a dynamic learning environment for teachers and
students, and encourage students to actively use technology to be more innovative, and include teachers'
understanding of TPACK[13]. Basically, include a mastery of technology, pedagogy, and materials that are
integrated in the learning then realized learning successfully. Teachers must be creative in designing more specific
learning, teaching materials, and methods to be carried out by developing them according to the TPACK framework.
[14]. Teaching activities are carried out with good planning by the teacher. TPACK has now been implemented in
different settings, as well as to develop activity-based teaching, technology, or assessment of knowledge and
experience technology integration teacher [15-17]. his study focuses on how the assessment of knowledge that is
integrated with technology in the google app.
The use of technology in learning has a good impact. Maximizing use of technology can make it easier for
teachers and students. Use of easily accessible Google apps used in learning. Based on students' opinion, the use of
Google Apps is effective and efficient [18]. Google form can make it easier for the teacher to assess and analyze
student learning outcomes. There are certain features that are presented as a spreadsheet that shows responses to the
students ' answers. Google Apps is easy to use for most users, especially in Education. Google apps a useful and
user-friendly tool available for teachers with experience using Google Drive, Google form and other applications
[19]. Geographic location, platform dependencies, and compatibility issues are no longer limiting factors Based on
user data, it shows that most of the Google applications can be used in several browsers and platforms wherever the
user is located, provided that internet access is stable [20]. Almost all teachers and students have gmail accounts that
can be connected to google apps, so researchers use google forms as a medium to assess student learning outcomes.
The learning implementation that leads to the growth of the critical thinking skill usually uses multiple choice
tests. Multiple choice is commonly used to determine students’ outputs, but the use of multiple-choice tests is only
limited to the low thinking skills of the students and could not be used to measure students' high thinking skills in
physics [21]. Multiple-choice is usually used to measure student’s remembering, understanding, and skill
application. Hence, teachers should find more qualified assessment tools to measure the output of the students in
learning [22]. In this study, instrument test on multiple-choice form but using two part in one item. The first part
containing main question and the second part contain the reasoning answer from part one. So, students must choice
in two times in one item number. The reason answers on part two can help teacher analyze more detailing about
critical thinking skill of students. Instrument test can be applied in google form which can connect with google drive
and spreadsheet. A good test instrument has several criteria such as validity, reliability, the difficulty index, and the
compatibility with the items model.
Based on what has been explained, start from critical thinking, one of the 21st century skills until technology is
used as a place for item tests. So, this research develops test items in the form of multiple-choice reasons to
determine students' critical thinking skills which are implemented on a computer based.

METHODS
In this research use Quantitative research with R & D methods. From this study, the purpose is produces product,
then take some test to know product effectivity. The followings are steps carried out by the researchers. Before that,
below this picture shown the example of item test which it applies in google form.

FIGURE 1. Item test Reasoned Multiple-Choice Test Apply in Google Form

Instrument Validity Test

Instrument validity test was based on the judgment of a validator that consisted of a teacher as an expert and two
peer reviewers to assess the instrument for critical thinking items. Validity of the test uses Aiken’s V formula and
the result of the student answer. Instrument validity by experts based on several categories. Categories used are
content, construction, and language. Validity is needed in terms of content to develop item tests to be more in line
with the content being taught. The construction of the item test becomes a material consideration to be analyzed.
Aspects that are considered in the construction of item tests such as clarity in the form of items, so that it does not
have a lot of meaning. The pictures or diagrams presented in the items can function and be understood by students.
The language of the items used must be clear and not use vocabulary that is make students difficult to understand.
After getting the results from the three experts, there are several question items that need to be improved or
developed.
Empirical Test

Empirical test was conducted after the instrument was validated and revised. Purposive sampling was used to
choose the subjects from some sample that suit to student’s criteria which they already got from critical thinking
learning activities. In details, sample consisted of 44 students that who science concentration in X SCIENCE E and
X SCIENCE F in 2020-2021 academic year. The test result needs to be analyzed with a Quest application with a
polytomous category.

Data Interpretation
The validation results from the raters were analyzed using the V-Aiken formula and transformed into 3
categories. Items can be categorized into high validity if it scored between 0.8-1.0, medium validity with score
between 0.4-0.8, and low validity scored under 0.4.
The test instrument consists of 15 items developed with critical thinking indicators. This empirical evidence was
analyzed using the Quest programme with Item Response Theory. In addition, the items that passed the goodness of
fit based on the average score of INFIT Mean of Square (Mean INFIT MNSQ) and standard deviation can be said as
fit with the model in INFIT MNSQ if it ranged between 0.77-1.30 or gained the INFIT t value between ± 2.0 with
0.5 probabilities. Using the Hambleton & Swaminathan criteria, the items are labeled difficult if it has the difficulty
index between >+2.0, while easy item has difficulty index between <-2.0 [23,24].

RESULTS AND DISCUSSION

The form instrument test that applies in google form and the result of the test will be presented in a spreadsheet
which their apps connect with an email account. Shown below in fig. 2. And then, the result need to analyze with
Quest Application to know the validity, reliability, and others.

FIGURE 2. Responds student test in Spreadsheet as a part apps of Google

The critical thinking instrument was made based on the critical thinking indicators. Here are the distributions of
the critical thinking indicators towards the total items.
TABLE 1. Distribution of Critical Thinking Indicators Towards the Total Items
Aspect Total Items Item Number
Interpreting 3 1, 2, 14
Analyzing 4 3, 5, 8, 13
Inferring 5 6, 10, 12, 4, 7
Explaining 3 15, 9, 11
The test instruments with reasoned multiple-choice format are intended to measure the critical thinking skills that
need to be developed, validated, and discovered by the item characteristics. This stage is called design because it
maps and distributes questions on sub-materials, called the instrument lattice [25].

Instrumen Validation

Validation sheets were filed by a teacher as an expert and 2 practitioners or peer reviewers based on the
validation sheet for each item. The validator provides a value to each item using the numbers 1, 2, 3 and 4. The
result for the item validation is shown in table 2.

TABLE 2. Item Validation Results

Item Number V Validity Level

1 0.9 High
2 0.9 High
3 0.8 Medium
4 0.9 High
5 0.8 Medium
6 0.9 High
7 0.8 Medium
8 0.8 Medium
9 0.8 Medium
10 0.9 High
11 0.8 Medium
12 0.9 High
13 0.9 High
14 0.9 High
15 0.8 Medium

It can be seen that the items that got the high validity were item number 1, 2, 4, 6, 10, 12, 13, and 14, while item
number 3, 5, 7, 8, 9, 11, and 15 were categorized as medium validity level. Some revisions were given by the raters
to prepare the items prior to the implementation of empirical tests. The mean of V score for each item was 0.8, so it
can be concluded that the items were valid with a high validity score. Based on the validator's assessment, there are
some questions that need minor and major revisions, but the test instruments can be categorized as good and can be
used. [25,26].

Empirical Test Results Analysis


Empirical testing was carried out after the instrument was revised from validation by experts and practitioners.
There are 44 students of class X who are the subject of this empirical test. After that, analyzing the results of student
answers with the quest application. Item response theory method was used during the analysis of the instrument
using Quest programme. There were 9 of 15 instrument items that fit to the Rasch model approach because they met
the criteria of INFIT MNSQ 0.98 ± 0.27 as can be seen in table 3.

TABLE 3. QUEST analysis results


Parameter Item Estimation Case Estimation
INFIT MNSQ 0.98 ± 0.27 0.96 ± 0.32
OUTFIT MNSQ 1.12 ± 0.54 1.12 ± 0.75
Mean of items difficulties 0.00. ± 0.53
Reliability estimation 0.00

INFIT MNSQ shows the compatibility of the item characteristics towards the test items model. Based on the
table, the INFIT MNSQ score was 0.98±0.27 with the range score between 0.71-1.25. Thus, it can be assumed that
some items fit and others not fit by having the range between 0.77-1.30. Clarity about content validation in the study
of instrument development was essential, so 9 items called valid and 6 others non-valid [27]. Table 3 also shows that
the difficulty level obtained from the instrument in the range between -2 until +2 was 0.00±0.53 meaning that the
instrument's difficulty index was good [28].
In terms of reliability, the item test got 0.00 based on the Quest programme, indicating that the items were
unreliable. The good reliability score is higher than 0.6. The results of reliability based on the test instruments can
be assumed as reliability score since it was derived from the students’ test result. This unreliable situation might be
caused by the change of the samples. The fit for each item can be seen in fig. 3.

FIGURE 3. Results Compatibility Items Test from QUEST

It can be seen from the fig.3 that there were 6 items that were out of the range between 0.77-1.30, so those items
could not be categorized as items that fit with the model. Meanwhile the 9 other items were assumed as fit items
because their scores were still in range between 0.77-1.30 [24]. Some items that were not fit to the model might be
caused by the material structure, construction, and language that needed to be revised. Various things affect the
results characteristics of test instrument. There are many shortcomings in making test instruments, in the future it is
hoped that further research can develop this test better [27].
FIGURE 4. Person Item Maps from QUEST

Fig. 4 shows the skills mapping of the students towards the items. X symbol represents a student. It can be seen
that there were some of the items that could not be answered by the student to get maximum score (4). For example,
item 12.4 seemed to be the most difficult item because there was no student that could answer and get the maximum
score. It also happened for some items that could not be answered by the students. Students not familiar with the
form of the item test, has some impact on the thinking of students. Due to on 1 item a student must answer in 2
times, student's can not answer the questions well. This also causes questions and reasons to have an unequal
distribution because it shows students can answer the main question well but difficult to answer. The reason for the
student's answer is the part that is make student difficult to answer. The average ability of students is smaller than
the difficulty level of the questions [28]. So, from reality, items have been reviewed in terms of the material,
construction, and language, and can be revised. From the fig. 4 can analyze the categorization of critical thinking
ability. It can be shown in this table below.
Table 4. Critical Thinking Ability Categorize
Sample Ability value Interpretation
0 2.0 – 3.0 Very High
0 1.0 – 2.0 High
40 -1.0 – 1.0 Average
4 -2.0 – (-1.0) Low
0 -3.0 – (-2.0) Very Low

Given the importance of developing critical thinking skills, this research is expected to have an impact on
students and teachers. Critical thinking is an objective analysis of facts to form valuation. Through this test, it can
stimulate and improve students' critical thinking skills, especially on impulse and momentum material [29]. Because
according to students, the material is classified as abstract. For teachers, in order to know that the test instrument is
ready to be used, that is, it has gone through the determination of the instrument criteria [30].

CONCLUSION

The reasoned multiple-choice test was developed after the instrument was validated by 3 raters and revised.
Afterwards, the items were tested empirically to find the item characteristics and the results conclude that the items
are compatible and not compatible with the model by 0.98±0.27 and mean by 0.00±0.53, so the difficulty index of
the items is good. However, its reliability score is 0.00 or not reliable, it is because of the reliability of student
results. Moreover, there were 6 items from 15 items that were not fit to the model and other 9 items fit to the model
so that it can be concluded that the items need to be reviewed from the aspects of materials, construction, and
language and be revised.
The development of test instruments in this study still has many weaknesses and needs to be developed for
further research. However, this study can give more advantages for students and teachers. Test items that match the
criteria are ready to be used in learning, so it can stimulate and improve students' critical thinking skills, especially
on impulse and momentum material. The teacher can know the importance of determining the criteria for the test
instrument. A good test instrument can identify certain indicators.

REFFERENCES

1. A. Yilmaz, Particip. Educ. Res., 8 (2), 163–199 (2021).


2. C. Wardani and B. Jatmiko, Int. J. Act. Learn., 6 (1), 17–26 (2021).
3. E. S. Handayani, Yuberti, A. Saregar, and Y. Wildaniati, "Development of STEM-integrated physics e-module
to train critical thinking skills: The perspective of preservice teachers", in "Young Scholar Symposium on
Science Education and Environment (YSSSEE)" Journal of Physics Conference Series 1796, edited by
[Sudarman] (IOP Publishing, Bristol, United Kingdom,2021). pp [012100].
4. C. Walsh, K. N. Quinn, C. Wieman, and N. G. Holmes, Phys. Rev. Phys. Educ. Res., 15 (1), 10135 (2019).
5. A. M. Ilmi and W. Sunarno, “Development of TPACK based-physics learning media to improve HOTS and
scientific attitude”, in " The 5th International Seminar on Science Education" Journal of Physics Conference
Series 1440, edited by [Antuni Wiyarsi] (IOP Publishing, Bristol, United Kingdom,2021). pp [012049].
6. D. T. Tiruneh, M. De Cock, A. G. Weldeslassie, J. Elen, and R. Janssen, Int. J. Sci. Math. Educ., 15 (4), 663–
682 (2017).
7. V. Serevina and K. Luthfi, "Development of discovery learning-based on online learning tools on momentum
and impulse", in "3rd International Conference on Research and Learning of Physics (ICRLP) 2020" Journal
of Physics Conference Series 1876, edited by [Ramli] (IOP Publishing, Bristol, United Kingdom,2021). pp
[012076].
8. H. Putranta and Supahar, "Synthesis of the Cognitive Aspects’ Science Literacy and Higher Order Thinking
Skills (HOTS)" in Chapter Momentum and Impulse, " ICRIEMS 6" Journal of Physics Conference Series 1397
edited by [Restu Widiatmono] (IOP Publishing, Bristol, United Kingdom,2019). pp [012014]
9. E. F. Eldy and F. Sulaiman, Int. J. Humanit. Soc. Sci. Invent., 2 (3), 18–25 (2013).
10. F. S. Putri, E. Istiyono, and E. Nurcahyanto, UPEJ Unnes Phys. Educ. J., 5 (2), 76–84 (2016).
11. B. Miri, B.-C. David, and Z. Uri, Res. Sci. Educ., 37 (4), 353–369 (2007).
12. B. Stein, A. Haynes, M. Redding, T. Ennis, and M. Cecil, Innovations in e-learning, instruction technology,
assessment, and engineering education, 79–82 (2007).
13. A. M. Ilmi, S. Sukarmin, and W. Sunarno, "Development of TPACK based-physics learning media using
macro VBA to enhance critical thinking skills", in "ICMScE 2019" Journal of Physics Conference Series 1521
edited by [Galuh Yuliani] (IOP Publishing, Bristol, United Kingdom,2020). pp [22052].
14. D. Oktasari and Z. R. Putri, "Framework TPACK using Quick Response (QR) code to promote ICT literacy
students in learning physics", in "ICMSE 2019", Journal of Physics Conference Series 1567 edited by [Sutikno]
(IOP Publishing, Bristol, United Kingdom,2020). pp [32078].
15. D. D. Agyei and J. Voogt, Australas. J. Educ. Technol., 28 (4), 547-564 2012.
16. S. Pamuk, M. Ergun, R. Cakir, H. B. Yilmaz, and C. Ayas, Educ. Inf. Technol., 20 (2), 241–263 (2015).
17. C.-C. Tsai and C. S. Chai, Australas. J. Educ. Technol., 28 (6), 1057-1060 (2012).
18. S. Widodo, "Implementing google apps for education as learning management system in math education", in
"ICMScE" Journal of Physics Conference Series 895 edited by [Liliasari] (IOP Publishing, Bristol, United
Kingdom,2017). pp [12053].
19. L. J. Awuah, J. Instr. Res., 4 (1), 12–22 (2015).
20. M. E. Brown and D. L. Hocutt, J. Usability Stud., 10 (4), 160-181 (2015).
21. E. Istiyono, D. Mardapi, and S. Suparno, J. Penelit. dan Eval. Pendidik., 18 (1), 1–12 (2014).
22. D. S. Asysyifa, İ. WİLUJENG, and H. Kuswanto, Int. J. Educ. Res. Rev., 4 (2), 245–253 (2019).
23. A. Z. Khairani and H. Shamsuddin, Assess. Learn. within beyond Classr., 1 (1), 417–426 (2016).
24. F. Reffiane, Sudarmin, Wiyanto, and S. Saptono, Pegem Egit. ve Ogr. Derg., 11 (4), 1–8 (2021).
25. N. N. Resta, A. Halim, Mustafa, and I. Huda, "Development of e-learning-based three-tier diagnostics test on
the basic physics course", in "AICMSTE 2019", Journal of Physics Conference Series 1460 edited by [Musri
Musman] (IOP Publishing, Bristol, United Kingdom,2020). pp [12131]
26. T. Sugiarti, I. Kaniawati, and L. Aviyanti, "Development of Assessment Instrument of Critical Thinking in
Physics at Senior High School", in "MSCEIS" Journal of Physics Conference Series 812 edited by [Topik
Hidayat] (IOP Publishing, Bristol, United Kingdom,2017). pp [12018].
27. S. Ramadhan, D. Mardapi, Z. K. Prasetyo, and H. B. Utomo, Eur. J. Educ. Res., 8 (3), 743–751 (2019).
28. A. Prasetya, U. Rosidin, and K. Herlina, "Development of Instrument Assessment for Learning the Polytomous
Response Models to Train Higher Order Thinking Skills (HOTS)", in "YSSTEE2018" Journal of Physics
Conference Series 1155 edited by [Elin Palm] (IOP Publishing, Bristol, United Kingdom,2019). pp [12032].
29. P. Soros, K. Ponkham, and S. Ekkapim, "The results of STEM education methods for enhancing critical
thinking and problem solving skill in physics the 10th grade level", in "International Conference for Science
Educators and Teachers (ISET) 2017" AIP Conference Proceedings 1923 edited by [Chokchai Yuenyong]
(AIP Publishing, New York, USA, 2018). pp [30045]
30. F. Mabruroh and A. Suhandi, "Construction Of Critical Thinking Skills Test Instrument Related The Concept
On Sound Wave", in "MSCEIS" Journal of Physics Conference Series 812 edited by [Topik Hidayat] (IOP
Publishing, Bristol, United Kingdom,2017). pp [12056].

You might also like