Task 1 Edup3063-Tesl 3-Nur Syahanis Aduni Binti Mohd Rafi-2020092340327 PDF

INSTITUT PENDIDIKAN GURU
KAMPUS DATO’ RAZALI ISMAIL

21030 KUALA NERUS TERENGGANU
PROGRAM IJAZAH SARJANA MUDA PERGURUAN (PISMP)

AMBILAN JUN 2020
SEMESTER 3 TAHUN 2 (SESI FEB-JUN 2022)
TUGASAN 1 – ACADEMIC WRITING
NAMA PELAJAR : NUR SYAHANIS ADUNI BINTI MOHD RAFI

ANGKA GILIRAN : 2020092340327
NO KAD PENGENALAN : 010102-03-0526
KUMPULAN/UNIT : TESL 3
KOD DAN NAMA KURSUS : EDUP3063 ASSESSMENT IN EDUCATION
NAMA PENSYARAH : MADAM NIK AZLEENA BINTI NIK ISMAIL
TARIKH HANTAR : 26 APRIL 2022
NO. TEL : 016-5040527
Ulasan/Komen/Maklum Balas Pemeriksa/Pensyarah :
Tanda Tangan Pemeriksa/Pensyarah : Tarikh :
Pengesahan Pelajar
Saya mengesahkan bahawa maklum balas yang diberikan oleh pensyarah telah saya rujuki dan
fahami.
Tanda tangan Pelajar : Caanis Tarikh :

1.0 Introduction
Assessment is a central component of the teaching and learning process. It is defined as

the systematic collection and analysis of information to improve student learning (Stassen, M.
L., et al, 2004). In this task, online assessment is used to measure students’ underlying
capabilities and gather essential data pertaining to their skills, whereas this assessment
provides about 20 multiple-choice questions which primarily aims for assessing the students’
knowledge and literacy level in English subject. From the created online assessment, twelve
of Year 5 students are tested and the data has been collected to be analyzed on its
effectiveness of the items constructed.
2.0 Item Analysis
Item analysis is a statistical analysis of the responses of students to a test. The collection
and summarizing of student responses can give quantifiable objective information that can be
used to determine the quality of test items and improve the assessment's efficiency (Sharma,
L. R., 2021). Item analysis also evaluates the performance of individual items in respect to
some external criterion or the remaining test items (Thompson, B. & J.E. Levitov, 1985). Item
analysis include the difficulty index, discrimination index, and distractor analysis, in general.
2.1 Difficulty Index Analysis

The Difficulty Index is used to determine the difficulty level of test items. This metric
requires teachers to compute the percentage of students that correctly answered the test
item (Aris, S., 2020). The difficulty index goes from 0 to 100, with higher values indicating
simpler questions and lower ones indicating the difficulty of tough questions. According to
Bloom, et al (1971), a decent distribution of test results can be attained if the difficulty
index, p, is between 0.20 and 0.80 with a mean between 0.50 and 0.60. The table below
shows how many students chose each answer choice for Questions 1 to 10.
Question A B C D
1 5 5* 2 0
2 1 0 11* 0
3 0 0 12* 0
4 9* 0 3 0
5 2 3* 6 1
6 0 0 7* 5
7 0 12* 0 0
8 1 7* 0 4
9 0 0 12* 0
10 10* 2 0 0
(*) denotes correct answer.
Difficulty Index, p = the number of students who choose the correct answer
the number of total students (n)
Item Calculation Interpretation
1 p=5
From the calculation of each item, we can interpret that
12
question items 2, 3, 7 and 9 fell in the range of the difficulty
= 0.4 index of 0.9 to 1. It indicates that four questions were the
easiest to answer since almost all the students are able to
2 p = 11
correctly answer those questions. Hence, questions 2, 3, 7,
12 and 9 should be carefully analyzed and probably deleted or
changed. Eventhough these kinds of questions are
= 0.9
maintained in the questionnaire, these questions do not
3 p = 12 function or bring any benefit because very difficult item
displayed poor discrimination index, but the very easy item
12 had high discrimintion index, indicating a faulty item or
=1 incorrect keys.
4 p=9 Moving on to the next question which is question item 1,
12 this question fell in the range of the difficulty index of 0.4,
where it can be considered as moderately difficult. Also,
= 0.8 question item 5 fell in the range of the difficulty index of 0.3.
5 p=3 This shows that this question was difficult. Looking at
question items 6 and 8, they fell in the range of the difficulty
12 index of 0.6. It indicates those two questions were
= 0.3 moderately easy. On the other hand, question items 4 and
10 fell in the range of the difficulty index of 0.8. Those two
6 p=7 questions were easy. Therefore based on the item
12 evaluation, questions that fell in the range of difficulty index
of 0.29 to 0.89 should be maintained in the list of
= 0.6 questionnaire. This is because moderately easy or difficult
7 p = 12 item had the maximal discriminative ability, indicating that
these items were excellent test items for differentiating
12 between poor and good performers.
=1
8 p=7
12
= 0.6
9 p = 12
12
=1
10 p = 10
12
= 0.8
2.2 Discrimination Index Analysis
On the other side, the Discrimination Index relates to how well an evaluation
differentiates between high and low scores. In other words, we should be able to anticipate
that high-performing students would pick the correct answer to each question more
frequently than low-performing students. Table 2 shows the outcomes of 10 questions on
an online evaluation.
Student Total Questions

Score
1 2 3 4 5 6 7 8 9 10
(%)
Ainul Mardhiyyah 100 1 1 1 1 1 1 1 1 1 1
Nur Syarah Afrina 100 1 1 1 1 1 1 1 1 1 1
Nor Qaseh Ariana 80 1 1 1 1 0 1 1 0 1 1
Nurul Alia Amirah 80 0 1 1 1 0 1 1 1 1 1
Wan Irdina 80 1 1 1 0 0 1 1 1 1 1
Insyirah
Nur Adawiyah 70 0 1 1 1 0 1 1 1 1 0
Balqis
Nur Qaseh Rania 70 0 1 1 0 0 1 1 1 1 1
Qurratu Aini 70 1 1 1 1 1 0 1 0 1 0
Syafiah
Muhammad Aqil 60 0 1 1 1 0 0 1 0 1 1
Muhammad 60 0 1 1 0 0 0 1 1 1 1
Syakil
Nur Hafizatul 60 0 1 1 1 0 0 1 0 1 1
Solehah
Nur Alya Batrisya 50 0 0 1 1 0 0 1 0 1 1
“1” indicates the answer was correct, “0” indicates it was incorrect.
Discrimination Index, d = 𝑹𝑨 (correct upper group) – 𝑹𝑩 (correct lower group)
½ x (the number of total 𝑹𝑨 and 𝑹𝑩 )
Item 𝑹𝑨 𝑹𝑩 Calculation Interpretation
1 3 0 d=3-0
From the discrimination index calculation
½x8
of each item, we can conclude that question
= 0.75 items 1, 2, 4, 5, 6, and 8 have a good
discrimination index of greater than 0.3. This
# 0.8
means that these items could be used as a
2 4 3 d=4-3 ranking question to help separate the
stronger from the weaker candidates. As a
½x8
result, these items recorded positive
= 0.25 discrimination indexes and shows that the
majority of the students in the upper group
# 0.3
answered the items correctly. This is notable
3 4 4 d=4-4 because, as indicated by Crocker and Algina
(1986), for an item to be good, majority of
½x8 the knowledgable students should be able to
=0 get that particular item right than the poor
students.
4 4 3 d=4-3
½x8 Additionally, if the discrimination index of
the item is zero like the question items 3, 7,
= 0.25 9 and 10, these items have a zero
# 0.3 discrimination and that means these items
are too easy or too hard and thus, every
5 2 0 d=2-0 students got the item right, missed the item
½x8 or the item is ambiguous. Therefore, further
analysis is needed to determine issues with
= 0.5 the questions to avoid erroneous item in the
6 4 0 d=4-0 future.
½x8
=1
7 4 4 d=4-4
½x8
=0
8 3 1 d=3-1
½x8
= 0.5
9 4 4 d=4-4
½x8
=0
10 4 4 d=4-4
½x8
=0
2.3 Distractor Analysis
The idea of higher and lower groups was applied in assessing the distractors, although
the analysis and expectation changed slightly from standard item discrimination. Instead
of expecting a bigger value, we should logically predict a negative value because more
students from the lower group will choose the distractors. Nonetheless, each distractor
can have its own item discrimination value in order to assess how the distractors work and,
ultimately, refine the effectiveness of the test item itself. Generally, there are three features
that show the distractor is problematic or ineffective in test items, for instance:
i. More of the students in the upper group chose the distractor than the students
in the lower group.
ii. A 50/50 percent chance of students answering the correct response.
iii. No students choose the possible answer.
Item 1 Option A B* C D
Upper Group 1 3 0 0
Lower Group 3 0 1 0
Interpretation: The correct answer is (B) while most of the students chose
(A). According to distractor analysis, this item is miss-keyed rather than an
implausible distractor and (D) stands as non-functional distractor as none of
the students chose the answer (D).
Item 2 Option A B C* D
Upper Group 0 0 4 0
Lower Group 1 0 3 0
Interpretation: According to distractor analysis, this item is acceptable to
be used in the question bank as the number of upper group students
choosing the correct answer is higher than the lower groups.
Upper Group 0 0 4 0
Lower Group 0 0 4 0
Interpretation: This item needs major revision or rewriting since all
distractors are non-functional. This item is absolutely easy with no difficulty
or discrimination index. Such items should be removed from the question
bank and removal from the exam is considered valid.
Item 4 Option A* B C D
Upper Group 4 0 0 0
Lower Group 3 0 1 0
Upper Group 0 2 2 0
Lower Group 1 0 2 1
Interpretation: The distractor analysis shows that all the distractors are
functional. Thus, this item has acceptable indices. Such items can be saved
in the question bank for further use. The distractors need to be updated to
have more efficiency.
Upper Group 0 0 4 0
Lower Group 0 0 0 4
Upper Group 0 4 0 0
Lower Group 0 4 0 0
Upper Group 0 3 0 1
Lower Group 1 1 0 2
Upper Group 0 0 4 0
Lower Group 0 0 4 0
Item 10 Option A* B C D
Upper Group 4 0 0 0
Lower Group 4 0 0 0
3.0 Strategies in controlling the validity and reliability of online assessments
Due to the COVID-19 epidemic in 2020, many schools made a quick switch to online
instruction and assessment. Online assessment provides the extra benefit of providing
massive cohorts of students with immediate and detailed feedback on their personal
performance, allowing students to focus their weaker areas for remediation. This has
ramifications for both teaching and assessment efficiency, as the default is frequently to
measure fundamental recall knowledge in a multiple-choice exam. Teachers must analyze the
validity and reliability of the assessments to ensure that the authenticity and structure of the
assessment fit the test design.
One of the most compelling pedagogical justifications for online assessment is the ability
to offer accurate and appropriate feedback that adheres to good feedback practice. While
grades and marks are vital aspects of feedback, they do not improve learning and can even
impair learning and diminish motivation. Online assessment, on the other hand, allows for the
provision of not just a mark, but also particular feedback on correct responses and the
reasoning involved. According to research, formative multiple-choice exercises and
assessments may be an efficient and successful online approach to encourage student
learning and autonomy when matched with the principles of positive feedback.
Aside from that, online assessment, often known as e-assessment, is possibly the most
difficult aspect of online primary education. Along with a reduction in resources, this resulted
in an increase in the burden of students, teachers, and assessors. To address this move
toward high-throughput marking approaches, some types of tests may be constructed that can
be marked automatically, providing students with quick precise feedback on their responses,
including a brief explanation of the correct and incorrect answer options. While designing and
writing online assessments to develop or measure various cognitive levels is difficult and time
consuming, automated marking and feedback of some question types, such as multiple
choice, true/false, matching, and short response, increases efficiency for both students and
teachers.
Furthermore, there are several tangible disadvantages to online assessment that may be
more relevant to specific students. For example, online assessment has the potential to
exacerbate educational and social disadvantage due to differences in access to and literacy
in digital technologies associated with age, gender, and socioeconomic factors. Not only that,
but online assessments also allow for an increased proclivity to cheat, which will influence
students' assessment results. Since students are given ample time to complete the test and
in a variety of settings, they are more prone to cheat in order to obtain high grades. In certain
circumstances, their parents or family members do the assessments on their behalf. This
situation has made it difficult for teachers to measure pupils' learning and academic
achievement levels efficiently.
To maintain the integrity of the assessment, assignments should yield unambiguous proof
that the work, whatever its type, was generated by the candidate. To combat the many sorts
of cheating and evaluations in the online world, various measures are necessary. Measures
such as verifying the test taker, using plagiarism detection software, and supervising the
monitoring of test circumstances can directly prevent cheating, although other measures, such
as using real assessments, can limit both the potential and motive to cheat.
4.0 Positive impact of online assessment practices on teachers and students
The online assessment has brought in a shift in the way traditional tests are performed.
The benefits of online assessments are numerous, particularly in light of the continuous
epidemic affecting the worldwide education environment. The advantages of online testing
include a plethora of assessment opportunities for both students and the teacher administering
the assessment. Nature does not have to suffer the weight of human insensitivity any longer,
thanks to technological advancements. The negative environmental impact of ruthlessly
chopping down trees for paper is obvious. Using an online assessment system, on the other
hand, ensures that institutions and organizations may go paperless by not printing exam
papers and keeping a paper record of applicants.
One of the numerous benefits of using online assessment practices is that it allows
teachers to scale up their evaluation process without any problems. It is also making inroads
into the student fraternity, owing to the accessibility and ubiquity of access to education. Their
comfort level with online evaluation tools is increasing by the day. The more familiar they are
with such an interface and procedure, the more efficient they will be while utilizing a system.
When the human, logistical, and administrative expenses associated with traditional test
settings are considered, it is reasonable to identify an online assessment system as the most
cost-effective approach for conducting examinations at scale. There is no need for pupils to
gather in vast, expansive classrooms to take the test, for example. The flexibility of time and
location is quite appealing to students. After all, online assessments do not necessitate the
rental of a classroom or the engagement of an assessor for manual surveillance.
5.0 Raw Data
6.0 Statistical Calculation
From the data above, the calculations on mean, standard deviation, and z-score will take
place below.
6.1 Mean
The mean is the average of the numbers. It is easy to calculate where the process
starts with adding up all the numbers in a data and then dividing by how many numbers
there are. In short, mean is the sum divided by the count (Math Is Fun, 2021).
Subject Calculation
Science 𝛴𝑥
𝑥̅ =
𝑁
= 80 + 75 + 60 + 90 + 64 + 78 + 80 + 98 + 75 + 76 + 78 + 99 + 92 + 85 + 20
15
= 1150
15
= 76.67
Mathematics 𝛴𝑥
𝑥̅ =
𝑁
= 80 + 85 + 80 + 72 + 85 + 77 + 83 + 88 + 76 + 75 + 37 + 80 + 78 + 75 + 72
15
= 1143
15
= 76.2
6.2 Standard Deviation

Standard deviation is the degree of dispersion or the scatter of the data points
relative to its mean, in descriptive statistics (CueMath, n.d.). It tells how the values are
spread across the data sample and it is the measure of the variation of the data points
from the mean. The standard deviation of a sample, statistical population, random
variable, data set, or probability distribution is the square root of its variance.
Subject Variances Calculation
Science A, (80 − ∑[𝑥−𝑥̅ ]2

76.67)2 𝜎=√
𝑁
= 11.0889
(11.0889 + 2.7889 + 277.8889 + 177.8889 + 160.5289 + 1.7689
B, (75 − + 11.0889 + 454.9689 + 2.7889 + 0.4489 + 1.7689 + 498.6289 +
76.67)2 = 235.0089 + 69.3889 + 3211.4889)
= 2.7889 √ 15
C, (60 −
76.67)2 5117.5335
=√
15
= 277.8889
D, (90 − = √341.1689
76.67)2
= 18.47
= 177.6889
E, (64 −
76.67)2
= 160.5289
F, (78 −
76.67)2
= 1.7689
G, (80 −
76.67)2
= 11.0889
H, (98 −
76.67)2
= 454.9689
I, (75 −
76.67)2
= 2.7889
J, (76 −
76.67)2
= 0.4489
K, (78 −
76.67)2
= 1.7689
L, (99 −
76.67)2
= 498.6289
M, (92 −
76.67)2
= 235.0089
N, (85 −
76.67)2
= 69.3889
O, (20 −
76.67)2
= 3211.4889
Mathematics A, (80 − ∑[𝑥−𝑥̅ ]2
76.2)2 𝜎=√
𝑁
= 14.44
B, (85 − (14.44+77.44+14.44+17.64+77.44+0.64+46.24+139.
76.2)2 =
√ 24+0.04+1.44+1536.64+14.44+3.24+1.44+17.64)
15
= 77.44
C, (80 − 1962.4
76.2)2 =√
15
= 14.44
= √130.8
D, (72 −
76.2)2 = 11.44
= 17.64
E, (85 −
76.2)2
= 77.44
F, (77 −
76.2)2
= 0.64
G, (83 −
76.2)2
= 46.24
H, (88 −
76.2)2
= 139.24
I, (76 −
76.2)2
= 0.04
J, (75 −
76.2)2
= 1.44
K, (37 −
76.2)2
= 1536.64
L, (80 −
76.2)2
= 14.44
M, (78 −
76.2)2
= 3.24
N, (75 −
76.2)2
= 1.44
O, (72 −
76.2)2
= 17.64
6.3 Z-Score
A z-score (also called a standard score) tells an idea of how far from the mean a
data point is. But more technically it is a measure of how many standard deviations
below or above the population mean a raw score is. Simply put, Z-scores are a way to
compare results to a “normal” population. Results from tests or surveys have
thousands of possible results and units; those results can often seem meaningless.
However, a z-score provides the information and value for us to make comparison
between two or more variables.
Subject Student Calculation

Science Student A [𝑥 − 𝑥̅ ]
𝑧=
𝜎
= (80 – 76.67)
18.47
= 0.18
Student B [𝑥 − 𝑥̅ ]
𝑧=
𝜎
= (75 – 76.67)
18.47
= -0.09
Mathematics Student A [𝑥 − 𝑥̅ ]
𝑧=
𝜎
= (80 – 76.2)
11.44
= 0.33
Student B [𝑥 − 𝑥̅ ]
𝑧=
𝜎
= (85 – 76.2)
11.44
= 0.77
7.0 Comparison on achievements of student A and student B for Science and
Mathematics
Students' academic achievement can be influenced by a variety of factors, including

nutritional status, demographics, and socioeconomic status (Anuar, M., et al, 2005). No nation
can afford to waste its most valuable national resource: its people's intellectual capacity. This
study z-score demonstrates that when suitable evaluation and teaching methods are used in
the classroom, students' academic performance in Science and Mathematics subjects can
increase effectively.
For Science subject, the z-score of student A is positive value, indicating that the raw score
he/she gets is higher than the mean average (Saul, M., 2019). When the score is higher than
the mean score, it shows the student A achieves good academic performance. Meanwhile,
the z-score of student B is negative, indicating that the raw score he/she gets is below the
mean average (Saul, M., 2019). This means student B achieves poor academic performance
which needs him/her to study and revise more on the subject to get higher marks and better
academic performance in the next examination.
For Mathematics subject, the z-score of both student A and student B are positive value.
This justifies that both students got higher score than the mean score. However, student B
achieves better academic performance since his/her z-score is larger than student A.
Therefore, the higher the value of z-score, the better the students’ academic performance.
8.0 Conclusion
In a nutshell, item analysis allows teachers to exert additional quality control over tests.
Well-defined learning objectives and well-crafted items help us get started, but item analyses
can provide feedback on how effective we were as a teacher (Gott, F. , n.d.). When
examining the features of each question, we will frequently uncover how they may or may not
have assessed the learning outcome intended which regards as a validity issue. Aside from
that, when we modify items to remedy these issues, it signifies that the item analysis has
assisted us in improving the likely validity of the test the next time we administer it.
9.0 References
Anuar, M., et al. (2005). Effects on nutritional status on academic performance of Malaysian
primary school children. Asia Pac J Public Health, 81-87.
Aris, S. (2020). Item Analysis of English Summative Test: EFL Teacher-Made Test.
Indonesian EFL Research and Practices, 35-54.
CueMath. (n.d.). Standard Deviation. Retrieved from CueMath:

https://www.cuemath.com/data/standard-deviation/
Gott, F. . (n.d.). Item Analysis of Classroom Tests: Aims and Simplified Procedures. Retrieved
from Udel.edu: http://www1.udel.edu/educ/gottfredson/451/unit9-
guidance.htm#:~:text=Item%20analyses%20are%20intended%20to,test%20by%20i
mproving%20its%20reliability.
Math Is Fun. (2021). How to Find the Mean. Retrieved from MathIsFun.com:
https://www.mathsisfun.com/mean.html
Saul, M. (2019). Z-Score: Definition, Calculation, and Interpretation. Retrieved from

SimplyPsychology: https://www.simplypsychology.org/z-
score.html#:~:text=A%20positive%20z%2Dscore%20indicates,is%20below%20the%
20mean%20average.
Sharma, L. R. (2021). Analysis of difficulty index, discrimination index and distractor efficiency
of multiple choice questions of speech sounds of english. International Research
Journal of MMC, 15-28.
Stassen, M. L., et al. (2004). Program-based review and assessment: Tools and techniques
for program improvement. Office of Academic Planning & Assessment. Retrieved from
Office of Academic Planning & Assessment, University of Massachusetts Amherst.
Thompson, B. & J.E. Levitov. (1985). Using microcomputers to score and evaluate items.
Retrieved from Eric: eric.ed.gov/?id=EJ320128

Task 1 Edup3063-Tesl 3-Nur Syahanis Aduni Binti Mohd Rafi-2020092340327 PDF

Uploaded by

Copyright:

Available Formats

You might also like

Task 1 Edup3063-Tesl 3-Nur Syahanis Aduni Binti Mohd Rafi-2020092340327 PDF

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Task 1 Edup3063-Tesl 3-Nur Syahanis Aduni Binti Mohd Rafi-2020092340327 PDF

Uploaded by

Copyright:

Available Formats

INSTITUT PENDIDIKAN GURU

KAMPUS DATO’ RAZALI ISMAIL

PROGRAM IJAZAH SARJANA MUDA PERGURUAN (PISMP)

TUGASAN 1 – ACADEMIC WRITING

NAMA PELAJAR : NUR SYAHANIS ADUNI BINTI MOHD RAFI

Ulasan/Komen/Maklum Balas Pemeriksa/Pensyarah :

Tanda Tangan Pemeriksa/Pensyarah : Tarikh :

Tanda tangan Pelajar : Caanis Tarikh :

Assessment is a central component of the teaching and learning process. It is defined as

2.0 Item Analysis

2.1 Difficulty Index Analysis

Student Total Questions

Nur Syarah Afrina 100 1 1 1 1 1 1 1 1 1 1

Nor Qaseh Ariana 80 1 1 1 1 0 1 1 0 1 1

Nurul Alia Amirah 80 0 1 1 1 0 1 1 1 1 1

4.0 Positive impact of online assessment practices on teachers and students

6.0 Statistical Calculation

6.2 Standard Deviation

Subject Variances Calculation

Science A, (80 − ∑[𝑥−𝑥̅ ]2

Subject Student Calculation

Students' academic achievement can be influenced by a variety of factors, including

CueMath. (n.d.). Standard Deviation. Retrieved from CueMath:

Saul, M. (2019). Z-Score: Definition, Calculation, and Interpretation. Retrieved from

You might also like