Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 19





NIM : 1401065052



A. The Background of the Study

In learning English, there are four skills that must be mastered by

students, they are listening, speaking, reading and writing. By that four skills,

teachers need to give an evaluation to measure achievements of students.

Evaluation is a way to improve the teaching and learning process. Besides,

evaluation can be used to know how far the objectives of learning have been

achieved by the students. According to Stephen in Christine, “Evaluation refers

to a process of acquiring, considering, and judging information related to

teaching and learning.”1 It means that evaluation is a way to know the

students’ successful of the learning process.

In general ways, there are three steps for doing an evaluation, those

are: collecting data, analyzing data, and interpreting data. In education terms,

evaluation is used to make decisions on how far students understand the

Christine. C, 2007. Evaluating Teacher Effectiveness in Esl/Efl Contexts. United
States Of America, The University Of Michigan Press, p. 107.
learning materials, the effectiveness of teaching methods and to help teachers

know each individual student.

One of the steps of the evaluation is collecting data. In order to obtain

the information about the students, a teacher needs an instrument. There are

two instruments for collecting data, those are test and non-test. A test

commonly used to obtain specific information about skill or knowledge, and

non-test is used to get specific information about characteristics of something.

Then, the results from those instruments can be used as a consideration for

making a decision.

A test which is administrated to students should have the good

characteristics, they are validity, reliability, and practicality. According to

Heaton he said that there are four criteria of a good test, those are validity,

reliability, discrimination, and administration.2 Beside the characters of a

good test above, there is other thing that must be considered, it is the quality

of the items, whether the item is good or not. In order to know quality of a

test, the teachers need to do items analysis.

Algina in Sumarsono says that the teacher should give attention to

three things in analyzing the test items, those are facility value, discriminating

power and effectiveness of the distractor.3

J.B. Heaton,1988. Writing English Language Test. USA: Longman Inc. P. 159

Sigit Sumarsono. 2012. Metode Riset Pendidikan Bahasa. Jakarta: UHAMKA Press. P. 89
SMAN 51 Jakarta has just carried out a summative tests for the first

semester. Based on the experience of the writer during the observation in the

school, he got information that the summative tests which were held in the

second semester had never been analyzed by the teachers, one of them is the

English test, and then the items quality cannot be known. Started from the

writer’s curiosity about the quality of test items that have never been analyzed

by the teacher in the school, the writer SMAN 51 Jakarta as the research place

Based on the facts the writer had found above, the writer interested in

conducting research related to the analysis of the test item with the title: " AN




B. Identification of the Problem

From the explanation above, the writer want to know "Did The

English Summative Test For The Eleventh Grade Students In The First

Semester At Sman 51 Jakarta On The 2016/2017 Academic Year have good


C. Purpose of the Study

The purpose of this study is to determine whether The English

Summative Test For The Eleventh Grade Students In The First Semester At

Sman 51 Jakarta On The 2016/2017 Academic Year had good items or not.
D. Limitation of the Study

Based on the problem above, the writer limits the study just on the

problem of item analysis on the English summative test in MTs Nurul Falah

Batujaya, the writer focus only on the facility value, the discriminating

power, and the effectiveness of the distractor.



A. The meaning of the Evaluation

Evaluation is defined as an assessment process for making

decisions based on a set of measurement results and focus on the goals

that have been formulated. Stufflebeam said “evaluation is the process of

delineating, obtaining, reporting, and applying descriptive and

judgmental information about some object’s merit and worth in order to

guide decision making.”4 It means that evaluation is a process of

obtaining and describing useful information for making decisions.

Ariasian stated "Evaluation is the process of making judgment

about what is good or desirable." 5 It means that evaluation is the process

of making a decision on something whether it is good or not to be used.

There are some differences in terms of evaluation by stufflebeam

and Ariasian. Stufflebeam said that evaluation is a process of describing,

obtaining (data), and providing useful information to make a decision.

While Arisian argued that evaluation is a process of making a decision on

something whether it is good or bad.

Gronlund said "Evaluation is a systematic process of collecting,

analyzing and interpreting information to determine how far the learning

Daniel L. Stufflebeam, 2002. Evaluation Models. New York: Kluwer Academic Publisher. P. 280

Peter W. Ariasian. 1996. Assessment in the Classroom. New York: McGraw-Hill.Inc. P. 4
goals have been achieved by students."6 That can be said evaluation is a

process to know how far the students have been achieved learning goals.

Stufflebeam and Ariasian previously stated that evaluation is the

process of making a decision, whereas in terms of education Gronlund

said that evaluation is a systematic process to know the level of student

learning that has been achieved by collecting, analyzing, and interpreting


From some definitions by the experts above, it can be concluded

that evaluation is a process of making a decision through steps, such as

obtaining, analyzing and interpreting information. In education terms, it

applied to know how far the learning goals have been achieved by


B. The meaning of the Test

The term test comes from the ancient French word "testum"

which means a plate to set aside precious metals. In Bahasa test is

translated as a test or an experiment.

More specifically, in the education terms test is used to determine

the level of students' ability in learning. A test is an instrument for

teachers to retrieve information of students' knowledge and skills in


Allen Philips said “A test is commonly difined as a tool or

instrument of measurement that is used to obtain data about a specific

Gronlund et al. 2009. Measurement and Assessment in Teaching, Tenth Edition. New York:
Pearson Education, Inc. P. 98
trait or characteristic of an individual or group.”7 In his statement Phillips

said that the test as a measurement tool used to obtain specific trait from

someone or about the character of a person or group.

Gronlund stated “Test is an instrument or systematic procedure

for measuring a sample of behavior by posing a set of questions in a

uniform manner.”8 It means that test is as a measure for measuring a

sample behavior of someone by giving them a set of tasks.

Philips and Gronlund stated test as an instrument for measuring

the trait or behavior of individual or group. Moreover, Gronlond

statement explained that test is a tool of measuring sample behavior by

giving a set of questions.

Hopkins and Antes stated “Test is an instrument, a device, or

procedure that proposes a sequence of task to which a student is to

respond the result of which are used as measures of a specified trait.”9

They argued that the test is a tool, or a set of tasks that assigned to

students whose the responses become a measure of specific information

about knowledge and skill of them.

From the definitions of some experts above it can concluded that

test is an instrument of measurement consisting of a set of tasks assigned

Phillips, Allen D. 1979. Measurement and Evaluation in physical education. Canada: John
Whiley & Sons, Inc. P. 43
Gronlund et al. 2009. Measurement and Assessment in Teaching, Tenth Edition. New York:
Pearson Education, Inc. P. 28.
Hopkins & Antes 1990. Classroom Measurement and Evaluation, Third Edition. Itasca: F. E
Peacook Publisher, Inc. P. 130.
to students whose responses becomes the measure of their specific

knowledge or skill.

C. Kinds of the Test

The writer wants to know the quality of English summative test

in MTs Nurul Falah Batujaya, Karawang, West Java, whether it is good

or bad. Before analyzing quality of a test, the writer will describe kinds

of test that carried out in language terms to know its functions.

Heaton summarized a test into four kinds, those are achievement

test, proficiency test, aptitude test, and diagnostic test. For achievement

test, he argued it is a test that related on what students are presumed to

have learnt from classroom lesson, unit or even curriculum. Proficiency

test is purposed to examine global competence of students’ language, the

test usually does not related to the classroom lesson or curriculum.

Aptitude test is a test to measure the student’s probable performance in a

foreign language which he or she has not started to learn. The last is

diagnostic test, a test that implied to determine the type of difficulty that

discovered by students in a certain subject, the teacher will note the error

and plan appropriate remedial teaching.10

Achievement test has two branches for based on its function; both

are formative and summative test. Sudijono explained the branches of

summative test are:

a. Formative Test

J.B. Heaton,1988. Writing English Language Test. US America: Longman Inc. P 171
Formative test is a test of learning outcome that aims to find

out how far learners have "formed" (in accordance with learning goal

objectives) after they follow the learning process within a certain

period of time.

b. Summative Test

A summative test is a test of learning outcomes which is held

after completing a set of teaching program units. 11 It means that

summative test is a test that applied at the end of semester in school


D. The Characteristics of a Good Test

In implementing an evaluation, the instrument or the test that used

should be good in order to obtain an accurate data; a good test is needed

to make a right decision based on accurate data. The common

characteristics of good test are:

a. Validity

Harmer said “A test is valid if it tests what it is supposed to

test.”12 It means validity is the compatibility of the measuring instrument

with what it wants to be measured, as example when the teacher wants to

Prof. Drs. Anas Sudijono. 2006. Pengantar Evaluasi Pendidikan. Jakarta: PT Raja Grafindo
Persada. P. 71-72
Jeremy Harmer. The Practice of English Language Teaching, Fifth Edition. P. 381
know about students’ grammar skill he must use a grammar test as an


b. Reliability

Jack C. Richard stated that “reliability in testing is a measure of

the degree to which a test gives consistent results. A test is said to be

reliable if it gives the same results when it is given on different occasions

or when it is used by different people.”13 It means, reliability is when the

test is used given on different opportunity and different people it is

giving consistent results or it can be said the test is trusted.

c. Practicality
Harris assumed “practicality means the test is easily for

administration, scoring, and easy interpretation.”14 Then a test is said

practicality if it easy to do with.

E. The meaning of the Item Analysis

Beside the three common characters of a good test above, there is

another thing that should be considered, it is test item analysis. Item

analysis is a process which is conducted in accordance with certain

Jack C. 1992. Dictionary of Language Teaching and Applied

Linguistics. Longman Group UK Limited P. 465

David P. Haris. 1969. Teaching English as a Second Language. New York: Mc Graw-Hill
Publishing Company Ltd. P. 14.
procedures and steps, its purpose is to identify any item which is

effective or has a good quality. Then the items can be used for testing or

not. Item analysis also can be used to know which items are weak, so it

can be revised or even rejected.

Ahmann mentioned "The item analysis is reexamining each test

item to discover its strength and flaws"15 it means that item analysis is an

activity to retest each item in the test to find out its strengths and


Brown stated “Item analysis is the systematic evaluation of the

effectiveness of the individual items on a test.”16 It means that item

analysis is the systematic evaluation which evaluates the effectiveness of

each item on a test.

Algina in Sumarsono said that the teacher should pay attention to

three things in analyzing the test items, those are facility value,

discriminating power and effectiveness of the distractor.17

From the definitions above, the writer conclude that item analysis

is a process of reexamining every item in the test to discover its quality.

In addition, with the item analysis the teacher can categorized the tests

item which is accepted, should be revised or even rejected. It can be

decided by reviewing three things from the item: facility value,

discriminating power, and distractor.

J. Stanley Ahmann 1967 Evaluation Pupil Growth, Boston: Allyn and Bacon, Inc. P. 184
James Dean Brown. 1996. Testing in Language Programs. Prentice-Hall, Inc. P. 50

Sigit Sumarsono. 2012. Metode Riset Pendidikan Bahasa. Jakarta: UHAMKA Press. P.89
a. Facility Value

Whether has a good quality or not, an item can be known firstly

from facility value. These items can be said as good items if they are not

too difficult and not too easy. In other hand, the facility value of the item

is moderate or medium.

Sumarsono said “Proporsi dengan rasio butir sukar, sedang,

mudah yang umum adalah 1 : 2 : 1. Bila diprosentasekan, komposisi

suatu set soal yang baik bila 25% terdiri dari mudah, 50% sedang, dan

25% sisanya butir sukar”18 It means that a good set of questions consists

not only of easy items, or vice versa. Rather, there should be an easy,

medium and difficult item. According to Sumarsono the common

proportion of items is 1 : 2 : 1, that means 25% of easy item, 50%

medium and 25% the rest is hard.

Heaton stated “… is simply to ascertain the percent of the sample

who answered each items correctly, multiple choice item are easy if they

are correctly answered by at least 92% if the sample group.”19

In operating his statement Heaton gives formula calculate facility

value as follows:

FV =


FV : Facility Value.

Sigit Sumarsono. 2012. Metode Riset Pendidikan Bahasa. Jakarta: UHAMKA Press. P. 91

J.B. Heaton,1988. Writing English Language Test. USA: Longman Inc. P
R : The number of correct answer.

N : The number of students taking the test.20

In his book, Sumarsono detailing into 5 or more categories of

items based on the simplicity.

Classification of items with facility value by Sumarsono

0,00 – 0,20 : The items are very easy

0,21 – 0,40 : The items are easy

0,41 – 0,60 : The items are medium

0,61 – 0,80 : The items are difficult

0,81 – 1,00 : The items are very difficult21

After discovering the steps of analyzing the facility value above,

the writer wants to analyze one by one item from the English summative

test in MTs Nurul Falah Batujaya, Karawang, to know whether the test

has a proportional facility value or not. After implementing the analysis

the writer will divide the three types of facility value and will publish the

result in order to give readers explanation about the item quality in

facility value aspect.

b. Discriminating Power

Discriminating power indicates the degree to which an item

separates the students who performed well from those who performed

Ibid. P.179

Sigit Sumarsono. 2012. Metode Riset Pendidikan Bahasa. Jakarta: UHAMKA Press. P. 91
poorly. These two groups are sometimes referred to as the high and low

scorers or upper and lower-proficiency students.22

Heaton gave a formula to see the discriminating power which is

contained in the item

Correct U −Correct L


DP : Discriminating Power.

U : Sum of the students from the upper group who answered correctly.

L : Sum of the students from the lower group who answered correctly.

N : Number of Candidate in one group.23

In his book, Prof. Dr. Suharsimi Arikunto explained about the

classification of discriminating power, those are:

D =  0,00 – 0,20  =  poor

D =  0,21 – 0,40  =  satisfactory

D =  0,41 – 0,70  =  good

D =  0,71 – 1,00  =  excellent24

To find out which item has a discriminating power index,

according to Sudijono it can be classificated as follows:

Index of
Classification Interpretation
Discriminating Power
Negative Bad It has bad discriminating

James Dean Brown. 1996. Testing in Language Programs. Prentice-Hall, Inc. P. 67

J.B. Heaton,1988. Writing English Language Test. USA: Longman Inc. P 180

Suharsimi Arikunto, 2003. Manajemen Pendidikan. Jakarta: Rhineka Cipta P
It has week/poor
0.00< D <0.20 Poor
discriminating power
It has satisfactory of
0.20 < D <0.40 Satisfactory
discriminating power
It has good discriminating
0.40 < D <0.70 Good
It has excellent discriminating
0.70< D <1.00 Excellent

After analyzing the facility value, the writer proceeds to

discriminating power in item analysis. In this second step, the writer will

analyze the discriminating power that separates the upper and lower

student by the procedures above and will publish it to the analysis report

in order to make improvement on the items.

c. Distractor

The last thing that needs to be observed in item analysis is the

distractor that the item has. Sumarsono stated “Pengecoh atau distracter

adalah alternative yang disediakan penulis tes sebagai jawaban yang

salah. Distractor disediakan untuk menguji apakah siswa benar benar

tahu jawaban mana yang benar dan mana yang salah.25 It means that

distractor is the alternative choice which is provided by the test maker for

examining the students do them aware which answers are correct and


The third analysis is on the effectiveness of distractor, in this step

the writer will analyze the distractor in each items in order to know

Sigit Sumarsono. 2012. Metode Riset Pendidikan Bahasa. Jakarta: UHAMKA Press. P. 91
whether the items are good or bad, then after analyzing the writer will

write the effectiveness of the distractor in the table.

As the final result, the writer will make the report of item analysis which

focus on facility value, discriminating power, and the effectiveness of the

distractor and give the report to the teacher in order to make improvement for the

better quality of the test items which is made by the teacher in the future.


A. The Location and Time of The Research

This research is conducted in eleventh grade second semester students

of SMAN 51 East Jakarta at Jl. Raya Condet, East Jakarta on June 2017

B. The Method of The Research

The writer uses quantitative analysis and qualitative analysis method.

Quantitative analysis is used in analyzing data of the difficulty level and

discriminating power scores. This analysis is used to detect the test item whether

it is good or not. Meanwhile, qualitative analysis is used to analyze farther about

causes of the weakness of the items.

C. The Instrument of The Study

The research instrument is the students’ answer sheets and English

Summative Test. The students’ answer sheets are collected, and then analyzed.

D. The Technique of Data Analysis

The technique of data analysis used by the writer is two kinds of components.

There are the difficulty level and discriminating power.

1. The Difficulty Level

The writer also uses the formula to measure the difficulty level which is

adopted from Heaton:

FV =

FV : Facility Value.

R : The number of correct answer.

N : The number of students taking the test.

2. The Discriminating Power

The writer also adopts the formula by Heaton as following below:

Correct U −Correct L


DP : Discriminating Power.

U : Sum of the students from the upper group who answered


L : Sum of the students from the lower group who answered


N : Number of Candidate in one group

E. The Procedure of the Research

The procedure of doing this research are as follows :

a. Asking for permission to the Head of The Study Programme of

English Education and the advisors to do this research in The

University of Muhammadiyah Prof. Dr. Hamka.

b. Asking for the research letter from campus to do the research in the

place selected.
c. Asking permission to the school principal and and English teacher of

SMAN 51 East Jakarta to do this research.

d. Asking permission to borrow the English summative test sheet, the

students’ answer sheets, and answer key from the English teacher,

e. Copying the English summative test sheet, the students’ answer sheet,

and answer keys from the English teacher.

f. Checking the answer key and English summative test.

g. Arranging the students’ answer sheets based on the rank of the test

scores, from highest score to the lowest score.

h. Recapitulating the students’ score and the writer arranges them into

two groups. They are upper and lower group.

i. Grouping the students’ answer sheet that are taken 27% from the

upper group and 27% from the lower group.

j. Counting students who answer correctly from each group (upper

group and lower group) and put it in the format of tabulation of the

item analysis.

k. Counting difficulty level, and discriminating power using provided

formula and criteria of them.

l. Analyzing which items must be maintained, revised, and discarded.

m. Making the alternative revision.

You might also like