Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 5

Subject: Inquiries, Investigation and Immersion

Module 1: The Research Instruments and Analysis of Data


Teacher: HJ Gerangue

Research Instrument

Research instrument are defined as tools used by researchers to collect, measure, and analyze variables in their
research study. These instruments are carefully designed to answer the research questions that are obtained from the
population or sample of the population of interest.
As a researcher, you have to select or develop research instruments that measure, construct such as intelligence,
achievement, personality, motivation, attitudes, aptitudes, interests, and self-esteem. One or many of these constructs
serve as independent and dependent variables of your research study.
Generally, there are three types of research instruments: cognitive, aptitude, and affective. Cognitive research
instruments measure intellectual processes such as problem solving, analyzing, and reasoning. Example of cognitive
research instruments are achievement test and conceptual understanding tests. Aptitude research instrument measure
mental ability. They are usually used for predicting future performance of the subject or population of interest. Examples
are process skills tests, critical thinking tests, verbal reasoning tests, numerical reasoning tests, and mechanical reasoning
tests. Affective research instruments assess the subjects’ feelings, attitudes, beliefs, interests, personality, and values.
This type of research instrument is usually expressed by the Likert Scale, sematic differential scale, Thurstone scale,
Guttman scale, and rating scale. Example of affective research instruments are behavior rating scales, product rating
scales, attitude rating scales, and interest inventory scales.
Any research instrument must be valid, reliable and objective. Objectivity is the extent of agreement among
scorers. Thus, an objective research instrument has no subjective judgement. Validity, on the other hand, is the extent to
which an instrument measures what it intends to measure. Reliability is the extent to which the instrument measures
accurately and consistently.
Here are some of the instruments used in quantitative research such as tests, questionnaires, and observation
instruments.

Tests

A test is defined as “ a set of standard stimuli presented to individuals in order to elicit responses from which a
numerical score can be assigned (Ary et al., 2014). The stimuli can be presented in written form, in an oral presentation,
or in a performance setting. Tests are used to measure both cognitive and noncognitive values.
Tests should be valid, objective, and reliable to derive scores as indicators of the variable of interest in your
research study. Tests such as multiple choice and true-false are objective tests because the scoring is done by comparing
the students’ or subjects’ answers with the scoring key; hence, the scorer or the researcher does not need to make any
decision. On the other hand, essay tests are less objective.

Type of Tests
Type of Tests

Achievement Tests Aptitude Tests

Standardized Tests Performance Tests Researcher-made Tests Individual Aptitude Tests 1. Group Tests of Aptitude

Achievement Tests. Are used most often in


education to measure what students have learned. The measure students’ mastery and proficiency in different areas of
knowledge through a standard set of questions that require the completion of cognitive tasks.
a. Standardized tests. Are published tests developed as a result of careful planning and skillful preparation by
experts. It cover broad academic objectives common to school systems. These tests have known comparative
norms, established validity and reliability, and prescribed directions for administering and scoring. Examples of
standardized tests in the Philippines are licensure examinations for professionals, National Achievement Tests,
Technical and Vocational National Competency Tests or Skills Tests, and Career Service Examination. College
admission tests are also examples of standardized tests.
b. Researcher made-tests. You have to construct your own tests. Doing this, you can tailor your test such that it
contains the specific competencies that you need to measure to answer your specific objectives or specific
research questions. In constructing your own test, you have to measure its validity and reliability by administering
the test to a small group of nonrespondents (for pretesting). If your target respondents are grade 8 students, then
you may choose a small group of grade 9 or 10 students to whom the researcher-made test will be administered.
The results indicated by these selected small group of students can provide information about the validity and
reliability of your constructed tests and can help you identify issues or problems present in your test. This way,
you can do necessary revisions to the items of your test before administering it to your actual respondents.
c. Performance tests. Are used to measure what individual can do. In this type of test, the researcher observe and
assess the ability of an individual to perform a certain task. The individual who takes the performance test may be
required to carry out a process or a product. The process or product is the evaluated using predetermined criteria
or rubrics.

2. Aptitude tests. Measure a person’s ability to recognize relationships, solve problems, and apply knowledge in various
situations. They attempt to measure an individual’s general ability or potential for learning a specific concept or skill.
Educational research often use aptitude tests. This is because much of the research in education involves the respondents’
aptitude as one of the variables of interests. Educators have found that the use of aptitude tests may help them predict
school success. There are aptitude tests that are intended for individuals and for groups.
a. Individual aptitude tests. This type is individually administered to the respondent of a research study. Example of
individual aptitude tests are the Stanford-Binet Intelligence Scale and the Wechsler tests. These tests are
administered by a trained psychometrician to an individual person, making these expensive and time-consuming
when applied to a group of individuals.
b. Group tests of aptitude. The first known group test of mental ability was developed in 1917 by Arthur Otis. Such
a test, particularly the Army Alpha Intelligence Test, was used to measure the mental ability of 1.5 million
military recruits during World War I.

Questionnaires

A questionnaire is a research instrument that contains a set of questions on a topic or a group of topics to obtain
statistically significant information from the respondents of your research study. When constructed properly and
administered responsibly, questionnaires are vital tools in collecting a wide range of information from a large sample of
the population of interest.
There are different requirements that a questionnaire should meet for it to be considered well-written, according to
Roopa and Rani (2012). First, the wordings of the questionnaire should be understood easily. Second, the questions
should have only one answer at a given instance. If it is a yes or no question, it should be constructed in a way that it is
answerable only by yes or no. Third, the researcher should endeavor to write the questionnaire in a way that it will get the
best possible answer from the respondent. Fourth, the researcher should strive in writing the questionnaire in a way that
all possible responses to a query can be extracted from the respondents. Fifth, the questionnaire should have response
options that are mutually exclusive. Sixth, the researcher must ensure that there is a variability in the possible responses.
Last, the questionnaire must be written in a manner that it lessens social desirability. This means that the respondents will
not just give answers so that they will present themselves in a favorable positions.
Many individuals use the terms survey and questionnaire interchangeably. A questionnaire refers to the set of
questions, whereas a survey refers to the set of questions and process of collecting and analyzing the responses from those
questions. In other words, a questionnaire describes content, whereas a survey is a broader term that describes the content,
process and method.

Types of Survey Questions


Types of Survey Questions

Contingency questions Matrix questions Close-ended questions Open- ended questions


Dichotomous questions

Multiple choice questions


Scaled questions

1. Contingency questions. These are questions that are answered only if the respondents give a particular response to a
previous question.
Example:
Have you traveled abroad for the past 12 months? Check one.
_____Yes _____No
If yes, how many times? Check one.
_____ Once _____ Thrice
_____ Twice _____ Four times and above
2. Matrix questions. Is a group of multiple choice questions that is often displayed in tabular form using rows and
columns. The rows usually contains the questions, and the columns contains the set of predefined answer choices that
apply to each question in the row. Often, the choices are in the form of a scale.
Example:
How was your experience with the following service/ amenities of the hotel? Put a check mark in the
box that corresponds to your answer.

Excellent Very Satisfactory Satisfactory Fair Poor


Staff
Internet
Room
Pool
3. Close-ended questions. The respondents are asked to choose their answers to the series of questions from a distinct set
of predefined responses. Close-ended questions can be grouped as dichotomous questions, multiple choice questions, and
scaled questions.
a. Dichotomous questions. In this type of close-ended question, the respondents are asked to choose one from the
two possible responses indicated to the questionnaire. These include yes/no questions, true/false questions, and
agree/disagree questions.
Example:
Type of school:
___Public ____Private

The school policy is effective.


Yes No

I love mathematics
___ Agree ____ Disagree

b. Multiple-choice questions. The respondents have several options to choose from. Multiple choice questions can be
categorized as rating scale, checklist type, and rank order.
i. Rating scale multiple choice questions. Require the respondents to choose a numeric value that
corresponds to their responses to the questions. The number of scale point varies.
Example:
How would you rate your flight experience, with 1 being the lowest? Encircle your rating.
1 2 3 4 5

How useful is the product? Check one.


______ Very useful _____ Slightly useful
______ Useful _____ Not useful
______ Moderately useful

ii. Checklist type multiple choice questions. The respondents can choose one or more answers from the
choices provided.
Example:
Which of the following learning competencies are important in performing your job? Check all
that applies.
( ) Collaboration ( ) Creativity
( ) Communication ( ) Critical thinking

iii. Rank order multiple choice questions. The respondents will order the provided options based on the
preference, usually from the most preferred to the least preferred.
Example:
Rank the following learning competencies based on their extent of use in your job, with 1 being
the most frequently used.
_____ Collaboration ____ Creativity
_____ Communication ____ Critical thinking

c. Scaled questions. A scale is a set of categories or numeric values assigned to individuals, objects, or behaviors for
the purpose of measuring variables. Scaled questions are questions where the response options consist of
gradations or numeric values that indicate various degree of a given characteristics. Scaled questions are used to
measure attitudes. Examples of these types of scales are the Likert scale and semantic differential scale.
i. Likert Scale. Commonly used in research studies that employs questionnaires to measure social
attitudes. A Likert scale measure the respondents’ attitudes towards a topic whether they strongly agree,
agree, are undecided, disagree, or strongly disagree.
Example:
Mathematics is fun and exciting.
____(5) Strongly agree ____(2) Disagree
____(4) Agree ____(1) Strongly agree
____(3) Neutral

ii. Semantic differential scale. Also known as the bipolar adjective scale, the semantic differential scale
presents respondents with a list of adjectives that have bipolar or opposite meanings. This type of scaled
question is appropriate for research studies that endeavor to measure the psychological meaning of an
object ( or an abstract concept) to an individual.
Example:
Rate the school on the following dimensions. Put a check mark on the space that corresponds to
your answer.
Safe ___: ____: ____: _____: ____ Dangerous
Dirty ____:____: ____: _____: ____ Clean
Quiet ____: ___: ____: ______: ____ Noisy
Strong ____: ____: ____: _____: ____ Weak

4. Open-ended questions. Do not have categories or predefined options from which the respondents will choose their
responses. In open-ended questions, the respondents answer in their own words without being constrained by a fixed set
of possible responses.
Example:
What do you think is the biggest challenge for educational leaders today?
The biggest challenge is _______________ because ______________

Direct Observation

Direct observation is used to measure the extent to which a particular behavior or characteristics is present. In
this case, the observer acts like a camera or a recording device to keep a record of the occurrence of the behavior or
characteristics under investigation. Direct observation is used in educational research, specifically in studies whose
concern is on classroom behavior involving teachers and/or students. Direct observation has many applications. For
example, in the field of marketing, when a new product is launched, direct observation can be used to find out the
behavior or reaction of people on the new product.
When using a direct observation as a method to measure the variable under investigation, researchers must specify
the behavior involved and devise a systematic procedure for identifying, categorizing, and recording the behavior in a
natural or contrived environment. Generally, using any real-world tools requires the application of proper skills,
procedures, and strategies to derive an accurate result.
For quantitative direct observation to become effective, the preliminary steps must be carefully considered and
implemented. First, the researcher must identify the aspect or aspects of a behavior that is desired to be observed.
Second, the researcher must provide a clear definition of the behavior for each category. Third, the researcher must
devise a systematic procedure for quantifying the behavior appropriately. Fourth, the researcher must provide a
systematic procedure for recording the selected behavior. Last, the researcher must orient and train people involved in the
observation process.
In direct observation, you use any of the following devices for recording observations:
1. Checklists. A checklist is the simplest device for direct observation. This contains the list of behaviors to be
observed. During the observation, the observer checks whether each behavior is present or absent. For example,
you want to observe the motivation techniques used by teachers in the classroom. An observer would then check
items such as “ Uses verbal reinforcement,” “Uses praises” or “Provides awards and rewards systems.”
2. Rating scales. Is used as a device for direct observation where the observer evaluates the gradation of an
observed behavior or activity. Typically, rating scales involve three to five points or categories. For example, you
want to determine the extent of practice of a particular motivation techniques, then your scale with the
corresponding points can be as follows: (5) always practiced, (4) frequently practiced, (3) sometimes practiced,
(2) seldom practiced, and (1) not practiced. Rating scales can be completed after an observation period.
3. Coding systems. When it is essential for observer to categorize and count the frequency of specific predetermined
behaviors as they occur, then coding systems are appropriate to use. Using this direct observation device, the
observer does not only determine whether a behavior occurred but also uses codes to record what actually
occurred.

Validity and Reliability of Research Instruments

Validation is the process of collecting evidence to support a particular score-based interpretation to establish that
the inferences made on the basis of the test results are appropriate. According to the Standards for Educational and
Psychological testing, there are three categories of evidence that can be used to determine the validity of test instrument.
1. Evidence based on test content. Refers to the content of the test and their relationships to the variables that the test
intends to measure. This is also known as content validity. A particular test instrument must contain items that
represent the skills and knowledge associated with a topic or a set of topics. So, a test instrument must consider
the appropriateness of the test’s contents and the adequacy of the test items in relation to the competencies being
measured. For example, you want to measure the ability of students to solve word problems involving fractions.
In this example, giving only two items is not enough to obtain results that can help you characterize the students’
ability to solve problems involving fractions.
2. Evidence based on relation to a criterion. Is the extent to which test scores are related to one or more outcome
criteria. In this validity evidence, the focus is put on the criterion. The criterion will be used as a basis for making
inferences about the test scores. For example, you have developed a test instrument for Grade 8 mathematics and
you need to gather evidence that your test really measures students’ abilities, knowledge, and skills in grade 8
mathematics. To do this, you need to administer your test and a well-known and previously validated test
instrument in grade 8 mathematics to a group of students. A substantial correlation between the scores of the
students in the two test will indicate that the new test is valid. This is known as concurrent validity. Another way
to collect criterion-related validity evidence is to determine the predictive validity of a new test instrument. In this
case, the scores of the students on the new grade 8 mathematics test will be correlated with their grades in grade 8
mathematics in the future. If a substantial relationship exists, then the new test demonstrates predictive validity.
This means that the new test can be used later to predict the performance of students in grade 8 mathematics.
3. Construct-related evidence of validity. Refers to the degree to which the test instrument measures a psychological
variable, such as intelligence, motivation, and anxiety based on a theory. In this type of validity evidence, you
have to establish that the test items in your test instrument really measures the psychological variable of interest in
your research study based on the definitions and characterizations provided by relevant theory or theories and
previous related research.

A test is said to be reliable if the scores obtained by an individual remain the same when the test is
administered repeatedly. Generally, establishing the reliability of a test instrument is easier than establishing its validity.
The reliability of a test instrument can be measured through its standard error of measurement or reliability
coefficient.
The reliability coefficient of a test instrument can be obtained in any of the following ways:
1. Coefficients derived from correlating individual’s scores on the same test administered on different occasions. An
example, is the test-retest reliability coefficient. Here, the test instrument is administered to a group of individuals on two
occasions and then their scores are correlated.
2. Coefficients derived from correlating an individual’s scores on different test instrument with equivalent items. An
example is the equivalent reliability coefficient. In this situation, two equivalent forms of the test instrument are
administered to a group of individuals and then their scores are correlated. The administration of the two equivalent forms
of the test can be done in the same occasion or in different occasion. The resulting reliability coefficients are called the
coefficient of equivalence and coefficient of stability and equivalence.
3. Coefficients based on the relationship among scores derived from individual test items of a subset of items within a test.
Here, the researcher administers the test instrument only once. This is also known as internal consistency. Examples are
the split half reliability, homogeneity measures, and Kuder-Richardson procedures.

You might also like