Download as pdf or txt
Download as pdf or txt
You are on page 1of 24

1

I: Introduction to Psychological Assessment

Overview of Psychological Assessment


The word test causes some anxiety in people - be it a math test, driving test, or a university entrance
test, it makes people feel some dread. The possibility of having endless review nights to pass and failing
the test are some of the reasons why people do not perceive the word test in a positive light. However,
we cannot deny the fact that tests are important tools in society, as they serves as means to select
people qualified for a job or measure the skills of a student to pass him/her in the next level. Imagine
if the people are selected to their jobs without some sort of tests or measurement, it would be chaos!
In this module, the importance of psychological tests and assessments will be the main focus. The basic
concepts and principles in the assessment will be explored to help better understand the nature of
psychological tests and assessments.

Basic Concepts
A test is one of the many tools used in the field of psychology. It can be either a device or a technique
that allows behavior to be quantified or predicted. Meanwhile, the term testing refers to the process
of utilizing the former such as the administration of the test. Putting a modifier before the word test,
the meaning becomes different. The term, though, had become obsolete in practice, and practitioners
prefer to use the term assessment.
When we use the term psychological testing, this refers to the systematic procedure of gathering
sample behavior in relation to cognitive or affective functioning (Urbina, 2004). The data collected
from this are used as a basis to establish the standards of the test or tool.
Psychological testing pertains to the utility of the tests to evaluate an individual. On the other hand,
psychological assessment is integrating the collected data from various assessment tools in order to
come up with an evaluation. Earlier it was mentioned that a test helps to quantify behavior, and with
quantification, there is the measurement, numbers are assigned to objects, in this case behavior and
personality traits, according to a certain set of rules (Christensen, 1991).
Psychological testing and assessment allow behavior to be scaled, something which was thought
impossible before, since these processes utilize various tests that give us an idea what a person's IQ is
or the level of one's aggression. This allows data to be objective, leading to a less biased judgment of
an individual - evaluation. This is more extensive than measurement as inferences are drawn from
various assessment procedures.
For a better understanding of the aforementioned concept, please refer to the Table below (adapted
from Cohen-Swerdlick, 2010)

Table 1. Major Differences between


Testing and Assessment

Note: Adapted from


Psychological Testing and
Assessment: An Introduction to
Tests and Measurement 7th
Edition, by Cohen−Swerdlik,
Copyright ©2009 by The
McGraw−Hill Companies, Inc.
2

Brief History of Testing and Assessment


The beginning of testing was established earlier in the Eastern part of the world rather than in the
West. As early as 2200 B.C., the Chinese were using tests as means for the selection of government
officials who would serve the throne and the nation. It is a civil service exam that was deemed to be a
sophisticated method at that time due to its continuous improvement and development (Kaplan &
Saccuzzo, 2015). The West had likely adapted this selection process that it became the norm for getting
in a job or serving in the military. From there, tests and testing started to make their mark in the world.

In the 1800s, the field of psychology was blooming, but some concepts were still vague. It was in 1838
that Jean Esquirol differentiated mental illness and mental retardation, but the test that will measure
the latter only came in 1905 when Alfred Binet and his partner, Theodore Simon, developed the first
intelligence test upon the commission of the French government. Moreover, theories, empirical
studies, and measurements were still being developed at that time.
In 1904, Charles Spearman postulated that intelligence is made up of a single g factor - a general factor
and a number of s factors or specific factors. In the same year, Karl Pearson contributed correlation
measures which are an immense help in testing.

In 1916, the Binet-Simon test came to the United States and was renamed as Stanford-Binet
Intelligence Test after Lewis Terman and his team from the said university revised and reformed the
test. The following year, Army Tests were developed, and this was led by Robert Yerkes. He and his
team made the Army Alpha, a verbal test for native English-speaking recruits, and Army Beta, a non-
verbal test which was for immigrant recruits for the first world war. Given this selection process,
Robert Woodworth developed the Personal Data Sheet in 1918, which he gleaned from answers from
the questionnaires of soldiers (Cohen & Swerdlik, 2017).
Later on, he made the Woodworth Psychoneurotic Inventory for civilian test takers. This was
considered by some the first personality test (Santos & Pastor, 2009). Rorschach Inkblot Test came in
1920. It was developed by Herman Rorschach, a Swiss psychiatrist. However, it was not complete as
there was no clear scoring system as he passed away before completing this. But this did not become
a hindrance as the inkblot test is one of the popular projective tests and is widely researched. Aside
from this, Henry Murray and Christiana Morgan developed another projective technique which was
the Thematic Apperception Test.

As psychological tests began to gain popularity, a lot of tests were being published, and by 1921,
Psychological Corporation was founded by Cattel, Thorndike, and Woodworth, who was the first test
publisher (Santos & Pastor, 2009).
Some five years later, the SAT or Scholastic Aptitude Test was made and published by the college
examination board. This marked development of tests exclusive to the educational setting. One year
later, Vocational Interest Blank published its first edition. Over the years, the development in the field
of testing continued. Theories and ideas were researched and written, and these supplemented test
development proposals. Few notable developments were: the publication of Lauretta Bender's Bender
Visual Motor Gestalt Test in 1938, which can detect organic problems in a person; 1939, Wechsler-
Bellevue Intelligence Scale by David Wechsler which was revised to Wechsler Adult Intelligence Scale
and Wechsler Intelligence Scale for Children in 1949; development and publication of Minnesota
Multiphasic Personality Inventory in 1942 and introduction of the coefficient alpha to measure the
internal consistency or reliability of tests and other assessment methods.
3

I: Introduction to Psychological Assessment

The history of tests and testing is a story of discovery and improvement. With the existing ideas and
theories of earlier theorists and developers, new tests and statistical measures are developed to
improve the quality of tests that society will be using.

Goals of Psychological Tests and Assessment


Tests and assessments are uniform procedures in which data from individuals/ groups are obtained.
For example, you are experiencing chills at night and experiencing fluctuating body temperature.
Worried that you are not experiencing typical flu symptoms, you consulted a doctor, and he ordered
you to take a laboratory diagnosis for dengue.
Similar to a medical doctor who uses laboratory test results (i.e., blood samples) to confirm his
diagnosis, a psychologist uses tests/assessments with a battery of psychological tests to confirm the
referral question (i.e., use of ASI-5 to measure the severity of drug abuse). Which brings us to a
practical question, what are the processes of psychological assessment?

Assessment Process
As a general practice, the process of assessment starts with a referral for assessment from a source
such as a teacher, guidance counselor, social worker, judge, or human resource recruiter. Referral
questions guide the assessor on what needs to be checked in the assessee. Some examples of referral
questions are the following: the mental age of the child, the capability of an employee to handle a
managerial position, grounds for annulment cases, and “Does this person has a substance use disorder?

The assessor will do a formal assessment in order to clarify or rule out the reason for his/her referral.
Hence, he/she will choose tools for assessment suitable for the assessee’s situation. It is important
that in sensitive cases like a referral for an annulment, the tool selection process may be informed by
some research in preparation for the assessment. After selecting appropriate instruments or
procedures to be employed, the formal assessment will begin. After the assessment, the assessor will
write a report of the findings that is designed to answer the referral question. More feedback sessions
with the assessee and/or interested third parties (such as the assessee’s parents and the referring
professional) may also be scheduled.

Evaluating Psychological Tests


Before using a psychological test, an experienced clinician will first read and understand the
theoretical orientation behind the test, the appropriateness of the standardization sample, and if there
is adequate reliability and validity.

1. Theoretical Orientation
As cited by Groth-Marnett in 2010, Haynes, Richard, & Kubany (1995) emphasized that clinicians
should study the construct that the test is supposed to measure and how the test approaches this test
construct. Usually, this information is easily found in the test manual. Careful examination of the
individual items will help the clinician to understand and obtain meaningful information about the
construct being measured. An example of psychological tests with a strong theoretical orientation will
be the Revised NEO Pi R based on the Five Factor Theory of McCrae and Costa Jr.
4

2. Practical Considerations
Before using a test, a number of practical considerations about the context and manner used should be
examined. First the appropriateness of the test to the examinee's educational level (reading skills).
Imagine administering an IQ test that needs at least a high school level attainment to a group of
illiterates (especially in prison settings)! The examinee must be able to read, comprehend, and
respond appropriately to the test. Otherwise, the result of the test will be useless.
Second, the length of the test should be considered, too, as it may cause boredom, fatigue, and
frustration on the part of the examinees, which in turn will affect the quality of data gathered.
Administering short forms of the test may reduce these problems, provided these forms have been
properly developed and are treated with appropriate caution (Groth-Marnatt, 2010).
Lastly, a clinician should be honest about how knowledgeable and competent he /she is in
administering and interpreting the instrument. If further training is necessary, a method must be
developed to enroll in this training.

Why Do Psychological Testing and Assessment?


The idea of individual differences is not something new, but rather this was remarked on even by
ancient philosophers. However, at those times, they did not have the means to measure it objectively;
thus, development was stunted until such time that practitioners in the field developed new ideas that
contributed to the concept of individual difference more. The existence of tests and other assessment
tools helped immensely to measure aspects of personality which were thought to be impossible before.

These tools are developed in a manner that is able to measure what it is supposed to measure - validity
and can give a consistent result - reliability.

Tests are used in various settings, and these tools proved to be very useful in many ways:

Clinical Setting
It is in the clinical setting where tests - psychological tests, in particular, are often heard. This is not
surprising as tests and other assessment procedures like a clinical diagnostic interview and behavior
assessment are utilized to help the clinician - psychologists and psychometricians to come up with a
proper diagnosis. These detect intellectual disabilities as well as emotional and behavioral instability,
which are important facets of one's personality. Aside from diagnosis, these tools enable the
identification of suitable interventions for the client, like counseling, psychotherapy, or behavior
therapy.

Industrial Setting
In an industrial setting, these assessment devices pave the ways for the selection process to be easier.
Tests aid the HR practitioners in finding the right person for the right job, and these also allow
promotion and training of employees to be done efficiently and objectively. As the tests are means to
measure the skills and capabilities of the employee, the system of evaluating the performance is
unbiased and more objective. The results from these tests can help the development and planning of a
good training program and see if these programs are effective.
5

I: Introduction to Psychological Assessment


School Setting
Meanwhile, in a school setting, psychoeducational tests are often used. Tests in this setting are not
limited to achievement tests but rather extend to aptitude and intelligence tests, which results can help
with career and vocational counseling for students. There are also personality tests that school
psychologists can administer to help diagnose learning difficulties or adjustment problems of the
students at school. Psychoeducation tools allow understanding of students' behavior and what can
hinder them from learning effectively.

Who is involved in the Testing Process?


In the testing industry, there are several people who are involved in the process of testing.
Without these people, the testing industry would not thrive and would remain stagnant. The people in
the industry are the ones who develop, review, publish and control the tests, among others. They are
the ones who ascertain that these tests are well utilized and are not abused or misused.

Test authors and developers


They are the ones who develop new test materials based on psychological theories and phenomena.
One of their means to advertise these materials is through various studies and research, which they
publish in journals and other databases relevant to the field of psychology and testing.

Test reviewers
These are people in the same or related fields who would evaluate the developed tests based on the
tool’s theoretical, empirical and psychometric merits.

Test Users
Their role is to select a test that is appropriate for the purpose of testing. This would be a challenge
since there is a thousand test that is published annually. Also, these people can be the test
administrators, scorers as well as interpreters, depending on their qualifications and training.
Being test administrators, they are required to know the process of giving the exam, whether by group
or individually. As test scorers, they have to get the raw scores from the tests and transform these into
interpretable scores through the objective process and evaluative judgments. The scores then will have
to be interpreted in a manner that will be understandable by other professionals and disciplines. Also,
the interpretation needs to be clear and informative, based on the test results, so it can be utilized for
decision-making.

Test takers
They are the ones whom the tests are made for in order to measure a specific facet of their personality.
6

Types of Psychological Tests and Assessment


In the previous topic, the uses of tests in various settings were explored. How these were used varies,
such that different types of tests and assessment procedures were developed for specific purposes and
settings.
1. Interview
This is more than just conversing as it has an objective, and that is to gather relevant information
depending on the purpose. There are several types of interviews but do not differ in the format of
question and answer. In the clinical setting, structured, semi-structured, and unstructured clinical
interviews are interchangeably utilized. Respectively, the first one has guided questions that are asked
in order. Thus, it can be conducted quickly and easy to quantify, but then it lacks details and is not as
flexible as the other two. The semi-structured is flexible though it still adheres to the guide questions.
The interviewer can stray from this whenever it seems appropriate. The last one mostly contains open-
ended questions, which can generate more qualitative data; however, it can be time-consuming.
From all of these types, the interviewer does not just take down what was said but rather, the non-
verbal behaviors are also noted. For some practitioners, there is a lot that can be gleaned from the
gestures and behavior in the interview, and these are as helpful as the answers from the interviewee.

2. Tests
As mentioned earlier, these are tools that are devised to measure some variables. The types also
tend to vary depending on where it is used. Tests in education are sometimes referred to as
psychoeducation tests which do not measure achievement in class but rather the individual skills -
aptitude and intelligence tests which determine the intellectual functioning of a person. The human
resource utilizes various tests as well both that measure skills as well as personality. In the clinical
setting, we see a more focused test - a structured and projective personality test. Both measure
personality traits which aid in diagnosis and clinical evaluation.

3. Behavioral Observation
This is another tool in assessment whereby it can substitute when other tools cannot be utilized.
It uses naturalistic observation, interview, and rating scales to better understand how and why the
person behaves as such. It is very useful in an industrial setting when there is a need to choose an
employee with certain abilities required to perform a job (Cohen & Swerdlik, 2017).

4. Other Tools
Aside from the aforementioned tools, there are several more tools that can be utilized in the
assessment procedures; among them is the case history data. Case studies have been prevalent during
the development of psychology, and this is often the only source of information for the practitioner.
However, in the modern period, it is utilized together with other tools to understand the factors that
contributed to the person's past as well as present functioning.
7

I: Introduction to Psychological Assessment


Ethical Considerations
With the widespread development of psychological tests, especially during the time of war, it was
prone to be abused and taken advantage of by people. At some point in history, there many delved into
the test development industry even when they were not in the practice of psychology. These and other
issues in psychological practice were addressed in the American Psychological Association (APA) in
the Ethical Principles and Code of Conduct (APA, 1992).
There are six principles to which psychologists and behavioral science practitioners should adhere,
and this summarizes the following:

Principle A: Competence
In psychological testing and assessment, we have two practitioners, the psychometrician and
the psychologist. They are expected to fulfill their roles as professionals and do their jobs within their
training and education, knowing their professional limitations and boundaries.
Principle B: Integrity
Practitioners strive to make objective assessments and evaluations based on the data that they
have gathered and analyzed. Unbiased evaluation reports are the end goal. Thus, they utilize tools with
high reliability and validity as it allows them to achieve this.
Principle C: Professional and Scientific Responsibility
In the practice of psychology, it is not surprising to collaborate with professionals in other fields.
Among which are doctors, lawyers, teachers as well as social workers. This is because professionals
know the limits of their job responsibilities, and in order to help the client in their care, they must work
with others for the best interest of the client. Aside from this, they should know their tools well so as
to be confident of their reliability and validity to the population that they will be used.

Principle D: Respect for People's Rights and Dignity


Confidentiality plays a big part in the assessment practice. Practitioners are often come across
sensitive and private details of the test-taker, which cannot be divulged to another. Moreover, labeling
should be avoided in the practice as this can affect the person's image and somehow pave the way to
stigma; unless this is a clinical diagnosis, the practitioner needs to discuss the results of the assessment
in the most prudent way possible.

Principle E: Concern for others' welfare


Being in a helping profession, the practitioner's role is to do no harm to another. In doing so,
they strive to understand the rights and welfare of their clients before subjecting them to an
assessment of sorts.
Principle F: Social Responsibility
The practitioner's work is not limited to their patients or client, but it extends to society as well.
They use their knowledge in psychology to improve human welfare and lessen human suffering. In
doing so, they avoid misusing their tools as well as knowledge to comply with the law.

The APA ethics code was amended in 2010 and 2016 in order to address changes and
development in practice as well as in society.
8

References and Supplementary Materials


Books and Journals
1. Cohen, R. J., & Swerdlik, M. E. (2018). Psychological testing and assessment: An introduction to
tests and measurement. New York, NY: McGraw-Hill Education.
2. Kaplan, R. M., & Saccuzzo, D. P. (2015)). Psychological assessment and theory: Creating and
using psychological tests. Singapore: db Cengage Learning Asia Pte Ltd: Cengage Learning Asia Pte.
3. Santos, Z. C., & Pastor, G. N. (2009). Psychological Measurement and Evaluation. Manila: REX
Bookstore.
4. Urbina, S. (2004). Essentials of psychological testing. New Jersey: John Wiley & Sons, Inc.
5. Apruebo, R.A. (2010). Psychological Testing Volume 1 (1 st ed). Quezon City;Central Book
Supply
6. Groth-Marnatt, G., Wright, A.J. (2010). Handbook of Psychological Assessment (6 th Edition).
Online Supplementary Reading Materials
1. Mcleod, S. (2014). Structured and Unstructured Interviews | Simply Psychology. Retrieved August
13, 2018, from
https://www.bing.com/cr?IG=9A7E4A9D44B04B9FB6165AAA32234A24&CID=1D012897FCB4648
D14F824DDFD496558&rd=1&h=Lug-
wV4QYQyBlxCgwK_6fWdAfA5ANkIbRvNi_xc0AsY&v=1&r=https://www.simplypsychology.org/inter
views.html&p=DevEx.LB.1,5070.1
2. Connecticut Parent Advocacy Center. (n.d.). Retrieved August 13, 2018, from
http://www.cpacinc.org/materials-publications/evaluation/functional-behavioral-assessment/
3. Ethical Principles of Psychologists and Code of Conduct (1992). (n.d.). Retrieved August 13, 2018,
from http://www.apa.org/ethics/code/code-1992.aspx
4. Nelson, R. O., & Hayes, S. C. (1979). Retrieved August 13, 2018, from
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1311472/?page=1
5. Lanyon, R. I., Almer, E. R., & Curran, P. J. (n.d.). Use of biographical and case history data in the
assessment of malingering during examination for disability. Retrieved August 13, 2018, from
https://www.ncbi.nlm.nih.gov/pubmed/8054679
6. Smith, D. (2003, January). The First Code. Retrieved August 13, 2018, from
http://www.apa.org/monitor/jan03/firstcode.aspx

7. Vallente, K. K. (2014, March 13). Behavioral assessment - Clinical Psychology. Retrieved from
https://www.slideshare.net/keziahkeilavallente/behavioral-assessment-clinical-psychology
8. What you need to know about the new code. (n.d.). Retrieved August 13, 2018, from
http://www.apa.org/monitor/jan03/newcode.aspx
II: Nature and Uses of Psychological Test

Overview
In the first module, we have learned about the basic concepts and principles in psychological tests and
assessments. In the field of psychological assessment, a Test refers to the many tools used in the field
of psychology since it is a device or method that allows behavior to be quantified or to be predicted.
In Module 1, it was also discussed the major differences between Testing and Assessment. When we
say Testing, the main objective in performing this method is to maintain a numerical estimate with
regard to ability or attribute (i.e., kindness, industriousness), in contrast to Assessment, which is
typically performed to answer a referral question (Does the patient has substance use disorder?), solve
a problem (intervention for speech delay), to arrive at a decision through the use of tools of evaluation
(insanity pleas in Court).
Likewise, we can now explain the brief history of testing and assessment, and history shows that
psychological tests have evolved in a complicated environment in which hostile and friendly forces
have produced a balance characterized by innovation and a continuous quest for better methods
(Kaplan & Sacuzzo,,2013)
Now that we know the basic concepts and principles of psychological testing and assessment let's
study the nature and uses of psychological tests.

Defining features of a Test


As cited by Apruebo in 2010, Gregory (1996) stated that a test should have the following defining
characteristics:

1. Standard Procedure
An important characteristic and requirement of any psychological test should include a uniform
administration procedure.

2. Behavior Sample
Apruebo (2010) explained that a psychological test is a limited amount of sample of behavior. This
means, for a short period of time, using a psychological test enables a clinician to gather data about the
behavior of a person. This sample of behavior allows the clinician to make inferences and
interpretations about the total domains of relevant behavior. For example, using an intelligence test
such as Purdue Non-language Test helps the clinician to determine the intellectual functioning of
his/her client.

3. Scores/ Categories
Another important defining feature of a test has scores/categories. A psychological test should provide
one or more scores, meaning it provides the data/person that it belongs to one category and not
another. In simpler terms, a psychological test calculates the performance in numbers or
classifications.

4. Norms or Categories.
Another essential feature of a psychological test is possessing norms or standards.
Kaplan & Sacuzzo (2016) defined norms as the performances by defined groups on particular tests.
Means, norms consisted of a summary of test results from a large and representative group of subjects
in which a test score is interpreted by comparing it with scores obtained by others on the same test.

5. Prediction of Non-test Behavior


The result of a psychological test should help the clinician predict behaviors (i.e., aggression)
by interpreting the score obtained by the examinee.
For example, a person who scored high in a hypothetical Acting-out Frustration Test is
predicted to display behavior of impulsively expressing his/her anger, whether physically or verbally,
whenever he/she feels frustrated when perceived goals are considered unattainable.

Characteristics of a Good Test


Tests, as discussed in the previous module, are tools that are very important in the discipline of
psychology. Since these make the psychologists’ work possible, a good test is a must in order for the
practitioner to deliver the service expected from them. But then, what makes a test good? How does a
practitioner in the field of psychology determine the test to be good for the purpose that they are
utilizing it? For a test to be considered an adequate device, it should be psychometrically sound. The
psychometric soundness of a test pertains to the technical qualities or specifications of a test or an
assessment tool (Cohen & Swerdlik, 2017).

Standardized
A good test is standardized when it follows a uniformed process from administration, scoring, and
interpretation as well as the norms of the test. Standardization of the test ensures that the same
procedure will be given to everyone who takes the test, thus lessening the bias as well as mistakes that
can affect the results of the test.
This process covers how the test will be done - what materials will be utilized in the testing process,
the time limit, directions and instructions that will be given on the test, and other details that will
directly affect the process of testing (Santos & Pastor, 2009). All of these factors should be exactly the
same for every test taker.
Another aspect of a standardized test is the norms. Norms help the test used to make sense of the
results of the psychological tests since these kinds of tests cannot be easily interpreted as pass or fail.
Interpreting a psychological test is different as the test user will have to compare the results of an
individual to the scores of other test takers on the same test. By standardizing the norms of a test, the
results can be easily understood - what is the difference between this individual in this measure
compared to other people.

Reliability
A good measuring tool consistently measures a factor on different occasions. A good test is said
to be reliable when a person shows consistent scores when examined on the same test but in separate
periods of time. Suppose that you wanted to know how a student's Math skills are, so you gave him a
20-item test covering different areas of mathematics. You gave that test to him on three trials - in the
first trial, the student scored 15; on the second try, the score was seven; and in the last, the score was
10. Given these scores, did the test show reliability? In the scenario mentioned, the test was not
reliable, seeing that the scores deviate in a random fashion. The test needs to be able to accurately
measure the facet it has to measure in order for the test to be useful, and one way of testing its
reliability is when it yields consistent measurements at different times.

Validity
Reliability is not enough to say that a test is good; a test needs to be valid as well. This property
of psychometric soundness measures how well a tool assesses a factor that it claims to measure. Let's
say that you are holding a test that you are told will measure knowledge in history. Knowing this, you
will be expecting questions or items relating to events that have happened in the past. However, upon
II: Nature and Uses of Psychological Test
perusal, you see that there were items related to grammar rules and tenses of verbs. Do these items
make the test valid?
For a test to be considered valid, it has to measure the purpose for which it is intended. Who
then deems a test to be valid? In the earlier example, about the history test, the test has to be
scrutinized by experts in the field of history since they are knowledgeable of the events that took place
in the past. These experts are also the ones who can help assess if the test covers enough areas to say
that the test will adequately measure the historical knowledge (Cohen & Swerdlik, 2017).

Scorability and Interpretability


A test will not mean anything if it cannot be scored and interpreted. Scoring of the test should
be simple so the test user will have little difficulty getting the scores and analyzing the results. If the
scoring is done well, then it would not be difficult to make sense of the results. A clear score will allow
better analysis since it shows where the performance of an individual lies. Some tests come with
scoring manuals as well as stencils and computer programs that will score the test automatically.
Understanding the scores will pave the way for better interpretation. It is not enough that one
will get the numerical data that symbolizes the performance of the individual on the test this should
be interpreted as well. Interpretation explores the numerical score, differentiating the performance of
the individual from the normative sample; it also gives descriptive explanations of the findings from
the score, which can help a lot in decision-making.

Categories of Test
Apruebo (2010) maintained that a psychological test is typically considered and categorized as
a psychometric test and projective test.
He defined the psychometric test as a structured, voluntary, objective, and specifically designed
to measure intelligence, aptitude, and personality traits. Meanwhile, the projective test is the exact
opposite of the abovementioned test, as it is an unstructured and subjective way of measuring covert
(non-observable) or unconscious characteristics of behavior in which the results are discussed in a
qualitative/reflective manner.

As cited by Apruebo in 2010, Campbell discussed the categories of tests in terms of the following
three (3) dimensions:

1. Structured- Nonstructured (Psychometrics-Projective Techniques)


This category includes the degree of freedom the subject can respond in a variety of ways. For
example, in a structured (psychometric test), clients only have limited ways of answering by choosing
only what are the choice (forced choice, i.e., True or False, multiple-choice tests), whereas a non-
structured test (projective test), clients can freely respond as they like (i.e. Rorschach Inkblot Test,
Draw a Person Test).

2. Disguised- Non-Disguised (Projective- Subjective)


This category includes the degree of clients’ knowledge of the purpose of the test.,
Apruebo (2010) emphasized that in a disguised (projective) test, the examiner interprets the test in
way other than what the client assumed it would be when responding. As an illustration, you
administered an objective test to a suspected manipulative client. So you disguised (projective) test
as a test of perceptual ability when in fact, it is a test of psychopathology.
On the contrary, in a non-disguised (subjective) test, both the client and the
examiner/researcher are informed and understand the purpose of the test. For example, an interview
or intelligence test to be used for Court proceedings such as Adoption Cases has the qualities of a non-
disguised test.

3. Voluntary –Objective (Psychometric-Behavioral)


The third and last dimension pertains to a voluntary and objective test. Voluntary
tests enable the client to give his answer freely. As an example, a career attitude preference
questionnaire will require the client to write his preferences among given alternatives on a self-report
basis. While in an objective test, the client is instructed to give a correct response to choices given in
contrast to the self-report questionnaire. On the other hand, Pervin ( as cited by Apruebo in 2010)
recommends another way of categorizing the test into four (4) major categories by assessment
techniques and personality theories, as shown below:

Test Test Examples Obtained Data Theoretical Approach Illustrative


Catego Characteristic Theory and
ry s Theorist

Projecti Nonstructured, TAT, DAP, HTP Organization of Psychodynamic Psychoanalysis-


ve disguised conscious/unconscious Sigmund Freud
motives and conflicts

Subject Nonstructured Interview, Rep Individual perceptions of Phenomenological Self- Rogers


ive or semi- Test, Q-Sort self and world
Personal Construct
structured,
- Kelly
undisguised

Psycho Structured, 16 PF Personality Traits Trait- Type, Factor Trait - Cattell


metric Voluntary Analytic
MMPI

Objecti Structured, Behavioral Behaviors (responses)in Learning Learning Theory –


ve- Objective Assessment specific learning situations Skinner, Bandura,
Behavi and Mischel
oral

Adapted from Psychological Testing Volume 1 by Dr. Roxel A. Apruebo, RGC Copyright ©2010 Central Book
Supply Inc.

Testing Levels
It should be a common practice that only qualified users/examiners shall handle the administration
and safekeeping of all test materials (including the manuals, answer keys, reusable booklets, etc.) At
the hands of incompetent and unauthorized persons, the usefulness of tests will either be of no use or
may break/destroy a person’s life. Just imagine a Court scenario a rapist received a Scot-free verdict
due to an incompetent psychological assessment report. As we mentioned earlier, in assessment, every
word has the power to either make or break a person's life. Hence, the following are the guidelines for
administering tests according to the level of the test, as elaborately indicated by Apruebo in 2010:

Level A
The qualifications of the psychologist should fall on the following: undergraduate courses in
testing or psychometrics and adequate training/administration in testing. He/she can be administered
paper-and-pencil tests: IQ, Achievement tests, Aptitude Tests, etc.

Level B
II: Nature and Uses of Psychological Test
For this level, the psychologist should at least complete an advanced course in testing (graduate
level) in a university or its equivalent in training under the guidance and supervision of a qualified
clinical psychologist./psychological consultant. Under this level, tests that can be administered are
the following: those that are under Level A and paper-and-pencil tests of personality: Sentence
Completion Tests, Personality Assessment Inventory, 16 Personality Factor, and the Wechsler
Scales.

Level C
To administer and interpret Level C tests, psychologists should have a M.A. or Ph. D. and/or
equivalent experience in training and psychodiagnostic. The tests that can be administered and
evaluated at this level are the following: Level A and Level B tests and projective techniques such as
Thematic Apperception Test, Children Apperception Test, and Rorschach Psychodiagnostic Test, to
name a few.

Control of Psychological Tests


Just like any other tool, not everyone has the knowledge and can use psychological tests.
Qualified test users are needed to handle these tests as these will need technical knowledge as to when
to use the test - selection, how to use it - administration, as well as how to score and interpret this.
If the test user does have the qualification to use the test, then the likelihood of misusing it is
higher. Given this, the test developers and distributors require certification of the test user first before
allowing the latter party to purchase the test.
The purchase and use of psychological tests it is categorized into three levels. Level A tests can
be administered and scored by people in all professions with the aid of the test manual.
Then there are Level B tests which require a degree in psychology and behavioral statistics.
Tests at this level require a thorough understanding of statistics in order to see how different the
person's scores are from the rest of the sample representatives. Level C tests require more stringent
qualifications with a master's degree in psychology and supervised practice.
These tests are projective tests that demand a higher understanding of personality theories and
mathematics in behavioral science (Cohen & Swerdlik, 2014).
Aside from categorizing the tests, the American Educational Research Association, the American
Psychological Association, and the National Council on Measurement education have written the
Standards for Educational and Psychological Testing.
This document addresses the issues regarding the test construction and evaluation, test
administration and use, as well as applications of tests for special situations. The Standards were
written in 1954 and revised several times, with the recent publication in 2014.
One of the responsibilities of the test user is to secure the contents of the test. Securing the test
will help preserve its reliability and validity of the test. However, securing the test does not mean that
the interpretation should be relayed ambiguously.
The test user needs to discuss the basic information of the test in order for the test takers to
fully understand the test, its purpose, and the information that it will yield.
Also, ensuring the test is employed by a competent and qualified examiner so that the scores
are properly used is a MUST. Remember, an unqualified examiner has a high tendency to commit an
error in the selection, administration, scoring, or interpretation of psychological tests, which may
cause harm that should not happen at all costs.

Planning and Preparing for Test Administration


Test administration is not a simple task of just distributing the test materials and collecting
these. The administration of the test should be planned out carefully in order to ensure the
psychometric properties of the tests are sustained, as well as to lessen the bias and utilize the test to
its maximum purpose.
There are a lot of factors that need to be considered first. Factors such as what kind of procedure
should be followed in administering a test depend on the type of test that will be used; should the
administration be an individual or a group administration; timed or non-timed, and should the test
focus more on the cognitive or behavioral aspect.
Moreover, careful consideration of psychological factors such as preparedness, "test wiseness,"
motivation, and the emotion of the examinees as these will surely affect their scores.
Personality, skill, and behavior of the examiner have also had an effect on the examinees'
performance.
Last but not least, situational variables, such as time, and place of testing and environmental
/physical conditions such as illumination, temperature, noise level, and ventilation, may also add to
psychological variables like motivation, concentration, test anxiety, and performance of examinees.
This means that the test examiner must be fully prepared before administering tests.

Before Test Administration


1. Scheduling the Test
The examiner must take into consideration the schedule and activities that the examinees
usually engage in at the time. In short, an examiner/should schedule the testing in a way that is
convenient for the examinee.
Everyone deserves an opportunity to prepare intellectually, emotionally, motivationally, and
physically for a test (Apruebo, 2010).

2. Informed Consent
Informed consent refers to the unanimous agreement between a professional and a particular
person and his/her legal representative. Under this agreement, permission is given to administer
psychological tests to the person and to obtain other information from that person for
evaluative/dianoetic reasons.

3. Becoming Familiar with The test


The examiner should be familiar with the test so that he/she can efficiently be administered the
tests and avoid errors in testing.

4. Ensuring Satisfactory Testing Condition


It is the duty of an examiner to see make sure that the testing area is a conducive surrounding,
such as seating, lighting, ventilation, temperature, noise level, and other physical conditions in the
testing situation are appropriate.

During Test Administration


Before the test, the test administrator needs to check the materials for the exam and make sure
that these are all complete with the necessary pages in the correct order. The distribution of the test
materials - the questionnaire and answer sheet. The test administrator needs to direct the test takers
to carefully fill in the required information in the answer sheet carefully. Before the test properly
begins, the administrator also needs to guide the test takers on how to mark the answer sheets. Giving
II: Nature and Uses of Psychological Test
examples are a great way to ensure that the direction is followed, and asking clarificatory questions
from the test takers will help to see if there are directions that were vague.
Read the general instructions of the test aloud for the test takers and explain directions further
should there be any need. A time limit of the exam has to be clarified, and the test administrator needs
to be sure to keep track of it by giving time checks to the test takers from time to time.
When the time limit is done, the test administrator has to gather the test questionnaires and
answer sheets, making sure that all papers are accounted for. These papers - answer sheets are then
prepared for scoring and interpretation of the test.

After Test Administration


After test administration, make sure to retrieve all the data by collecting and securing all test
materials. The examinee should be reassured concerning his/her test results. In clinical testing, it is
vital to interview a parent or significant other who has accompanied the examinee before and after the
administration of the test for collateral information and be informed on what will be done with the
results givens to the examinee. Again, examiners must be reassured of the confidentiality of the test
results and interpretations and will be endorsed to the proper persons or agency to recommend any
further action.

References and Supplementary Materials


Books and Journals
1. Cohen, R. J., & Swerdlik, M. E. (2018). Psychological testing and assessment: An introduction
to tests and measurement. New York, NY: McGraw-Hill Education.
2. Santos, Z. C., & Pastor, G. N. (2009). Psychological Measurement and Evaluation.
Manila: REX Bookstore.
3. DeVon, Holli & Block, Michelle & Wright, Patricia & M Ernst, Diane & Hayden, Susan &J
Lazzara, Deborah & Savoy, Suzanne & Kostas-Polston, Elizabeth. (2007). A Psychometric
Toolbox for Testing Validity and Reliability. Journal of nursing scholarship: an official
publication of Sigma Theta Tau International Honor Society of Nursing / Sigma Theta Tau. 39.
155-64. 10.1111/j.1547-5069.2007.00161.x.
4. Apruebo, R.A. (2010). Psychological Testing Volume 1 (1 st ed). Quezon City; Central Book
Supply
5. Groth-Marnatt, G., Wright, A.J. (2010). Handbook of Psychological Assessment (6 th Edition).
6. Kaplan, R. M., & Saccuzzo, D. P. (2013)). Psychological Testing: Principles, Applications, & Issues
(8th Edition). Wadsworth. Cengage Learning
Online Supplementary Reading Materials
1. ELSOUS, A., SARI, A. A., RADWAN, M., MOHSEN, S., & ZAYDEH, H. A. (2017, May). Retrieved August
15, 2018, from https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5442278/
2. Swanson, E. (2014, June). Validity, Reliability, and the Questionable Role of Psychometrics in
Plastic Surgery. Retrieved August 15, 2018, from
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4174233/
1

III: Technical and Methodological Principles

General Considerations
As discussed in the previous module, it was discussed the characteristics of what a good test
should have. For a test to be helpful to a clinician, it should measure what it intends to measure in as
accurate a way as much as possible. Which brings us to this important question, what are the factors
we need to consider before using a test to assess someone psychologically?

Reliability
To consider the test to be suitable, it should be, first, reliable. Reliability of a test refers to the
“accuracy, precision, or consistency of a score obtained through the test” (Apruebo, 2010). Likewise,
Souza et al. (2017) mentioned that it should yield “a consistent result in time and space, or from
different observers, presenting aspects on coherence, stability, equivalence, and homogeneity. This
means, across different times, different situations, and different test takers, - a reliable test will always
reproduce a stable score that will measure a skill, knowledge, and domain consistently. In other
words, reliability addresses the degree to which an obtained score by a person is the same even if the
person retakes the same test on different occasions (Groth-Marnatt, 2010).
As an illustration, a K-Pop enthusiast decided to take an aptitude exam for the Korean language
without any preparation, relying on the phrases she learned from the Korean series she binge-
watched. As a result, she utterly fails the exam. Frustrated with her obtained score and a strong desire
to learn the language, she enrolled in a review center for this subject with plans to retake the exam.
After 3 months, she decided to retake the exam and scored higher than before. The Korean aptitude
test is said to be a good test as it can consistently measure her aptitude based on her understanding
of the subject. If the test is not reliable, retaking the said exam will only yield an increase or decrease
in her score based purely on chance.
However, Kaplan (2009) explained that errors of measurement could not be avoided as
discrepancies between true ability and measurement of abilities is inevitable. Humans are bound to
make mistakes, and our goal is to lessen the error to “keep testing errors within reasonably accepted
limits” (Groth-Marnatt, 2010). In other words, errors in measurement are an estimate of the possible
range of random changes in the score that can be expected from a person's score.
In psychological assessment, error implies inaccuracy of measurement. Again, tests that are
"relatively free of error" (Kaplan, 2009) are considered to be reliable. How do we know that a test is
“reliable” then?
This is where reliability analysis will enter to examine whether the test provides a consistent
measure.

Common Ways of Estimating the Reliability of a Test


When we evaluate for reliability, it is important to identify first the source of measurement
you are trying to measure.
1. Test-Retest method (Coefficient of Stability)
Test-retest reliability pertains to “estimates are used to evaluate error associated with administering
a test at two different times” (Kaplan, 2009). Kaplan further elucidates that this type of reliability
analysis is important to consider only if we need to measure "constructs," "characteristics," or "traits"
that do not change over time. An example of this will be testing measuring intelligence as we consider
this trait to be a general ability. The coefficient of stability is obtained by correlating the scores
obtained on two different administrations by the same person. The degree of correlation between
these two scores shows the range of test scores that can be generalized from one occasion to another.
If a high correlation exists between these scores, the results are less likely to be an effect of some
random changes in the condition of the person or the ambiance of the testing environment. Simply, in
the actual application of the test, the examiner can confidently conclude that differences in obtained
scores are because of an actual change in the trait measured rather than a random chance result.
However, careful evaluation in choosing this reliability method, as it is not applicable to use this
type of reliability estimate in measuring constantly changing characteristics such as projective tests
like Draw a Person, House-Tre-Person, and Sachs Sentence Completion test as these tests tell the
clinician the client's wellbeing at the present time.
A common measure for test-retest reliability would be the usage of correlation, regression, and
multiple regression.

2. Parallel-Forms Method (Equivalence Forms Reliability)


Parallel-form method refers to the comparison of two equivalent forms of test that measure the same
attributes (Kaplan, 2009). These two forms contain different items that are selected with the same
difficulty level. For example, you have developed a Frustration-Anxiety Test, and you are interested to
know if all your test items measure the abovementioned trait. Using this method, you will create two
forms of the test (equivalent in items difficulty) and administer this test to the same group of people
on two different occasions to test. Afterward, the equivalent form reliability coefficient is calculated
using the correlation between the obtained scores on two forms of the test from the same group of test
takers.
Practically speaking, the use of the parallel-forms method is impractical and time-consuming as factors
such as the test taker's motivation, fatigue, and cooperation are posed a challenge in performing this
task, as well as the need to create two forms of test that are identical in difficulty level.

3. Split-Half Reliability Method


Groth-Marnat (2010) argued that this is the best method for determining the reliability of a trait with
a high degree of change. It is also a practical technique since a test is administered only once, and the
items are divided into halves that are scored separately (Kaplan, 2009). Usually, the test items are
divided using the odd-even half method. The items belonging to the odd number are grouped together,
while the other group is comprised of even-numbered items. Afterward, the two scores are correlated.
Since the test is given only once, the split-half method yields a measure of the internal consistency of
items. Kaplan (2009) defined the term internal consistency as an intercorrelation among items within
the same test. A good test that has an internal consistency measure a single construct consisting of
items that measure such traits and, ultimately, should have a high agreement with each item.
This means this method shows if all test items assess a single construct/trait. To estimate the reliability
of the test, employing the Spearman-Brown formula is a must, as it allows the estimation of what the
correlation between the two halves would have been if each half had been the length of the whole test
(Kaplan, 2009).
Aside from using the split-half technique, there are other methods for calculating the internal
consistency of a test. If the items are dichotomous in nature (usually scored by 0 or 1, Yes or No), one
can employ the use of KR20 or Kuder-Richardson 20.
This technique estimated the reliability of the test in single test administration and considered all
possible ways of splitting the items. As cited by Kaplan in 2009, Cronbach (1951) explained that
mathematical proofs have shown that the KR20 formula calculates the same reliability estimate that a
test would get if you took the mean of the split-half reliability estimates obtained by dividing the test
by all possible ways.
Remember, when you are performing an item analysis with items that are DICHOTOMOUS
(answerable by two options only, i.e ., Yes or No, right or wrong, true or false), it is recommended to
use the KR20 formula.
3

III: Technical and Methodological Principles


Another method of reliability test for internal consistency would be Cronbach’s Alpha. This is used to
evaluate the internal consistency of tests that are not answerable by right or wrong answers. Examples
of this test are personality tests and attitude scales.

For example, in answering a personality inventory, you might encounter a statement such as "I rather
read books than go out and party with people." Typical choices on this test are the following: Strongly
Agree, Agree, Neutral, Disagree, and Strongly Disagree. There is no right or wrong answer, but rather
you are just saying where you stand on the range of agreeing or disagreeing on this statement.

Kappa statistic (Inter-observer reliability)


What if the clinician with a strong behaviorist foundation uses direct observation of behavior? How do
we evaluate the reliability of a behavioral observation? For example, suppose you are measuring
assertiveness in a classroom setting. As a researcher, you will be assigned some people secretly
observing the behavior of their classmates. These observers will tabulate the number of observable
responses in each "display of assertiveness" category you choose. Hence, there would be one score for
every “taking the lead” and “assuming responsibility." After tabulating all the observers' scores, the
kappa statistic is best used in testing the reliability of such behavioral observations. Introduced by J.
Cohen in 1960, kappa indicates the actual agreement as a proportion of the potential agreement
following correction for chance agreement (Kaplan, 2009).
Kappa statistic is an agreement measure between observers/raters and has a maximum value of 1.00.
The higher the Kappa value is, the higher the concordance between the raters will be. Values close to
or below 0.00 indicate a lack of concordance.
Hence, when there is a high agreement or concordance between the observers/raters, we can conclude
that there is a lesser measurement error performed by raters, making the test reliable.

Validity
In psychological assessment, it is important to use a test that will measure what it intends to measure.
Just imagine you are taking your mid-term exams in Theories of Personality only to answer trivial
questions such as "What age did Sigmund Freud die?" or " Who coined the term "schizophrenia"? "
That is so unfair; the test is INVALID; it does not measure my knowledge about personality theories,"
you exclaimed. A test that is valid for identifying personality traits should measure what it is intended
to measure and should also produce information useful to clinicians. Validity is the degree to which
certain inference from a test is appropriate or meaningful. In layman's terms, it measures what it wants
to measure.
Groth-Marnatt (2010) explained that even though an instrument/test can be reliable without being
valid, it is an important requirement for the test to achieve a certain level of reliability. Souza et al.
(2017) emphasized that a test that is not reliable cannot be valid; however, a reliable test can,
sometimes, be invalid. Hence, high reliability does not guarantee a test's validity.

As cited by Apruebo in 2010, Nunnally & Bernstein said that validity has three (3) major meanings:
a. Construct Validity is measuring psychological domains
b. Predictive Validity is establishing a statistical relationship with a particular criterion.
c. Content Validity is sampling from a pool of required content.
Types of Validity Methods
According to Translation Validity (Apruebo, 2010)
Face Validity
One night, while browsing the internet, you become bored and decides to try an English proficiency
test on the Internet. Some of the questions go like this “ A is for Apple, C is for ___"?, " How much wood
would the woodchuck chucked?" and " Nan has 5 siblings. Bab, Beb, Bib, and _____. “
After item number 3, you decided to stop answering the test as you feel you've been duped, and it's a
waste of time since it clearly doesn't look like an English proficiency test. And that is a classic example
of what Face Validity is all about.
Face validity refers to the appearance of the test. It pertains to the perceived purpose of the test.
In other words, “Does your test looks like a test”?
For example, if you think you are answering an intelligence test because the test items are composed
of abstract items, then we can say that it has face validity.
Groth-Marnatt (2010) implied that it is really not a type of validity at all as it does not offer evidence
to support conclusions drawn from test scores. However, bear in mind that it is essential to have a test
that “looks like” it is a valid test, as these appearances can help motivate test takers because they can
see that the test is relevant.

Content validity
Say that, for example, you have an upcoming test for General Psychology. You have rigorously studied
your notes and book for that examination and known almost everything only to find that the professor
has come up with some trivial items that do not represent the content of the course. I know how hard
that moment is, which is why it is important for a test to have content validity.
Refers to the degree to which the items of the test are a representative sample of a universe content
(i.e., contains all the possible content areas of a construct). Meaning to say, it shows whether the test
includes comprehensive coverage of the construct. It also shows whether the test has been adequately
constructed and whether item contents and the domain it represented were examined by experts.
An example of a test with high content validity was the Board Licensure Examinations.

According to Criterion-related Validity (Apruebo, 2010)


When we say criterion-related validity, it means that a test was evaluated for its validity based on a set
of standards to which the test is compared.
Such evidence is provided by high correlations between a test and a well-defined criterion measure. A
criterion is a standard against which the test is compared.
For example, a test might be used to predict which students will graduate with honors and which ones
will stop or drop out. Academic success is the criterion, but it cannot be known at the time the students
take the test.

Predictive Validity
This type of validity measures how well its prediction agrees with subsequent and/or future outcomes.
A classic example of this would be in the United States; they used the SAT Critical Reading Test serves
as predictive validity evidence for college admissions tests to know if it accurately forecasts how well
5

III: Technical and Methodological Principles


high-school students will do in their college studies. The SAT, including its quantitative and verbal
subtests, is the predictor variable, and the college grade point average (GPA) is the criterion (Kaplan
& Sacuzzo, 2015).

Concurrent validity
Say that you, as the newly-hired Human Resource Specialist, are assigned to hire a Chef for a Korean
Eat-All-You-Can Buffet. You already screen your applicants to three (3) with the most impressive job
experience. Since they appear to have the same qualifications, what will be your tool for hiring the best
Chef among the three?
One way is to test potential employees on a sample of behaviors that represent the tasks to be required
of them. For example, as cited by Campion (cited by Kaplan & Sacuzzo, 2015) found that the most
effective way to select maintenance mechanics was to obtain samples of their mechanical work.
Similarly, the best way to hire the Chef is to require them to create their best version of Korean
Samgyupsal, and the best way to showcase their skill is, of course, to cook!
The abovementioned scenario is a good instance of the use of concurrent validity. In short, concurrent
validity is a correlation between the test and a criterion when both are measured at the same point in
time.
Convergent Validity
A measure determined by significant and strong correlations between different measures of the same
construct.
For example, you decided to test your newly constructed depression questionnaire, Light Scales, to be
compared with Aaron Beck's Depression Inventory to see if there is a high correlation between the two
tests.
If the data you obtained denotes a high correlation, it means that the Light Scales indeed measure
depression.
Discriminant Validity
This measure refers to the extent to which measures diverge from other operationalizations.
This means that when you use this validity test, it should yield a low correlation for tests that are
opposites of your measure.
For example, just for the sake of discussion, the test entitled Resilience Scale should not
correlate highly with Aaron Beck's Depression Inventory because it will mean that the Resilience Scale
measures the wrong construct, which is depression.

Validity Coefficient
The relationship between a test and a criterion is usually expressed as a correlation called a validity
coefficient. This coefficient tells the extent to which the test is valid for making statements about the
criterion.

Norms
This pertains to the performance of a particular reference group to which an examinee's score can be
compared. This means a norm is a normal or average performance.
It can be expressed as the number of correct items, the time required to finish a task, the number
of errors committed, etc.
Apruebo (2010) strongly argued that raw scores are pointless until they can be evaluated in terms of
appropriate interpretative standard data or statistical techniques.
In short, a norm is a set of scores from a group of individuals to which the raw score from a
psychological test is compared to.

Usage of Norms
Psychological test manuals provide tables of norms to facilitate comparing both individuals and
groups. However, several methods and techniques for deriving into more meaningful norms, more
specifically, "standard scores" from "raw scores," have been widely adopted because all of them reveal
the relative status of individuals within the group.

5 Basic Norming Techniques


1. Measure of Central Tendency
It is a statistical measure to determine a single score that defines the center of a distribution. The
goal of central tendency is to find the single score that is most typical or most representative of the entire
group.

1.1 Mean
Commonly known as Arithmetic Average, computed by adding all the scores in the distribution
and dividing by the number of scores.

1.2 Median
The median is the score that divides a distribution exactly in half. Exactly 50% of the individuals in
distribution have scores or below the median. The median is equivalent to the 50 th percentile.
7

III: Technical and Methodological Principles

1.3 Mode
In a frequency distribution, the mode is the score or category that has the greatest frequency.

2. Frequency Distribution
A frequency distribution is an organized tabulation of the number of individuals located in each
category of the scale of measurement. It takes a disorganized set of scores and places them in order from
highest to lowest, grouping together all individuals who have the same score.
Personality Traits

Anxiety Traits
f %
(ANX)
59 or less 54 51.92
60 T to 69 T 41 39.42
70 to 81t 9 8.65
Total 104 100

An example of a frequency distribution. From the data above, the table indicates that the majority of
respondents' scores fall in the bracket of 59T or less, which means 54 people obtain that score.

Adapted from Statistics for the Behavioral Sciences by Gravetter, Frederick J. & Wallnau, Larry B. Copyright
©2012 Wadsworth/Cengage Learning

In a symmetrical distribution, it is possible to draw a vertical line through the middles so that one side
of the distribution is a mirror image of the other. In a skewed distribution, the scores tend to pile up
toward one end of the scale and taper off gradually at the other end.
The section where the scores taper off toward one end of the distribution is called the tail of the
distribution.
For example, in a very difficult exam, most scores tend to be low, with only a few individuals earning high
scores. This will produce a positively skewed distribution.
On the other hand, a very easy exam is inclined to produce a negatively skewed distribution, with most
of the students earning high scores and only a few low values.

3. Use of Normal curve


A normal Distribution/Curve is a bell-shaped curve that shows the probability distribution of a
continuous random variable.

4. Percentile Rank
A rank or percentile rank of a particular score is defined as the percentage of individuals in the
distribution with scores at or below the particular value. When a score is identified by its percentile rank,
the score is called a percentile. Percentile describes your exact position within the distribution.
How to interpret percentile:
0- 5 % tile Compartment 1 = Fail
6-10 % tile Compartment 2 = Low Average
11-50 % tile Compartment 3 = Below Average
51-85 % tile Compartment 4 = Average
86-95 % tile Compartment 5 =High Average
96-99 % tile Compartment 6 =Excellent

5. Stanine System
a. Raw Scores are transformed into nine groups.
b. one is the lowest and 9 Highest

Basic Principles of Using Norms


1. There are two approaches to Norm Construction
a. Criterion-Referenced Approach: Did the person satisfy the said standard?
b. Norm-Referenced Approach: Person's performance relative to others; Norm Dependent
c. Norm used must be appropriate for the subject’s score.
2. Always check the indicator of the appropriateness
a. Indicator 1: Nationality (origin) Local Norms should be applicable to your present clients.
b. Indicator 2: Age
c. Indicator 3. Gender
i. Established Gender Differences: Verbal Ability,
Numerical Ability, Emotional Sensitivity, Aggression
ii. Established without Gender Differences
General IQ, Self-esteem
3. Norms should be constantly updated.
Each norm is only good until five years, so it needs to be updated to be in line and be a good
representation of a given population.
9

III: Technical and Methodological Principles


How are norms constructed?
1. Construct a Psychological Test – Is there a need for that test?
2. Pilot a test
-Administered the test in a group
3. Applying Norming Techniques

References and Supplementary Materials


Books and Journals
1. Cohen, R. J., & Swerdlik, M. E. (2018). Psychological testing and assessment: An introduction
to tests and measurement. New York, NY: McGraw-Hill Education.
2. Kaplan, R.M., & Sacuzzo, D.P. (2018). Psychological testing; principles, applications, and
issues. Belmont, CA: Wadsworth Cengage Learning
3. Apruebo, R.A. (2010). Psychological Testing Volume 1 (1 st ed). Quezon City; Central Book
Supply
4. Groth-Marnatt, G., Wright, A.J. (2010). Handbook of Psychological Assessment (6 th Edition).
5. Kaplan, R. M., & Saccuzzo, D. P. (2013)). Psychological Testing: Principles, Applications, & Issues
(8th Edition). Wadsworth. Cengage Learning
6. Gravetter, Frederick J. & Wallnau, Larry B. (2012). Statistics for the Behavioral Sciences.
Belmont, CA; Wadsworth/Cengage Learning
Online Supplementary Reading Materials
1. Swanson, E. (2014, June). Validity, Reliability, and the Questionable Role of Psychometrics in
Plastic Surgery. Retrieved August 15, 2018, from
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4174233/

You might also like