Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 92

INTRODUCTION

TEST – a measurement device or technique used to quantify behavior or aid in


the understanding and prediction of behavior.
 

INSTRUMENTS– checklists, scales, surveys, and inventories to provide


information
ITEM – specific stimulus to which a person responds overtly; this response can
be scored or evaluated. Items are specific questions or problems that make up
a test.
 

PSYCHOLOGICAL TEST – set of items that are designed to measure


characteristics of human beings that pertain to behavior (Kaplan & Saccuzzo,
2013)
Definition
(Cohen, Swerdlik, & Sturman 2013)

Psychological Assessment is the gathering and
integration of psychology-related data for the purpose of
making psychological evaluation that is accomplished
through the use of tools such as tests, interviews, case
studies, behavioral observation, and especially designed
apparatuses and measurement procedures.


Psychological Testing is the process of measuring
psychology-related variables by means of devices or
procedures designed to obtain a sample of behavior
Assessment vs. Test(Cohen, Swerdlik, & Sturman 2013)
Assessment Testing

Objective To answer referral question, solve a To obtain some gauge, usually numerical
problem, or arrive at a decision in nature, with regard to the ability or
through the use of tools of evaluations. attribute.

Process Typically individualized and focuses on May be individual or group in nature


how the individual processes rather with little or less regards with the
than simply the results of that mechanics of contents and processes.
processing.

Role of The assessor is key to the process of The tester is not key to the process;
Evaluator selecting tests and/or other tools of practically speaking, one tester may be
evaluation as well as drawing substituted for another tester without
conclusion from the entire evaluation appreciably affecting the evaluation

Skill of the Typically requires an educated Typically requires technician-like skills in


Evaluator selection of tools of evaluation, skill in terms of administering and scoring a test
evaluation, and thoughtful as well as in interpreting a test result.
organization and integration of data.

Outcome Entails a logical problem-solving Typically yields a test score or series of


approach that brings to bear many test scores.
sources of data designed to shed light
on a referral question
Assessment Timeline (Erford, 2013)*
Philosophies of Psychological Assessment
Philosophies of Psychological Assessment

PSYCHOMETRIC APPROACH

American in origin

Gives numerical estimates of single aspects of performance.

It rests on E. Thorndike’s belief “if a thing exists in some amount, and if it exists in some
amount, it can be measured”.

Definite and structured

IMPRESSIONISTIC APPROACH

German in origin

Leads to a comprehensive, descriptive picture of the individual. 

It looks for significant cues to understanding an individual’s dynamics by all available means
and integrates them into a total picture

Gives minimal consideration to “how much” of some characteristic is present

It seeks “wholeness” or unity and relies on observation, descriptive data and self-report.
ASSUMPTIONS ABOUT PSYCHOLOGICAL
TESTING AND ASSESSMENT
1. Psychological traits and states exist

Trait – any distinguishable, relatively enduring way in which one


individual varies from another (Guilford, 1959,p.6)

States – also distinguish one person from another but are relatively less
enduring (Chaplin et al. 1988)

2. Psychological traits and states can be quantified

and measured 

Many test developers and researchers, much like all other people, have
different ways of looking and defining the same phenomenon

3. Test-related behavior predicts non-test-related

behaviors
4. Tests and other measurement techniques have

strengths and weaknesses

Psychological tests are prone to confounding variables

5. Various sources of error are part of the assessment

process

  Error refers to a long standing assumption that factors other than


what a test attempts to measure will influence performance on the test

6. Testing and assessment can be conducted in a fair and

unbiased manner

  Some of the test fairness related problem are more political than
psychometric
Types of Assessment
Formal vs. Informal Assessment

Formal assessments are preplanned, systematic attempts by the teacher to
ascertain what students have learned. The majority of assessments in
educational settings are formal. Typically, formal assessments are used in
combination with goals and objectives set forth at the beginning of a lesson
or the school year. Formal assessments are also different from informal
assessments in that test takers can prepare ahead of time for them.

Informal assessments are those assessments that result from spontaneous
day-to-day observations of how an individual behave and perform a
particular task. When informal assessments are conducted, they don't
necessarily have a specific agenda in mind, but are more likely to learn
different things about the person as they proceed through daily activities
naturally. These types of assessments offer important insight into a
person‘s misconceptions and abilities (or inabilities) that might not be
represented accurately through other formal assessments.
Types of Assessment
Paper-Pencil vs. Performance-Based
Assessment. These types of assessment
procedures are sub-divided into the
following:

Verbal vs. nonverbal. Verbal tests are not
necessarily spoken but may be written and non-
verbal is involves the ability to understand and
analyses visual information.

Speed vs. Power. Only speed of response is
measured by the speed test while power test is
designed to measure the knowledge of the test
taker, regardless of his or her speed of
performance.
Types of Assessment
Paper-Pencil vs. Performance-Based
Assessment. These types of assessment
procedures are sub-divided into the following:

Individual vs. Group. A test can be said individual test in the
sense that they can be administered to only one person at a
time while group test involves many people.

Objective vs. Subjective. Objective tests are psychological
tests that measure an individual's characteristics in a way that
is independent of rater bias or the individual's own beliefs.
Objective tests tend to be more reliable and valid than
subjective tests which are evaluated by giving an opinion.
Subjective tests are more challenging and expensive to
prepare, administer and evaluate correctly
Types of Assessment
Paper-Pencil vs. Performance-Based
Assessment. These types of assessment
procedures are sub-divided into the following:

Cognitive vs. Affective. Cognitive tests attempt to measure
mental ability while affective tests are designed to assess
interests, attitudes, and personal values of an individual.
TYPES OF PSYCHOLOGICAL TESTS
Intelligence tests.

These are used to measure intelligence, or your ability to understand your
environment, interact with it and learn from it

–  Wechsler Adult Intelligence Scale (WAIS)

–  Wechsler Intelligence Scale for Children (WISC)

–  Stanford-Binet Intelligence Scale (SB)

Personality tests.

These are used to measure personality style and traits. Personality tests
are commonly used in research or to assist with clinical diagnoses.

–  Minnesota Multiphasic Personality Inventory (MMPI)

–  Thematic Apperception Test (TAT)


TYPES OF PSYCHOLOGICAL TESTS
Attitude tests.


Such as the Likert Scale or the Thurstone Scale, are used to measure how an
individual feels about a particular event, place, person or object.

Achievement tests.


These are used to measure how well you understand a particular topic (i.e.,
mathematics achievement tests).
–  American College Test (ACT)

–  Iowa Test of Basic Skills,

–  STAR Early Assessment.

Aptitude tests.


These are used to measure your abilities in a specific area (i.e. clerical skills).
TYPES OF PSYCHOLOGICAL TESTS
Neuropsychological tests.

It is use to find out how damage to your brain is affecting your ability to
reason, concentrate, solve problems, or remember.
–  Barcelona Neuropsychological Test (BNT)

–  Cambridge Neuropsychological Test Automated Battery (CANTAB)

–  Cognistat (The Neurobehavioral Cognitive Status Examination)

Vocational tests.

It is the process of determining an individual's interests, abilities and
aptitudes and skills to identify vocational strengths, needs and career
potential.
–  Career Occupational Preference Survey (COPES)
TYPES OF PSYCHOLOGICAL TESTS
Direct observation tests.

It involves the observation of people as they complete activities. This type
of assessment is usually conducted with families in a laboratory, home or
with children in a classroom.
–  The Parent-Child Interaction Assessment-II (PCIA)

–  The MacArthur Story Stem Battery (MSSB)

–  Dyadic Parent-Child Interaction Coding System-II

Biographical Information Blank



The Biographical Information Blanks or BIB is a paper-and-pencil form that
includes items that ask about detailed personal and work history. It is
PURPOSES PSYCHOLOGICAL
ASSESSMENT (Erford, 2013)


Screening
– quick survey to located individuals who may need or be eligible
for special treatment


Diagnosis
– A detailed analysis of an individuals strength and weaknesses
with the general goal of arriving at a classification decision
PURPOSES PSYCHOLOGICAL
ASSESSMENT (Erford, 2013)

Treatment Planning and Goal identification
– Determines what specific concern/problem afflicts a person to
properly define a specific plan of action and to identify specific
objectives that are needed to be achieved


Progress / Outcome Evaluation
– Ensuring that a given treatment plan is helpful to a client
Limitations of Psychological Test
Uses of Psychological 
Scores can’t reveal how or why the
Test in Various individual obtained a certain score

Infers only from a sample of
Settings behavior

Easily affected by extraneous
variables

Educational Setting 
Chance error on individual
interpretation of scores

Industrial/Business 
SEM = reasonable limits of scores
Setting and yet maintain its reliability

SEdiff = Difference between two

Clinical/Counseling scores for test of significance

SEest = Margin of error expected in
Setting individual’s predicted criterion
STATISTICAL FOUNDATIONS
SCALES OF MEASUREMENT AND THEIR PROPERTIES

Type of Magnitude Equal Absolute 0


Scale Intervals
Nominal No No No
 
Ordinal Yes No No
 
Interval Yes Yes No
 
Ratio Yes Yes Yes
 
DESCRIBING DISTRIBUTIONS
1. Frequency Distribution. All scores are listed (tabular or graphic form) alongside the
number of times each score occurred.

2. Frequency Polygon

1. Skewness – the nature and extent to which symmetry is absent which indicate how
the measurements in a distribution are distributed
i. Positive Skew – relatively few of the scores fall at the high end of the distribution
ii. Negative Skew – relatively few of the scores fall at the low end of the distribution
2. Kurtosis– the steepness (peakedness/flatness) of a distribution in its center
i. Platykurtic– relatively flat
ii. Leptokurtic – relatively peaked
iii. Mesokurtic – somewhere in the middle
DESCRIBING DISTRIBUTIONS
3. Normal Distribution – bell shaped distribution

4. Measures of Central Tendency

1. Mean – the average of a group of scores

2. Median – the middle score that splits a distribution in half

3. Mode – the frequently occurring score in the distribution

5. Measures of Variability

1. Range – the difference between the highest and lowest score

2. Interquartile Deviation – measure of variability for curves that are not normally
distributed. The partial variance between upper and lower quartiles.
DESCRIBING DISTRIBUTIONS
6. Standard Scores
A. Individual Score

Percentile Score/Rank – percentage of the norm group to which the examinee scored at or below given score.
B. Linear Standard Score – results from the conversion of the raw score into a number indicating how many standard
deviation units the raw score is below or above the mean of the distribution

i. Z-Score – (Mean = 0, SD = 1).

ii. T-Score – (Mean = 50, SD = 10)

C. Normalized Standard Scores – involves “stretching” the skewed curve into a shape of a normal curve and creating a
corresponding scale of standard scores

i. Stanine - score based on dividing the normal distribution into nine parts, each part describes a range of percentile scores.

ii. Sten - score based on dividing the normal distribution into ten parts, each part describes a range of percentile scores.
CORRELATIONAL STATISTICS
A statistic that indicates the degree of relationship between any two
sets of scores obtained from the same group of individuals. The degree of
association is computed and measured through correlation coefficient.
+/- 0.0 to 0.19 Very weak, negligible correlation
+/- 0.20 to 0.39 Weak, low correlation
+/- 0.40 to 0.59 Moderate correlation
+/- 0.60 to 0.79 Strong, high correlation
+/- 0.80 to 1.0 Very strong correlation
CORRELATIONAL TECHNIQUES
Statistical Tool Set of Scores Type of Measure Interpretation
Pearson r 2 sets of scores from same respondents Both are Interval or Ratio (Scale data) -1 to +1 = perfect
.91 to .99 = very high
.71 to .90 = high
.41 to .70 = moderate
.21 to .40 = slight/weak
.00 to .21 = no corr.

Spearman’s rho 2 sets of ranking from same Both are ordinal -1 to +1 = perfect
respondents. .91 to .99 = very high
N is not more than 30 .71 to .90 = high
  .41 to .70 = moderate
.21 to .40 = slight/weak
.00 to .21 = no corr.

Kendall’s Tau 2 sets of ranking from same Both are ordinal -1 to +1 = perfect
respondents.   .91 to .99 = very high
N can be more than 30 .71 to .90 = high
Tau coefficient is smaller than .41 to .70 = moderate
spearman’s rho, when the tools are .21 to .40 = slight/weak
used. .00 to .21 = no corr.

Kendall’s W More than two sets of ranking All are ordinal, ranking from several 0 to 1
judges or raters 0 = no agreement between judges
1 = perfect agreement between judges

Phi Coefficient 2 or more sets of frequencies All are Nominal Data -1 to +1 = perfect
.91 to .99 = very high
.71 to .90 = high
.41 to .70 = moderate
.21 to .40 = slight/weak
.00 to .21 = no corr.

Multiple correlation 3 or more sets of Pearson All are ratio or Interval (Scale Data) Relationship between one variable and a
combination of two other variables.
Parametric Tests

Z-test is any statistical test for which the distribution of the test
statistic under the null hypothesis can be approximated by a
normal distribution. (NOTE: n ≥ 30; single parameter)
Example: Suppose that in a particular geographic region, the mean and standard
deviation of scores on a reading test are 100 points, and 12 points, respectively. Our
interest is in the scores of 55 students in a particular school who received a mean score
of 96. We can ask whether this mean score is significantly lower than the regional mean
— that is, are the students in this school comparable to a simple random sample of 55
students from the region as a whole, or are their scores surprisingly low?

T-test assesses whether the means of two relatively small groups of
normal distributions are statistically different from each other.
Parametric Tests (continued…)

one-way analysis of variance (ANOVA) is used to determine whether there are any
significant differences between the means of three or more independent (unrelated)
groups.
Example: a researcher wishes to know whether different pacing strategies affect the time to complete
a marathon. The researcher randomly assigns a group of volunteers to either a group that (a) starts slow
and then increases their speed, (b) starts fast and slows down or (c) runs at a steady pace throughout. The
time to complete the marathon is the outcome (dependent) variable.

two-way analysis of variance (ANOVA) compares the mean differences between groups
that have been split on two independent variables (called factors) on the dependent
variable.
Example: you may want to determine whether there is an interaction between physical activity level and
gender on blood cholesterol concentration in children, where physical activity (low/moderate/high) and
gender (male/female) are your independent variables, and cholesterol concentration is your dependent
variable.
Nonparametric Tests
Situations in which nonparametric test are used:

The data involve measurements on nominal or ordinal
scales. In these situations, you cannot compute the
means and variances that are essential part of the
parametric tests.

The data do not satisfy the assumptions underlying
parametric tests.

The data have extremely high variance, which can
undermine the likelihood of significance for a parametric
test. In this case, the scores can be converted to
categories or ranks, and a nonparametric test can be
Nonparametric Tests

Chi Square statistic is used to investigate whether distributions of categorical variables
differ from one another.
 chi-square test for independence. The test is applied when you have two categorical variables from a
single population. It is used to determine whether there is a significant association between the two
variables.
Example: in an election survey, voters might be classified by gender (male or female) and voting
preference (Democrat, Republican, or Independent). We could use a chi-square test for
independence to determine whether gender is related to voting preference.

 chi-square goodness of fit test. The test is applied when you have one categorical variable from a
single population. It is used to determine whether sample data are consistent with a hypothesized
distribution.
Example: suppose a company printed baseball cards. It claimed that 30% of its cards were rookies;
60%, veterans; and 10%, All-Stars. We could gather a random sample of baseball cards and use a
chi-square goodness of fit test to see whether our sample distribution differed significantly from
the distribution claimed by the company.
Nonparametric Tests (Continued…)


Binomial Test is an exact test of the statistical significance of
deviations from a theoretically expected distribution of
observations into two categories. One common use of the
binomial test is in the case where the null hypothesis is that two
categories are equally likely to occur (such as a coin toss).

Sign-test is the alternative test to the Wilcoxon test for dependent
data. One requirement of the Wilcoxon test is that the data needs
to be at least interval scaled. For the sign test the data needs to be
at least ordinal scaled.
Parametric vs. Nonparametric
  Parametric Non-parametric
Assumed distribution Normal Any
Assumed variance Homogeneous Any
Typical data Ratio or Interval Ordinal or Nominal
Data set relationships Independent Any
Usual central measure Mean Median

Benefits Can draw more conclusions Simplicity; Less affected


by outliers
Tests    

Choosing Choosing parametric test Choosing a non-


parametric test
Correlation test Pearson Spearman

Independent measures, 2 groups Independent-measures t- Mann-Whitney test


test

Independent measures, >2 groups One-way, independent- Kruskal-Wallis test


measures ANOVA

Repeated measures, 2 conditions Matched-pair t-test Wilcoxon test

Repeated measures, >2 conditions One-way, repeated Friedman's test


measures ANOVA
RELIABILITY
Source of Error Example Method How Assessed
Time sampling Same test given at Test-retest Correlation
two points in time between scores
obtained on the two
occasions
Item Sampling Different items used Alternate forms or Correlation
to assess the same parallel forms between equivalent
attribute forms of the test
that have different
items
1. Split half 1. Corrected
correlation
between two
halves of the
Consistency of test
Internal consistency items within the 1. Inter-item 1. KR20
same test analysis for
Intelligence test
1. Inter-item 1. Cronbach’s
analysis for alpha
personality test
Observer differences Different observers Inter-rater or Kappa Statistic
recording Inter-scorer
reliability
STANDARD ERROR OF MEASUREMENT

It is an estimate of the standard deviation of a normal distribution of scores that would presumably be obtained if a
person took the test an infinite number of times.

The formula for calculating the standard error of measurement (SEM) is:

SEM= s√ 1- r

Where: s represents the standard deviation of the instrument

r is the reliability coefficient.

Example: Anne took the Graduate Record Examinations Aptitude Test (GRE), an instrument used in selecting and
admitting students in the graduate program. GRE gives three scores:Verbal (GRE-V), Quantitative (GRE-Q) and Analytical
(GRE-A) Scores range from 200 to 800

Anne’s Scores in the GRE-V is 430:

Assume that the mean is 500 and standard deviation is 100

The reliability coefficient for the GRE-V is .90 (Educational Testing Service, 1997).
STANDARD ERROR OF MEASUREMENT

A psychometrician could then tell Ann that 68% of the time


she could expect her GRE-V score fall between 398 (430 - 32) and
462 (430 + 32). If we wanted to expand this interpretation, we
could use two standard errors of measurement (2 x 32 = 64). In
this case, we would say that 95% of the time Anne’s score would
fall between 366 (430 - 64) and 494 (430 + 64). If we wanted to
further increase the probability of including her true score, we
would use three standard errors of measurement and conclude
that 99.5% of the time her score would fall between 334 (430-96)
and 526 (430 + 96).
HOW TO IMPROVE RELIABILITY

1. Quality of test items (Concise statements, homogeneous words, uniformity)

2. Adequate sampling of content domains (Comprehensiveness of items)

3. Longer assessment (More test items)

4. Developing a scoring plan (Especially for subjective tests = Rubrics)

5. Ensure Validity
VALIDITY
Face Validity – content of the test reflects the materials it Predictive Validity – Test data are used to estimate criterion
is supposed to measure, according to the test takers. scores in the future; predictor and criterion scores are
obtained at different times.
Example: The RPm Board Exam should reflect the materials
Example: High score in RPm Board Exam should predict high
provided in the table of specification. performance in psychometric work

Content Validity – Depends on evidence that the items on


Construct Validity – The extent to which a test
the test are representative of the content domain and that
they measure the objective they are supposed to. may be said to measure a theoretical construct or
trait.
Example: The RPm Board Exam should comprise a wide range of
subjects. Example: A score in an IQ test should reflect one’s
intelligence
 Criterion-Related Validity
Convergent Validity – measures of constructs that
Concurrent Validity – The test scores and criterion theoretically should be related to each other are, in fact,
information are obtained at the same time. observed to be related to each other
Example: The scores on the Mechanical Aptitude Test correlated   Discriminant Validity – measures of constructs that
significantly with supervisory ratings of the worker’s performance theoretically should not be related to each other are, in fact,
conducted at the same time observed to not be related to each other
FACTORS THAT CAN LOWER
VALIDITY
1. Unclear Directions

2. Difficult reading vocabulary

3. Ambiguity in statements

4. Inadequate time limits

5. Inappropriate level of difficulty

6. Poorly constructed test items


Test Development* and Scale Construction*

Norm-Referenced Tests (or NRTs) compare an examinee’s performance to that of other examinees

Criterion-Referenced Tests (or CRTs) differ in that each examinee’s performance is compared to a pre-
defined set of criteria or a standard.

Standardization – The process of administering a test to a representative sample of examinees for the
purpose of establishing norms.

Item Response Theory – Focuses on the range of item difficulty that helps assess an individual’s ability
level.
i. Difficulty Index – refers to the ‘difficulty level’ of each item in the given test
.00 to .20 – Very Difficult
.21 to .80 – Moderate
.81 to 1.00 – Easy
ii. Discrimination Index – indicate how adequately an item separates or discriminates
between high scores and low scores on an entire test
.00 to .20 – Can’t discriminate
Psychological Report
Writing*
THEORIES OF INTELLIGENCE

Francis Galton (1884) – The most intelligent persons are 
Cattell-Horn and Carroll – CHC model (1993):
equipped with the best sensory abilities.
Proposed that Fluid abilities (Gf) are biologically

Alfred Binet (1916) – The tendency to take and maintain a determined while crystallized abilities (Gc) are
definite direction or purpose; the capacity to make adaptations acquired skills and knowledge influenced by
and strategies to achieve a desired end, and the power of cultural, social, and educational experiences
autocriticism (self-criticism)
 Short-Term Memory (Gsm): is the ability to

Charles Spearman (1927) –Intelligence consists of one general
apprehend and hold information in immediate
factor (g) plus a large number of specific factors: Two-Factor
Theory of Intelligence. awareness and then use it within a few seconds.


L. Thurstone primary mental abilities.(1938) 7 interrelated  Long-Term Storage and Retrieval (Glr): is the
factors namely: verbal comprehension, word fluency, number ability to store information and fluently retrieve it
facility, perceptual speed, memory, space, and reasoning later in the process of thinking.

Jean Piaget (1954) – a kind of evolving biological adaptation to  Visual Processing (Gv): is the ability to perceive,
the outside world. analyze, synthesize, and think with visual patterns,
including the ability to store and recall visual

J. P. Guilford’s Structure of intelligence model (1967). Includes
120 unique intellectual factors that are organized around 3 representations.
dimensions: mental operations, content, and products.
THEORIES OF INTELLIGENCE

Francis Galton (1884) – The most intelligent persons are 
Cattell-Horn and Carroll – CHC model (1993):
equipped with the best sensory abilities.
Proposed that Fluid abilities (Gf) are biologically

Alfred Binet (1916) – The tendency to take and maintain a determined while crystallized abilities (Gc) are
definite direction or purpose; the capacity to make adaptations acquired skills and knowledge influenced by
and strategies to achieve a desired end, and the power of cultural, social, and educational experiences
autocriticism (self-criticism)
 Auditory Processing (Ga): is the ability to analyze,

Charles Spearman (1927) –Intelligence consists of one general
factor (g) plus a large number of specific factors: Two-Factor synthesize, and discriminate auditory stimuli,
Theory of Intelligence. including the ability to process and discriminate
speech sounds that may be presented under

L. Thurstone primary mental abilities.(1938) 7 interrelated factors distorted conditions.
namely: verbal comprehension, word fluency, number facility,
perceptual speed, memory, space, and reasoning  Processing Speed (Gs): is the ability to perform
automatic cognitive tasks, particularly when

Jean Piaget (1954) – a kind of evolving biological adaptation to
measured under pressure to maintain focused
the outside world.
attention.

J. P. Guilford’s Structure of intelligence model (1967). Includes
120 unique intellectual factors that are organized around 3
dimensions: mental operations, content, and products.
THEORIES OF INTELLIGENCE
 Short-Term Memory (Gsm): is the ability to apprehend 
Lev Vygotsky’s Socio-cultural theory (1978): Suggested a
and hold information in immediate awareness and then developmental theory emphasizing the role of culture and social
use it within a few seconds. interaction.

 Long-Term Storage and Retrieval (Glr): is the ability to 


Alesandra Luria (1966) – Information-Processing view :
store information and fluently retrieve it later in the mechanisms by which information is processed. How information
process of thinking. is processed, rather than what is processed.

 Simultaneous vs. successive processing.


 Visual Processing (Gv): is the ability to perceive,
analyze, synthesize, and think with visual patterns,  Simultaneous – parallel processing
including the ability to store and recall visual
representations.  Successive – sequential
 Auditory Processing (Ga): is the ability to analyze, 
Howard Gardner (1983) – The ability to resolve genuine
synthesize, and discriminate auditory stimuli, including problems or difficulties as they are encountered. He Proposed 8
the ability to process and discriminate speech sounds frames of mind (in some circles there are 9) : Spatial, Linguistic,
that may be presented under distorted conditions. Logical – Mathematical, Bodily – Kinesthetic, Musical,
Interpersonal, Intrapersonal, Naturalistic, with Existential as the
 Processing Speed (Gs): is the ability to perform 9th frame of mind, currently under contention.
automatic cognitive tasks, particularly when measured
under pressure to maintain focused attention. 
Robert Sternberg (1986) – Triarchic Theory of Intelligence
Intelligence Testing

Principles in Test

Issues Construction

Is intelligence stable? 
Age Differentiation

General Mental Ability

What do intelligence test
scores predict? 
Gf-gc Theory

Point Scale System

Is intelligence hereditary?

Performance vs. Verbal Questions

What is the Flynn effect?

Developmental Issues

Stanford-Binet Intelligence Scale
FR Nonverbal Matrices Tasks
(Fluid Reasoning) Verbal Analogies

KN Nonverbal Absurdities
(Knowledge) Verbal Vocabulary
QR Nonverbal Quant. Reasoning
(Quantitative Verbal Verbal QR
Reasoning)
VS Nonverbal Form board
(Visual/Spatial Verbal Positions & Dir
Reasoning)
WM Nonverbal Block pattern
(Working Memory) Verbal Sentence memory
Stanford-Binet Intelligence Scale
IQ Range Category

145-160 Very gifted or highly advanced

130-144 Gifted or very advanced

120-129 Superior

110-119 High average

90-109 Average

80-89 Low average

70-79 Borderline impaired or delayed

55-69 Mildly impaired or delayed

40-54 Moderately impaired or delayed


Wechsler Adult Intelligence Scale

  Vocabulary
Verbal Comprehension Similarities
Information

  Picture completion
Perceptual Organization Block Design
Matrix reasoning
  Arithmetic
Working Memory Digit span
Letter-number sequencing
Processing Speed Digit symbol
Symbol Search
Wechsler Adult Intelligence Scale

IQ Range Category
130 & above Very Superior
120 – 129 Superior
110 – 119 High Average
90 – 109 Average
80 – 89 Low Average
70 – 79 Borderline
69 & below Extremely low

Vineland Adaptive

Kaufman Assessment
Battery for Children, 2nd
Behavior Scales, 2nd
ed
ed Scale Name Subtest
Index Subdomain Sequential Scale Number Recall

Communication Receptive Word Order


Hand Movements
Expressive
Simultaneous Scale Block Counting
Written
Conceptual Thinking

Daily Living Personal Face Recognition


Skills Pattern Reasoning
Domestic
Rover
Community
Story Completion
Socialization Interpersonal Relationships Triangles

Play and Leisure Time Gestalt Closure


Planning Scale Pattern Reasoning
Coping Skills
Story completion
Motor Skills Fine Learning Scale Atlantis
Gross Rebus
Atlantis Delayed
Maladaptive Internalizing
Behavior Index Rebus Delayed
Externalizing
Knowledge Scale Expressive Vocabulary
(Optional)
Other Riddles
Verbal Knowledge

Ravens Progressive •
Culture Fair Intelligence
Matrices by Jean Ravens Tests by R. Cattell (1959,
(1935, 1992) 1973)

One of the best known nonverbal group 
Paper and pencil procedure that covers
test that measures individual general 3 levels (ages 4-8, ages 8-12 and high
intelligence for educational and school age to above average adults)
industrial purposes. 60 items with
increasing difficulty applicable for 5 yrs 
Nonverbal measure of fluid
to adults. intelligence – analytic and reasoning
ability in abstract and novel situations.
Panukat ng Katalinuhang Pilipino by Aurora Palacio (1991)


A tool that is truly Filipino in nature and orientation. Age range: 16 and above; uses
Full Scale IQ

Coverage: Crystallized intelligence → Talasalitaan (Vocabulary), Kakayahan sa mga
bilang (Numerical ability), Ugnayan (Analogy) and Fluid intelligence →
Isinalarawang problema (Figural)
Differential Aptitude Test (DAT, 5th ed. by
Bennett, Seashore & Wesman, 1982, 1990)
General Cognitive Verbal reasoning Measure the ability to learn in either
Abilities an occupational or training setting,
Numerical reasoning
and specifically the ability to learn
from books and manuals, self
instruction, trainers, teachers, or
mentors.

Perceptual Abstract Reasoning Tests abilities that are important


Abilities when dealing with things, rather
Mechanical Reasoning
than people or words
Space Relations
Clerical and Spelling Tests skills necessary to perform
Language Skills various types
Language Usage
Clerical Speed &
Accuracy

Flanagan Industrial Tests (FIT by John C. Flanagan,
1965)
1. Arithmetic 7. Ingenuity 13. Patterns
2. Assembly 8. Inspection 14. Planning
3. Components 9. Judgment & Comprehension 15. Precision
4. Coordination 10. Mathematics & Reasoning 16. Scales
5. Electronics 11. Mechanics 17. Tables
6. Expression 12. Memory 18. Vocabulary


Philippine Aptitude Classification Test (PACT by CEM)

1. Symbol discrimination 6. Flexibility of Closure


2. Form discrimination 7. Verbal Filipino
3. Verbal English 8. Spatial Closure
4. Number Facility 9. Mechanical Reasoning
5. Induction 10. Perceptual Acuity
BarOn Emotional Quotient Scale
Inventory by Rueven Baron
1. Intrapersonal. Being aware of ourselves and 3. Stress-Management.
understanding our strengths and weaknesses; and a) Stress Tolerance
being able to express ourselves, or feelings, and b) Impulse Control
our thoughts nondestructively.
a) Self-regard 4. Adaptability. Managing change by realistically and
b) Emotional self-awareness flexibly coping with the immediate situation and
c) Assertiveness effectively solving problems as they arise.
d) Independence a) Reality Testing
e) Self-actualization b) Flexibility
2. Interpersonal. Being aware of others’ emotions, c) Problem Solving
feelings and needs, and being able to establish and
maintain cooperative, constructive and mutually 5. General Mood. Being optimistic, positive and
satisfying relationship. sufficiently self-motivated to set and pursue our
a) Empathy goals
b) Social Responsibility a) Optimism
c) Interpersonal Relationship b) Happiness
BarOn Emotional Quotient Scale
Inventory by Rueven Baron
Theories of Personality

Hippocrates’ Humoral Theory – with four 
Costa and McCrae’s Big Five model
personality types namely: sanguine, Extraversion – Neuroticism – Openness –
choleric, melancholic and phlegmatic Agreeableness – Conscientiousness


Sheldon & Stevens (1942) proposed a type

Raymond Cattell’s Factor-Analytic Trait
theory based on relationship between body theory.
build and temperament such as endo-,  Surface traits – more obvious
meso- and ectomorph. Contemporary
research on coronary prone personality aspects of personality
types seem related to the type theory.  Source traits – stable and

John Holland’s theory of vocational constant sources of behavior, less
personality. Occupational choice has a visible than surface traits and
great deal to do with one’s personality and more important in accounting for
self-perception of abilities. behavior.
Theories of Personality

Hans Eysenck’s dimensions of 
Social Cognitive Theory. Reciprocal
personality = Psychoticism (P), influences between people and their
Extraversion (E), and Neuroticism (N). circumstances, especially concerning

Henry Murray’s Psychogenic Needs and perceptions of control
Environmental Presses which argues that 
Humanistic Theory. Theories of how
it is the continuity of functional forms
and forces manifested through sequences people develop, the structure of
of organized regnant processes and overt personality, the nature of mental
behaviors from birth to death health and how to treat problems
based on writings by Rogers and

Sigmund Freud’s unconscious level based Maslow
on Topographical model of Personality
PRINCIPLES OF PERSONALITY TEST
CONSTRUCTION
 Response Styles A tendency to….
Personality – Individual’s unique constellation of
psychological traits and states, including aspects of values,
Socially desirable Presenting oneself in a
interests, attitudes, worldview, acculturation, sense of responding favorable light
personal identity, sense of humor, cognitive and
behavioral styles and related characteristics. Acquiescence Agree with whatever is
presented

TRAITS vs. TYPES vs. STATES Nonacquiescence Disagree with whatever is


presented
Traits – any distinguishable, relatively enduring ways in
which one individual varies from another (Guilford, 1959) Deviance Make unusual or uncommon
responses
Types – constellation of traits that is similar in patterns to
Extreme Make extreme, as opposed to
one identified category of personality within a taxonomy middle, ratings on a rating
scale
of personalities.
Gambling/cautiousness Guess or not guess when in
States – refers to transitory exhibition of some personality doubt
trait.
Overly positive Claim extreme virtue through
SELF-REPORTS vs. OTHER RATER: How valid can it self-presentation in a
superlative manner
be?
TECHNIQUES OF
• •
GENERAL PRINCIPLES FOR
EVALUATING MEASURES OF
MEASURING PERSONALITY PERSONALITY

INTERPRETABILITY – The results must convey

PAPER-PENCIL TESTS – This is a kind of information about the individual that can be interpreted
test requiring written answers; objective type reliably by various users. In other words, personality tests
that are reasonably specific in terms of what they are trying

INTERVIEWS – this is a technique wherein a to measure are likely to prove more useful than tests that are
vague.
person is asked to describe herself/himself.
Sometimes interviews are from standardized •
STABILITY – It can be defined in two different ways, both
set of questions. This is also the most of which are relevant for evaluating personality measures.
commonly used method of personality – First, there is the stability of scoring rules, which
assessment affects inter-judge agreement. In general, objective
measures have simple and significantly more stable

ASSESSMENTS – this technique is used to scoring rules than are impossible with projective
measures.
pick out psychological disorders of a person.
Examples of these assessments are MMPI, – Second meaning of stability is stability across
Rorschach test and Thematic Apperception situations which refer to both the test scores and to
Test. the attribute being measured


OBSERVATION – This are judgments about
personality based on our own observations.
NORMS, RELIABILITY AND
VALIDITY OF PERSONALITY TESTS

The scores are usually interpreted with reference to a set of norms based on the responses of selected
groups of people because the standardization samples are sometimes very small and unrepresentive of the
intended target population, such norms must be interpreted cautiously.

Scores and norms for some personality inventories, particularly those consisting of items having a forced-
choice format, are ipsative. When the scoring is ipsative, a person’s score on one scale is affected by his or
her scores on the remaining scales.

It is impossible to make all high scores or all low scores because the scores compensate for one another.
This creates problems in comparing the scores of different people on a particular scale or variable.

The instability of personality measurements typically results in measures having lower reliabilities than
scores on tests of ability and achievement.

Personality inventories have fairly limited validities. Here, the faking and response sets contribute to the
low validities of many inventories used in clinical diagnosis and classification. Another factor affecting the
validity is the susceptibility of users to the fallacy of believing that sets of item scales with similar names

NEO Personality Inventory – 16 Personality Factors

Revised by Costa & McRae, 1992 Questionnaire R. Cattell, 1946


Domains Facets (each has 6) The 16 Personality Factors
Factor Name Interpretation of Low Score Interpretation of High Score
Neuroticism - Anxiety - Depression
Warmth Reserved, detached, cool, Warm, outgoing, likes people
-
Angry Hostility -
Self- impersonal
-
Impulsiveness consciousness Intelligence Concrete thinking Abstract thinking, bright
-
Vulnerability Emotional Emotionally less stable, Emotionally stable, calm, mature
Stability changeable
Extraversion - Warmth - Gregariousnes Dominance Submissive, conforming, mild Dominant, assertive. Competitive
s
-
Assertiveness Impulsivity Serious, prudent, sober, Enthusiastic, cheerful, heedless
-
Excitement taciturn
-
Activity
seeking Conformity Expedient, disregards rules Conforming, persevering, moralistic
Boldness Shy, timid, restrained Bold, uninhibited, spontaneous
-
Positive
Sensitivity Tough-minded, self-reliant Tender-minded, sensitive
Emotions
Suspiciousness Trusting, adaptable Suspicious, hard to fool, opinionated
Imagination Practical, conventional Impractical, absent-minded,
unconventional
Openness to - Fantasy - Actions
Experience Shrewdness Forthright, genuine, Calculating, polished, socially alert
-
Aesthetics -
Ideas unpretentious
-
Feelings -
Values
Insecurity Confident, self-satisfied, secure Self-blaming, worrying, troubled
Agreeableness - Trust - Modesty Radicalism Conservative, resisting change Liberal, analytical, innovative
-
Compliance -
Altruism Self-sufficiency Group-oriented, sociable Resourceful, self-sufficient
Self-discipline Undisciplined, impulsive Compulsive, socially precise
-
Straightforwardn -
Tender-
Tension Relaxed, tranquil, low drive Frustrated, driven, tense
ess mindedness
 

Conscientiousn - Competence - Achievement- Four Second-Order Indices from the 16PF


ess striving Extraversion Introversion Extraversion
-
Order
-
Self-discipline Anxiety Low anxiety High anxiety
-
Dutifulness
Tough poise Sensitivity, emotionalism Tough poise
-
Deliberation
Independence Dependence Independence

MINNESOTA Multiphasic Personality Inventory by •
MYERS-BRIGGS TYPE INDICATOR by
Starke Hathaway & John Charnley McKinley (1940) Katharine C. Briggs & Isabel Briggs Myers

Hypochondriasis – patients who showed exaggerated concerns
(1962)
about their physical health

Attempts to classify persons according to

Depression – clinically depressed patients; unhappy and
pessimistic about their future
Carl Jung’s theory of personality types
which yields a 4-letter typological code.

Hysteria – Patients with conversion reactions

It is valuable in vocational guidance,

Psychopathic Deviate – Patients who had histories of
delinquency and other antisocial behavior organizational consulting, implications to
relationships, leadership and personality

Masculinity-femininity
functioning

Paranoia – Patients who exhibit paranoid symptomatology such
as ideas of suspiciousness, delusions of persecution, and •
The four theoretically independent polarities
delusions of grandeur
when uniquely combined according to

Psychasthenia – Anxious, obsessive-compulsive, guilt-ridden, examinees’ preferences will result to total of
and self-doubting patients 16 possible TYPES: Extraverted –

Schizophrenia – Patients who were diagnosed as such (with
Introverted; Sensing – Intuitive; Thinking –
various subtypes) Feeling; Perceiving – Judging

Hypomania – Patients most diagnosed as manic-depressive, who
exhibited manic symptomatology such as elevated mood,

Bender Gestalt Visual- •
Panukat ng Pagkataong Pilipino by
Anna Daisy Carlota (1978)
Motor Test by Lauretta
Bender (1938) CONSCIENTIOUSNESS Domain EMOTIONAL STABILITY Domain
1. Pagkaresponsable 1. Pagkamahinahon

Test takers were shown 9 cards in turn and instructed (Responsibleness) (Emotional Stability)
2. Pagkamatiyaga (Patience) 2. Pagkamaramdamin
to copy it. Average administration time for all designs 3. Pagkamapagsapalaran (Sensitiveness)
(Risk taking) 3. Pagkamasayahin
was about five minutes. After all designs had been 4. Pagkamasunurin (Cheerfulness)
(Obedience)  
coped, examinee is instructed to draw all of the 5. Pagamasikap (Achievement
orientation)
designs from memory.
6. Pagkamaayos (Orderly)

   


Typical errors in reproduction are rotation, INTELLECT/OPENNESS Domain SURGENCY/EXTRAVERSION
1. Pagkamatalino Domain
angulation, integration, perseveration, distortion of (Intelligence) 1. Pagkapalakaibigan
2. Pagkamalikhain (Creativity) (Sociability)
shape and disproportion. 2. Pagkamadaldal (Social
curiosity)
AGREEABLENESS Domain 4.  Pagkamapagkumbaba

Brannigan and Decker (2003) added seven new items: 1. Pagkamaalalahanin
(Humility)
5. Pagkamaunawain (Capacity
(Thoughtfulness)
4 designs for ages 4 to 7 yo while 3 new items are for 2. Pagkamagalang
for understanding)
6. Pagkamatapat (Honesty)
(Respectfulness)
ages 8 and above. 3. Pagkamatulungin
(Helpfulness)
Panukat ng Ugali at Pagkatao by Virgilio Enriquez & Ma.
Angeles Guanzon-Lapena (1975, 1983, 1989, 1997, 2001)

EXTRAVERSION/SURGENCY AGREAABLENESS
1. Pagkasunud-sunuran (Conformity) 1. Pagkamapunahin (Criticalness)
2. Ambisyon (Ambition) 2. Pagkapalaaway (Belligerence)
3. Pagkamahiyain (Shyness/Timidity) 3. Hirap Kausapin (Difficulty to Deal with)
4. Lakas ng Loob (Guts/Daring) 4. Pagkamapagkumbaba (Humility)
5. Pagkamatulungin (Helpfulness)
6. Pagkamapagbigay (Generosity)
7. Pagkamagalang (Respectfulness)

CONSCIENTIOUSNESS EMOTIONAL STABILITY


1. Pagkasalawahan (Ficklemindedness) 1. Pagkamapagtimpi (Restraint)
2. Katiyagaan (Perseverance) 2. Pagkapikon (Low Tolerance for Teasing)
3. Tigas ng Ulo (Stubbornness) 3. Pagkamaramdamin (Sensitiveness)
4. Pagkaresponsable (Responsibleness) 4. Sumpong (Mood)
5. Pagkasigurista (Prudence)  
6. Katipiran (Thriftiness)

INTELLECT/OPENNESS TO EXPERIENCE  
1. Pagkamausisa (Inquisitive)
2. Pagkamaalalahanin (Thoughtfulness)
3. Pagkamalikhain (Creativity)
Trustworthiness Scale (T Scale)
by Garcia, Hernando, Samson and Abrenica
(1995)

Designed to measure the degree to which trustworthiness
behaviors will be manifested by bank employees

50 items written in English

5-point Likert Scale

Reliability (Split-half): .95

Validity
– Convergent Validity: r=.343; p value at .05
– Discriminant Validity: r=-.147; p value at .05
Alternative Assessment

INTAKE OR INITIAL INTERVIEW •
CHECKLIST

One of the commonly used assessment strategies
1. Requires the observer to note whether a particular

Used to gather information about the range and scope of the concerns, characteristic is present or absent
pertinent details about the current situation, and relevant background
information to the current problems
2. Individuals are instructed to mark the words or

Not always easy to conduct effectively; guidelines and training are needed phrases in the list that apply to them
to obtain accurate and valid information from an interview

Strengths 3. Can be filled out by the client or by an observer


(parent, teacher)

Ability to ask diverse questions (issues, problems, personal information)

Good method for gathering initial information Standardized Checklists
Limitations
•Mooney Problem Checklist

Lack of validation evidence

Influences of the interviewer’s subjectivity •The Symptoms Checklist -90 Revised

Considerations on Gender and culture issues •Brief symptoms Inventory

Possible “halo effect”
•The Child Behavior Checklist for 4 – 18
Rating Scales
5 Types of Rating Scales

Francis Galton – first to use rating scales
for psychological assessment in 19th 1. Numerical Scale – the person, object or event is assigned one
of several numbers corresponding to particular description of
century typically asks the observer to note the characteristics being rated
the degree to which a characteristic is
present or how often a behavior occurs 2. Semantic Differential Scale – a person rates a series of
concepts on several seven-point, bipolar adjectival scales

May be made either by the ratee or another
3. Graphic rating Scale – the rater puts a check mark on each
rater of a series of lines containing descriptive terms or phrases
pertaining to a certain characteristic or trait

Extensively used in assessing behavioral
and personality characteristics 4. Standard Rating Scale – the rater supplies or is supplied with
a set of standards for evaluating the persons being rated (ratees)

Less precise than personality inventories
5. Forced-choice Scale – given a series of descriptions, the rater
and more superficial than projective is told to select the statement that is most descriptive and the
technique. one that is least descriptive of the ratee
Rating Scales

Errors in Rating
1. Constant Error – occurs when the assigned ratings are higher (leniency or
generosity error), lower (severity error) or average category (central
tendency error)

1. Halo Effect – tendency of raters to respond on the basis of their general


impression of the ratee

1. Contrast Error – tendency to assign a higher or lower rating than justified.


Observation

Most widely used, understood and acceptable method of personality assessment

Another useful form of data collection

Gives information and insights on strengths and weaknesses

First step in designing relevant focus group questions, interview guides, and surveys (It
also provides an opportunity to observe the program setting, activities, and interactions
between staff and participants)

Advantages

Observations are relatively easy to do.

The situations in which people or animals are observed may come closer to real life
situations than is the case for most research. Therefore, observational research may be
Observation
Disadvantages

Even in laboratory observation, once the observation is set up, the observer has relatively
little control over the behavior.

The observer may not be aware of all of the factors that are affecting behavior, and may
draw incorrect conclusions.

Observational Methods

Naturalistic Observation - this method involves observing behavior as it naturally
happens. The observer tries to be unobtrusive and does not interfere in any way.

Laboratory Observation - this method involves doing observations in which you have set
up the circumstances in which the behavior will take place. If you interfere in any way to
set up the observation, it becomes a laboratory observation; thus, it does not necessarily
Mental Status Examination

Mental Status Examination (MSE) and client’s history are the
most important diagnostic tools a psychiatrist has to
obtain information to make an accurate diagnosis

A means of assessing the person's current thought processes,
emotions, and interpersonal qualities

Can also provide clues to areas that may need to be
addressed in follow-up sessions or outside referrals

Often conducted as part of the clinical interview
ETHICAL PRACTICE OF PSYCHOLOGICAL
ASSESSMENT (by Groth-Marnat, 2009)

Developing a Professional Relationship



Assessment should be conducted only in the context of a
clearly defined professional relationship.

This means that the nature, purpose, and conditions of the
relationship are discussed and agreed on.

Usually, the clinician provides relevant information, followed
by the client’s signed consent. Information conveyed to the
client usually relates to the type and length of assessment,
alternative procedures, details relating to appointments, the
nature and limits of confidentiality, financial requirements,
and additional general information that might be relevant to
the unique context of an assessment
ETHICAL PRACTICE OF PSYCHOLOGICAL
ASSESSMENT (by Groth-Marnat, 2009)

Invasion of Privacy

The Office of Science and Technology (1967), in a report
entitled Privacy and Behavioral Research, has defined privacy
as “ the right of the individual to decide for him/herself how
much he will share with others his thoughts, feelings, and
facts of his personal life” (p. 2). This right is considered to be
“essential to insure dignity and freedom of self-
determination” (p. 2).

The invasion of privacy issue usually becomes most
controversial with personality tests because items relating to
motivational, emotional, and attitudinal traits are sometimes
disguised.
ETHICAL PRACTICE OF PSYCHOLOGICAL
ASSESSMENT (by Groth-Marnat, 2009)

Inviolacy

Whereas concerns about invasion of privacy relate to the
discovery and misuse of information that clients would rather
keep secret, inviolacy involves the actual negative feelings
created when clients are confronted with the test or test
situation.

Inviolacy is particularly relevant when clients are asked to
discuss information they would rather not think about.
ETHICAL PRACTICE OF PSYCHOLOGICAL
ASSESSMENT (by Groth-Marnat, 2009)

Labeling and Restriction of Freedom



A major danger is the possibility of creating a self-fulfilling
prophecy based on the expected roles associated with a
specific label.

Another negative consequence of labeling is the social stigma
attached to different disorders.

Self-acceptance of labels can likewise be detrimental, Clients
may use their labels to excuse or deny responsibility for their
behavior.
ETHICAL PRACTICE OF PSYCHOLOGICAL
ASSESSMENT (by Groth-Marnat, 2009)
Competent Use of Assessment Instruments

To correctly administer and interpret psychological tests, an
examiner must have proper

Training, which generally includes adequate graduate course
work, combined with lengthy supervised experience (Turner
et al., 2001).

Examiners should also acquire a number of specific skills
(Moreland, Eyde, Robertson, Primoff, & Most, 1995; Turner et
al., 2001). These include the ability to evaluate the technical
strengths and limitations of a test, the selection of
appropriate tests, and knowledge of issues relating to the
test’s reliability and validity, and interpretation with diverse
ETHICAL PRACTICE OF PSYCHOLOGICAL
ASSESSMENT (by Groth-Marnat, 2009)

Interpretation and Use of Test Results



Accurate interpretation means not simply using norms and cutoff
scores, but also taking into consideration unique characteristics of the
person combined with relevant aspects of the test itself. Whereas
tests themselves can be validated, the integration of information from
a test battery is far more difficult to validate

If there are significant reservations regarding the test interpretation,
this should be communicated, usually in the psychological report itself.

A further issue is that test norms and stimulus materials eventually
become outdated. As a result, interpretations based on these tests
may become inaccurate. This means that clinicians need to stay
current on emerging research and new versions of tests. A rule of
thumb is that if a clinician has not updated his or her test knowledge in
the past 10 years, he or she is probably not practicing competently.
ETHICAL PRACTICE OF PSYCHOLOGICAL
ASSESSMENT (by Groth-Marnat, 2009)

Communicating Test Results



Psychologists should ordinarily give feedback to the client and
referral source regarding the results of assessment (Lewak &
Hogan, 2003; also see Pope, 1992 for specific guidelines and
responsibilities). This should be done using clear, everyday
language.

This involves understanding the needs and vocabulary of the
referral source, client, and other persons, such as parents or
teachers, who may be affected by the test results.
ETHICAL PRACTICE OF PSYCHOLOGICAL
ASSESSMENT (by Groth-Marnat, 2009)

Maintenance of Test Security and Assessment Information



If test materials were widely available, it would be easy for
persons to review the tests, learn the answers, and respond
according to the impression they would like to make. Thus,
the materials would lose their validity.

This means that psychometricians should make all reasonable
efforts to ensure that test materials are secure. Specifically,
all tests should be kept locked in a secure place and no
untrained persons should be allowed to review them. Any
copyrighted material should not be duplicated.

In addition, raw data from tests should not ordinarily be
released to clients or other persons who may misinterpret
ETHICAL DECISION MAKING
(Forester-Miller & Davis, 1996; in Erford, 2013)

1.Identify the Problem. One should gather all relevant information and determine whether the problem is an
ethical issue, legal, practice-related, or other issue. If this is an ethical issue, continue with the process.

2. Apply the PAP CODE OF ETHICS FOR PHILIPPINE PSYCHOLOGISTS:

A. Bases for Assessment – based on substantial information and appropriate/adequate assessment


techniques and procedures. Acknowledge the limitation of our expert opinion if testing was not conducted or
test results were rather old

B. Informed Consent in Assessment – exemption:


i. when it is mandated by the law
ii. when it is implied such as in routine educational, institutional and organizational activity
iii. when the purpose of the assessment is to determine the individual’s decisional capacity

C. Assessment Tools – judiciously select and administer tests pertinent to referral and purpose of
assessment, norms are directly referable to the population of our clients
ETHICAL DECISION MAKING
(Forester-Miller & Davis, 1996; in Erford, 2013)
D. Obsolete and Outdated Test Results – do not base interpretations, conclusions and
recommendations on outdated test results

E. Interpreting Assessment Results ­– communicate results in a most understandable way and least
stigmatizing manner

F. Release of Test Data – data not to be used by other persons not involved. We do not release
raw scores, scaled scores, client’s actual test responses and notes or behaviors during exam
unless regulated by the court

G. Explaining Assessment Results –use layman’s terminologies and not technical language

H. Test Security – administration and handling of all test materials must be done by qualified
personnel.

I. Assessment by Unqualified Persons – except for training purposes with adequate supervision
ETHICAL DECISION MAKING
(Forester-Miller & Davis, 1996; in Erford, 2013)

3.Determine the Nature and Dimensions of the Dilemma. A psychometrician should consider the
moral principles that underlie the PAP Code of Ethics for direction, current research, and seek
consultation to determine an appropriate course of action.

4.Generate Potential Course of Action. A professional psychometrician should consult at least one
colleague to ensure that all potential courses of action are identified.

5.Consider the Potential Consequences of all Options and Determine a Course of Action. The impact
of potential consequences on the client and others should be considered in determining which option
is optimal for addressing the dilemma.

6.Evaluate the Selected Course of Action. This is to ensure that implementing that choice will not
create new or additional ethical dilemmas.

7.Implement the Course of Action. The psychometrician, together with the assistance from the
psychologist, should implement the selected course of action and follow up to ensure that the selected
action had the desired outcome.
1. Mismatched Validity
*Some tests are useful in diverse situations, but
no test works well for all tasks with all people
in all situations.
*It is important to note that as the population,
task, or circumstances change, the measures of
validity, reliability, sensitivity, etc., will also
tend to change.

*10 Fallacies in
Psychological Assessment
Pope, K.S. (2003; 2010). 10 fallacies in psychological assessment. Retrieved on June 2015
from http://kspope.com/fallacies/assessment.php
2. Biases
*Contrast bias. Involves making an evaluation based on
the standard of the preceding client (Wexley, Sanders, &
Yuki, 1973).
*Order effects. The primacy and recency effect,
collectively known as order effects, refer to the saliency
of information based on the timing and order in which
they are presented. The primacy effect can be seen
when information first presented to the assessor
influences the final judgment more than information
presented later during the session (Peters & Terborg,
*10 Fallacies in
1975). The reverse applies for the recency effect, where
information presented later in the session has a greater
Psychological Assessment
influence on the final decision made (Morgeson &
Campion, 2010).
Pope, K.S. (2003; 2010). 10 fallacies in psychological assessment. Retrieved on June 2015
from http://kspope.com/fallacies/assessment.php
2. Biases
*Availability bias. Rresults from inaccurately basing the
frequency of events on the ease with which they can be
recalled to memory (Morgeson & Campion, 2010).
*Confirmation bias. Refers to the tendency to seek
evidence to confirm an initial preconception and ignore
any contradictory information (Dror & Fraser-McKenzie,
2008).
*Representativeness bias. Reflects the tendency of
people to judge the degree of relationship between two
things based on their similarity to each other (Morgeson
& Campion, 2010).
*10 Fallacies in
Psychological Assessment
Pope, K.S. (2003; 2010). 10 fallacies in psychological assessment. Retrieved on June 2015
from http://kspope.com/fallacies/assessment.php
3. Confusing Retrospective & Predictive
Accuracy (Switching Conditional
Probabilities)
*“Affirming the consequent” logical fallacy:
People with condition X are overwhelmingly likely
to have these specific test results.
Person Y has these specific test results.
Therefore: Person Y is overwhelmingly likely to
have condition X.

*10 Fallacies in
Psychological Assessment
Pope, K.S. (2003; 2010). 10 fallacies in psychological assessment. Retrieved on June 2015
from http://kspope.com/fallacies/assessment.php
4. Unstandardizing Standardized
Tests

*Changing the instructions, or the test items


themselves, or the way items are
administered or scored

*10 Fallacies in
Psychological Assessment
Pope, K.S. (2003; 2010). 10 fallacies in psychological assessment. Retrieved on June 2015
from http://kspope.com/fallacies/assessment.php
5. Ignoring the effects of Low Base Rates

* Example:
Task: To create a test that will identify crooked person among
Judicial Candidates
Hypothetical Contention: 1 in only 500 Judicial Candidate is
crooked
Supposed Reliability: .9
Number of Judicial Candidates: 5000

*10 Fallacies in
Psychological Assessment
Pope, K.S. (2003; 2010). 10 fallacies in psychological assessment. Retrieved on June 2015
from http://kspope.com/fallacies/assessment.php
6. Misinterpreting High Base Rates

* Example:
Out of 273 residents who sought mental health service after
an earthquake, 89% were diagnosed with PTSD related to the
earthquake. 92% of that figure belong to a specific faith.
Therefore, this faith makes a person vulnerable to PTSD. Or
more subtly, this faith might make it easier for people with
PTSD to seek mental heath services.

*10 Fallacies in
Psychological Assessment
Pope, K.S. (2003; 2010). 10 fallacies in psychological assessment. Retrieved on June 2015
from http://kspope.com/fallacies/assessment.php
7. Perfect Conditions Fallacy

* Various situational (examiner’s and testtakers’


alike) and environmental factors that can
confound the test results

*10 Fallacies in
Psychological Assessment
Pope, K.S. (2003; 2010). 10 fallacies in psychological assessment. Retrieved on June 2015
from http://kspope.com/fallacies/assessment.php
8. Financial Bias

* The Specialty Guidelines for Forensic Psychologists


"Forensic psychologists do not provide professional
services to parties to a legal proceeding on the basis of
'contingent fees,' when those services involve the offering
of expert testimony to a court or administrative body, or
when they call upon the psychologist to make affirmations
or representations intended to be relied upon by third
parties."

*10 Fallacies in
Psychological Assessment
Pope, K.S. (2003; 2010). 10 fallacies in psychological assessment. Retrieved on June 2015
from http://kspope.com/fallacies/assessment.php
*9. Ignoring Effects of Audio-recording,
Video-recording or the Presence of
Third-party Observers

* Hawthorne Effect - (also referred to as the observer


effect) is a type of reactivity in which individuals modify
or improve an aspect of their behavior in response to
their awareness of being observed.

*10 Fallacies in
Psychological Assessment
Pope, K.S. (2003; 2010). 10 fallacies in psychological assessment. Retrieved on June 2015
from http://kspope.com/fallacies/assessment.php
*10. Uncertain Gatekeeping
* Example:
* A 17-year-old boy comes to your office and asks for a
comprehensive psychological evaluation. He has been
experiencing some headaches, anxiety, and depression.
A high-school dropout, he has been married for a year
and has a one-year-old baby, but has left his wife and
child and returned to live with his parents. He works full
time as an auto mechanic and has insurance that covers
the testing procedures. You complete the testing.

*10 Fallacies in
Psychological Assessment
Pope, K.S. (2003; 2010). 10 fallacies in psychological assessment. Retrieved on June 2015
from http://kspope.com/fallacies/assessment.php

You might also like