Glossary-Of-Terms JEREMIAH BABAO

Ability and Aptitude. 

The terms ability and aptitude are closely related and often difficult to distinguish from each other. Ability,
the mental or physical capacity to perform at a given level, is considered to be innate, therefore determined
genetically. According to psychological theory, it may be described as possession of one or more of the multiple
areas of intelligence that have been described by various theories and models. Aptitude may be described as the
proclivity to excel in the performance of specific tasks (as in, "she has a real aptitude for drawing").


In assessment refers to holding individuals or institutions responsible for the outcomes of instruction. For
example, you might hear or read that "students are accountable for their school successes and/or failures," that
"teachers (or parents) are accountable for the performance of their students (or children)," or that "school principals
are accountable for the achievement of their schools."


Is a measure of the quality and or the quantity of the success one has in the mastery of knowledge, skills, or
understandings. References to academic achievement, for example, usually involve performance in such areas as
reading, mathematics, science, or social studies.

Achievement test batteries.

 Many schools test students using an array of subtests, in a number of academic content areas and at a
variety of grade levels under a single overall test name. For example, a particular "Test of Basic Skills" might
involve subtests of mathematical skills, language skills, and vocabulary.


Involves the process of "taking stock" of, or understanding, an individual's characteristics, status, or
performance, and typically involves considering and interpreting information from several sources of data. It might
involve, for example, observations, interviews, or other kinds of information. (Compare with evaluation and

Authentic assessment 

Refers to the evaluation of students' work on activities that students engage in that approximate realistic or
real-life tasks and performances, rather than answering traditional paper-and-pencil tests. Authentic tasks typically
require complex work, problem solving, and integration of a variety of knowledge and skills brought to bear on a
realistic task or challenge. For example, students might use grocery store ads, a shopping list, and a budget to spend
as a realistic alternative to completing a group of arithmetic "column addition" exercises on a worksheet.

Competency-based assessment. 

This phrase indicates that students will be evaluated against some specific learning, behavior, or
performance objective. This objective, and/or the level of performance that represents "competency" is clearly
established in the curriculum and represents an expected level of expertise or mastery of skills or knowledge.
Criterion-referenced testing
Refers to evaluating students against an absolute standard of achievement, rather than evaluating them in
comparison with the performance of other students. A standard of performance is set to represent a level of expertise
or mastery of skills or knowledge.

Derived scores or standard scores 

Transform raw scores (the actual number of correct responses) into values that allow us to compare one
student's performance in relation to the performance of others of the same age or grade, or to the highest possible
score on a test. Common standard scores are z-scores, T-scores, percentiles, and stanines. Derived or standard scores
are all computed by determining how far above or below the mean of all scores a student scores, and then
representing the results using a standard scale. [Editor's note: The article by Gregorgy Machek and Jonathan Plucker
in the December, 2003 issue of PHP included a chart illustrating many common derived or standard scores.]


Represents a judgment or determination of value (e.g., effective or ineffective, or below, at, or above grade
level) is placed on some performance.

Formative evaluation 

Refers to any form of assessment, such as quizzes, tests, essays, projects, interviews, or presentations, in
which the goal is to give students feedback about their work while it is in progress, to help students correct errors or
missteps, or to improve the work along the way to the final product. In contrast, summative evaluation is to make a
judgment about a final product or about the quality of performance at the end of an instructional unit or course.

Grade equivalent score. 

A grade equivalent score describes a student's performance on that test in relation to a grade level and
number of months during the year of that grade. (A score of 8.2, for example, tells you that your child obtained the
same score on a test that an average student in the second month of the eighth grade would obtain.) Of course, if
your child is in the fifth grade, that's very good, but if your child is in the tenth grade, that's not so good!

High-stakes testing

Typically refers to major state or national standardized school achievement tests administered periodically
to students at various grade levels. The phrase "high stakes" is used to signify that these test results carry a great deal
of weight among school personnel, government agencies, politicians, community leaders, and the general public.
These test results often are used to make important decisions about students, teachers, and their schools, such as
graduation, grade promotions or retentions, selection for highly competitive programs or schools, or staffing and
budget decisions.


Over many years, the concept of intelligence has had many definitions. Intelligence has been defined, to
cite several examples, as the ability to think conceptually, to solve problems, to manipulate one's environment, or to
develop expertise. Some theorists have proposed that intelligence is mostly innate, inherited, or biologically-based,
and others have argued equally strongly that intelligence is influenced by one's environment. Issues regarding the
nature and breadth of intelligence continue to be topics of lively discussion among theorists and researchers in
several fields of study (including educational psychology, cognitive psychology, and sociology, for example).

Learning objective. 

A learning objective is a specific statement that describes what the student is to learn, understand, or to be
able to do as a result of a lesson or a series of lessons.
Learning outcome. 

A learning outcome represents what the student actually achieved as a result of a lesson or a series of
lessons. The success of lessons may be influenced by the students' prior knowledge, their effort and attention,
teaching methods, resources, and time. Learning outcomes refer to the results of instruction, while learning
objectives refer to the intended goals and purposes of lessons.


Is simply the process of assigning a number, or a score if you will, to some performance or product.
Examples would include grading a test or a homework assignment in terms of number or percent of correct or
incorrect responses.

Measures of central tendency 

Are quantitative (numerical) ways to describe the middle of a distribution of scores. Since most individuals
in a given population tend to exhibit middle levels of competence or presence of a characteristic, most people tend to
earn scores that are near the central portion of the normal curve (see definition, below). There are three common
measures of central tendency: mean, median, and mode. The mean refers to a numerical average of the scores. It is
obtained by adding all the scores and dividing their sum by the number of scores (e.g. scores of 100,90,80,80 and 70
result in a mean of 84). The median is simply the middle score when all scores are placed in ranked order. The
median in our example would be 80 because it is the third score counted in from either direction. The mode is the
most often occurring score. In our example, the mode is 80 since it occurs more often than any other score.

Minimum competency 

Is a judgment of the lowest level of skill or knowledge a student must have attained to be considered
"competent" in that area. Minimum competency tests are often the focus of broad national educational efforts to
improve education. It is important to note, especially for high-ability students, that minimum competencies do not
represent an adequate standard or expectation of performance, nor do they imply proficiency in, or mastery of, the
content or skill being tested.

Normal curve ("bell curve”). 

The normal or "bell" curve is a common way of representing the distribution of scores for a particular
competence or characteristic in a large population. Since most individuals of any population would exhibit
"average" competence or presence of a characteristic, their scores appear in the middle area around the crest of the
curve. Those who exhibit exceptionally high or low competence or very great or very small presence of a
characteristic appear at either ends of the curve's shape. [Editor's Note: The second part of this series, in the
December, 2003 issue of PHP, also included a diagram of the normal curve.]

Norm-referenced testing 

(Or norm-referenced assessment) refers to testing in which individuals' results are compared to some larger
group (such as a national or statewide sample of students). Usually, "norm" or "normal" groups are those in which
the students' scores are distributed in a "normal" (or "bell-shaped") pattern. In these cases, an individual's
performance is assessed in relation to where his or her score would fall under the normal curve.

Objective test items 

Require the student to select a specific response to a question that can be graded as either correct or
incorrect. They are easy to administer and score (and can often be machine-scored). Common examples of objective
test items include: true-false, multiple-choice, and matching questions.
Online assessment 

Is an assessment that is accessed on a computer via the Internet or a similar computer network. The
assessment or test is read online and the responses are given online by selecting or checking a choice by clicking the
mouse, typing a response, or perhaps even touching the computer screen with a special "pen" or speaking a response
aloud using voice recognition technology. Online assessment may also be a vehicle for submitting a portfolio of
student performances or completed assignments for the teacher to evaluate.

Percentile ranks 

Refer to an individual's standing in relation to the rest of the individuals in the norm or comparison group
(i.e., others who are taking the same test). If your child receives a percentile rank of 90, it means that your child
achieved a score equal to or better than 90 percent of the rest of the group with whom he or she is being compared.

Performance assessment

Refers to a system of evaluating individuals' abilities or achievements based on actual work or behavior.
Performance assessment focuses on the student's ability to apply what he or she has learned to a realistic task- a
problem or situation that might be encountered in real life.


Are collections of an individual's work. Some educators regard portfolio assessment as a better method of
observing and evaluating what learners truly know, understand, and can do than are tests and homework exercises,
for example. In typical classrooms that employ portfolios, students keep their work (quizzes, test papers, creative
writing, homework, book reports, project reports, art projects, etc.) in large folders, boxes, electronic files, or other
storage containers. They may keep all their work or, as is more typical and recommended as best practice, students
(on their own or with their teachers' guidance) periodically select samples of their work to illustrate their best
performances across a variety of activities. Students and teachers also may keep work samples of various degrees of
achievement to illustrate growth in ability over time or to help identify and illustrate particular weaknesses or
disabilities that require additional attention.

Power tests 

Typically have no time limits or very generous time limits so that the individual has sufficient time to
answer all questions. On a power test, the goal is to measure as much as the individual can do without the pressure
of time limits. (Compare with "speed tests.")


A student profile is often used to describe a student's characteristics and learning needs, to help guide
important educational decisions for a particular individual, or to guide individualized instructional planning. It may
contain many different kinds of data (including test scores, observations, anecdotal records, samples of student
work, or comments from cumulative records) that describe the student, the circumstances that prompted creating the
profile, questions or problems requiring resolution, and suggestions for making desired decisions.


The range of scores is the difference between the highest and lowest recorded scores. If the lowest score is
28 and the highest is 98, then the range is 70.


Refers to the degree of consistency or dependability of a test. A reliable test will produce similar scores and
distributions whenever it is given to similar populations. Thus, if a student scored a 90 on an achievement test today,
then, if the test is reliable, the student's score would not differ substantially if the test were taken again another day.
Reliability may also mean that a student would earn similar scores on two different forms of a test, if tested at about
the same time.

A rubric is a chart or plan that identifies criteria for evaluating a piece of a student's work, be it an essay
test, a paper, or some other student production. The rubric offers a description of the qualities or characteristics of
performance for several levels (such as: beginning, intermediate, or advanced, or needs improvement, adequate, or
outstanding) that the teacher or other evaluator may assign. The best rubrics offer the clearest details for each
category of evaluation so that a student's products can be evaluated consistently. Rubrics may be "analytic" and
"holistic." An analytic rubric specifies all the components of a perfect response and point values are assigned to each
component. While holistic scoring also identifies a model or perfect answer, point values. are not assigned. Thus,
holistic or global scoring is more subjective and may be less reliable than analytic scoring.

Speed tests 

Are tests with specific time limits. Such a test rewards individuals who can work fast to answer the test
items. Students with disabilities may be exempt from time limits set for speed tests. (Compare with power tests.)

Standardized tests 

Are instruments that are administered, scored, and interpreted in the same, pre-specified way by all users.
There are detailed instructions or rules for how a test is administered and scored. (One example of a well-known
standardized test is the Scholastic Aptitude Test or SAT.)


To put "standards-based" in front of such terms as instruction, assessment, testing, measurement, evaluation
and other terms typically means that whatever teachers teach and students do in class is evaluated against
specifically written and adopted standards, or goals and objectives, of achievement, usually written and adopted at
the state or national level.

Subjective tests 

Refers to the approach used to evaluate or score the student’s response to a writing prompt, an open-ended
task or question, or a "free," unstructured response to a short-answer or essay question. Unlike objective tests, in
which the correct or incorrect answer selection is easily and quickly obtained, subjective assessments present a more
difficult challenge to score and require considerably more time to read and to analyze carefully and equitably.


Is a term that describes how well a test, or a test item, measures what it claims to measure, accurately
predicts a behavior, or accurately contributes to decision making about the presence or absence of a characteristic.


