Joseph McGeorge - FAA Testing Methods and Recommendations

Running head: WRITTEN EXAMINATIONS AND PILOT COMEPTENCE 1
Written Examinations and Pilot Competence: Relationship of Learning and Evaluation
Joseph McGeorge
University of Illinois
WRITTEN EXAMINATIONS AND PILOT COMPETENCE 2
Abstract
In aviation, student pilots are required to complete a list of requirements and pass several
evaluations to achieve certification. This paper seeks to isolate standardized written testing
within the pilot certification process and analyze its role in learning among student pilots.
Standardized testing is examined through empirical research and its suitability for evaluating
student-pilot competence. Relevant research is used to support both criticisms of current
evaluation methods as well as recommendations for qualitative changes to the system. Included
is a study proposal that compares standardized testing with other criterion-based evaluation
methods that seems more appropriate to foster true learning and mastery. This paper culminates
in a study proposal to show the influence of the proposed changes compared with the current
system.
Written Examinations and Pilot Competence: Relationship of Learning and Evaluation
The modern process of pilot-certification requires various steps which a student pilot
must accomplish in order to achieve a flight certificate. One of the requirements is to pass a
standardized written exam. Many written exams have merit but this particular test has some
inherent problems which I would like to target for the purposes of this paper. First, standardized
testing in a written format may not be the most suitable evaluation method to use on pilots.
Pilots are sometimes forced to make time critical decisions that can result in life or death.
Therefore, pilots must have mastery in aeronautical knowledge so correct judgments can be
made in any situation. Another problem with respect to preparation for the written exam is that
all possible test questions have been released to the public and are available through third-party
vendors. Giving students the questions and answers before they take an exam defeats the
purpose of administering it.
If the preceding problems are not enough, students must receive a one time endorsement
from a flight instructor to be authorized to take the written exam. The flight instructors who
provide these endorsements only need to review that a student has completed a home-study
course to make the endorsement. Instructors are left with a vague idea of how to evaluate a
student’s knowledge and the test becomes more about a passing grade than true comprehension
of all subject areas. I contend that the current method for instructing and evaluating pilot
knowledge is ineffective and would reason that there are better methods for encouraging learning
among student pilots which is the purpose of this paper.
Several formats of evaluation are used in determining pilot competence, but emphasis is
given to the standardized written exam. Depending on the application, standardized achievement
testing has been shown to be useful, benign, or even harmful. Determining how tests influence
learning among pilots is crucial. Pilot evaluation is a multilayered process which incorporates
several different methods of evaluation. There are three primary evaluations a pilot must
accomplish to be certified which are a comprehensive oral exam, flight exam, and a standardized
written exam. One might ask, why propose a study seeking deficiencies in one aspect of this
process? Even though pilots are put through rigorous training and exercises, there are still many
accidents that occur within the arena of general aviation each year. (General aviation is
considered to be flying activity other than commercial.) In the 2007 Aircraft Owner’s and Pilot’s
Association (AOPA) Nall Report, figures for general aviation accidents were published using
raw data from the National Transportation Safety Board (NTSB). The figures from 2006
reported that 1,319 general-aviation accidents occurred in the United States where 73.8 percent
of those accidents were pilot-related (pilot-related means improper action or inaction by the
pilot) (AOPA, 2007). However, only 88 accidents occurred in commercial carrier operations
over the same time period, where 13 of those accidents resulted in a fatality (NTSB, 2007).
Why is commercial air travel so much safer than general aviation? Commercial
operations employ layers of expertise and content mastery in the organizational structure which
drastically improves safety and efficiency. In general aviation, the pilot is solely responsible for
the safety and events of a flight. It is my contention that many events of pilot error are caused by
lack of knowledge among these non-professional pilots. Furthermore, the evaluative methods
used during the certification process (the written examination in particular) may give pilots a
false sense of competence. A goal of this paper is to contribute ideas that will lead to greater
amounts learning while a pilot is in training by devising more effective evaluation methods.
Improving mastery among pilots will increase overall expertise and reduce accident statistics
which is truly the overarching goal of pilot training.

Justifying Changes to the Evaluative Standard
Evaluation is highly debated over subjects such as learning comprehension, long-term
retention, equity, reliability, and fairness. General consensus has led to the agreement that
standardized written tests are an effective means of comparing knowledge among students.
Haladyna (2002) writes that standardized tests can be used to “make comparisons among states,
school districts, schools, classes and students…and monitor academic achievement over time.”
Based on standardized test scores “a strength or weakness in the curriculum can be identified…
and results can be used to make changes in curriculum” (p.46). Scores can be assigned to
students which place them in percentiles of achievement based on how well they do on these
exams. The problem is that a great deal of research suggests that standardized tests may not be
as effective in determining actual understanding of those who participate. Popham (1999) writes
that while standardized achievement tests are able to “make norm-referenced interpretations of
students’ knowledge and skills in relationship to those of students nationally…standardized
achievement tests should not be used to evaluate the quality of education.” In aviation, the latter
of the two statements is often what takes place with respect to the written exam.
Instructors and evaluators place emphasis on the scores that students achieve on these
exams. However, these judgments are flawed for the simple reason that students have access to
the question bank before they take the exam. The questions in the test guide booklet are
verbatim to those presented on the written exam, so students can simply memorize the answers to
the questions they do not understand. The flaws in this method of evaluation are obvious but
emphasis is still placed on the score of the individual, even though it is a farce.
Empirical research is always a staple in guiding studies and providing evidence; however,
personal experience can go a long way in creating a compelling argument for change. On the
University of Illinois campus I teach a course for students pursuing their private pilot’s license.
Transfer students who have a private pilot’s licenses must also enroll in this course to ensure
they have adequate knowledge before proceeding in the curriculum. In one of my recent lectures
on aircraft performance I asked a question to my class and the answers I received perfectly
illustrated the disservice standardized written exams do to the students. The following dialogue
transpired:
Me: …Density altitude. Who can tell me what this means?
(John raised his hand)
Me: John?
John: It is altitude corrected for non-standard temperature and pressure.
Me: O.K. that’s a correct definition, but what does it mean?
(Silence)
Me: Anybody? John, if you had to try and describe it based on what you know, what
would you say?
John: (several seconds of pause) I don’t know.
Someone from the back: It’s the density of the air you are flying in.
Me: O.K. that has something do to with density altitude but let me show you what this
really means in terms of aircraft performance…(continued lecture).
Density altitude is a critical factor in aircraft performance. In fact, the understanding of
this term should be review for the level of these students, but they are still at a disadvantage
when describing it (remember, many of them are already private pilots). So how can a student
give an excellent definition of the term, but no one in the class can truly describe its
characteristics? Perhaps the students responded based on the material they use to study (i.e. the
written exam test guide). The following is one of the questions from the list of possible
questions on the FAA written exam:
What is density altitude?
A- The height above the standard datum plane.
B- The pressure altitude corrected for nonstandard temperature.
C- The altitude read directly from the altimeter.
(Jeppesen, 2008, 2-16)
Seeing that the correct answer is “B,” is it any surprise why the student answered as he did? In
fact, few students go beyond the knowledge of these basic definitions. Pressure altitude corrects
altitude for non-standard pressure; and density altitude corrects altitude for non-standard pressure
and temperature. The student who answered in class was very confident in his answer and is like
many of his peers who believe they have understanding of this concept. However, when pressed
for a truer description of it, they are unable to answer. Much of the time, they cannot even
speculate beyond that which is supplied as a written exam question.
Consider how such misunderstandings may have an influence on real-life circumstances.
In 2002, a commercial pilot was ferrying an aircraft across country from a high altitude airport.
On the day of the flight, high temperatures and low pressures produced a high density altitude
(poor aircraft performance). The pilot started his takeoff run and after using the entire runway he
pitched up and barely cleared a perimeter fence. Shortly after becoming airborne, the aircraft
stalled and crashed over the side of a cliff. The NTSB noted in the accident report that one of the
primary causes of the accident was the pilot’s failure to plan and compensate for high density
altitude conditions (NTSB Database). It seems unthinkable that a pilot can get beyond the
commercial level without being challenged in his applicative understanding of such an important
concept. It cost him his life.
To this point, it is easy to see the reliable inadequacy of the written exam but such a
process raises questions of ethics as well. Deuink (1986) provides a succinct list of standardized
testing ethics stating that it is unethical for teachers to “tutor students on the specific subject
matter of an expected examination…examine the content of standardized tests and use that
specific content…use standardized test items on their own examinations…or try to improve pupil
performance by developing items that parallel those on standardized test” (p. 22-23). By these
standards, the FAA is simply authorizing the pilot community to be unethical with respect to the
exam. Instructors are not only telling their students to study questions like those on the written
exam; they are telling students to study the actual questions. A convincing reason to eliminate
the standardized written test is to discard of this current practice.
Deciding on Evaluative Methods and Standards
Standardized testing for school children is and will be static for several reasons. It is
used to rank students, provides a means of determining student progress on a year-to-year basis,
and provides a benchmark for students and schools across the country. These benefits are still
subject to the problems associated with standardized testing, but they are only mentioned so as to
draw a comparison between the goals of childhood education and the goals of airman
certification. The benefits of standardized testing during childhood education are attributed to
scalability and efficiency, serving the vast majority of children in this country. However, in
airman certification the number of individuals being served is drastically less. Therefore, the
benefits of standardized testing are negligible because pilots are not ranked among one another,
the tests are not ongoing (only one test is required per certificate), and since the questions are
available to the students before the test, there is no benchmark (or even a false benchmark) of
knowledge acquisition.
A highly prioritized goal of pilot training is not only learning but also retention. When
pilots are certified they no longer have the mentor/mentee relationship with a flight instructor
and must then rely on their own knowledge to make critical decisions about flying. Therefore,
the means of evaluation should be conducive to learning strategies that encourage retention.
Since the need to classify pilots among test scores is very minimal, the need for norm-referenced
testing is unnecessary. Instead, the certification process should be focused primarily on
criterion-referenced evaluation with an emphasis on oral testing. Students must already pass a
comprehensive oral exam; however, the exam usually lasts less then two hours which is a
questionable testing period to truly determine mastery in all aviation subject areas. Invariably,
items are not covered in depth but only require “definitional” answers that do not show mastery
as demonstrated earlier in this paper through the dialogue I had with John.
If evidence is needed to support the claim that private pilots lack mastery even after
certification, one needs to look no further than accident analysis data. On October 5, 2009 a
private pilot was involved in an accident when he encountered weather that was not appropriate
for his abilities. He made several radio calls and shortly thereafter lost control of the airplane
due to a lack of situational awareness. He crashed and was killed in the accident (NTSB
Database). A surprising element of this story is that the pilot was certified on April 13, 2008
which puts his training fairly close to the date of the accident. Clearly, this pilot was not
competent in issues of weather theory and decision making which ultimately led to his death.
Just like other pilots of his caliber, he was required to pass a written knowledge test as well as an
oral and practical exam to be certified. However, he still lacked the knowledge to make the
proper decisions related to weather. This pilot would be placed in the accident statistics
described as pilot-related. It is for case examples like this that I propose a more adequate
examination process that takes advantage of more appropriate criterion-referenced methods.
When criterion-referenced measurements are used, the evaluator is more concerned with
a student’s mastery. Airasian and Madaus (1972/1977) wrote that when using criterion
referenced methods “either a student is able to exhibit a particular skill, produce a specific
product, or manifest a certain behavior, or he is not…and the criterion-referenced approach
focuses attention upon a central aspect of the teaching-learning process, namely, the criterion
skills” (p.331). In other words, the method is entirely focused on the ability and skills of the
individual, not related or compared to any other. Criterion-referenced examinations can be
implemented in different ways, but aviation evaluations should be focused on domain-centered
applications of knowledge. Domain-referenced tests have “absolute meaning in the sense of
indicating what proportion of some defined domain the examinee has mastered” and are “most
suitable when the area to be measured is a domain that can be clearly defined, the number of
possible elements in it is within finite bounds, and a ‘sampling frame’ listing all the elements of
the domain exist or can be readily constructed” (Shaycroft, 1979, p.4). Since flying sometimes
requires moments of decision making that have life and death consequences, each domain should
be mastered to 100 percent of the standard. Morgan et. al. (2004) recommends pass/fail when
the goal is to “determine whether or not students can operate technology or equipment to a
satisfactory standard” or “identify whether they can perform essential procedures” (p.253).
Since a high level of aeronautical knowledge is required, students should be evaluated on a
pass/fail basis where mastery is seen as accomplishment of the desired standards. If performance
on knowledge items is not satisfactory, remediation should be given until the student is able to
master the domains where he is deficient.
A written exam does the opposite for students, convincing them that mastery is not
required but a percentage of understanding is acceptable. The question must be asked, what does
the nation expect from its pilot force? Is making the right weather decision acceptable if made
80 percent of the time? Is knowing 85 percent of the skills needed to land an airplane
acceptable? What about knowing only 70 percent of the rules of aviation? Any reasonable
person would answer no to these questions. Most would say that a pilot should be able to
perform all of these tasks without error. When lack of mastery exists, consequences such as the
1,300 accidents that occur each year are the result. Is this byproduct of ignorance an acceptable
cost to ensure a timely certification process? Some accidents are unavoidable but those that are
pilot related can be addressed.
If a pilot has not mastered any portion of a domain, the incomplete education could cost
him his life (e.g. density altitude). With respect to dividing content into domains as an
evaluative guide, the FAA has already established domains in a publication called the Practical
Tests Standards (PTS). These domains are used by FAA examiners to administer the final oral
exam. Because of time limitations, FAA examiners are unable to cover every subject in the
depth that is necessary. Instead of this one final oral exam, I propose a stepped process where
students have several different oral examinations where testing criteria is separated into related
domains. Testing would be spread over a period of time, which should increase retention and
comprehension. The PTS could be expanded to elaborate further on the content domains so that
more adequate questioning can be imposed and the testing process will be less vague. Roediger
and Karpicke (2006) found that when students were given multiple tests over a period of time on
a reading task, they were more apt to recall the information they had learned when compared
with those who were tested less. Even though the type of testing is different, the fact that
students are being tested more draws from part of what Roediger and Karpicke showed.
FAA Limitations
It is important here to understand the limitations of the FAA and its evaluative abilities.
To extend oral examinations into several meetings would undoubtedly overextend the
capabilities of the designated examiners which would slow certification to a crawl. Instead, I
propose that the stepped evaluations be administered by authorized flight instructors. After each
of the oral sessions, the examining instructor would be required to make an endorsement in the
student’s logbook certifying the student’s mastery of the tested domains. But are instructors
qualified to administer these exams? Consider the following. Flight instructors already endorse
students for the practical examinations, so allowing them to administer preliminary evaluations is
appropriate. Also, flight-schools with examining authority already have instructors who are
authorized to administer comprehensive examinations. Clearly the FAA has confidence in the
abilities of flight instructors.
In terms of administering domain-referenced oral exams, the certification process would
be relying on instructor expertise. If the student receives endorsements by passing all
preliminary evaluations, he would then meet the regulatory requirement to schedule a final oral
exam with an FAA examiner. In this way, the FAA is still able to maintain meaningful
oversight.
Study Proposal
A study must control extraneous variables in order to test the central question with little
outside interference. This study hypothesizes that students who achieve a private pilot certificate
based on the stepped method of oral evaluations will be better educated, retain more, and be
better able to apply aeronautical knowledge than those who use the traditional method of
certification. In this study, a control group would need to continue the certification process as it
stands and an experimental group would be allowed to pursue their rating using the stepped
evaluation approach as described (20 students in each group). This experiment should take place
at a flight school large enough to meet the sample size and participant requirement. Flight
schools typically have standard procedures that are generally followed on a staff-wide basis;
therefore, individual differences in instruction should be negligible between the control and
experimental groups. However, an instructor should be able to prepare his students to the extent
necessary to pass their exams. The two groups should be composed of students who only intend
to achieve a private pilot’s certificate. In the experimental group, participants would take five
domain-referenced oral exams based on PTS subject matter before the final, cumulative oral
exam.
After completion of each domain exam, a pass/fail grade will be given. If a fail is
designated, a student would need to restudy until his flight instructor determines he is competent
in the respective domains and the student would retake the failed exam. Each domain exam will
be taken in this manner which will replace the regular written examination. Flight instructors
who administer these exams should be selected based on experience. More experienced pilots
will be able to make professional decisions as to the competence of the participant based on the
respective domains. Through this process, it is likely that most will go on to achieve their
private pilot’s certificate; however, the study will also evaluate retention through the use of post-
tests.
Post-tests would be administered one year following the certification date. The groups
would be tested to determine which has retained a greater level of knowledge from their training.
The post-tests will be administered as a comprehensive oral exam, where selected concepts from
each respective domain will be tested to compare retention among the groups. It is important to
post-test the participants who did not pursue any further training during the one year period as
they may skew the results. If the experiment proves positive, the group subjected to more
domain-referenced oral evaluation will have built greater mastery and retained more knowledge.
Answers to Criticisms and Questions
The question may be asked, why not approach enhancements in learning from an
instructional perspective rather than an evaluative perspective? The problem is that flight
instruction is based on a mentoring relationship. The FAA cannot control the interaction
between every student and flight instructor. The more feasible option is to affect instruction
through enhanced evaluations. Currently, it is at the discretion of a flight instructor to determine
if a student is ready for a certification exam. It makes more sense that other parties are involved
to expose weaknesses in a student’s knowledge. In turn, a positive feedback will result where
instructors will receive more information about their students’ knowledge and comprehension. If
instructors are unknowingly omitting important information, having checks and balances with
other instructors will undoubtedly expose those deficiencies. Overall, using this evaluation
process will result in more learned students and more highly effective instructors.
A full scale implementation of this type would be a large undertaking. Assigning new
responsibilities to flight instructors, adjusting paperwork, and creating new regulations are all
tasks that can take time and effort. However, depending on the success of the study, the
governing body must ask itself if a change in the certification process would save lives. If the
answer is yes then the FAA is obligated to enact such changes to the aviation industry. Aviation
is a community and all are responsible for working toward safety. Most would welcome the
change and work diligently to see its success because the benefits clearly outweigh the costs.
Standardized written exams will continue to be the status quo unless those who see the
deficiencies fight to change the process. If the change saved one additional life, than the efforts
were not in vain because that life saved could be a family member or friend.
References
Airasian P.W. & Madaus G. F. Criterion-Referenced Testing in the Classroom. In Martuza V. R.
(1977). Applying norm-referenced and criterion-referenced measurement in education.
Boston, MA: Allyn and Bacon, Inc. (Reprinted from Criterion-referenced testing in the
classroom. NCME 3(4), 1972. by the National Council on Measurement in Education
Inc. East Lansing, MI.
AOPA Air Safety Foundation. (2007). 2007 Nall report: Accident trends and factors for 2006.
Frederick, MD: Bruce Landsburg.
Deuink, J. W. (1986). The proper use of standardized tests. Greenville, South Carolina. Bob
Jones University Press.
Haladyna T. M. (2002). Essentials of standardized achievement testing: validity and
accountability. Boston, MA: Alan & Bacon.
Hashway, Robert M. (1998). Assessment and evaluation of developmental learning: qualitative
individual assessment and evaluation models. Westport, Conn.: Praeger.
Jeppesen (2008). Private pilot FAA airman knowledge test guide. Englewood, CO.
Morgan C., Dunn L., Parry S., & O’Reilly M. The student assessment handbook. New York,
NY: RoutledgeFalmer.
National Transportation Safety Board. (2010). Review of aircraft accident data: U.S. air carrier
operations calendar year 2006. Washington: Government Printing Office.
Popham, James W. (1999). Why standardized tests don’t measure educational quality.
Educational Leadership. 56 (6). 8-15.
Roediger, H. L., & Karpicke, J. D. (2006). Test-enhanced learning: Taking memory tests
improves long-term retention. Psychological Science, 17, 249-255.

Shaycroft, M.F. (1979). Handbook of criterion-referenced testing: development, evaluation, and
use. New York, NY: Garland STPM Press.

Joseph McGeorge - FAA Testing Methods and Recommendations

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Joseph McGeorge - FAA Testing Methods and Recommendations

Uploaded by

Copyright:

Available Formats

Running head: WRITTEN EXAMINATIONS AND PILOT COMEPTENCE 1

Written Examinations and Pilot Competence: Relationship of Learning and Evaluation

student-pilot competence. Relevant research is used to support both criticisms of current

Written Examinations and Pilot Competence: Relationship of Learning and Evaluation

purpose of administering it.

among student pilots which is the purpose of this paper.

which is truly the overarching goal of pilot training.

Justifying Changes to the Evaluative Standard

Evaluation is highly debated over subjects such as learning comprehension, long-term

students’ knowledge and skills in relationship to those of students nationally…standardized

Me: …Density altitude. Who can tell me what this means?

(John raised his hand)

John: It is altitude corrected for non-standard temperature and pressure.

Me: O.K. that’s a correct definition, but what does it mean?

would you say?

John: (several seconds of pause) I don’t know.

really means in terms of aircraft performance…(continued lecture).

Density altitude is a critical factor in aircraft performance. In fact, the understanding of

questions on the FAA written exam:

What is density altitude?

A- The height above the standard datum plane.

B- The pressure altitude corrected for nonstandard temperature.

C- The altitude read directly from the altimeter.

(Jeppesen, 2008, 2-16)

speculate beyond that which is supplied as a written exam question.

Consider how such misunderstandings may have an influence on real-life circumstances.

concept. It cost him his life.

the standardized written test is to discard of this current practice.

Deciding on Evaluative Methods and Standards

testing is unnecessary. Instead, the certification process should be focused primarily on

examination process that takes advantage of more appropriate criterion-referenced methods.

product, or manifest a certain behavior, or he is not…and the criterion-referenced approach

individual, not related or compared to any other. Criterion-referenced examinations can be

implemented in different ways, but aviation evaluations should be focused on domain-centered

applications of knowledge. Domain-referenced tests have “absolute meaning in the sense of

Since a high level of aeronautical knowledge is required, students should be evaluated on a

master the domains where he is deficient.

pilot related can be addressed.

abilities of flight instructors.

In terms of administering domain-referenced oral exams, the certification process would

be relying on instructor expertise. If the student receives endorsements by passing all

Answers to Criticisms and Questions

through enhanced evaluations. Currently, it is at the discretion of a flight instructor to determine

Airasian P.W. & Madaus G. F. Criterion-Referenced Testing in the Classroom. In Martuza V. R.

(1977). Applying norm-referenced and criterion-referenced measurement in education.

classroom. NCME 3(4), 1972. by the National Council on Measurement in Education

Inc. East Lansing, MI.

Frederick, MD: Bruce Landsburg.

Jones University Press.

Haladyna T. M. (2002). Essentials of standardized achievement testing: validity and

accountability. Boston, MA: Alan & Bacon.

Hashway, Robert M. (1998). Assessment and evaluation of developmental learning: qualitative

individual assessment and evaluation models. Westport, Conn.: Praeger.

operations calendar year 2006. Washington: Government Printing Office.

Educational Leadership. 56 (6). 8-15.

improves long-term retention. Psychological Science, 17, 249-255.

Shaycroft, M.F. (1979). Handbook of criterion-referenced testing: development, evaluation, and

use. New York, NY: Garland STPM Press.

You might also like