Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 24

Measuring Learning with

WBIEG Level-2 Evaluation


By Violaine Le Rouzic, Evaluation Officer, WBIEG

March 27, 2005

What is a Level-2 evaluation?
 To help refine a course by:
 Measuring how much participants learned
 Assessing what participants learned
Evaluation Design
 Test all participants’ knowledge of course
contents at the start & the end of the course
 Same difficulty on pre- & posttests, but
different items, to avoid pretest recall effect
Main Indicator
 Group’s Learning Gain = Post-test – Pre-test
What is WBIEG Level-2
evaluation Toolkit?
 A set of guidelines, templates, databases,
and macro enabling course teams to assess
their participants’ learning.

 Adapt psychometrics to WB context to

measure learning with fair confidence
(Tradeoff between science and feasibility)

Main Toolkit features
 To measure participant group learning at the course
 Practical, short, step-by-step
 Tasks divided between content experts & assistants
 Accounts for short preparation time, few test takers
 No evaluation knowledge required
 Need basic Word, Excel & Internet browsing skills
 Test form templates in over ten languages

Not: theoretical, state of the art psychometrics

Not: for certification of individual participants

For whom is the Toolkit?
 World Bank course teams (for courses with
external and/or internal participants)

 World Bank managers or any WB staff who

want to compare learning outcomes data
across various criteria (division, years, etc.)

 Course teams in other organizations can use

the evaluation tools on external WB web site
(but the test items and results database is for
World Bank users only.)
When to use the Toolkit?
 Level-2 evaluation is feasible:
 Learning knowledge/skills is the main objective
 Learning objectives are clear before the course
 Every participant follows the same curriculum
 Worth investing in Level-2 evaluation:
 Course will be offered again
 Long enough (at least 1 week recommended)
 Many participants (30 or more recommended)
 Enough resources and commitment:
 Two staff weeks’ time for the evaluation
 Commitment to use the evaluation results
Toolkit’s 13 Steps,,contentMDK:20270021~pagePK:209023~piPK:335094~theSitePK:213799,00.html
1. Plan the evaluation
For course director
 Course team resources needed:
 1 week of the content expert team (can be split)
 1 week of assistants
Less time required on subsequent L2 evaluations
 Through the course cycle
 From early design stage to course re-design
 Most time for test development before delivery
 Start at early course design stage
2. Map the test
For course director
 Build a test specification matrix to determine:
 Which content areas should be tested
 To which cognitive domain each area relates
 How many test items are needed per content area
and cognitive domain (Recommended minimum
total: 20 item pairs per test)

 Objective:
 Make the test representative of the course content
3. Review past items (optional)
For content experts of World Bank only
 Consult a database with over 5,000 items used in
over 100 WB courses with Level-2 evaluations
 Search for keywords in offering titles or items
 Potential benefits
 Save time if some items fit your needs
 Identify issues to avoid from past items
 Get ideas on writing new items
 Item quality is context-specific, don’t re-use blindly!,,contentMDK:20191931~pagePK:135700~piPK:135698~theSitePK:136975,00.html

4. Write items
For content experts
Match the content area and cognitive
domain of the test specification matrix
Test items use multiple-choice format
All items have five response options
(Last option is always “I don’t know.”)
Average difficulty level
Clearly stated
5. Pair items
For content experts
 For each item, write an equivalent item
 Same difficulty level
 Same content area
 Same cognitive domain
 Same length
 Same format
 Examples in guidelines
 Objective: Make pre- and post-test equivalent
6. Pilot tests
For content experts (with assistants)
 Have volunteers take the tests (or part of the
test) before the course. Volunteers can be:
 Other content experts (to check key)
 Alumni
 Participant look-alike
 Non-content experts
BUT NOT the actual participants!
 Collect comments and demographics with
tests responses. Test without, then with key.
7. Review test items

For content experts (with assistants)

Use the pilot test responses and the
Toolkit checklist to review each item for:
 content
 wording
 using statistical item analysis (if any)
Finalize the items
8. Produce test forms
For assistants
 Use automated template to randomly assign
items to either pre- or post-test
 Use test form templates (customize the
templates, as needed)
 Use formatting and production guidelines

Poor test form production can ruin all results!
9 &10. Collect test forms
For any organizer on site
 Collect pre-test at course start and post-test
at course end
 Have all participants answer
 Have all participants write their codes on both
forms to match results by respondent
 Explain evaluation objectives & confidentiality
11. Compute results

For assistants

Follow tabulation guidelines

Enter responses in tabulation template
Click macro for automatic item analysis

11A. Results example:
Learning gain:
a course
green if statistically significant;learning gain
Compare with
and post-test score other courses
orange, if not.

Pre- and post-test scores Post-test scores

(matched respondents) (all respondents) 18
11B. Results example:
distribution of a course’s
pre- and post-test scores

Post-test to
the right of

11C. Results example:
post-test responses by item


Items’ text
Responses 20
Check: item confusing or
misconception not overcome
11D. Results example:
Check: item too hard or
a course item analysis
course did not teach this well

Check pre-post equivalence

Most items statistically OK

Confused high scorers

Compare reliability (WB only) 21

12. Send results to WBIEG

For assistants (WBI courses only)

WBIEG will:
Check quality of processing
Include test items and results in database
Report evaluation efforts and results on

13. Interpret results

For course director & content experts

Review and interpret results using:
Interpretation guidelines
Results database (WB only)
Toolkit glossary
Decide how to improve next offering
and next test

Thanks to:
 Developers: Joy Behrens, Guangbin Liu,
WBIEG staff
 Advisors: Marlaine Lockheed, William
Eckert, Sukai Prom-Jackson, Zhengfang
Shi, Gary Echternacht …

Main contact: Violaine Le Rouzic


You might also like