1. There are several types of reliability and validity that are important to consider when designing employee selection tests and processes.
2. Test-retest reliability and inter-rater reliability measure whether results are consistent over multiple test administrations or ratings.
3. Content validity and criterion-based validity are also important, with content validity demonstrating a test samples important job tasks, and criterion-based validity showing test performance correlates with actual job performance.
1. There are several types of reliability and validity that are important to consider when designing employee selection tests and processes.
2. Test-retest reliability and inter-rater reliability measure whether results are consistent over multiple test administrations or ratings.
3. Content validity and criterion-based validity are also important, with content validity demonstrating a test samples important job tasks, and criterion-based validity showing test performance correlates with actual job performance.
1. There are several types of reliability and validity that are important to consider when designing employee selection tests and processes.
2. Test-retest reliability and inter-rater reliability measure whether results are consistent over multiple test administrations or ratings.
3. Content validity and criterion-based validity are also important, with content validity demonstrating a test samples important job tasks, and criterion-based validity showing test performance correlates with actual job performance.
1. There are several types of reliability and validity that are important to consider when designing employee selection tests and processes.
2. Test-retest reliability and inter-rater reliability measure whether results are consistent over multiple test administrations or ratings.
3. Content validity and criterion-based validity are also important, with content validity demonstrating a test samples important job tasks, and criterion-based validity showing test performance correlates with actual job performance.
According to US law and HR practices worldwide, one
of the fundamental principles of job selection is that
any information you use to select a candidate is defined as a test.
So let’s review possible items used to select an
individual for a job. Which of the following is a test? 1. Application blank or form 2. Paper and pencil or computer-based test 3. Selection Interview 4. Background check Answer: If any of the above are used to make a decision about a candidate it is legally considered a test. Thus The short answer is: All the above!
Types of job interviews:
Unstructured Job interviews –
There are a number of flaws of unstructured job interviews.
The basic problem with these types of interviews is that they do a poor job of predicting job performance—in other words, they have poor validity. This can be due to several reasons. First, different applicants are typically asked different questions. Thus it is hard to make fair comparisons between them. Second, the questions are often not directly related to the job. Maybe your interviewer discovers you are an alumnae of EDHEC and he is too. The rest of the conversation may veer off course, being more about his or her remembrances of EDHEC than your specific job qualifications. Finally, especially in the unstructured interview format, interviewers have a hard time agreeing amongst themselves what is a great answer from a not so great answer! This picture shows how judges gave very different scores for the same performance.
In the semi-structured interview:
The employer lists job-oriented questions ahead of time. The interview is set up so that each person is asked THE SAME questions. Candidate answers are still ‘open-ended’, not multiple choice. Note that this is the difference between the semi- structured and the completely structured interview format. However, a behavioral scoring guide of illustrative answers improves the reliability of scoring. This way, managers have the same frame of reference.
Finally, although structuring your interview helps to reduce
interviewing errors, it is useful to keep them in mind, since they can still occur. First is the halo effect, where a very positive (or negative) evaluation on one dimension clouds judgment of other dimensions. By structuring the interview and asking specific questions, this tends to reduce the risk of a halo effect since you have to score each question separately. The Halo effect is aggravated when you are asked to just give one overall score for an interview. Second, is the similar-to-me effect (unrelated to the job). If you went to the same school, play the same sports, come from the same region, share the same religious affiliation, all these factors can affect and cloud your judgment. A major French university study which took place a few years ago shows, for instance that in resumes sent to companies that were identical except for obvious differences in religious background (based on name or extracurricular activity), bias entered into the evaluation of candidates. Stereotyping is a related problem. Candidate order can also make a difference. Those first and last tend to be remembered more. Also an extremely good candidate may make an otherwise acceptable candidate that follows look weaker. Again, although not entirely eliminating the problem, structuring the interview reduces this problem somewhat, although not eliminating it altogether. Another common error is that of first impressions. While the average length of an interview is 40 minutes, 33% of 2000 surveyed bosses indicated they know within the first 90 seconds if they will hire that candidate. Leniency error is another problem, where everyone might be rated too high or low. A similar error is central tendency error which is to rate everyone as average. You can learn more about some of these errors in the reading, Optional 1.3. With work samples, candidates are presented with situations representative of the job for which they’re applying, and are evaluated on their responses. Sometimes a short training session is given to show the candidate what to do, and then an observer watches to see how quickly he or she can learn the job being performed.
The management assessment center is another type of
job simulation which is actually a series of job simulations. For professional positions, it may be important to see how the candidate works with others in a group situation or how the candidate can present his or her ideas to the rest of the group. Many different tasks are bundled together typically into an all-day event where a number of candidates are invited at the same time. Many students may find themselves in this type of environment. Observers may take notes of how each of the candidates acts individually and with others. There may be several exercises planned throughout the day including a role play, in-basket simulation, leaderless group discussion and business game. Our Career Service Center helps students learn to prepare for some of these exercises since they are commonly used by employers hiring EDHEC graduates. You can read more about Management Assessment Centers in the section on Job simulations in Reading 1.1 (see especially pages 15-19).
Types of reliability:
Reliability describes the consistency of scores obtained
by the same person when tested in different ways. One common type of reliability important in employee selection is test-retest reliability. This type of reliability asks: are the results similar after retesting. Consider this example. Imagine weighing yourself on a scale. The first time you step on the scale you see you weigh 60 kilos. A few seconds later, without having eaten anything or exercised, you step on the scale again. What would you expect? Right, you would still weigh 60 kilos. If this is what happens, you would say that the scale is a reliable way to measure your weight. It is a type of test-retest reliability.
Inter-rater reliability. Is another important type of
reliability, especially when several individuals are used in the organization to assess candidates. Different raters would come up with the same rating of an individual, if they used the same test. Rating an application blank may be relatively easy. Getting the same grade from a job interview, especially one that is unstructured, as we will see, is a different story. In a later module, we will see though how to improve inter-rater reliability of the interview by structuring the questions and creating a more uniform score sheet when interviewing candidates.
A third type of reliability which is also important in
construction of good tests, especially when multiple items are involved, is that of internal consistency reliability, two examples of which are the item-to-total or split half-reliability. Suppose it’s a personality test measuring self-confidence. With split half reliability, the score on the first half of the test should correlate with that of the second half. That is, if you scored high on the first part, you should score high on the second part, and if you scored low on part I, you should score low on Part II. The same holds for average scores on both parts. Of course, individual scores may differ, but if administered to a large group, this should be the general pattern.
Types of validity:
Employers demonstrate the content validity of a test by
showing that the test constitutes a fair sample of the job’s content. The basic procedure here is to first carry out a job analysis. This helps you to identify job tasks that are critical to performance. Next, randomly select a sample of those tasks to test. Criterion-based validity means demonstrating that those who do well on the test will also do well on the job, and that those who do poorly on the test do poorly on the job. In psychological measurement, a predictor is the measurement (in this case, the test score) that you are trying to relate to a criterion, such as performance on the job. We will come back to this.