Download as pdf or txt
Download as pdf or txt
You are on page 1of 48

The Determinants and Performance Effects of Supervisor Bias

Jasmijn C. Bol IESE Business School University of Navarra E-mail: jbol@iese.edu December 2006

Abstract This paper examines the determinants and performance effects of leniency and centrality bias. An empirical analysis of a compensation plan for low-level employees with both objective and subjective performance measures leads to two key results. First, the causes of supervisor bias include: employee performance, differences in organizational hierarchy, the financial position of the firm, the length of the employee-supervisor relationship, and supervisor characteristics. This indicates that supervisors take their utility into consideration when appraising employee performance. Second, supervisor bias affects future employee incentives. Contrary to previous assumptions, the results show that biased performance ratings can have both positive and negative effects on incentives. Leniency bias positively affects performance improvement, while centrality bias has a negative effect on performance.

Acknowledgements: I want to thank Stan Baiman, Antonio Dvila, Henri Dekker, Chris Ittner, Frank Moers, Fernando Pealva, Joan Enric Ricart, Josep Maria Rosanas, Stan Veuger and seminar participants at the 2006 Global Management Accounting Research Symposium in Copenhagen, the 2006 AAA Annual Meeting in Washington D.C., and the University of Pennsylvania for their comments and suggestions.

I. INTRODUCTION This paper provides empirical evidence on the determinants and performance effects of supervisor bias. Knowledge of supervisor bias is important for managers in charge of incentive design. Understanding the causes of supervisor bias, and the consequences of supervisor bias on future performance, can help compensation system designers make better-informed decisions about how much supervisor discretion to allow for in performance-based compensation plans. In this paper, I focus on two well-known forms of supervisor bias: leniency bias and centrality bias (Prendergast, 1999). Leniency bias is the tendency to provide employees with inflated subjective performance ratings (Bretz, Milkovich, & Read, 1992), while centrality bias is the tendency to compress performance ratings, creating less variance in performance ratings than in actual performance. Although there is considerable empirical evidence indicating that the use of subjective performance measures leads to lenient and compressed ratings (Landy & Farr, 1980; Murphy & Cleveland, 1991; Bretz et al., 1992; Prendergast & Topel, 1993; Jawahar & Williams, 1997; Prendergast, 1999; Moers, 2005), empirical evidence on how supervisor bias actually influences employee behavior is almost entirely absent. This paper extends the current literature by focusing both on what causes a supervisor to bias performance ratings and on how biased ratings affect the incentive effects of performance-based compensation contracts. I study a compensation plan for low-level employees of a financial service provider. The incentive system includes both objective and subjective performance measures and gives supervisors some freedom to make ex post modifications to financial rewards. I use contract design and performance data from five branch offices for 2003 and 2004. I begin my analysis by establishing that the performance ratings on the subjective dimension are subject to leniency and centrality bias. I then examine why the performance ratings of some employees are subject to

more leniency and centrality bias than those of others. Finally, I examine how these biases affect performance change between 2003 and 2004. The results indicate that supervisors take their utility into consideration when appraising performance. I find that differences in organizational level between employee and supervisor, and the financial performance of the company, are determinants of leniency and centrality bias. This is consistent with supervisors taking information-gathering costs and attitudes towards rating accuracy into consideration when rating behavior. Supervisor characteristics, the length of the employee-supervisor relationship, and employee performance influence the extent of leniency bias, suggesting that supervisors consider the cost of communicating evaluations and how employee ratings reflect on their own management capabilities. Finally, performance ratings influence incentive provision. Leniency bias has a positive effect on performance improvement, while centrality bias has a negative effect, for both aboveaverage and below-average performers. The study makes several important contributions to the performance evaluation and compensation literature. This is the first paper that empirically investigates how biased performance ratings influence the effectiveness of a performance-based compensation contract. Prior studies have established the existence of centrality and leniency bias, but not the consequences of these biases on future performance. Moreover, this paper does not only present evidence indicating that the use of subjective performance measures leads to biased performance ratings; it also provides insights into the determinants of supervisor bias. Finally, this paper offers a detailed description of the salient features of an incentive system for low-level employees and of the supervisors and employees reactions to this system. This gives us insight

into the compensation practice for low-level employees, an area in which detailed information is scarce (Indjejikian, 1999). This paper consists of five sections. Section II reviews research on supervisor bias in compensation contracts and develops the hypotheses. The research design is presented in section III and in section IV the results are analyzed. The final section summarizes the results and discusses future research possibilities.

II. HYPOTHESES Most papers dealing with subjectivity in compensation contracts focus on the determinants of subjectivity. These studies examine the role of subjectivity in incentive systems and indicate how introducing subjectivity improves contracts by mitigating incentive distortions or reducing risk (Baker, Gibbons, & Murphy, 1994; Baiman & Rajan, 1995; Hayes & Schaefer, 2000; Ittner, Larcker, & Meyer, 2003; Gibbs, Merchant, Van der Stede, & Vargus, 2004). However, the fact that performance evaluation is subject to supervisor discretion can also give rise to a number of problems, the most prominent of which is rating inaccuracy.1 Since the correctness of subjective performance evaluations cannot be assessed by outside parties, supervisors can take their own preferences into consideration when rating employee performance. One documented consequence of allowing supervisors to apply discretion is biased performance ratings (Landy & Farr, 1980). A long line of research, mainly in human resources management, investigates performance rating accuracy in subjective performance appraisal and presents considerable evidence on biased performance ratings (e.g.,

Inaccurate performance ratings are clearly not the only potential negative consequence of introducing subjectivity into explicit compensation contracts. Other concerns, such as unclear measurements criteria, are also likely to influence the effectiveness of the compensation contract. In this paper, I limit the discussion to examining the effects of biased performance ratings.

Murphy & Cleveland, 1991; Bretz et al., 1992; Jawahar & Williams, 1997; Moers, 2005). These papers find that supervisors tend to rate leniently and that they do not differentiate strongly between employees when evaluating performance. The extent to which ratings are biased is not the same for different supervisors or for different employees evaluated by the same supervisor. Supervisors are predicted to vary the amount of bias applied to the performance ratings depending on how rating bias influences their own utility.2 In determining the extent of bias, supervisors consider the potential negative consequences of communicating performance ratings. Communicating harsh but accurate ratings to employees will likely damage personal relationships and lead to discussions and criticism. By offering employees lenient and compressed ratings, supervisors can reduce the real and psychological cost of communicating evaluations (McGregor, 1957; Bernardin & Buckley, 1981). The extent of bias that supervisors apply to performance ratings is also influenced by the time and effort needed to gather the appropriate performance information. High informationgathering costs will make supervisors less willing to invest the required time in information collection (Harris, 1994). This is predicted to lead to leniency and centrality bias, because biasing imprecise performance evaluations reduces the chance that resulting imprecise ratings lead to painful discussions and dissatisfied employees. Since supervisors compensation and promotion possibilities are often linked to employee performance, supervisors are also expected to take the effects of their rating decisions on employee incentives into account (Prendergast & Topel, 1993), as well as how the performance ratings reflect on the supervisors own management capabilities (Longenecker, Sims, & Gioia,

In this paper I focus on intentional bias by supervisors. However, supervisors can also unintentionally bias performance ratings due to their cognitive limitations. See, for example, Lipe and Salterio (2000) and OConnor, Deng and Shields (2006).

1987). By rating poorly performing employees more leniently, supervisors may try to hide departmental problems in order to come across as more capable managers. Supervisors are also predicted to take their superiors attitude towards rating accuracy into account. Supervisors adjust their biasing behavior based on the expected rewards for accurate ratings and/or the expected consequences of being perceived as a biased rater by their own superiors (Longenecker et al., 1987). Finally, rating behavior is influenced by the utility supervisors get out of (dis)favoring employees (Prendergast & Topel, 1996). Favoring and disfavoring certain employees leads to biased ratings, although favoritism does not necessarily lead to leniency and centrality bias. If supervisors take their own utility into account when rating performance, I expect to find more leniency and centrality bias in situations where bias is predicted to increase rater utility. In particular, I predict that the costs of communicating evaluations, information-gathering costs, employee performance, and the attitude of the supervisors superiors towards rating accuracy influence the extent of leniency and centrality bias present in performance ratings. This results in the following general hypothesis: Hypothesis 1: The extent of bias applied to subjective performance ratings is influenced by the costs of communicating evaluations, information-gathering costs, employee performance, and the attitude of the supervisors superiors towards rating accuracy.

Employee Incentives and Supervisor Bias Performance ratings are an important element of incentive contracts because they summarize the outcome of the performance evaluation process and link performance to pay by indicating how much compensation the employee will receive. Bias in performance ratings is likely to influence the incentives created by the compensation contract. In this paper I identify

three ways in which bias in performance ratings is expected to influence employee incentives: 1) through the effect bias has on the link between pay and performance, 2) through the effect bias has on the perceived fairness of the compensation system, and 3) through the effect bias has on the performance information received by the employees. The Link between Pay and Performance One of the main objectives of using a performance-based compensation system is motivating employees to supply effort on the right job dimensions.3 According to agency theory, linking pay to performance motivates employees to exert increased effort to improve performance, because increased performance results in increased pay (Holmstrom, 1979). Performance measures play a crucial role in this process because employees direct their attention to those actions that are measured. However, the incentive effect of a compensation plan does not depend solely on what is measured; it also depends on how these actions are measured. Employees will not be motivated to increase effort unless improved performance is actually expected to translate into more compensation.4 When performance is assessed subjectively, this may not take place consistently. Bias clouds the link between pay and performance, affecting the incentive provision of the compensation plan. The Perceived Fairness of the Compensation System Employee incentives are not solely determined by the relation between pay and performance. The extent to which employees are motivated by a compensation system is also

Another important objective of performance measurement is to differentiate between highly skilled and less skilled employees. Although the effect of wrong personnel decisions might be severe, the aim of this paper is not to investigate the effect of bias on selection issues. 4 In expectancy theory (Vroom, 1964; Heneman & Schwab, 1972) this is referred to as the degree to which performance is instrumental for the attainment of certain outcomes, and research shows that it is important in motivating individuals.

influenced by the perceived fairness of the compensation plan (Akerlof & Yellen, 1988; Blinder & Choi, 1990; Colquitt, Conlon, Wesson, Porter, & Yee Ng, 2001). Employees not only care about how their rating compares to their performance, but also about how the received performance rating compares to their expectations and to the ratings received by others. In the organizational justice literature, two types of subjective perceptions of fairness are distinguished: the fairness of the outcome distributions, or distributive justice, and the fairness of the procedures used to determine these outcome distributions, or procedural justice (Greenberg, 1990). Inaccuracies in performance ratings caused by supervisor bias are predicted to influence the perceived fairness of the compensation system as bias changes the outcome distribution of the compensation plan. Performance Information Compensation systems also influence employee incentives through their effect on selfperception. Performance ratings provide employees with information about how their performance is perceived by the supervisor. This information influences employee incentives by affecting the employees self-perceived marginal productivity of effort (Fang & Moscarini, 2002). Higher self-confidence enhances motivation by affecting the expected result of effort, which provides supervisors with clear incentives to provide employees with performance ratings that increase or maintain their self-esteem (Bnabou & Tirole, 2002; 2003).

The Performance Effects of Leniency and Centrality Bias Supervisor bias is studied extensively in the HR literature in the hope of finding ways to reduce it, under the maintained assumption that inaccurate ratings per se are bad for compensation contracting (see Landy & Farr, 1980; Rynes, Gerhart, & Parks, 2005). In this

paper, I argue that this assumption may be incorrect. There are several ways in which bias is expected to influence the functioning of performance-based compensation contracts and they are not all negative. In the following sections I discuss how the two biases considered in this paper, leniency and centrality bias, are predicted to influence employee incentives. Leniency Bias Higher ratings can affect incentive provision positively by increasing congruence between performance rating expectations and received performance ratings. Individuals have a tendency to overestimate themselves5 and therefore to rate themselves higher than their supervisors do (McFarlane Shore & Thornton, 1986; Harris & Schaubroeck, 1988). Employees who believe they have received a lower performance rating (and consequently less compensation) than they deserve, are expected to lower their performance in order to restore a feeling of equity (Akerlof & Yellen, 1988; Kahn & Sherer, 1990; Colquitt et al., 2001). More lenient performance ratings may minimize these feelings of unfairness and the resulting negative effects on performance. Moreover, more lenient performance ratings can maintain or increase the employees confidence in his ability and efficacy, which can be valuable to the firm because it increases the employees motivation to undertake ambitious projects and persist even when faced with adversity (Bnabou & Tirole, 2002). On the other hand, since performance ratings give employees a signal about how their performance is perceived by their supervisor, lenient ratings may lead employees to falsely conclude that they are performing their tasks as desired by their supervisor. Providing employees

There exists abundant evidence in the psychology literature that most people overestimate their abilities and past achievements (e.g. Larwood & Whittaker, 1977; Arkin, Cooper, & Kolditz, 1980), that they tend to recall their success more than their failures (e.g. Mischel, Ebbesen, & Zeiss, 1976), and that they have the tendency to be unrealistically optimistic about their future life events (e.g. Weinstein, 1980).

with this mistaken impression is expected to have a negative effect on performance because it motivates them to continue to take wrong/suboptimal actions. Hence, leniency bias is predicted to have a positive effect on effort, but this will only lead to improved performance if the behavior that is stimulated is desirable. Since I expect the first effect to be stronger, I state the hypothesis as follows: Hypothesis 2: Leniency bias has a positive effect on the effectiveness of a compensation contract as an incentive provider.

Centrality Bias6 The lack of distinction between performance ratings of different employees is expected to influence employee incentives in several ways. First, above-average performers are likely to feel disenchanted when employees who perform worse are rewarded almost equally. This is expected to negatively affect above-average employees future performance (Lazear, 1991). Moreover, compression reduces the probability that the value of a marginal performance rating increase will outweigh the cost of the extra effort needed to improve performance sufficiently to receive this marginal performance rating increase. Centrality bias is thus predicted to have a negative effect on the incentives of above-average performers. This results in the following hypothesis: Hypothesis 3: Centrality bias has a negative effect on the effectiveness of a compensation contract as an incentive provider for above-average performers.

The situation is different for below-average performers, as compression influences their ratings in a positive way. Because of centrality bias, their performance seems similar to that of top performers. Since most employees consider their own performance to be above-average (Meyer, 1975), providing below-average employees with their true comparative position will be

Although both biases will move up ratings for below-average performance, centrality bias also decreases the variance in reported employee performance.

a deflating experience for most of them, which can have a disruptive effect on their performance (Pearce & Porter, 1986; Gabris & Mitchell, 1988). Compressed ratings might therefore have a positive effect on the incentives of below-average performers as they protect them from information that will lower their self-perception and the perceived prospects from providing effort in the future (Prendergast & Topel, 1993; Fang & Moscarini, 2002). On the other hand, just as with above-average performers, the lack of variance in performance ratings might also negatively affect the motivation of below-average performers as their marginal cost to improve performance ratings may be higher than the marginal benefit they receive from the performance rating improvement. Since I cannot predict which of these arguments explains the relation between centrality bias and the performance change of below-average performers, their relative importance must be empirically determined.

III. RESEARCH DESIGN Research Setting In this paper I use data provided by a Dutch financial service provider (FSP). FSP is one of the main financial service providers in the Netherlands, and it has a considerable stake in the European market. It serves over nine million customers and its assets have a total value of some 500 billion (in 2005). FSP introduced a new compensation system for its branch offices in 2003. The incentive plan covers all branch office employees, except for the local management teams. The main reasons for introducing the new compensation contract are improving employee incentives and

10

having employees focus more on output.7 In order to obtain these objectives, FSP designed an incentive system that links pay to performance. Compensation Contract Design The new compensation system consists of a fixed salary and a bonus, where the fixed salary depends on the employees job and the periodic fixed salary increase (up to a stated maximum), and the bonus on the employees overall performance rating. In the incentive system, performance is not only objectively measured but also subjectively assessed. More specifically, half of the overall rating is determined by an output-based performance rating and half by a competence-based performance rating. The output-based section of the system consists of three to six subjective or objective performance measures that measure specific employee output. Examples of objective performance measures are: the number of insurance policies sold, the percentage increase in portfolio growth, or the number of appointments made. Examples of subjective outcome performance measures are the value of the front-office support, the quality of the management information reports, or timely preparation of customer reports. Although the intention of the new compensation system is to improve output, the designers of the system wanted to avoid a blind focus on improving measured performance. The compensation system therefore includes the subjective assessment of competencies that are considered essential to performing the employees tasks in an efficient and correct manner. The supervisor must choose between one and four competence measures, which are subjective assessments of specific behavior that the employee must demonstrate. Hence, the system not only indicates what type of output improvement is expected, it also informs the employees how
7

An internal study that examined the consequences of the introduction of the new compensation system finds that both supervisors and employees think that employee effort and focus on output have increased after the introduction of the system.

11

output improvements are expected to be accomplished. The competencies cooperative behavior and customer focus are included in all performance documents (see Figure 1). The weight on each performance measure is determined by the number of measures chosen within each section. Each section determines half of the overall rating and the weights of the measures within each section are equal. The supervisor also sets specific targets and margins for each of the selected measures.8, 9 At the end of the year, the supervisor evaluates performance according to these measures and targets, and assigns a rating between one and five (not necessarily integers) to each measure. This leads to an overall rating which determines a) whether the employee receives the periodic fixed salary increase, and b) the bonus percentage range. The employee only receives the periodic fixed salary increase if the overall rating is above 1.5.10 Additional Discretion in the Compensation System The use of subjective performance measures is not the only way in which the FSP has introduced supervisor discretion into its compensation system. Supervisors also have discretion over the awarded bonus percentage. The overall rating is linked only to a bonus percentage range,11 not to a concrete percentage and, if deemed necessary, supervisors are allowed to provide employees with a bonus percentage outside the stated range. These discretionary bonus adjustments are intended to ensure that important actions that are not foreseen ex ante can still be rewarded.
Margins indicate when a performance target is not met, almost met, met, surpassed and generously surpassed indicated by the scores bad, regular, good, very good and excellent, respectively. 9 Although supervisors individually select the performance measures and set the targets, before presenting the contracts to the employees, all supervisors and management meet to discuss and compare the chosen performance measures and targets. In this way they hope to increase the consistency in the use of the system. For the same reason the designers of the system provide the supervisors with examples of performance measures for each type of function. 10 This is only true for employees who have not yet reached the maximum of their salary scale. 11 A bad rating (0-1.5) corresponds to a bonus of 0%, a regular rating (1.5-2.5) to 1-3%, a good rating (2.5-3.5) to 4-8%, a very good rating (3.5-4.5) to 9-12% and an excellent rating (4.5-5) to 13-15%. See Figure 1.
8

12

In sum, the incentive system links pay to performance and puts part of the compensation at risk. It does not rely exclusively on objective measures, but combines objective performance measurement with subjective performance appraisal. Nevertheless, the incentive system gives the supervisor a relatively wide scope for discretion. Introduction and Implementation of the Compensation System FSP took the introduction of the new compensation system very seriously. All employees and supervisors received information (through presentations, folders, letters, and internet sites) on the functioning of the system, its formal steps and the timelines of its implementation. The directors of the branch offices also wrote a formal letter to all employees indicating top managements support for the new system. Supervisors were provided with workshops which explained how to select the performance measures, how to set the performance targets, and how to rate performance. Empirical evidence (e.g. Longenecker et al., 1987; Burney, Henle, & Widener, 2006) indicates that attaching importance to the implementation and use of a compensation system reduces the raters tendency to bias performance ratings. The setting of this study is therefore biased against finding the existence of supervisor bias, and therefore their determinants and consequences. Data The analyses employ two years of proprietary archival data on the incentive plans for employees of five offices of FSP. I limit my study to employees who were employed by the bank in both 2003 and 2004 and to departments that made significant use of objective performance

13

measures (at least 25%). This results in 396 complete performance documents, 198 from 2003 and 198 from 2004.12 The documents (see Figure 1) provide information about the chosen performance measures, the targets, the ratings per measure, the total rating, and the old and new total compensation (salary scale, periodic salary increase and bonus). Furthermore, the documents indicate the employees job, department and supervisor, and an extensive description of the chosen performance measures. Based on these descriptions I classify each output-based performance measure as being either objective or subjective, while the competencies are by definition subjective. This results in two performance dimensions: objective and subjective.13 I also obtained personal characteristics of the supervisors and employees such as gender and age, and I interviewed several employees, supervisors, and designers of the system to get a better understanding of the compensation system and the company background. Finally, I accessed an internal study that examined employee and supervisor reactions to the compensation system. Summary statistics for the main variables used in this paper (see Tables 1, 2 and 3) indicate that supervisors in 2003, on average, provided higher ratings on the subjective than on the objective dimension. Both the mean (2.94 versus 2.80) and the median (3.00 versus 2.75) were significantly higher in 2003 (p < 0.005 two-tailed). This relationship reversed in 2004; both the mean (3.14 versus 3.28) and the median (3.13 versus 3.33) subjective rating were significantly lower (p < 0.005 two-tailed). The descriptive statistics also show that the variance

12

The data gathered does not cover all the employees meeting these conditions because some of the performance documents were incomplete or were never received by the human resource department. I have checked whether certain departments were underrepresented in the sample but found no such bias. 13 For 63 percent of the employees in the sample the subjective dimension solely consists of competencies, while for the remaining employees only one or two subjective output-based measures were chosen. I examine whether the results presented in this paper are influenced by the type of subjective measures that are used (only competencies or also subjective output-based measures), by re-estimating all models for only those employees that have no subjective output-based measures and by re-estimating all models while excluding the subjective output-based measures. In both cases the results are similar to those presented.

14

in the objective ratings was significantly larger (p < 0.001 two-tailed) than the variance in the subjective ratings in both 2003 and 2004. As predicted, the subjective ratings seem to be more lenient (although only in 2003) and more compressed, suggesting the presence of leniency and centrality bias. In the next section, I examine the existence of leniency and centrality bias more formally.

Leniency and Centrality Bias In order to examine the determinants and performance effects of supervisor bias, I first examine whether the performance ratings are subject to leniency and centrality bias, as predicted by previous research. To detect leniency, I test whether performance ratings on the subjective dimension are higher, on average, than performance ratings on the objective dimension, after controlling for other relevant influences. Similar to Moers (2005), a higher score on the subjective dimension is considered to be evidence consistent with more lenient ratings. Leniency bias can only be adequately captured in this way if the performance ratings on the objective dimension are unbiased or at least significantly closer to the true performance value than the subjective ratings. Another underlying assumption is that employees have similar ability levels on the two dimensions. After controlling for effort allocation effects, actual employee performance on the objective and subjective dimension (as opposed to supervisors potentially biased assessments) is therefore assumed to be very similar. Thus, considering the assumption that the objective performance rating is a good benchmark of actual performance, there should be, on average, no difference between the ratings on the different dimensions unless the ratings are biased, after controlling for other relevant influences.

15

As the dependent variable, I use the employees performance ratings on the objective and subjective dimensions (RATING). This means that each employee is included twice for each year, once with his objective rating and once with his subjective rating. The influence of discretion on the ratings is examined by including a dummy variable that equals one if the observation is a subjective rating and zero otherwise (DSubjectivity). To account for differences in contract design that may cause different effort allocations, I control for the contractual weight on the specific dimension (WEIGHT), as a higher weight is expected to lead to higher effort and consequently to higher ratings. I also include the number of performance measures (NR_PM), as a larger number of performance measures can lead to incentive dilution. To control for year effects, I include a year dummy (Y2003) that is one if the observation relates to 2003 and zero otherwise. In order to determine whether leniency bias is specific to a certain group of employees, or is a more general consequence of using subjective performance measures, I include several additional control variables. First, supervisors are not expected to behave in identical ways when it comes to rating performance. I include supervisor dummies (DSupervisor) to control for these differences. Since rating behavior might also be dissimilar for different types of employees, I also control for three employee variables: age (AGE), gender (SEX), and number of designated contract hours per week (HOURS). Finally, the way the system is used (e.g., the type of measures that are chosen) and/or the importance attached to the system (e.g., the time that is provided for instructions and evaluations) might not be identical for different departments and offices. To control for these differences, office (DOffice) and department dummies (DDepartment) are included.14

14

Employees have different tasks and duties that are logically translated into different performance measures. I partly control for these differences by adding supervisor, office, and department dummies. The chosen performance

16

To control for likely correlation of regression model errors for a given employee, I use a random effects model with robust standard errors. This leads to the following specification:
R A T IN G it = 0 + 1 D Subjectivity + 2W E IG H Tit + 3 N R _ P M it + 4 Y 2003 t + 5 A G E it + 6 SE X i + 7 H O U R S it +

j =8

48

D SupervisorJ +

k = 49

53

D O ffice k + l D D epartm ent l + i + it


l = 54

60

(1) where i indicates performance ratings (i= 1, , 792), t time, j supervisors, k offices, and l departments. I use random effects15 to obtain consistent estimates of all parameters, including coefficients of time invariant regressors (Greene, 1993). I investigate whether the use of subjective measures leads to centrality bias, by calculating the ratio between employees ratings on the objective (subjective) dimension and the mean rating on the objective (subjective) dimension (R_RATING = Max ((rating/mean rating), (mean rating/rating)). To justify this method, I assume that the variance in performance is similar over the two dimensions and that the variance in the objective performance ratings is similar to the performance variance in the sample. All other variables are the same as in model 1, and a random effects model with robust standard errors is used, leading to the following empirical specification:
R _ RATIN G it = 0 + 1 D Subjectivity + 2W EIGHTit + 3 NR _ PM it + 4 Y2003 t + 5 AGE it + 6 SEX i + 7 HOURS it + j D SupervisorJ +
j =8 48 k = 49

53

D Office k + l D Department l + i + it
l = 54

60

(2) Since the descriptive statistics show different patterns when comparing the objective and subjective ratings for the two consecutive years, I examine the existence of leniency and centrality on a yearly basis, in addition to the pooled sample.
measures per supervisor are, however, also not identical per se. To examine the influence of performance measure differences within reference groups, I redo the analyses for only those observations where the chosen performance measures in the reference group are at least 90% the same in the two consecutive years. The results show similar inferences as those in Table 5, implying that the results are not driven by differences in performance measures within reference groups. 15 The Hausman test indicates that random effects estimation is adequate for models 1, 2, 3 and 4.

17

The Determinants of Bias In this section, I examine the determinants of leniency and centrality bias. Although all the compensation contracts are subject to supervisor discretion, the extent to which ratings are biased is expected to vary with supervisor preferences. In Table 4, descriptive statistics on leniency and centrality bias and the discretionary bonus adjustment are presented. They show that the average amount of leniency bias was larger in 2003 than in 2004, while the average amount of centrality bias was very similar in the two consecutive years. Supervisors also took advantage of their ability to make discretionary adjustments, especially in 2003. None of these adjustments was negative, and the average adjustment was 2.7% (1.5%) in 2003 (2004). I measure the extent of leniency bias in the subjective rating of a specific employee by comparing the objective rating to the subjective rating of that employee. As differences between the objective and subjective rating can be caused by differences in effort allocation on the different dimensions, I control for both the contractual weight and the number of performance measures. Moreover, the difference between the objective and subjective rating captures both leniency and compression applied to the rating (even after controlling for effort allocation effects). In order to control for the effort allocation effect, and to separate leniency bias from centrality bias, I use a residual model. I regress the ratio between subjective and objective ratings on the weights, the number of performance measures and on the centrality bias (explained below), and use the residuals of this regression as a measure of leniency bias (LEN_BIAS).16 Several variables are predicted to affect the extent of leniency bias applied by supervisors. Since supervisors prefer to avoid negative consequences related to rating performance, and painful confrontations are more likely to occur when performance is
16

A problem with this measure is that it assumes leniency bias and centrality bias to be independent. However, if I use the simple ratio between subjective and objective ratings and control for the extent of centrality bias in the regression, I get similar results.

18

inadequate, employee performance is predicted to have a negative effect on the extent of leniency bias. Supervisors might also rate more leniently, especially when employee performance is poor, to hide departmental problems. I capture employee performance by including the objective performance ratings (OBJ_R) (see Table 6 for variable definitions). Confronting employees with harsh but accurate ratings might be less costly to the supervisor when the personal relationship with the employee is not that strong. I therefore predict that ratings are less lenient when the employee and supervisor have only started working together recently. Moreover, since supervisors care about the performance of their department, they are expected to reduce leniency when rating leniently might give employees the mistaken idea that they are performing their duties correctly when they are not. Since misunderstandings on how tasks should be preformed are more likely to occur when the employee and supervisor have worked together for a short period of time, supervisors are also expected to be less lenient in the initial stage of the relationship to avoid stimulating dysfunctional behavior. To capture the influence of the length of the employee-supervisor relationship, I include a dummy variable that equals one if the employee joined the company within the last three years, if the employee recently changed jobs within the company or if a new supervisor was assigned, and zero otherwise (NEW_R). The extent of leniency bias is also influenced by the time and effort the supervisor must invest in collecting performance information. High information-gathering costs are predicted to result in less precise evaluations, which are expected to lead to more lenient ratings because leniency decreases the possibility that imprecise ratings lead to problems. I capture informationgathering costs using the supervisors position in the organizational hierarchy. More specifically,

19

I proxy for information-gathering costs (INFO_C) using the difference between the organizational level of the supervisor and the employee.17 The effect of the attitude of the supervisors superiors towards rating accuracy is analyzed by examining the financial performance of the offices, as financial conditions are expected to affect attitudes towards leniency bias. More specifically, poor financial performance is predicted to put pressure on the office to keep compensation costs down, making the superior less tolerant of lenient ratings (Longenecker et al., 1987). I measure the financial position of the offices by including their profit growth (GROWTH) and the difference between their budgeted and actual profit (BUDGET). Supervisor and employee characteristics are also expected to influence supervisor bias as they influence the supervisors preferences. I include supervisor gender (SUP_SEX) and employee gender (SEX) as independent variables because some evidence in the psychology literature indicates that female supervisors rate more leniently (Tsui & O'Reilly, 1989), and that female employees are rated lower than male colleagues (Rosen & Jerdee, 1974). Finally, the age difference between the supervisor and employee (DIF_AGE) is included because research on relational demography shows that individuals with comparative demographic characteristics are more likely to develop a close relationship (Tsui & O'Reilly, 1989). A closer relationship is expected to have a positive effect on leniency bias (Varma, DeNisi, & Peters, 1996), because it can increase the utility received from favoring the employee, and/or increase the pain suffered when the employee needs to be confronted. Discretionary bonus adjustments provide supervisors with an alternative way to influence compensation. To control for a possible substitution effect, I include the number of percentage
The influence of the supervisors position in the organizational hierarchy on rating behavior has been analyzed in the psychology literature, and several studies indicate little interrater agreement on performance ratings provided by supervisors from different organizational levels (Berry, Nelson, & McNally, 1966; DeCotiis & Petit, 1978).
17

20

points that the bonus is adjusted beyond the stated range (DIS_ADJ).18 Finally, I include year (Y2003) and department dummies (DDepartment) as additional controls.19,.20 A random effects model with robust standard errors is used to test the following empirical specification;
LEN _ BIASit = 0 + 1OBJ _ Rit + 2 NEW _ Rit + 3 INFO _ Cit + 4 GROWTH it + 5 BUDGETit + 6 SUP _ SEX i + 7 SEX i + 8 DIF _ AGEit + 9 DIS _ ADJ it + 10Y2003t + l DDepartment l + i + it
l =11 17

(3) In order to measure centrality bias, I calculate the ratio between the standard deviation of all ratings on the objective dimension and the standard deviation of all ratings on the subjective dimension for each reference group21 (CEN_BIAS). This ratio is the same for all employees in a reference group, and a high ratio indicates that the subjective ratings are compressed relative to the objective ratings. The variables predicted to affect the extent of centrality bias are similar to those in model 3, except for the predicted sign on office performance. Pressure to keep compensation costs down is predicted to lead to more, instead of less, compression. Furthermore, the standard deviation of all objective performance ratings per reference group (REF_SD) is included as an additional control variable, since the extent to which ratings can be compressed depends on existing performance variation in the reference group. This leads

18

Including discretionary adjustments might lead to an endogeneity problem since both the decision to bias and the decision to provide discretionary bonus adjustments are made by the supervisor at the end of the year. The results are similar when the discretionary adjustments are not included. 19 Due to high correlation between the variables measuring office performance and the office dummies, the office dummies are not included in the model. 20 The introduction of additional variables that control for the number of employees evaluated by the supervisor and contract choices made by the supervisor, such as the contractual weight placed on subjective measures, the number of subjective measures included and the consistency in the chosen subjective measures within the reference group, leads to similar results. 21 A reference group consists of all employees who have a similar function and who are evaluated by the same supervisor.

21

to the following specification that is tested with a random effects model at the supervisor level with robust standard errors:

CEN _ BIASit = 0 + 1OBJ _ Rit + 2 NEW _ Rit + 3 INFO _ Cit + 4GROWTH it + 5 BUDGETit + 6 SUP _ SEX i + 7 SEX i + 8 DIF _ AGEit + 9 DIS _ ADJ it + 10 REF _ SDit +11Y2003t + l DDepartment l +i + it
l =12 18

(4)

Performance Effects In this part of the analysis, I investigate how leniency and centrality bias affect the effectiveness of the compensation system as an incentive provider. I capture the differences in incentives provided by the compensation system by examining the change in employee performance following exposure to supervisor discretion. I measure the change between 2003 and 2004 as the supervisors influence on incentive provision becomes apparent after the evaluation of 2003. Changes in objective performance (PERF_O), subjective performance (PERF_S), and total performance (PERF_T) are examined because bias is hypothesized to have an overall influence on employee incentives. Performance change is expected on both the objective and the subjective dimensions. However, since prior analyses indicate that the subjective (and therefore the total) ratings are biased, they must be interpreted with care. Existing variation in the biases provides an opportunity to examine the influence of these biases on future performance.22 Leniency (LEN_BIAS) and centrality bias (CEN_BIAS) are measured as before. However, in order to analyze the separate effects of centrality bias on above22

When developing hypotheses on the influence of centrality bias on incentive effectiveness, I assume that the employees had at least some knowledge of the rating distribution of their reference group. During the interviews, I asked several employees about this matter and they confirmed that they had some, though not full, knowledge of the performance ratings of their peers. Moreover, through the internal study conducted by the company (see page 15), employees confidentially provided information on the perceived fairness of the evaluation procedures and the outcome distributions of the new compensation system, which implies that they possessed knowledge of how rewards were distributed.

22

average and below-average performers, I split CEN_BIAS into two variables: CEN_BIASA and CEN_BIASB. CEN_BIASA (CEN_BIASB) takes the value of the ratio when the employee performs above (below) the average of his reference group on the objective dimension, and zero otherwise. I control for several variables that might influence the change in employees performance ratings. First, discretionary bonus adjustments are expected to have a positive effect on employee incentives. By adjusting incorrect ratings, the supervisor signals that providing employees with a fair reward is considered important. This reinforces employees beliefs that improving performance will result in higher rewards, which is essential in motivating employees to enhance performance (Dubinsky & Levy, 1989). Moreover, discretionary adjustments are expected to have a positive effect on incentives by creating a feeling of reciprocity.23 I control for the effect of discretionary adjustments using the DIS_ADJ variable defined earlier. Discretionary adjustments are not only expected to influence the motivation of the employees who receive them, but also the motivation of those who do not but are confronted by their existence. Discretionary bonus adjustments create a lack of consistency in rewarding procedures. For some employees actions not included in the compensation contract are taken into account, while for others they are not. This lack of consistency influences incentives by influencing employee perceptions of procedural justice within their organization (Parker & Kohlmeyer, 2005). To control for the influence of discretionary bonus adjustments on employees who do not receive them, I use a dummy variable that equals one if an adjustment has taken place within the reference group and zero otherwise (REF_DISADJ).

23

The employee receives more from the company than the required minimum, making the employee willing to provide more than the minimum effort required (Hannan, Kagel, & Moser, 2002).

23

Third, I control for the maximum amount the performance of the employee can still improve (the difference between the maximum performance rating (5) and the obtained performance rating in 2003, MGROWTH), for changes in the contractual incentive weight on the objective dimension (DIF_WO) and for changes in the number of measures (DIF_NR). I also include several personal conditions that might make an employee less sensitive to the incentives provided by the compensation contract. First, employees who have reached the upper limit of their salary scale might be less receptive to changes in provided incentives as performance improvement can only lead to a bonus increase, not to a periodic salary increase. I also include contract hours and age, as I suspect that part-time and older employees are not motivated in the same way (e.g., they are expected to be less concerned about their future career). Some employees have occupied a different position and/or were appraised by a different supervisor in 2004. To ensure this does not drive the results, two dummy variables are included (DDIF_FUN & DDIF_SUP) that equal one if the employees function and supervisor, respectively, have changed, and zero otherwise. Finally, I add office (DOffice) and department dummies (DDepartment) to control for office-specific and job-specific elements, such as local market conditions, that might influence employee performance. This leads to the following empirical specification.24
PERFi = 0 + 1 LEN _ BIASi + 2 CEN _ BIASAi + 3CEN _ BIASBi + 4 DIS _ ADJ i + 5 REF _ DISADJ i + 6 MGROWTH i + 7 DIF _ WOi + 8 DIF _ NRi + 9 DDIF _ SUP i + 10 DDIF _ JOBi + 11 MAX _ SCi + 12 HOURSi + 13 AGEi + k DOffice k + l DDepartmentl + . i
k =14 l =19 18 25

(5) where PERF can be PERF_T, PERF_O or PERF_S.

24

I have chosen to model the relationship between centrality and leniency bias and performance change as a linear relationship because I expect the effect of more leniency and centrality bias to be relatively constant over the limited range found in this setting.

24

IV. RESULTS Leniency and Centrality Bias As a first step in my analysis, I examine whether the performance ratings are, on average, subject to leniency and centrality bias (Table 5). The results show that the ratings on the subjective dimension are higher than the ratings on the objective dimension, after controlling for relevant influences, implying that subjective performance ratings are subject to leniency bias. However, the results of the annual analyses show that the subjective performance ratings are not more lenient in 2004. Subjectivity even has a marginally significant negative effect on the ratings. These results indicate that supervisors in general use the leeway they have in subjective performance appraisal to bias performance ratings, but that this does not necessarily result in more lenient ratings. Consistent with earlier empirical work, I find that the ratings on the subjective performance dimensions are more compressed than the rating on the objective dimension. The number of performance measures has a negative effect on the ratings, suggesting that a higher number of measures leads to lower performance ratings, consistent with the dilution argument.

The Determinants of Bias Determinants of leniency and centrality bias are investigated in Table 6. Employee performance is found to be one of the main determinants of leniency bias. The objective performance ratings negatively affect the extent of leniency bias, which indicates that supervisors rate more leniently when employee performance is low. Employee performance has no effect on centrality bias.

25

Ratings are found to be less lenient when the employee and supervisor only recently started to work together, indicating that supervisors are more willing to be accurate when personal ties are not that strong, and when lenient ratings are more likely to provide a wrong signal on the correctness of employee actions. Although the lack of a strong personal relationship was predicted to make supervisors more willing to differentiate between employees, ratings are not found not to be less, but more compressed in the initial period of the employee-supervisor relationship. A possible explanation for this is that supervisors refrain from differentiating strongly until they have more information on employee performance. The proxy for information-gathering costs positively affects the extent of leniency and centrality bias, indicating that supervisors provide more lenient and compressed ratings when gathering information is more costly. The evidence also indicates that the offices financial position affects supervisor bias. As predicted, both financial performance variables have a positive effect on the extent of leniency bias and a negative effect on the extent of centrality bias. This indicates that supervisors become more lenient and compress less when the office is performing better financially. Supervisor gender is also found to influence rating behavior. I find that male supervisors are more lenient, not female supervisors, as predicted by previous studies. A speculative explanation for this finding is that female supervisors are aware of the stereotype and consequently exercise more caution when rating. Employee gender was not found to influence the extent of bias.25 Finally, age differences between supervisors and employees negatively affect

25

Most studies that find that female employees receive lower ratings than male employees are conducted in settings where the occupation would likely be perceived as masculine (see Landy & Farr, 1980), while this setting has both masculine- and feminine-type positions. However, even after interacting the positions stereotype with employee gender, I find no evidence of employee gender affecting the extent of bias applied to the ratings.

26

the extent of leniency bias, suggesting that it is more costly for supervisors to rate accurately when they have a strong relationship with the employees. In sum, the results indicate that supervisors vary the amount of bias they apply to performance ratings, and that more bias is applied when biased performance ratings increase the utility the supervisor receives out of rating employee performance.

Performance Effects The performance effects of leniency and centrality bias are examined in Table 7. The results26 indicate that supervisor bias has both positive and negative effects on future performance.27 First, leniency bias has a positive effect on objective and total performance changes, indicating that employees who receive more lenient performance ratings on the subjective dimension show more performance improvement on the objective and total dimensions in the following year. The results also show that centrality bias, for above- and below-average performers, has a negative effect on performance change on all dimensions, indicating that more compressed ratings have a negative influence on the performance of all employees. Regarding the impact of discretionary bonus adjustments, I find that discretionary adjustments have a positive effect on performance change, while discretionary adjustments made in the reference group have no effect. This implies that receiving a discretionary adjustment increases future performance. An alternative explanation is that the ratings in 2004 now portray
26

The Breusch-Pagan and White's tests for heteroskedasticity indicate that the data does not suffer from severe heteroskedasticity problems and the Variance Inflation Factors (VIF) indicate the absence of multicollinearity problems. 27 I also examine the performance effects on a more aggregate level. More specifically, I investigate whether the average performance change of all employees who were appraised by the same supervisor is larger (smaller) when that supervisor has a stronger tendency to apply leniency (centrality) bias. To test this I use model 5, but measure the performance changes and the biases at the supervisor level. The results show that the average extent of leniency (centrality) bias is positively (negatively) related to the average performance change.

27

the true level of overall performance. Instead of needing an ex post adjustment to give the employee the fair amount of compensation, the true value is now directly captured by the measures, resulting in a higher rating. However, even if we cannot be sure that incentives will improve after employees have received an adjustment, the results show that employees who do not receive an adjustment do not show a significant negative performance reaction, thereby taking away an important reason for not adjusting rewards. The results also show that the employees growth potential (MGROWTH) has a positive effect on performance change. This shows that employees with more room for improvement improve their ratings to a greater extent. Furthermore, I find that changes in weights and number of performance measures influence performance change. Being evaluated by a different supervisor is also found to influence performance; it negatively affects performance changes. Finally, the results show that age has a negative effect on performance change, implying that younger employees are more sensitive to incentives.

Robustness Checks The results show that performance change on the subjective dimension is not significantly affected by leniency bias. A plausible explanation for this finding is that the motivational effect caused by the leniency bias in 2003 is offset by the less lenient average subjective performance ratings provided in 2004. An alternative explanation is that employees who score higher on the subjective dimension than on the objective dimension (the majority of the employees) reallocate their effort to the objective dimension because improving a high rating even further requires more than a linear effort increase. I examine this possibility by including the quadratic term of the maximum growth potential (MGROWTH) to control for the possible

28

non-linear effort requirements. The results (not reported) show that the quadratic terms have no significant effect on performance change and all other inferences remain the same. I also investigate the influence of effort reallocation on performance change. I do this by examining the group of employees who are most likely to reallocate their effort from the subjective to the objective dimension because of non-linear effort requirements; i.e., employees scoring high on the subjective dimension (top 20%) and scoring higher on the subjective than on the objective dimension. I use model 5 and replace the leniency bias variable by a dummy variable that indicates one if the employee belongs to this group and zero otherwise. The results (not reported) show no relation between the dummy variable and performance change on the objective dimension, suggesting that the positive relationship between leniency bias and performance change on the objective dimension is not driven by the reallocation of effort from the subjective to the objective dimension. Previously, I implicitly assumed that the performance ratings are exactly proportional to the bonus percentage. The ratings are, however, linked to a bonus range, not to a concrete percentage. The actual bonus might therefore not be exactly proportional to the rating, even if no discretionary bonus adjustment has occurred. To ensure that this is not driving the results, I calculate the difference between the actual bonus percentage and the bonus percentage that would have been paid if the relation between the rating and the bonus percentage had been completely proportional. After replacing the discretionary adjustment variable (DIS_ADJ) with this newly constructed difference variable, I re-estimate specification 5. The results (not reported) are similar to those in Table 7.28 In summary, I find strong support for the prediction that leniency and centrality bias affect the effectiveness of the compensation system as an incentive provider.
28

The results of model 3 and 4 also remain unchanged when DIS_ADJ is replaced by this new difference variable.

29

V. CONCLUSIONS AND FUTURE RESEARCH Rating inaccuracy caused by supervisor bias is perceived to be one of the main problems from introducing subjectivity into compensation contracts. This is due to its assumed negative effect on the compensation systems ability to motivate employees and provide valuable information for personnel decisions. Our knowledge of the influence of supervisor bias on the effectiveness of compensation contracting is, however, extremely limited. Empirical studies examining the consequences of biased performance ratings are especially lacking. This paper contributes to the literature by investigating what causes supervisors to bias performance ratings and by analyzing how biased ratings affect employee incentives. The results indicate that the use of subjective measures leads to more lenient and compressed ratings and that supervisors bias ratings to a greater extent when bias enhances the utility received out of rating performance. The results also provide strong support for the prediction that supervisor bias not only affects current performance ratings, but also future employee incentives. I find that leniency bias has a positive effect on performance improvement, while centrality bias has a negative performance effect for both above- and below-average performers. The empirical finding that supervisor bias can have a positive influence on employee incentives provides an explanation for earlier empirical findings that show that supervisors in general do not receive rewards for rating accurately (e.g., Napier & Latham, 1986). Companies seem to be interested in the effectiveness of performance-based compensation contracts in increasing employees future performance, not necessarily in the accuracy of the performance appraisal as such (at least not for incentive purposes). Motivation to manage the performance appraisal process in an efficient manner seems to be provided by linking the supervisors

30

compensation and promotion possibilities to their units performance, not to the accuracy of their units performance ratings (Harris, 1994). The findings from this study are not without limitations. First, although there are no theoretical reasons to expect that the results would not extend to other settings, its generalizability is reduced by relying on data from one firm. Moreover, I was only able to obtain information for two consecutive years. More extensive time-series data from multiple organizations would remove these limitations. Second, the focus of this study has been on intentional bias. Supervisors can also unintentionally bias performance ratings due to their cognitive limitations. I was unable to distinguish the extent of distortion caused by these cognitive limitations and therefore unable to investigate its effect. Another opportunity for future research is to examine the effect of inaccurate ratings on other functions of performance ratings. I have limited the study to incentive provision, but performance ratings are also likely to influence training and promotion decisions, among others. Finally, this study focuses only on the effects of biased performance ratings on future performance. Subjectivity in compensation contracting is likely to give rise to additional concerns, e.g., uncertainty about performance criteria. Investigating how these other concerns, in combination with biased ratings, influence the effectiveness of compensation contracts would make an important contribution to the literature.

31

REFERENCES Akerlof, G. A. & J. L. Yellen, 1988, Fairness and Unemployment, The American Economic Review, 78(2): 44-49. Arkin, R., H. Cooper, & T. Kolditz, 1980, A Statistical Review of the Literature Concerning the Self-Serving Attribution Bias in Interpersonal Influence Situations, Journal of Personality, 48: 435-48. Baiman, S. & M. V. Rajan, 1995, The Informational Advantages of Discretionary Bonus Schemes, The Accounting Review, 70(4): 557-79. Baker, G. P., R. Gibbons, & K. J. Murphy, 1994, Subjective Performance Measures in Optimal Incentive Contracts, Quarterly Journal of Economics, 109(4): 1125-56. Bnabou, R. & J. Tirole, 2003, Intrinsic and Extrinsic Motivation, Review of Economic Studies, 70(3): 489-520. Bnabou, R. & J. Tirole, 2002, Self-Confidence and Personal Motivation, The Quarterly Journal of Economics, 117(3): 871-915. Bernardin, J. H. & R. M. Buckley, 1981, Strategies in Rater Training, Academy of Management Review, 6(2): 205-12. Berry, N. H., P. D. Nelson, & M. McNally, 1966, A Note on Supervisor Ratings, Personnel Psychology, 19: 423-26. Blinder, A. S. & D. H. Choi, 1990, A Shred of Evidence on Theories of Wage Stickiness, The Quarterly Journal of Economics, 105(4): 1003-15. Bretz, R. D., G. T. Milkovich, & W. Read, 1992, The Current State of Performance Appraisal Research and Practice: Concerns, Directions, and Implications, Journal of Management, 18(2): 321-52. Burney, L. L., C. A. Henle, & S. K. Widener, 2006, Do Characteristics of Strategic Performance Measurement Systems Used in Incentives Enhance Organizational Fairness?, Working Paper: 141. Colquitt, J. A., D. E. Conlon, M. J. Wesson, C. O. L. H. Porter, & K. Yee Ng, 2001, Justice at the Millennium: A Meta-Analytic Review of 25 Years of Organizational Justice Research, Journal of Applied Psychology, 86(3): 425-45. DeCotiis, T. & A. Petit, 1978, The Performance Appraisal Process: A Model and Some Testable Propositions, Academy of Management Review, 3(3): 635-46.

32

Dubinsky, A. J. & M. Levy, 1989, Influence of Organizational Fairness on Work Outcomes of Retail Salespeople, Journal of Retailing, 65(2): 221-52. Fang, H. & G. Moscarini, 2002, Overconfidence, Morale and Wage-Setting Policies: Mimeo, Yale University. Gabris, G. T. & K. Mitchell, 1988, The Impact of Merit Raise Scores on Employee Attitudes: The Matthew Effect of Performance Appraisal, Public Personnel Management, 17(4): 36989. Gibbs, M., K. A. Merchant, W. A. Van der Stede, & M. E. Vargus, 2004, Determinants and Effects of Subjectivity in Incentives, The Accounting Review, 79(2): 409-36. Greenberg, J., 1990, Organizational Justice: Yesterday, Today, and Tomorrow, Journal of Management, 16(2): 399-432. Greene, W. H., 1993, Econometric Analysis, 5 ed, New Jersey: Pearson Education, Inc. Hannan, L. R., J. H. Kagel, & D. V. Moser, 2002, Partial Gift Exchange in an Experimental Labor Market: Impact of Subject Population Differences, Productivity Differences, and Effort Requests on Behavior, Journal of Labor Economics, 20(4): 923-51. Harris, M. M., 1994, Rater Motivation in the Performance Appraisal Context: A Theoretical Framework, Journal of Management, 20(4): 737-56. Harris, M. M. & J. Schaubroeck, 1988, A Meta-Analysis of Self-Supervisor, Self-Peer, and PeerSupervisor Ratings, Personnel Psychology, 41: 43 - 62. Hayes, R. M. & S. Schaefer, 2000, Implicit Contracts and the Explanatory Power of Top Executive Compensation for Future Performance, Rand Journal of Economics, 31(2): 273-93. Heneman, H. G. & D. P. Schwab, 1972, Evaluation of Research on Expectancy Theory Predictions of Employee Performance, Psychological Bulletin, 78(1): 1-9. Holmstrom, B., 1979, Moral Hazard and Observability, Bell Journal of Economics, 10(1): 74-91. Indjejikian, R. J., 1999, Performance Evaluation and Compensation Research: An Agency Perspective, Accounting Horizons, 13(2): 147-57. Ittner, C. D., D. F. Larcker, & M. W. Meyer, 2003, Subjectivity and the Weighting of Performance Measures: Evidence from a Balanced Scorecard, The Accounting Review, 78(3): 725-58. Jawahar, I. M. & C. R. Williams, 1997, Where All the Children Are above Average: The Performance Appraisal Purpose Effect, Personnel Psychology, 50: 905-26.

33

Kahn, L. M. & P. D. Sherer, 1990, Contingent Pay and Managerial Performance, Industrial and Labor Relations Review, 43(Special Issue): 107-20. Landy, F. J. & J. L. Farr, 1980, Performance Rating, Psychological Bulletin, 87(1): 72-107. Larwood, L. & W. Whittaker, 1977, Managerial Myopia: Self-Serving Biases in Organizational Planning, Journal of Applied Psychology, 62(2): 19498. Lazear, E. P., 1991, Labor Economics and the Psychology of Organizations, Journal of Economic Perspectives, 5(2): 89-110. Lipe, M. G. & S. E. Salterio, 2000, The Balanced Scorecard: Judgmental Effects of Common and Unique Performance Measures, The Accounting Review, 75(3): 283-98. Longenecker, C. O., H. P. Sims, & D. A. Gioia, 1987, Behind the Mask: The Politic of Employee Appraisal, The Academy of Management Executive, 1(3): 183-93. McFarlane Shore, L. & G. C. Thornton, 1986, Effects of Gender on Self-and Supervisory Ratings, Academy of Management Journal, 29(1): 115 - 29. McGregor, D. M., 1957, An Uneasy Look at Performance Appraisal, Harvard Business Review, 35(3): 89-94. Meyer, H. H., 1975, The Pay-for-Performance Dilemma, Organizational Dynamics, 3: 39-50. Mischel, W., E. B. Ebbesen, & A. M. Zeiss, 1976, Determinants of Selective Memory About the Self, Journal of Consulting and Clinical Psychology, 44(1): 92-103. Moers, F., 2005, Discretion and Bias in Performance Evaluation: The Impact of Diversity and Subjectivity, Accounting, Organizations and Society, 30(1): 67-80. Murphy, K. R. & J. Cleveland, 1991, Performance Appraisal: An Organizational Perspective, Boston: Allyn and Bacon. Napier, N. K. & G. P. Latham, 1986, Outcome Expectancies of People Who Conduct Performance Appraisals, Personnel Psychology, 39(4): 827-37. O'Connor, N., F. J. Deng, & M. D. Shields, 2006, Determinants of the Subjective Performance Measurement of Managerial Behavior, Working Paper: 1-33. Parker, R. J. & J. M. Kohlmeyer, 2005, Organizational Justice and Turnover in Public Accounting Firms: A Research Note, Accounting, Organizations and Society, 30: 357-69. Pearce, J. L. & L. W. Porter, 1986, Employee Responses to Formal Performance Appraisal Feedback, Journal of Applied Psychology, 71(2): 211-18.

34

Prendergast, C., 1999, The Provision of Incentives in Firms, Journal of Economic Literature, 37(1): 7-63. Prendergast, C. & R. H. Topel, 1993, Discretion and Bias in Performance Evaluation, European Economic Review, 37(2-3): 355-65. Prendergast, C. & R. H. Topel, 1996, Favoritism in Organizations, Journal of Political Economy, 104(5): 958-78. Rosen, B. & T. H. Jerdee, 1974, Influence of Sex Role Stereotypes on Personnel Decisions, Journal of Applied Psychology, 59(1): 9-14. Rynes, S. L., B. Gerhart, & L. Parks, 2005, Personnel Psychology: Performance Evaluation and Pay for Performance, Annual Review of Psychology, 56: 571-600. Tsui, A. S. & C. A. O'Reilly, 1989, Beyond Simple Demographic Effects: The Importance of Relational Demography in Superior-Subordinate Dyads, The Academy of Management Journal, 32(2): 402-23. Varma, A., A. S. DeNisi, & L. H. Peters, 1996, Interpersonal Affect and Performance Appraisals: A Field Study, Personnel Psychology, 49: 341-60. Vroom, V. H., 1964, Work and Motivation: Wiley New York. Weinstein, N. D., 1980, Unrealistic Optimism About Future Life Events, Journal of Personality and Social Psychology, 39(5): 806-20.

35

FIGURE 1 The performance measurement document


PERFORMANCE DOCUMENT Name: Department: Position: Supervisor: Output performance measures Evaluation
Bad Regular Good Very Good Excellent

1) 2) 3) 4) 5) 6) Comments

Competence performance measures Evaluation


Bad Regular Good Very Good Excellent

1) 2) 3) 4) 5) 6) Comments

Total rating output dimension Total rating competence dimension Total score Salary Determination Bad
0%

50% 50%

Regular
1-3%

Good
4-8%

Very Good
9-12%

Excellent
13-15%

Current Salary New Salary

Salary Scale Salary Scale

Growth % Growth %

Bonus Bonus

36

TABLE 1 Descriptive statistics: Part I 2003 Variable Overall performance rating Objective performance rating Subjective performance rating Total # of performance measures # of objective performance measures # of subjective performance measures % of total rating objectively determined % of total rating subjectively determined Mean 2.87 2.80 2.94 9.70 4.04 5.66 44.33 55.67 SD Median Range 0.52 2.86 1.6 4.7 0.76 2.75 1.0 5.0 0.46 3.00 1.5 4.3 1.77 10.00 5 12 1.35 4.00 1 6 1.36 6.00 2 9 8.72 50.00 25 50 8.72 50.00 50 75 n 198 198 198 198 198 198 198 198 Mean 3.20 3.28 3.14 10.68 4.48 6.20 44.37 55.63 2004 SD Median Range 0.50 3.23 1.6 4.4 0.74 3.33 1.0 5.0 0.45 3.13 2.1 4.6 1.49 11.00 5 12 1.30 5.00 2 6 1.06 6.00 3 8 7.57 50.00 25 50 7.57 50.00 50 75 n 198 198 198 198 198 198 198 198

37

TABLE 2 Descriptive statistics: Part II Variable Age Supervisor age Designated contract hours in 2003 Designated contract hours in 2004 Profit growth in 2003a Profit growth in 2004a Budgeted versus actual profit in 2003b Budgeted versus actual profit in 2004b Variable % of employees at their salary scale maximum % of female employees Number of supervisors % of female supervisors
a b

Mean 38.39 39.76 32.68 32.27 1.09 5.68 3.61 4.14

SD Median 8.60 7.54 6.52 6.80 2.62 18.21 3.56 4.05 2003 26.26 56.06 35 28.57

Range

n 198 41 198 198 198 198 198 198

37 23 60 37 28 57 36 12 40 36 12 40 1.60 -6.2 11.9 1.27 -20.3 67.2 1.64 1.6 12.6 4.24 -4.2 7.81 2004 31.31 56.06 32 40.63

The percentage difference between the offices profit this year and the offices profit last year. The percentage difference between the offices budgeted profit growth and the actual profit growth.

38

TABLE 3 Correlation between variables 198 observations for 2003 1. Overall performance rating 2. Objective performance rating 3. Subjective performance rating 4. Total # of performance m. 5. # of objective performance m. 6. # of subjective performance m. 7. % objectively determined 8. Employee age 9. Employee contract hours 10. Profit growth 11. Budgeted versus actual profit 198 observations for 2004 1. Overall performance rating 2. Objective performance rating 3. Subjective performance rating 4. Total # of performance m. 5. # of objective performance m. 6. # of subjective performance m. 7. % objectively determined 8. Employee age 9. Employee contract hours 10. Profit growth 11. Budgeted versus actual profit 1 0.90*** 0.80*** -0.14** -0.12* -0.06 -0.10 -0.02 0.18** -0.16** 0.24*** 2 3 4 5 6 7 8 9 10

0.47*** -0.11 -0.17** 0.03 -0.16** -0.03 0.19*** -0.15** 0.12*

-0.13* -0.02 -0.15** 0.02 -0.03 0.12* -0.12 0.35***

0.65*** 0.66*** 0.10 -0.04 0.30*** -0.12* -0.06

-0.15** 0.52*** -0.02 0.13* -0.05 0.17**

-0.39*** -0.03 -0.03 0.26*** 0.04 -0.04 -0.10 -0.01 0.04 -0.17** -0.25*** 0.09 -0.16** 0.02 -0.11

0.88*** 0.78*** -0.08 0.02 -0.14** 0.18*** -0.22*** 0.14** -0.08 0.12

0.41*** -0.08 -0.03 -0.08 0.10 -0.25*** 0.15** -0.09 0.13*

-0.05 0.07 -0.16** 0.19*** -0.09 0.11 -0.04 0.03

0.72*** 0.53*** -0.21*** -0.05 0.57*** -0.77*** 0.04 -0.02 0.08 0.10 0.08 0.04 -0.28*** -0.19*** -0.16** 0.36*** 0.10 0.39***

-0.06 -0.01 0.00 -0.10 -0.01 0.08 -0.02 0.07 0.14* -0.30***

***, **, * is statistically significant at respectively the 1%, 5%, and 10% level (two-tailed).

39

TABLE 4 Descriptive statistics: Part III Variable 2003 Leniency biasa Centrality biasb Leniency bias per supervisorc Centrality bias per supervisor 2004 Leniency bias Centrality bias Leniency bias per supervisor Centrality bias per supervisor Variable Discretionary bonus adjustment Average bonus adjustmentd
a

Mean 0.06 2.18 0.06 2.22 -0.06 2.20 -0.06 2.10

SD 0.32 1.41 0.22 1.36 0.27 1.15 0.14 1.00

Median 0.00 1.92 0.03 2.17 -0.11 1.92 -0.10 1.88 2003 35% 2.7%

Range -0.44 1.75 0.32 5.95 -0.29 0.80 0.40 5.95 -0.50 1.61 0.52 6.05 -0.25 0.40 0.68 5.10 2004 19% 1.5%

n 198 198 35 35 198 198 32 32

Leniency bias is measured by using the residuals of a regression model, the overall mean is thus forced to be zero. b Centrality bias is measured as the ratio between the standard deviation of all performance ratings on the objective dimension and the standard deviation of all performance ratings on the subjective dimension for each reference group. c The average amount of bias applied per supervisor. d The average bonus adjustment once an adjustment is made. All adjustments were of a positive nature.

40

TABLE 5 The impact of subjectivity on the performance rating and on its variation 2003 Independent variables a DSubjectivity WEIGHT NR_PM Y2003 AGE SEX HOURS RATING 0.10** (2.11) 0.00 (0.20) -0.06*** (-2.76) -0.40*** (-9.80) -0.01** (-2.47) -0.12 (-1.38) 0.01 (1.43) R_RATING -0.14*** (-7.35) 0.00 (1.33) -0.00 (-0.25) 0.05*** (3.28) 0.01* (1.76) 0.05* (1.81) -0.00 (-1.14) RATING 0.29*** (4.80) -0.00 (-0.86) -0.06** (-2.31) -0.01 (-1.34) -0.10 (-1.14) 0.01 (1.50) R_RATING -0.15*** (-5.70) 0.01** (2.04) -0.01 (-1.27) -0.00 (-0.20) 0.04 (1.17) -0.00 (-1.12) RATING -0.09 (-1.44) 0.01 (0.88) -0.07 (-1.19) -0.01*** (-2.69) -0.17 (-1.55) 0.01 (1.08) 2004 R_RATING -0.11*** (-4.82) 0.00 (1.54) -0.02 (-1.32) 0.01*** (2.66) 0.07* (1.87) -0.00 (-1.05)

R2 Within 0.22 0.14 0.09 0.22 0.05 0.14 Between 0.41 0.32 0.58 0.42 0.37 0.32 Overall 0.31 0.20 0.43 0.33 0.26 0.24 ***, **, * indicate statistical significance at respectively the 1%, 5% and 10% level (two-tailed). Z-values are in parentheses and n = 792 for the first two regressions and n = 396 for the last four. (Continued on next page)

41

TABLE 5 (continued) RATING R_RATING Performance rating, on either the objective or the subjective dimension Performance ratings variation, measured as the ratio between employees performance ratings on the objective (subjective) dimension and the mean performance rating on the objective (subjective) dimension: Max ((rating/mean rating), (mean rating/rating)) Dummy variable that equals 1 if the observation refers to a subjective performance rating and 0 otherwise DSubjectivity WEIGHT Contractual incentive weight on the dimension (objective or subjective) to which the observation relates NR_PM Number of performance measures on the dimension (objective or subjective) to which the observation relates Y2003 Dummy variable that equals 1 if the observation relates to 2003 and 0 otherwise AGE Employee age SEX Employee gender (0 for female and 1 for male) HOURS Employee designated contract hours per week a An intercept, as well as supervisor, department and office dummies are included but not reported.

42

TABLE 6 The determinants of leniency and centrality bias

Independent variables a OBJ_R NEW_R INFO_C GROWTH BUDGET SUP_SEX SEX DIF_AGE DIS_ADJ Y2003 REF_SD R2 Within Between Overall

LEN_BIAS -0.33*** (-16.07) -0.05** (-2.12) 0.04* (1.78) 0.02** (2.08) 0.01*** (3.44) 0.11*** (4.05) 0.03 (1.26) -0.01* (-1.73) 0.00 (0.44) -0.06*** (-3.22)

CEN_BIAS -0.02 (-0.36) 0.27** (2.19) 0.59** (2.01) -0.01*** (-4.73) -0.02** (-1.98) -0.08 (-0.22) -0.00 (-0.02) -0.00 (-0.55) 0.02 (0.81) 0.06 (0.68) 2.81*** (12.56)

0.73 0.63 0.67

0.46 0.25 0.36

***, **, * indicate statistical significance at respectively the 1%, 5% and 10% level (twotailed). Z-values are in parentheses and n = 396 for both regressions. (Continued on next page)

43

TABLE 6 (continued) Leniency bias, measured as the residuals of the regression of the ratio between the subjective performance rating and the objective performance rating on the contractual weights, the number of performance measures and the centrality bias CEN_BIAS Centrality bias, measured as the ratio between the standard deviation of all performance ratings on the objective dimension and the standard deviation of all performance ratings on the subjective dimension for each reference group OBJ_R Objective performance rating NEW_R Dummy variable that equals 1 if the employee joined the company in the last three years, if the employee recently changed jobs within the company or if a new supervisor was assigned, and 0 otherwise INFO_C Dummy variable that is 0 if the supervisor and the employee have the same organizational level and 1 if the supervisor has a higher organizational level GROWTH Percentage difference between the profit of this year and the profit of last year calculated per office per year BUDGET Percentage difference between budgeted profit growth and actual profit growth calculated per office per year SUP_SEX Supervisor gender (0 for female and 1 for male) SEX Employee gender (0 for female and 1 for male) DIF_AGE Absolute difference between supervisor age and employee age DIS_ADJ Difference between the rewarded bonus percentage and the maximum of the stated bonus range Y2003 Dummy variable that equals 1 if the observation relates to 2003 and 0 otherwise REF_SD Standard deviation of all objective performance ratings within the same reference group a An intercept and department dummies are included but not reported. LEN_BIAS

44

TABLE 7 Regression of the effects of supervisor discretion on incentive provision

Independent variables a

PERF_O

PERF_S

PERF_T

LEN_BIAS CEN_BIASA CEN_BIASB DIS_ADJ REF_DISADJ MGROWTHb DIF_WO DIF_NRc DDIF_ SUP DDIF_JOB MAX_SC HOURS AGE R2-ADJ

0.73*** (3.00) -0.14*** (-3.12) -0.13*** (-3.15) 0.06* (1.86) 0.01 (0.11) 0.27** (2.26) 0.01** (2.07) -0.02 (-0.53) -0.18 (-1.46) 0.17 (0.85) -0.18 (-1.42) -0.01 (-0.73) -0.02*** (-2.64) 0.47

-0.10 (-0.94) -0.04* (-1.72) -0.06*** (-2.69) 0.07*** (4.01) 0.07 (0.93) 0.45*** (6.78) 0.00 (0.67) -0.06* (-1.79) -0.13* (-1.91) -0.04 (-0.32) 0.05 (0.68) 0.00 (0.10) -0.01** (-2.27) 0.46

0.22* (1.89) -0.08*** (-3.08) -0.09*** (-3.69) 0.07*** (3.65) 0.02 (0.27) 0.36*** (4.66) 0.01** (2.36) -0.02 (-1.34) -0.16** (-2.14) 0.07 (0.56) -0.06 (-0.80)
\

-0.00 (-0.42) -0.01*** (-3.18) 0.51

***, **, * indicate statistical significance at respectively the 1%, 5%, and 10% (twotailed). T-values are in parentheses and n = 198 for all three regressions. (Continued on next page) 45

TABLE 7 (continued) Difference between the objective performance rating of 2003 and the objective performance rating of 2004 PERF_S Difference between the subjective performance rating of 2003 and the subjective performance rating of 2004 PERF_O Difference between the total performance rating of 2003 and the total performance rating of 2004 LEN_BIAS Leniency bias, measured as the residuals of the regression of the ratio between the subjective performance rating and the objective performance rating on the contractual weights, the number of performance measures and the centrality bias CEN_BIASA Variable that equals CEN_BIAS if the observation relates to an above-average performer and 0 otherwise CEN_BIASB Variable that equals CEN_BIAS if the observation relates to a belowaverage performer and 0 otherwise DIS_ADJ Difference between the rewarded bonus percentage and the maximum of the stated bonus range REF_DISADJ Dummy variable that equals 1 if a bonus adjustment was made within the reference group and 0 otherwise MGROWTH Growth potential, measured as the difference between the maximum performance rating (5) and the obtained performance rating in 2003 DIF_WO Difference between the contractual incentive weight on the objective dimension in 2003 and in 2004 DIF_NR Difference between the number of performance measures included in 2003 and the number included in 2004 DDIF_SUP Dummy variable that equals 1 if the employees supervisor is different in 2004 than in 2003 and 0 otherwise Dummy variable that equals 1 if the employees function is different DDIF_JOB in 2004 than in 2003 and 0 otherwise MAX_SC Dummy variable that equals 1 if the employee has reached the maximum of his salary scale and 0 otherwise HOURS Employee designated contract hours per week AGE Employee age a An intercept, as well as department and office dummies are included but not reported. b For regressions with the dependent variable PERF_O, PERF_S and PERF_T I use the objective rating, the subjective rating and the total rating of 2003, respectively, to calculate MGROWTH. c For regressions with the dependent variable PERF_O, PERF_S and PERF_T I use the number of objective, subjective and total performance measures, respectively, to calculate DIF_NR. PERF_O

46

47

You might also like