Professional Documents
Culture Documents
Application of The Cobb-Douglas Production Model To Libraries
Application of The Cobb-Douglas Production Model To Libraries
Robert M. Hayes
2005
Overview
Production Functions in Economics
Applicable to Libraries?
Testing on Public Library Data
Optimization
Application to Academic Library Data
Production Functions - 1
In economics, a “production function" describes an empirical
relationship between specified output and inputs. A production
function can be used to represent output production for a single
firm, for an industry, or for a nation. Just to illustrate, a
production function of a wheat farm might have the form:
W=F(L,A,M,F,T,R)
Q=aLbCc
where Q stands for output, L for labor, and C for capital. The
parameters a, b, and c (the latter two being the exponents) are
estimated from empirical data.
log(Circ/Srvst) = a + (1 – b) log(Coll/Srvst)
where
“Circ” is the circulation
“Srvst” is the service staff
“Coll” is the collection size
Application to Public Libraries
To see whether the Cobb-Douglas production model is
applicable to public libraries, detailed data (1976) for
several states—California, Illinois, Ohio, Missouri,
Wisconsin—were used to determine the relevant
parameters.
Data for a portion of the California libraries (the 78
serving the largest populations) provided the primary
basis for exploration of the Cobb-Douglas model, while
those for the rest of the California libraries and for the
other states and national libraries served as the means for
testing and evaluating the results.
Testing on Data for California Libraries
78 largest libraries
76 largest, not including LAPL or LA County
All 173 libraries
35 of 78 largest with budgets less than $1,000,000
120 of all 173 with budgets less than$1,000,000
The Results for California Libraries
log (Circ/Srvst) = a + (1-b) log(Coll/Srvst)
log a 1 - b R
78 largest libraries .804 .590 .68
76 (not including LAPL or LA County) .806 .590 .68
These35
dataof 78 awith
present incomes
qualitatively less
consistent than
picture, $1,000,000
showing a high correlation .670 .770 per .80
between circulation staff member
and size of collection per service staff member.
All 173 libraries .770 .592 .67
120 with incomes less than$1,000,000 .700 .654 .70
Generalization to other States
Illinois Public Libraries
Ohio Public Libraries
Missouri Public Libraries
The overall size of libraries in each of these states is relatively
smaller than those in California:
log a 1 - b R
120 of 173 California .709 .654 .70
454 af 567 Illinois .633 .676 .79
230 of 251 Ohio .617 .691 .78
122 ofthe121
In summary, Missouri
Cobb-Douglas equation.670 .631 .64
appears consistently to describe
the behavior of libraries of a size determined by budget of less than $1
million, across a set of four states (California, Illinois, Ohio, and Missouri).
In each case, there is a relatively high correlation. There is close agreement
among the values for the parameters for the four regressions.
Discussion of Variance
Effect of Multi-collinearity
Effect of Demographic Factors
Effect of Multi-Collinearity - 1
The use of regression equations is an easy way to deal
with the kind of analyses involved in evaluating the Cobb-
Douglas equation. However, although easy, it is a way
fraught with pitfalls. In particular, the variables involved
are closely interrelated—multi-collinear. Both staff and
collection are highly correlated with each other and with
circulation. It is therefore easy to investigate equations
that will almost automatically result in high correlation,
but will simply reflect the self-evident correlations.
In particular, different forms of the Cobb-Douglas
equation, though arithmetically equivalent, can exhibit
radically different correlations.
Effect of Multi-Collinearity - 2
To illustrate, consider the following two equations:
log CIRC = log(a) + blog(SRVST) + (1 - b)log(COLL)
log CIRC/SRVST = log(a) + (1 b)log(COLL/SRVST)
The correlations for these two equations, for the largest 78
California libraries and for the same values of a and (1 - b),
(viz., log(a) = .804 and 1 - b = .590) are, respectively, R = .96 and
R = .68. The reason is simple: The first equation is controlled by
the close relationship between circulation, on the one hand, and
service staff and collection size on the other; the second equation
depends upon the less clear-cut relation between the ratios.
Effect of Multi-Collinearity - 3
If this problem were treated as a multiple regression problem,
in which an effort were made to represent log(CIRC) as a
function of the two independent variables log(SRVST) and
log(COLL), several technical problems would arise:
1. The determinant of the matrix of inter-variable correlations would
be near zero, making it difficult to calculate the regression coefficients;
2. As a result, the computation would provide imprecise, highly
variable estimates of those coefficients; and
3. There would be large sampling variances.
The use of the ratios, CIRC/SRVST and COLL/SRVST,
significantly reduces the impact of the multi-collinearity. It
permits one to obtain consistent estimates and to avoid the
technical problems of multi-collinearity.
Effect of Demographic Factors - 1
The second, and more important, issue involved in evaluating
the correlations found for the Cobb-Douglas equation arises
because the library is not a market-oriented organization.
The management decision with respect to allocation of
resources between capital (i.e., collection) and service staff
therefore is likely to account for only a portion of the
variance among libraries, with at least part of the remaining
variance being determined by demographic factors.
The analysis presented of the California data considers only
those issues affected by library management decisions and
accounts for only 50% of the variance among California
libraries. It is, therefore, worthwhile to assess the effect of
some demographic factors.
Effect of Demographic Factors - 2
Consider the following log linear form which combines Cobb-Douglas with some
demographic factors:
6
log(y0) = ailog(yi)
i=0
y0 = circ/popl,
y1 = a(srvst)b(coll)1-b/popl,
y2 = (average income),
y3 = (average years of education)
y4 = (number in school)/popl,
y5 = (area/popl)
y6 = (average distance)
Effect of Demographic Factors - 3
The first, y0, is the same dependent variable used before; the
second, y1, is the Cobb-Douglas formula for circulation, divided by
population; the remaining are the typical demographic variables.
The regression for California libraries on this equation, combining
Cobb-Douglas and demographic factors, was as follows:
ai R2 beta
1 Cobb-Douglas .642 .446 .606
2 Income .101 .119 .067
3 Education 1.856 .050 .225
4 Percent in school .126 .030 .236
These account for 67.5% of the variance (an R > .80).
5 Density of population .076 .025 .473
6 Distance .077 .005 .241
Optimization
Central Library
Branch Libraries
Division Between Central Library and Branches
Optimization for a Central Library
For a central library, the management decision is to maximize X0 = aX1bX2(1-b)
, subject to X1C1 + X2C2 = TR
where TR is the budget available for the central library.
Using a Lagrangian multiplier, let
P=aX1bX2(1-b) - k(XlC1+X2C2 -TR).
To maximize P, take partial derivatives:
dP/dX1 = abX1(b-1) X2(1-b) – kC1 = 0
dP/dX2 = a(l - b)Xl b X2(1-b) – kC2 = 0
dP/dk = X1C1 + X2C2 - TR = 0
Taking the ratio of the first iwo equations,
C1/C2 = (b/(1-b))(X2/X1)
This gives the following as the design equations
C1X1 = bTR
C2 = (1 - b)TR
Optimization for Branch Libraries - 1
The management decisions where branches are involved, however,
are more complex than is implied in the Cobb-Douglas model taken
alone, since the effect of the inverse-distance law on utilization of a
library makes the number of branches crucial. Assuming that the
inverse distance law is applicable, take X B= (B/B0)X0
That is, given the actual circulation, X 0, for a given number of
branches, B0, the circulation for another number of branches, B,
would be proportional to B/B0.
However, a change in the number of branches would also change the
number of service staff needed, resulting in a different distribution
of resources between service staff and collection, and lead to changes
in circulation as represented in the Cobb-Douglas model.
Optimization for Branch Libraries - 2
Assume the following form for the Cobb-Douglas model:
XB = (B/B0) a(Bm)b (X2)(1-b)
where m is the minimum staffing required per branch.
We want to choose B and X2 so as to maximize XB, subject to
the boundary condition that the total resources available for
the branch library system are fixed:
BmC1 + X2C2 = TB
Using a Langranian multiplier, let
P = (B/B0)a(Bm)b (X2)(1-b) - y (BmC1 + X2C2 - TB)
= (B(1+b)/B0)a(m)b (X2)(1-b) - y (BmC1 + X2C2 - TB)
Optimization for Branch Libraries - 2
Taking partial derivatives with respect to B, X2, and y:
dP/dB = (1 + b)(Bb/B0) a(m)b (X2)(1-b) - y (mC1)
dP/dX2 = (1 – b)(B(1+b)/B0)a(m)b (X2)-b - y (C2)
dP/dy = BmC1 + X2C2 - TB
Taking ratios of the first two equations,
mC1/C2 = ((1 + b)/(1 – b))(X2/B)
From the third equation,
X2C2 = (1 – b)TB/2, BmC1 = (1 + b)TB/2
Optimization for Branch Libraries - 2
For x1, (1) Number of Reader Services staff
(2) Number of faculty
For x2, (1) Size of library collection
(2) Index of library rank (from ARL)
(3) Index of library size (from ARL)
Statistics were acquired for the ARL libraries and
institutions for academic year 1973/74 and for the total
number of publications attributable to each institution
for the period 1971/1981. Data of a similar nature were
obtained for each of the past 17 years; for each of them,
the analyses are quite consistent in the overall patterns.
Ordinary least squares regression analyses are applied
to these data for the several ARL libraries. In addition
to these regression analyses, some effort was made to
identify major differences among the libraries.
Library Productivity
Measures of Production
Measures of Labor
Measures of Capital Investment
Results
Measures of Production- 1
The characterizing function of the academic research
library is support to research of faculty and students,
primarily doctoral students. However, this is a very
difficult function to measure. Unfortunately, statistics for
"circulation" were not reported in ARL statistics until
1995, but even so it is a matter of some debate concerning
whether they are an adequate measure of research use.
On the one hand, the claim has been made that
“circulation is a reasonably reliable index of all use,
including the unrecorded, consultative, or browsing use
within the library". On the other hand, other analyses
that in-house use is significantly different.
Measures of Production- 2
The Faculty. If we regard the faculty as the primary
research users, might not the number of them be a measure
of the amount of research use made of the collection?
Underlying that view are a number of assumptions (e.g., the
average use of the collection by a faculty member is not a
function of the library as such and will be uniform from
institution to institution, even if not from faculty member to
faculty member).
Ph.D. Graduates. In the same vein, the Ph.D. students also
are heavy research users of a collection. With the same kinds
of assumptions that apply to faculty, might not the number
of Ph.D. graduates be a measure of the research use?
Measure of Labor - 1
The costs involved in technical services (basically, in selecting,
acquiring, processing, and cataloging of acquired materials)
are regarded as part of the capital investment.
Doing this requires that the staff involved in technical services
be estimated, since the published data do not clearly identify
it. The basis for doing so was identical with that used for
public libraries
In the following tabulation, the FTE estimates are based on
acquisitions (ACQ), using a standard ratio of 1.5 for volumes
per title Serial titles are assumed to account for about one-
fourth of the volumes acquired based on the apparent ratio of
serial volumes to total volumes and one volume per year per
serial).
Measure of Labor - 2
current serials.
Measures of Capital - 3
A third alternative is the ARL Library Index. Through
factor analysis, the ARL data variables were reduced
from 22 down to 10. A weight is derived for each of the
variables on the factor to which they relate.
These weights are then applied to the variables for a
given library to derive the index value for that library.
The weights for the ten variables in that principal
factor analysis for 1980/81 were as follows (the
comparable values reported for 1979/80 are quite
similar):
Measures of Capital - 4
6.70
6.60
6.50
log(Coll/Srvst)
6.40
6.30
6.20
6.10
6.00
5.90
0.00 0.20 0.40 0.60 0.80 1.00 1.20 1.40 1.60 1.80
log(Circ/Srvst) - 4.5
Scattergram for (Circ + Ref)/(Srvst)
6.8
6.7
6.6
6.5
log(Coll/Srvst)
6.4
6.3
6.2
6.1
5.9
0.00 0.20 0.40 0.60 0.80 1.00 1.20 1.40
YEAR A B R
ARL7778 -0.452 0.882 0.820
ARL7879 -0.700 0.808 0.685
THE END