Professional Documents
Culture Documents
The Significant-Digit Phenomenon
The Significant-Digit Phenomenon
The Significant-Digit Phenomenon
TheodoreP. Hill
unlike the set of even numbers,say, which has density 1/2 amongthe integersand
density0 amongthe real numbers,and in generalthere are manywaysof assigning
a numberto the set "first significantdigit = d" which are consistentwith natural
density.The explanationsmentionedabove simplysingle out particularsummation
or integrationtechniquesthat yield the "correct"Benfordfrequencies.
The second shortcomingis that, terminologynotwithstanding,the past fre-
quency-assigningfunctions leading to (1) are notprobabilities, at least not in the
classical sense. The standardmathematicaldefinitionof probabilityis a [0,1]-val-
ued function P on a domain of sets (called a sigma algebra) closed under
complementsand countableunions, which assigns 1 to the whole set and assigns
measureEn=1P(An)to the set U n=1Anif the {An}are disjoint.But the methods
above necessarilyfail to satisfythese conditions,as will, for example, any reason-
able notion of densityon the naturalnumberswhich assignsdensity0 to singletons,
for then P(Nl)= LXP({n})= O+ 1. (This is exactly the same reason for the
foundationaldifficultyin makingrigoroussense of "pick an integer at random";
e.g., see De Finetti [4] pages 86, 98-99). For the integer-basedmodels of Benford's
Law, this difficulty seems insurmountable,and for the above-mentionedreal-
numbermodels either a precise domainfor the probabilityin (1) was not specified
by Newcombet al., or when specifiedwas simplynot the appropriatecollection S.
THE GENERALSIGNIFICANT-DIGIT
LAW
GeneralSignificant-Digit
Law [8]. For all k E N, all d1 E {1,2,. . .,9} and all
j = 2,...,k,
dj s{0,1,2,...,9},
' k - ' k -1-
P n {Di = di} = log101 + E di *lOk-i . (2)
i i=l i i=l
Example.
S = {1 < D1 < 5} = {1 < D(100) < S} U {10 < D(100) < 50}
which says that graphically(as a subset of [1, b)), the same set S is
[ if b= 10
1 ba b
and is
[ F F ) ) if b= 100,
1 ba/2 bl/2 b(l+a)/2 b
Defilnition.[8] P is base-invariant
on v if
n-1
k =O
ACKN057VLEDGMENT.The author is grateful to Professors Bob Foley and Ron Fox for several useful
suggestions and Goran Hognas for pointing out the self-similarity of A.
1. Becker,P. (1982) Patternsin listingsof failure-rateand MTTFvalues and listingsof other data.
IEEE Transactions on Reliability,R-31, 132-134.
2. Benford,F. (1938)The law of anomalousnumbers.Proc. Amer. Phil. Soc. 78, 551-72.
3. Chernoff,H. (1981).How to beat the MassachusettsNumbersGame. Math. Intel.3, 166-172.
4. De Finetti,B. (1972) Probability,Inductionand Statistics.Wiley,New York.
5. Diaconis,P. and Freedman,D. (1979) On roundingpercentages.J. Amer. Stat. Assoc., 359-64.
6. Hamming,R. (1976)On the distributionof numbers.Bell Syst. Tech.J. 49, 1609-25.
7. Hill, T. (1988)Random-number guessingand the firstdigitphenomenon.PsychologicalReports62,
967-71.
8. Hill, T. (1995)Base-invarianceimpliesBenford'sLaw, Proc. Amer. Math. Soc. 123, 887-895.
9. Newcomb,S. (1881) Note on the frequencyof use of the different digits in naturalnumbers.
Amer. J. Math 4, 39-40.
10. Nigrini,M. (1992) The detection of income evasion throughan analysisof digital distributions.
Ph.D. Thesis,Departmentof Accounting,Universityof Cincinnati.
11. Raimi,R. (1976)The first digit problem.Amer. Math. Monthly83, 521-38.
12. Schatte, P. (1988) On mantissadistributionsin computingand Benford'sLaw. J. Inf. Process.
Cybern.EIK 24, 443-455.
13. Varian,H. (1972)Benford'sLaw. Amer. Statistician26, 65-66.
Schoolof Mathematics
GeorgiaInstituteof Technology
Atlanta, GA 30332
hill@math.gatech.edu