Professional Documents
Culture Documents
Benford'S Law and Arithmetic Sequences: January 2015
Benford'S Law and Arithmetic Sequences: January 2015
Benford'S Law and Arithmetic Sequences: January 2015
net/publication/312590688
CITATIONS READS
3 226
1 author:
Zoran Jasak
6 PUBLICATIONS 12 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Zoran Jasak on 22 January 2017.
ZORAN JASAK
Abstract
Benford’s law gives expected patterns of the digits in numerical data. It can be
used as a tool to detect outliers, for example, as a test for the authenticity and
reliability of transaction level accounting data. Based on Benford’s law tests for
first two digits, first three digits, last digits, last digit, last two digits have been
derived as an additional analytical tool. Benford’s law is known as a ‘first digit
law’, ‘digit analysis’ or ‘Benford-Newcomb phenomenon’. Leading first digits we
can treat as division of interval [1; 10 ) in 9 intervals; leading first two digits we
can treat as division of interval [10; 100 ) in 90 intervals. In this text, we elaborate
case when intervals are divided in arbitrarily chosen number n >1 of
subintervals. For such case analytical form for expectation and variance are
developed. Special interest is in case when n → +∞.
1. Introduction
these tables were actually dirtier than the last. From this, he concluded
that people are more likely to use numbers starting with smaller digits
than with larger ones.
1
pc = P [ D1 = c ] = log1 + . (1.1)
c
Here pc is probability for c to be the leading digit from the set A = {1, 2,
3, 4, 5, 6, 7, 8, 9}. Generally looking, set A is an arithmetic sequence
with starting member a = 1, difference d = 1 and n = 9 members.
Significands1 x having leading digit a are from interval [a; a + d ) what
we can write in general form
a + (k − 1) ⋅ d ≤ x < a + k ⋅ d, k = 1, … , 9,
d
pk = P [ a + (k − 1) ⋅ d ≤ x < a + k ⋅ d ] = log 1 + ,
(1.2)
a + (k − 1) ⋅ d
Motive for this text is to give answers on some question which arise:
is it possible to use arithmetic sequence with more than 9 members on
interval [1; 10 ) ? Is Benford’s law applicable in such situation ? What
happens if n → ∞ ?
a + kd a + (k + 1) d
> ⇔ (a + k d )2 > (a + k d )2 − d 2 ,
a + (k − 1) d a + kd
∑p
k =1
k = 1.
n n n
∑
k =1
pk = ∑
k =1
log1 +
d =
a + (k − 1) d
k =1
∑log
a +k ⋅d
a + (k − 1) d
n n
∑
k =1
pk = log ∏
k =1
a +k ⋅d
a + (k − 1) d
= log
a +n⋅d
a
=1
a +n⋅d 9a
⇒ = 10 ⇒ d = . (1.3)
a n
4 ZORAN JASAK
a1 = a, ak = a + (k − 1) ⋅ d, k > 1.
a +n⋅d ( B − 1) ⋅ a
⇒ = B ⇒ d = . (1.4)
a n
9a n 9
pk = log 1 + = log 1 +
.
(1.5)
a + (k − 1) ⋅ (9a n ) n + (k − 1) ⋅ 9
∪
m
[ ak ⋅ B m ; ak 1 ⋅ B m ) = ∪ [ ak ⋅ B m ; ( ak + d ) ⋅ B m ).
∈Z
+
m ∈Z
(1.6)
d 9
P [ x ∈ [ ak ; ak + d ) ] = log 1 + = log 1 +
.
a + (k − 1) d n + 9 ⋅ (k − 1)
BENFORD’S LAW AND ARITHMETIC SEQUENCES 5
Let make some choice of a, for example, a = 3. In this case, right end of
interval [a; B ⋅ a ) is 10 ⋅ 3 = 30; leading digits of numbers in this
interval are 3, 4, 5, 6, 7, 8, 9, 1, 2. In the another words, we again work
interval [1; B ) for first digits. In case when a = 10 we work with interval
9⋅a
ak = a + (k − 1) ⋅ d, d = ,
n
9a a n
d = ⇒ =
n d 9
9a a ⋅ (n − 9 )
⇒ a−d = a− =
n n
9a
⇒ a +n⋅d = a +n⋅ = 10a. (2.1)
n
n
E (D ) = ∑=1 ( a + (k − 1) d ) ⋅ log 1 + a + (k d− 1) ⋅ d
k
n
E (D ) = ∑ ( a − d + k ⋅ d ) ⋅ log 1 + a + (k d− 1) ⋅ d
k =1
6 ZORAN JASAK
n n
E (D ) = ∑
k =1
(a − d ) ⋅ log 1 +
d +d⋅
a + (k − 1) ⋅ d ∑ k ⋅ log 1 + a + (k d− 1) ⋅ d .
k =1
n n
E0 = ∑
k =1
(a − d ) ⋅ log 1 +
d = (a − d )
a + (k − 1) ⋅ d ∑ ⋅ log 1 + a + (k d− 1) ⋅ d
k =1
(2.1) n − 9
E0 = a − d = ⋅ a.
n
n n k
E1 =
k =1
∑
k ⋅ log 1 +
a + (k
d
− 1 ) ⋅ d
= log
∏ k =1
a +k ⋅d
a + (k − 1) ⋅ d
n
(a + n ⋅ d )n
E1 = log
n
= n ⋅ log (a + nd ) − log ∏ ( a + (k − 1) ⋅ d )
∏
k =1
( a + (k − 1) ⋅ d ) k =1
n
n
⇒ E1 = n ⋅ log (10a ) − log d ⋅
k =1
d ∏
a + (k − 1)
n
E1 = n + n log a − n log d − log ∏ da + (k − 1)
k =1
n
E1 = n + n log
a
d
− log ∏ da + (k − 1) .
k =1
(2.2)
x (n ) = x ⋅ ( x + 1 ) ⋅ … ⋅ ( x + n − 1) .
Γ (x + n )
x (n ) = .
Γ(x )
BENFORD’S LAW AND ARITHMETIC SEQUENCES 7
Since
a n 10n
x = = ⇒ x+n = ,
d 9 9
we have
10n
Γ
n 9
E1 = n + n log − log , (2.3)
9 n
Γ
9
10n
Γ
n−9 9a n 9
⇒ E (D ) = ⋅a + ⋅ n + n log − log
n n 9 n
Γ
9
n−9 9a
⇔ E (D ) = ⋅a+ ⋅ E1, (2.4)
n n
9a a ⋅ (n + 9(E1 − 1))
E (D ) = a + ⋅ (E1 − 1) = . (2.5)
n n
10n
Γ
(n − 9 )a n 9a 9
E (D ) = + 9a + 9a ⋅ log − ⋅ log
n 9 n n
Γ
9
10n
Γ
(10n − 9 )a n 9a 9
E (D ) = + 9a ⋅ log − ⋅ log
n 9 n n
Γ
9
10n
Γ
9a 10n − 9 n 9
E (D ) = ⋅ + n log − log . (2.6)
n 9 9 n
Γ
9
10n
Γ
9a 10n − 9 n 1 9 ,
E (D ) = ⋅ + n log − ⋅ ln (2.7)
n 9 9 ln 10 n
Γ
9
10n
Γ
9a 10n − 9 n 9 .
E (D ) = ⋅ + n log − log10 e ⋅ ln (2.8)
n 9 9 n
Γ
9
B⋅n
Γ
n B − 1
E1 = n + n log − log , (2.3.a)
B −1 n
Γ
B − 1
n − ( B − 1) ( B − 1) ⋅ a
E (D ) = ⋅a+ ⋅ E1, (2.4.a)
n n
B ⋅ n
Γ
(B − 1)a B ⋅ n − (B − 1) n B − 1,
E (D ) = ⋅ + n log B − log B
n B −1 B −1 n
Γ
B − 1
(2.6.a)
BENFORD’S LAW AND ARITHMETIC SEQUENCES 9
B ⋅ n
Γ
(B − 1)a B ⋅ n − (B − 1) n 1 B − 1
E (D ) = ⋅ + n log B − ⋅ ln ,
n B −1 B − 1 ln B n
Γ
B − 1
(2.7.a)
B ⋅ n
Γ
(B − 1)a B ⋅ n − (B − 1) n B − 1
E (D ) = ⋅ + n log B − log B e ⋅ ln .
n B −1 B −1 n
Γ
B − 1
(2.8.a)
n
E( D ) = 2
∑
k
(a + (k − 1) d )
=1
2
⋅ log 1 +
a + (k
d
− 1 ) d
.
First, we have
(a + (k − 1) d )2 = ((a − d ) + k d )2 = (a − d )2 + 2d(a − d )k + (k d )2 .
n
E( D2 ) = ∑ ((a − d )2 + 2d(a − d )k + (k d )2 ) ⋅ log 1 + a + (kd− 1) d ⇒
=
k 1
⇒ E ( D 2 ) = E 21 + E 22 + E 23,
where
n
E 21 = ∑1 (a − d)2 ⋅ log 1 + a + (kd− 1) d ,
k=
n
E 22 = ∑1 2d(a − d)k ⋅ log 1 + a + (kd− 1) d ,
k=
10 ZORAN JASAK
and
n
E 23 = ∑=
k 1
(k d )2 ⋅ log 1 +
a + (k
d
− 1 ) d
.
n
a ⋅ (n − 9 ) 2
E 21 = (a − d )2 ⋅ ∑=
k 1
log 1 +
d = (a − d )2 =
a + (k − 1) d
n
.
n
E 22 = 2d(a − d ) ⋅ ∑=
k 1
k ⋅ log 1 +
d = 2d(a − d ) ⋅ E1
a + (k − 1) d
9a (n − 9 ) a 18a 2 (n − 9 )
⇒ E 22 = 2 ⋅ ⋅ ⋅ E1 = ⋅ E1.
n n n2
n
E 23 = ∑ (k d)
k =1
2
⋅ log 1 +
d
a + (k − 1) d
n
E 23 = d 2 ⋅ ∑k
k =1
2
⋅ log 1 +
d
a + (k − 1) ⋅ d
n
81a 2
=
n 2 ∑ k
k =1
2
⋅ log 1 +
d
a + (k − 1) ⋅ d
n k2
81a 2
E 23 =
n2
log
k =1
∏ a +k ⋅d
a + (k − 1) ⋅ d
a + d 12 22 n2
81a 2 a + 2d a + n ⋅ d
E 23 = ⋅ log ⋅ ⋅…⋅
n2 a a+d a + (n − 1) d
BENFORD’S LAW AND ARITHMETIC SEQUENCES 11
2
81a 2 1 1 (a + n ⋅ d )n
E 23 = ⋅ log 2 ⋅ ⋅ … ⋅
n2
2 2 2 2
a1 (a + d )2 −1 (a + (n − 1) d )n − (n −1)
1 n2
81a 2 1 (a + n ⋅ d )
E 23 = ⋅ log ⋅ ⋅…⋅
n2 a 2⋅1 −1 (a + d )2⋅2 −1 (a + (n − 1) d )2⋅n −1
81a 2 a a+d a + (n − 1) d 2
E 23 = ⋅ log ⋅ ⋅…⋅ ⋅ (a + n ⋅ d )n
2 2⋅1 2⋅2 2⋅n
n a (a + d ) (a + (n − 1) d )
n
81a 2
∏
(a + (k − 1) d )
2
E 23 = ⋅ log k =1 .(a + n ⋅ d )n
n2 n
2
k =1
( a + ∏
(k − 1 ) d )k
n n
81a 2
E 23 =
n2
⋅ log
∏
k =1
(a + (k − 1) d ) + n 2 log (a + nd ) − 2 log
∏
k =1
(a + (k − 1) d )k .
81a 2
E 23 = ⋅ [E 231 + E 232 − 2 ⋅ E 233] , (2.9)
n2
where
n
E 231 = log ∏ (a + (k − 1) d) ,
k =1
E 232 = n 2 log(a + nd ) ,
and
n
E 233 = log
k =1
∏(a + (k − 1) d )k .
12 ZORAN JASAK
n n
E 231 = log
∏
k =1
(a + (k − 1) d ) = log d
n
∏
a + (k − 1)
k =1 d
10n 10n
Γ Γ
9 n 9
E 231 = n log d + log = n log a − n log + log
n 9 n
Γ Γ
9
9
n n m
E 233 = log
∏
k =1
(a + (k − 1) d )k = log
∏∏
m =1 k =1
(a + n ⋅ d − k ⋅ d ) , (2.10)
n n m
E 233 = log
∏
k =1
(a + (k − 1) d )k =
∑ ∑ log (a + n ⋅ d − k ⋅ d ).
m =1 k =1
(2.11)
n n m
E 233 = log
∏
k =1
k
(a + (k − 1) d ) = log
∏∏10a − k ⋅ 9a
m =1 k =1
n
n m
E 233 = ∑∑ log 10a − k ⋅
m =1 =1
k
9a
n
.
n m
81a 2 9a
E 23 =
2
n
n log (10a ) − E1 + n 2 log (10a ) − 2 ∑∑
m =1 k =1
log 10a − k ⋅
.
n
BENFORD’S LAW AND ARITHMETIC SEQUENCES 13
a 2 ⋅ (n − 9 )2 18a 2 (n − 9 )
E( D2 ) = + ⋅ E1
2
n n2
n m
81a 2 9a
+
n2
⋅ n(n + 1) log (10a ) − E1 − 2 ⋅
m =1 k =1
∑∑
log 10a − k ⋅
n
a 2 ⋅ (n − 9 )2 18a 2 (n − 9 ) 81a 2
E( D2 ) = + ⋅ E1 − ⋅ E1
n2 n2 n2
n m
81a 2 9a
+
n2
⋅ n(n + 1) log (10a ) − 2 ⋅
∑∑
m =1 k =1
log 10a − k ⋅
.
n
a 2 ⋅ (n − 9 )2 18a 2 (n − 9 ) 81a 2
⇒ P1 = + ⋅ E1 − ⋅ E1
n2 n2 n2
P1 =
a2
2
[(n − 9)2 + 18(n − 9)E1 − 81E1]
n
P1 =
a2
2
[(n − 9 + 9E1)2 − 81E1(E1 + 1)]
n
a2 2
P1 = (n − 9 + 9E1)2 − 81a2 E1 (E1 + 1)
n2 n
81a 2
P1 = (E (D ))2 − E1 (E1 + 1) .
n2
81a 2
E ( D 2 ) = (E ( D ))2 − E1 (E1 + 1)
n2
n m
81a 2 9a
+
n2
⋅ n(n + 1) log (10a ) − 2 ⋅
m =1k =1
∑∑
log 10a − k ⋅
n
14 ZORAN JASAK
E ( D 2 ) = (E ( D ))2
n m
81a 2 9a
+
n2
⋅ n(n + 1) log (10a ) − E1 (E1 + 1) − 2 ⋅
∑∑
m =1k =1
log 10a − k ⋅
.
n
Var (D ) = E ( D 2 ) − (E (D ))2 ⇒
n m
81a 2 9a
Var ( D ) =
n2
⋅ n(n + 1) log (10a ) − E1 (E1 + 1) − 2 ⋅
∑∑
m =1 k =1
log 10a − k ⋅
.
n
3. Conclusion
Calculation of this type can be conducted for second, third, ... digits
and groups of digits too.
References
[1] Simon Newcomb, Note on the frequency of use of different digits in natural numbers,
American Journal of Mathematics 4(1/4) (1881), 39-40.
[2] Frank A. Benford, The law of anomalous numbers, Proceedings of the American
Philosophical Society 78(4) (1938), 551-572.
16 ZORAN JASAK
[3] Arno Berger and Theodore P. Hill, A basic theory of Benford’s law, Probability
Surveys 8 (2011), 1-126; ISSN: 1549-5787, DOI: 10.1214/11-PS175.
[6] Ralph A. Raimi, The first digit problem, The American Monthly 83(7) (1976),
521-538.