Normal and Stable Approximation To Subgraph Counts in Superpositions of Bernoulli Random Graphs

Normal and stable approximation to subgraph counts
in superpositions of Bernoulli random graphs
Mindaugas Bloznelis1 , and Joona Karjalainen2 , and Lasse Leskelä2

1 Institute of Informatics, Vilnius University
Naugarduko 24, LT-03225 Vilnius, Lithuania

2 Department of Mathematics and Systems Analysis
School of Science, Aalto University

Otakaari 1, FI-02015 Espoo, Finland
arXiv:2107.02683v1 [math.PR] 6 Jul 2021
Abstract
The clustering property of a complex network signals about the abundance of small dense
subgraphs in otherwise sparse network. We establish the normal and stable approximation to
the number of small cliques, cycles and more general 2-connected subgraphs in the network
model defined by a superposition of Bernoulli random graphs that admits non-vanishing
global clustering coefficient and power law degrees.
1 Introduction
Mathematical modeling of complex networks aims at explaining and reproduction of charac-
teristic properties of large real world networks. We mention the power law degree distribution
and clustering to name a few. By clustering property we mean the tendency of nodes to clus-
ter together by forming relatively small groups with a high density of ties within a group.
Locally, in a vicinity of a vertex, clustering can be measured by the local clustering coefficient,
the probability that two randomly selected neighbors of the vertex are adjacent. Globally,
the fraction of wedges (paths of length 2) that induce triangles defines the global clustering
coefficient, which represents the probability that endpoints of a randomly selected wedge are
adjacent. Non-vanishing clustering coefficients signal about the abundance of triangles and
other small dense subgraphs in otherwise sparse networks. An interesting problem is about
tracing the relation between the clustering characteristics and the numbers of small cliques
in sparse complex networks. By sparsity we mean the scaling property, where the degree of
a randomly selected node remains (stochastically) bounded as the number of nodes of the
network increases.
In the present study we establish the normal and α-stable approximation to the distri-
bution of the number of k-cliques, k-cycles and more general 2-connected subgraphs in a
community-affiliation network model defined by superpositions of Bernoulli random graphs
[3], [22], [23].
To the best of our knowledge this is the first systematic study of an α-stable approxima-
tion to subgraph counts in a theoretical model of sparse affiliation network. It is relevant to
1
mention that in the network model considered both basic properties of complex networks,
namely, the clustering property and the power law degree distribution are essential for an
α-stable limit to show up.
Network model. Let (X, Q) be a random vector with values in {0, 1, 2, . . . } × [0, 1] and
let
G(x, p) : x ∈ {1, 2 . . . }, p ∈ [0, 1]
be a family Bernoulli random graphs independent of (X, Q). We set [x] = {1, 2, . . . , x} to
be the vertex set of G(x, p). For notational convenience we introduce the empty graph G∅
having no vertices and set G(0, p) = G∅ for any p ∈ [0, 1]. We define the mixture of Bernoulli
random graphs G(X, Q) in a natural way.
Let (X1 , Q1 ), (X2 , Q2 ), . . . be a sequence of independent copies of (X, Q). Given integers
m, n ≥ 1, let Vn,i = Vn,i (Xi ), 1 ≤ i ≤ m, be independent random subsets of [n] defined
as follows. For Xi ≤ n we select Vn,i uniformly at random from the class of subsets of [n]
of size Xi . For Xi > n we set Vn,i = [n]. We denote X̃i = |Vn,i | = Xi ∧ n. Let Gn,i ,
1 ≤ i ≤ m, be independent random graphs with vertex sets Vn,i defined as follows. We
obtain Gn,i by an one-to-one mapping of vertices of G(X̃i , Qi ) to the elements of Vn,i and
retaining the adjacency relations of G(X̃i , Qi ). We denote by En,i the edge set of Gn,i .
Finally, let G[n,m] = (V, E) be the the random graph with the vertex set V = [n] and edge
set E = En,1 ∪ · · · ∪ En,m . Therefore G[n,m] is the superposition of Gn,1 , . . . , Gn,m . We call
the contributing random graphs Gn,i layers or communities.
The random graph G[n,m] represents a null model of the community-affiliation network
studied in [22], [23]. In the range m = Θ(n) as m, n → +∞ the random graph G[n,m] depicts
basic features of real complex networks: it admits a power law degree distribution with
tunable power law exponent, nonvanishing global clustering coefficient and tunable clustering
spectrum [3]. The subsequent paper [4] establishes bidegree distribution and correlations of
degrees of adjacent vertices. The present paper continues study of the random graph G[n,m]
and focuses on the asymptotic distributions of (dense) subgraph counts.
Normal and stable approximation. Let F = (VF , EF ) be a graph with vertex set VF
and edge set EF . We denote vF = |VF | and eF = |EF |. We assume in what follows that F
is 2-connected. That is, F is connected and, moreover, it stays connected even if we remove
any of its vertices. We call F balanced if eF /vF = max{eH /vH : H ⊂ F with eH ≥ 1}. For
example the cycle Ck and clique Kk (k stands for the number of vertices) are 2-connected
and balanced. We will (often) use letter k for the number of vertices of F in what follows.
Let NF be the number of copies of F in G[n,m] and let NF be the number of copies of F
in G(X, Q). Denote σF2 = VarNF the variance of NF . We write σF2 < ∞ if the variance
is finite and σF2 = ∞ otherwise. We introduce a shorthand notation for the conditional
expectation NF∗ := E(NF |X, Q). A simple calculation shows that NC∗k = (X)k Qk /(2k) and
k
N ∗ = (X) Q(2) /k!. Here (x) = x(x − 1) · · · (x − k + 1) denotes the falling factorial.
Kk k k
The following result establishes the asymptotic normality of NF for F = Ck and F = Kk .
Theorem 1. Let k ≥ 3. Let ν > 0. Let n, m → +∞. Assume that m/n → ν. Assume that
F = Ck or F = Kk . Suppose that EX < ∞ and
0 < σF2 < ∞. (1)
For F = Ck we assume that

E X 1+r(1− 2k ) Qr < ∞,
1
1 ≤ r ≤ k − 1. (2)
For F = Kk we assume that

r̂

E X r− k(k−1) Qr̂ < ∞, 2 ≤ r ≤ k, (3)
where we denote r̂ := r−1

2 √ + 1.
Then (NF − ENF )/(σF m) is asymptotically standard normal.
2
Remark 1. For a balanced graph F the finite variance condition σF2 < ∞ is equivalent to
the second moment condition E(NF∗ )2 < ∞. In particular, we have
k
2
σC2k < ∞ ⇔ E(X k Qk )2 < ∞ and 2
σK k
< ∞ ⇔ E X k Q(2) < ∞. (4)
Let us briefly explain the result and conditions of Theorem 1. Let NF,i be the number of
copies of F in G(Xi , Qi ) and put SF = NF,1 + · · · + NF,m . The moment condition EX < ∞
and the assumption m ≈ νn imply that different layers Gn,i either do not intersect (which
is most likely) or they intersect in a single vertex. Intersections in two or more vertices are
rather unlikely. Consequently, the principal contribution to the subgraph count NF comes
from the subgraph counts NF,i of individual layers (recall that F is 2-connected). Therefore
we have NF ≈ SF . To make this approximation rigorous we introduce moment conditions
(2), (3). Finally, the asymptotic normality of NF follows from the asymptotic normality of
SF . The latter is guarateed by the second moment condition (1).
In the case where the random variable NF∗ has infinite second moment, we can obtain
α-stable limiting distribution for the subgraph counts NF . In Theorem 2 below we assume
that for some a > 0 and 0 < α < 2 we have
P{NF∗ > t} = (a + o(1))t−α as t → +∞. (5)

∗
Let NF,i = E(NF |Xi , Qi ), 1 ≤ i ≤ m, be iid copies of NF∗ and put SF∗ = NF,1
∗ ∗
+· · ·+NF,m . It
−1/α ∗
is well known (Theorem 2 in § 35 of [9]) that the distribution of m (SF − Bm ) converges
to a stable distribution, say Gα,a , which is defined by a and α. Here Bm = mENF∗ = ENF
for 1 < α < 2 and Bm ≡ 0 for 0 < α < 1. For α = 1 we have Bm = c?α,a ln m, where the
constant c?α,a > 0 depends on a and α.
Theorem 2. Let k ≥ 3. Let a, ν > 0 and 0 < α < 2. Let n, m → +∞. Assume that
m/n → ν. Assume that EX < ∞. Assume that F = Ck or F = Kk . Suppose that (5) holds.
For F = Ck we assume that

E X 1+r(1− αk ) Qr < ∞,
1
1 ≤ r ≤ k − 1. (6)
For F = Kk we assume that

2

E X r−r̂ αk(k−1) Qr̂ < ∞, 2 ≤ r ≤ k, (7)
where r̂ = r−1

2 + 1.
−1/α
Then m (NF − Bm ) converges in distribution to Gα,a .
The result of Theorem 2 is obtained using the same approximations NF ≈ SF as above.
In addition, we use the fact that for a balanced subgraph F condition (5) implies SF ≈ SF∗ ,
see Lemma 5 below. The α-stable limit is now guaranteed by condition (5), see Theorem 2
in § 35 of [9].
It is relevant to mention that the moment condition EX < ∞ together with the as-
sumption m ≈ νn imply the existence of an asymptotic degree distribution of G[n,m] as
n, m → +∞. A power law asymptotic degree distribution is obtained if we choose an appro-
priate distribution for the layer size X. Furthermore, under additional moment condition
EX 3 Q2 < ∞ the random graph G[n,m] has a non-vanishing global clustering coefficient, see
[3]. Therefore, Theorems 1, 2 establish the subgraph count asymptotic in a highly clustered
complex network.
We remark that moment conditions (3), (7) of Theorems 1, 2 can be improved. Further-
more, the results of Theorems 1 and 2 extend to more general 2-connected subgraphs. Our
general and more detailed results are presented in Section 2.
Finally, we mention an important question about the relation between the community
size X and strength Q. In Theorems 1, 2 no assumption has been made about the stochastic
3
dependence between the marginals X and Q of the bivariate random vector (X, Q) defining
the random graph G[n,m] . To simplify the model we can assume for example that X and Q are
independent, see Corollary 1 below. However for network modelling purposes various types of
dependency between X and Q are of interest. For example, a negative correlation between X
and Q would enhance the fractions of small strong communities and large weak communities,
a pattern likely to occur in real networks with overlapping communities. Assuming that Q
is proportional to a negative power of X (for example, Q = min{1, bX −β } for some β ≥ 0
and b > 0) one obtains a mathematically tractable network model admitting tunable power
law degree and bidegree distributions and rich clustering spectrum [3], [4]. A model based
approach to the community detection problem in this setup has been introduced in [22], [23].
Related work. The abundance of dense subgraphs in an otherwise sparse network is
related to the clustering property of the network. Moreover, the global and local clustering
coefficients of a network are expressed in terms of counts of triangles and wedges. Therefore,
a rigorous asymptotic analysis of clustering coefficients in large random networks reduces
to that of the triangle counts and wedge counts. In particular, the bivariate asymptotic
normality for triangle and wedge counts in a related sparse random intersection graph was
shown in [2], while [1] established respective α-stable limits. Another line of related research
pursued in [10], [15] addresses the concentration of subgraph counts in G[n,m] .
The rest of the paper is organized as follows: In Sections 2 we formulate and prove our
general results (Theorems 3, 4), derive Theorems 1 and 2, and prove Remark 1. In Section
3 we collect technical lemmas. Among them Lemmas 2, 3 and inequality (19) may be of
independent interest.
2 General results and proofs

We start the section with some notation. Then we outline the proof. Afterwards we formulate
and prove our results. Here we also prove Theorems 1, 2 and Remark 1. Auxiliary lemmas
used in the proofs are collected in Section 3.
Notation Denote for short X = (X1 , . . . , Xm ) and Q = (Q1 , . . . , Qm ). By E∗ (·) =
E(·|X, Q) and P∗ (·) = P(·|X, Q) we denote the conditional expectation and probability
given (X, Q). Let aF denote the number of copies of F in KvF . For example aKk = 1 and
aCk = (k − 1)!/2. Given F , for any positive sequences {an } and {bn } we denote an bn
(respectively an ≺ bn ) whenever for sufficiently large n we have c1 ≤ an /bn ≤ c2 (respectively
an ≤ c2 bn ) where constants 0 < c1 < c2 may only depend on vF .
Recall that NF and NF,i denote the number of copies of F in G(X, Q) and G(Xi , Qi )
respectively. Furthermore,

∗ X
NF = E(NF |X, Q) = QeF aF ,
vF
∗
NF,i ∗
= E(NF,i |Xi , Qi ), and SF = NF,1 + · · · + NF,m , SF∗ = NF,1 ∗
+ · · · + NF,m . Note that
∗ ∗ ∗ ∗
NF,i = E (NF,i ) and SF = E (SF ). Finally, let Ñi be the number of copies of F in Gn,i and
let S̃ = Ñ1 + · · · + Ñm .
We identify the indices 1 ≤ i ≤ m with colors and we color (the edges of) each Gn,i by
color i. The colored graph is denoted G?n,i . The union of colored graphs G?n,1 ∪ · · · ∪ G?n,m
defines a multigraph, denoted G?[n,m] , that admits parallel edges of different colors. Each
edge u ∼ v of G[n,m] is assigned the set of colors that correspond to parallel edges of G?[n,i]
connecting u and v.
A subgraph H ⊂ G[n,m] is called monochromatic if it is a subgraph of some Gn,i and
none of edges of H is assigned more than one color. Otherwise H is called polychromatic.
Let NF,M and NF,P stand for the number of monochromatic and polychromatic copies of F
in G[n,m] . A subgraph H ? ⊂ G?[n,m] is called monochromatic if it is a subgraph of some Gn,i .
It is called polychromatic if it contains edges of different collor. Given H ? ⊂ G?[n,m] let H0?
4
be the graph obtained from H ? by merging parallel edges and deleting their colors. We call
H0? the projection of H ? . Let NF,P
?
be the number of polychromatic copies of F in G?[n,m]
(there can be several monochromatic and/or polychromatic cliques sharing the same vertex
set).
Proof outline. In the proof we approximate NF ≈ S̃F and S̃F ≈ SF . In the case where
ENF2 < ∞ we deduce the normal approximation to the sum SF (of iid random variables) by
the standard central limit theorem. In the case where NF has infinite variance we further
approximate SF ≈ SF∗ and deduce the α-stable approximation by a generalized central limit
theorem (see Theorem 2 in § 35 of [9]).
Approximation NF ≈ S̃F . The approximation follows from the simple observation that
? ?
NF = NF,M + NF,P , NF,M ≤ S̃F ≤ NF,M + NF,P , NF,P ≤ NF,P . (8)
We only comment on the second inequality. To see why it holds true let us inspect every
copy of F in G[n,m] that belongs to two or more layers Gn,i . Let F0 ⊂ G[n,m] be such a copy.
Clearly, the number of polychromatic subgraphs F ? in G?[n,m] whose projection F0? is F0 is
?
larger than the number of monochromatic ones. Hence S̃F ≤ NF,M + NF,P . From (8) we
conclude that
?
|S̃F − NF | ≤ NF,P . (9)
In order to assess the accuracy of the approximation NF ≈ S̃F we evaluate the expected
?
value of NF,P . Let K[n] be a clique on the vertex set V = [n]. We couple G[n,m] ⊂ K[n] and
fix a subgraph F0 = (V0 , E0 ) ⊂ K[n] , with vertex set {1, . . . , vF } ⊂ V , which is a copy of F .
We have, by symmetry,
? n
ENF,P = aF hF , (10)
vF
where hF is the expected number of polychromatic subgraphs F ? ⊂ G?[n,m] whose projection
F0? is F0 . Each F ? is defined by the partition of the edge set E0 into non-empty color classes,
say B1 ∪ · · · ∪ Br = E0 , and the vector of distinct collors (i1 , . . . , ir ) ⊂ [m]r such that all the
edges in Bj are of the color ij (edges of Bj belong to G?n,ij ). Denote by B̃ = (B1 , . . . , Br )
and ĩ = (i1 , . . . , ir ) the partition and its coloring. The polychromatic subgraph F ? defined
by the pair (B̃, ĩ) is denoted F (B̃, ĩ). The probability that such a subgraph is present in
G?[n,m] is
r
Y 1
b

h(B̃, ĩ) := P F (B̃, ĩ) ∈ G∗[n,m] =

E (X̃ij )vj Qijj . (11)
j=1
(n)vj
Here bj := |Bj | and vj is the number of distinct vertices incident to edges from Bj . We have
 
X X
hF = E  I ∗
 = h(B̃, ĩ). (12)
F (B̃,ĩ)∈G[n,m]
(B̃,ĩ) (B̃,ĩ)
Here the sum runs over all polychromatic F whose projection F0? is F0 . We upperbound
?
hF in Lemmas 1, 4 below.
Approximation S̃F ≈ SF . For 1 ≤ i ≤ m we couple G(X̃i , Qi ) ⊂ G(Xi , Qi ) and Ñi ≤ Ni
so that G(X̃i , Qi ) 6= G(Xi , Qi ) and Ñi 6= Ni whenever Xi > n. For m = O(n) the event
An := {max1≤i≤m Xi > n} has probability
m
X m
P{An } ≤ P{Xi > n} ≤ EX1 I{X1 >n} = o(1). (13)
i=1
n
Hence P{S̃F 6= SF } = o(1).

Results and proofs. Recall that vF = |VF | and eF = |EF | denote the number of vertices
and edges of a graph F = (VF , EF ). Furthermore, σF2 denotes the variance of the number
NF of copies of F in G(X, Q).
5
Theorem 3. Let F be a 2-connected graph with vF ≥ 3 vertices. Let ν > 0. Let n, m → +∞.
Assume that m/n → ν.
2 0.5−vF
√ EX < ∞ and 0 < σF < ∞. Suppose that hF = o(n
(i) Suppose that ). Then
(NF − ENF )/(σF m) is asymptotically standard normal.
(ii) Suppose that

1+s 1− 2e1
E X F Qs < ∞, 1 ≤ s ≤ vF − 1. (14)
Then hF = o(n0.5−vF ). Here eF stands for the number of edges of F .

Theorem 4. Let a > 0 and 0 < α < 2. Let ν > 0. Let F be a balanced (and) 2-connected
graph with vF ≥ 3 vertices. Let n, m → +∞. Assume that m/n → ν.
1
(i) Suppose that EX < ∞. Suppose that (5) holds. Suppose that hF = o(n α −vF ). Then
(NF − Bm )/m1/α converges in distribution to Gα,a . Recall that the scalar sequence {Bm }
and the α-stable distribution Gα,a have been introduced before Theorem 2.
(ii) Suppose that

1+s 1− α 1e
E X F Qs < ∞, 1 ≤ s ≤ vF − 1. (15)
1
Then hF = o(n α −vF ). Here eF stands for the number of edges of F .
In the case where the random variables X and Q are independent the finite variance
condition σF2 < ∞ reduces to the moment condition EX 2vF < ∞. Moreover, the latter
condition implies (14). Furthermore, for EX 2vF = ∞, the regularity condition
P{X > t} = (b + o(1))t−γ as t → +∞, (16)
for some b > 0 and 0 < γ < 2vF , implies (5) with a = b(aF /vF !)γ/vF EQγeF /vF and α = γ/vF .
Note that for γ satisfying the inequalities
s vF
1+s<γ+ , for 1 ≤ s ≤ vF − 1, (17)
γ eF
condition (16) implies (15). Therefore, for independent X and Q, Theorems 3 and 4 yield
the following corollary.
Corollary 1. Let F be a 2-connected graph with vF ≥ 3 vertices. Let ν > 0. Let n, m → +∞.
Assume that m/n → ν. Suppose that X and Q are independent and P{Q √ > 0} > 0.
(i) Suppose that 0 < EX 2vF < ∞. Then (NF − ENF )/(σF m) is asymptotically
standard normal.
(ii) Suppose that for some b > 0 and 0 < γ < 2vF conditions (16), (17) hold. Then
(NF − Bm )/m1/α converges in distribution to Gα,a . Here Bm and Gα,a are the same as in
Theorem 4 and a = b(aF /vF !)γ/vF EQγeF /vF and α = γ/vF .
Proof of Theorem
√ 3. Our condition hF = o(n0.5−vF ) together with (10) imply
√ the bound
?
NF,P = oP ( m). Furthermore, invoking (9) we obtain (NF − SF ) = oP ( m). Then an
√
application of (13) shows (NF − SF ) = oP ( m). Finally, we apply the classical central
limit theorem to the
√ sum of iid random variables SF to get the asymptotic normality of
(NF − ENF )/(σF m).
The fact that moment condition (14) is sufficient for the bound hF = o(n0.5−vF ) is shown
in Lemma 1 below.
Proof of Theorem 4. Proceeding as in the proof of Theorem 3 above we obtain (NF − SF ) =

oP (m1/α ). Furthermore, by Lemma 5, the iid random variables NF,1 , NF,2 , . . . obey the
power law (32) and therefore (SF −Bm )/m1/α converges in distribution to Gα,a (see Theorem
2 in § 35 of [9]). Hence m−1/α (NF − Bm ) converges in distribution to Gα,a .
1
The fact that moment condition (15) is sufficient for the bound hF = o(n α −vF ) is shown
in Lemma 1 below.
6
Proof of Theorems 1 and 2. For F = Ck the result of Theorem 1 (Theorem 2) is an immedi-
ate consequence of Theorem 3. (Theorem 4). For F = Kk the result of Theorem 1 (Theorem
2) follows from Theorem 3 (i) (Theorem 4 (i)) and the fact that condition (3) (condition (7))
1
implies the bound hF = o(n0.5−k ) (the bound hF = o(n α −k )), see Lemma 4.
Proof of Remark 1. We have σF2 = VarNF = VarNF∗ + E(∆∗F )2 , where ∆∗F := NF − NF∗ .
Therefore σF2 < ∞ ⇒ VarNF∗ < ∞ ⇒ E(NF∗ )2 < ∞. To prove that E(NF∗ )2 < ∞ ⇒
σF2 < ∞ it sufficies to show that E(∆∗F )2 < ∞. By Lemma 3.5 of [13], we have E∗ (∆∗F )2 ≺
(NF∗ )2 /ΦF (X, Q), where ΦF (X, Q) = minH⊂F X vH QeH and where E∗ stands for the con-
ditional expectation given X, Q. Furthermore, from the inequality (30), which holds for
balanced F , we obtain
(NF∗ )2
E∗ (∆∗F )2 ≺ = max{(NF∗ )2−2/vF , NF∗ } ≤ max{1, (NF∗ )2 }.
min{(NF∗ )2/vF , NF∗ }
Hence E(NF∗ )2 < ∞ implies E(∆∗F )2 = E(E∗ (∆∗F )2 ) < ∞.
3 Auxiliary lemmas
3.1 Upper bounds for hF .
In Lemmas 1, 4 we upperbound the moments hF for 2-connected F and for F = Kk respec-
tively. Clearly, the result of Lemma 1 applies to F = Kk as well, but the bound of Lemma
4 is tighter for large k.
Lemma 1. Let F be a 2-connected graph with vF ≥ 3 vertices. Let n, m → +∞. Assume
that m = O(n).
(i) Suppose that (14) holds. Then hF = o(n0.5−vF ).
1
(ii) Assume that 0 < α < 2. Suppose that (15) holds. Then hF = o(n α −vF ).
In the proof we use the simple fact that for any s, t, τ > 0 the moment condition
E(X s Qt ) < ∞ implies
E (min{X, n})s+τ Qt = o (nτ ) .

(18)
Denote X̃ := min{X, n}. To see why (18) holds chose 0 < δ < τ /(s + τ ) and split the
expectation

E X̃ s+τ Qt = E X̃ s+τ Qt I{X<nδ } + E X̃ s+τ Qt I{X≥nδ } =: I1 + I2 .

Inequality X̃ ≤ n and E(X s Qt ) < ∞ imply I1 ≤ nτ E X s Qt I{X≥nδ } = nτ · o(1). Inequality
X̃ ≤ X imply I1 ≤ nδ(s+τ ) = o(nτ ).
Proof of Lemma 1. The proofs of statements (i) and (ii) are identical. Therefore we only
prove statement (i).
We start with establishing an auxiliary inequality (19) below, which may be interesting
in itself. Given a partition B̃ = (B1 , . . . , Br ) of the edge set E0 of the graph F0 = (V0 , E0 )
and given i ∈ [r], let Vi be the set of vertices incident to the edges from Bi . Let ρi be the
number of (connected) components of the graph Zi = (Vi , Bi ) and put vi = |Vi |. We claim
that
v1 + · · · + vr ≥ vF + ρ1 + · · · + ρr . (19)
To establish the claim we consider the list H1 , H2 , . . . , Ht of components of Z1 , . . . , Zr
arranged in an arbitrary order. Here t := ρ1 + · · · + ρr . Therefore, each graph Hi is a
component of some Zj and their union H1 ∪ · · · ∪ Ht = Z1 ∪ · · · ∪ Zr = F0 . Let us consider
the sequence of graphs H̄j := H1 ∪ · · · ∪ Hj , for j = 1, . . . , t − 1. Let ρ̄j and v̄j denote
7
the number of components and the number of vertices of H̄j . Let vj0 denote the number of
vertices of Hj . We use the observation that
v̄j ≤ v̄j−1 + vj0 + ρ̄j − ρ̄j−1 − 1 for j = 2, . . . t − 1. (20)
Indeed, ρ̄j−1 = ρ̄j means that the vertex set of (connected graph) Hj intersects with exactly
one component of H̄j−1 . Consequently, Hj and H̄j−1 have at least one common vertex and
therefore (20) holds. Similarly, ρ̄j−1 − ρ̄j = y > 0 means that the vertex set of Hj intersects
with exactly y + 1 different components of H̄j−1 . Consequently, Hj and H̄j−1 have at least
y + 1 common vertices and (20) holds again. The remaining case ρ̄j−1 − ρ̄j = −1 is realized
by the configuration where the vertex sets of Hj and H̄j−1 have no common elements. In
this case (20) follows from the identity v̄j = v̄j−1 + vj0 .
By summing up inequalities (20) we obtain (using ρ̄1 = 1) that
v̄t−1 ≤ v10 + · · · + vt−1

0
+ ρ̄t−1 − t + 1.
Note that given H̄t−1 with ρ̄t−1 components, the vertex set of Ht must intersect with each
component in at least two points, in order to make the union H̄t−1 ∪ Ht = F0 2-connected.
Consequently we have
v̄t ≤ v̄t−1 + vt0 − 2ρ̄t−1 .
Finally, we obtain
vF = v̄t ≤ v10 + · · · + vt0 − ρ̄t−1 − t + 1.
Now the claim follows from the identity v10 + · · · + vt0 = v1 + · · · + vr and inequality ρ̄t−1 ≥ 1.
Let us prove statement (i). Given (B̃, ĩ), we obtain from (11), (19) that
r r
1 Y 1 Y
h(B̃, ĩ) ≤ E X̃ vj Qbj ≤ E X̃ vj Qbj .
nv1 +···+vr j=1
nvF +z1 +···+zr j=1
Given B̃ = (B1 , . . . , Br ) we estimate the sum over all possible collorings (there are (m)r of
them)

r r E X̃ vj Qbj
X (m)r Y Y
h(B̃, ĩ) ≺ vF +ρ1 +···+ρr E X̃ vj Qbj n−vF
n j=1 j=1
nρj −1
ĩ

Yr E X̃ vj Qbj
= n0.5−vF = o n0.5−vF .

n ρ j −1+(bj /(2e F ))
j=1
In the second last identity we use b1 + · · · + br = eF , while the last bound follows by the cain
of inequalities

n1−ρj E X̃ vj Qbj ≤ E X̃ vj +1−ρj Qbj ≤ E X̃ vj +1−ρj Qvj −ρj

= o n(vj −ρj )/(2eF ) = o nbj /(2eF ) .
Here in the first step we used X̃ ≤ n; in the second step we used Q ≤ 1 and bj ≥ vj − ρj
(the latter inequality is based on the observation that any graph with vj vertices and ρj
components has at least vj − ρj edges); the third step follows by (18) from the moment
condition (14) applied to s = vj − ρj ; the last step follows from the inequality bj ≥ vj − ρj .
Finally we conclude that
XX
h(B̃, ĩ) = o n0.5−vF ,

hF = (21)
B̃ ĩ
because the number of partitions B̃ of the edge set of a given graph F is always finite.
8
Before showing an upper bound for hF , F = Kk we introduce some notation. Given
integer b ≥ 1 let b? be the minimal number of vertices that a graph with b edges may have.
Let Hb be such a graph. It has a simple structure described below. Let kb ≥ 2 be the largest
integer satisfying b ≥ k2b . Then
kb
b= + ∆b
2
for some integer 0 ≤ ∆b ≤ kb − 1. For ∆b = 0 we have b? = kb and Hb = Kb? (clique on
b? = kb vertices). For ∆b > 0 graph Hb is a union of Kkb and star K1,∆b such that all the
vertices of the star, but the central vertex, belong to the vertex set of Kkb . In this case
b? = kb + 1. In other words, one obtains Hb from Kkb +1 by deleting kb − ∆b edges sharing a
common endpoint. The next two lemmas establish useful properties of the function b → b? .
Lemma 2. For integers s ≥ t ≥ 1 we have
s? + t? ≥ (s + t − 1)? + 2. (22)
Proof. In the proof we consider graphs Hs and Ht that have disjoint vertex sets so that the
union Hs ∪ Ht has s? + t? vertices.
Note that for t = 1 both sides of (22) are equal. In order to show (22) for s ≥ t ≥ 2 we
consider the chain of neighboring pairs
(s, t) → (s + 1, t − 1) → · · · → (s + t − 1, 1). (23)
In a step (x, y) → (x + 1, y − 1) we remove an edge from Hy and add it to Hx . A simple

analysis of the step (Hx , Hy ) → (Hx+1 , Hy−1 ) shows that
(x + 1)? + (y − 1)? = x? + y ? + 1 whenever ∆x = 0, ∆y 6= 1, (24)

? ? ? ?
(x + 1) + (y − 1) = x + y − 1 whenever ∆x 6= 0, ∆y = 1, (25)
? ? ? ?
(x + 1) + (y − 1) = x + y in the remaining cases. (26)
We call a step (x, y) → (x + 1, y − 1) positive (respectively negative or neutral) if (25)

(respectively (24) or (26)) holds. Therefore as we move in (23) from left to right every positive
(negative) step decreases (increases) the total number of vertices in the union Hx ∪ Hy .
Let us now traverse (23) from the right to left. We observe that the first non-neutral
step encoutered is positive (if we encouter a non-neutral step at all). Furthermore, after a
negative step the first non-neutral step encountered is positive. Note that it may happen that
the last encountered non-neutral step is negative. Therefore, the total number of positive
steps is at least as large as the number of negative ones. This proves (22).
Lemma 3. Let k ≥ 3 and r ≥ 2. Let B1 ∪ · · · ∪ Br = E0 be a partition of the edge set of the

clique Kk . Denote bi = |Bi |, 1 ≤ i ≤ r, and κ = k2 . We have
b?1 + · · · + b?r ≥ (κ − (r − 1))? + 2(r − 1) ≥ k + r. (27)
Proof. The first inequality of (27) follows from (22) and the identity b1 + · · · + br = κ. The
second inequality is simple. Indeed, for r ≥ k the inequality follows from2(r − 1) ≥ k + r − 2
and (κ − (r − 1))? ≥ 2. For r ≤ k − 1 we have κ − (r − 1) ≥ k−1 2 + 1 and therefore
(κ − (r − 1))? ≥ k.
Now we are ready to upper bound hF , for F = Kk .

Lemma 4. Let k ≥ 3. Let 0 < α ≤ 2. Let A > 0 Let n, m → +∞. Assume that m ≤ An.
1
−k
Suppose that F = Kk . Then (7) implies the bound hF = o n α . Note that for α = 2
condition (7) is the same as (3).
9
Proof. We start with the observation that (7) implies

? k
E X b −b/(α eF ) Qb < ∞, 1≤b< . (28)
2
Note that ŝ = s−1

2 + 1 is the smallest integer t such that t? = s. In particular, for
?
any b with b = s we have b ≥ ŝ. Therefore, given 2 ≤ s ≤ k, the moment condition
E X s−ŝ/(α eF ) Qŝ < ∞ implies E X s−b/(α eF ) Qb < ∞ for any b satisfying b? = s. In this
way (7) yields (28)
Let us upperbound hKk . Given a partition B̃ = (B1 , . . . , Br ) of the edge set E0 of
Kk = ([k], E0 ), let vj be the number of vertices incident to the edges from Bj and let
bj = |Bj |. For any vector ĩ = (i1 , . . . , ir ) of distinct colors we have
r r ? r
E X̃ bj Qbj

Y E X̃ vj Qbj Y 1 Y ?
E X̃ bj Qbj .

h(B̃, ĩ) ≤ v
≤ b ? ≤ k+r
j=1
n j
j=1
n j n j=1
Here the first inequality follows from the inequality (X̃)t /(n)t ≤ X̃ t /nt , since X̃ ≤ n. The
second inequality follows from the obvious inequality b? ≤ vj and the fact that X̃ ≤ n. The
last inequality follows from the inequality b?1 + · · · + b?r ≥ k + r of Lemma 3.
For each r-partition B̃ as above we upperbound the sum over all possible collorings ĩ
(there are (m)r of them)
r r
X (m)r Y b?
Ar Y b?
1
j Qbj j Qbj n α −k .

h(B̃, ĩ) ≤ E X̃ ≤ E X̃ = o (29)
nk+r j=1 nk j=1
ĩ
k

In the very last step we used the bounds (with eF = b1 + · · · + bj = 2 )
b
j
?
bj bj
E X̃ Q = o n α eF ,
b

j
b?
j−αe bj
that follow from the moment conditions E X < ∞, see (28), via (18). Finally,
F Q
1
proceeding as in (21) above we derive the desired bound hF = o n α −k from (29).
3.2 Power law tails

Given a graph F = (VF , EF ) with the vertex set VF and edge set EF we denote vF = |VF |
the number of vertices and eF = |EF | the number of edges. Let ΨF = ΨF (n, p) = nvF peF .
Let
ΦF = ΦF (n, p) = min ΨH , mF = max (eH /vH ).
H⊂F H⊂F
Here the minimus/maximum is taken over all subgraphs H ⊂ F with eH > 0. Recall that F
is called balanced if mF = eH /vF . For a balanced F we have for any H ⊂ F with eH > 0
vH vH v /v
ΨH = npeH /vH ≥ npeF /vF = ΨFH F .
Hence
2/vF
ΦF ≥ min{ΨF , ΨF }. (30)
Let ∆F be the maximum degree of F .
Lemma 5. Let k ≥ 3 be an integer. Let a > 0 and 0 < α < 2. Assume that F is balanced and
connected. Let NF be the number of copies of F in G(X1 , Q1 ). Denote NF∗ = E(NF |X1 , Q1 ).
Assume that
P{NF∗ > t) = a(1 + o(1))t−α as t → +∞. (31)
Then
P{NF > t} = a(1 + o(1))t−α as t → +∞. (32)
10
For 0 < α < 2 the tail asymptotics (32) implies that NF belongs to the domain of
attraction of an α-stable distribution. Indeed, the left tail of NF vanishes since P{NF ≥
0} = 1. Therefore the conditions of Theorem 2 in § 35 Chapter 7 of [9] are satisfied.
Proof. By E∗ and P∗ we denote the conditional expectation and probability given (X1 , Q1 ).
Denote ∆∗F = NF − NF∗ . In the proof we often use the fact (Lemma 3.5 of [13]) that
(NF∗ )2
E∗ (∆∗F )2 (1 − Q1 ). (33)
ΦF (X1 , Q1 )
We also use the simple relation NF∗ ΨF (X1 , Q1 ).
To prove (32) we show that the contribution of ∆∗F to the sum NF = NF∗ + ∆∗F is
negligible compared to NF∗ and, therefore, the tail asymptotic (32) is determined by (31). For
this purpose we apply exponential large deviation bounds for subgraph counts in Bernoulli
random graphs [13], [14].
Given large t > 0 and small ε > 0 introduce event H = {−εNF∗ ≤ ∆∗F ≤ εt} and split
P{NF > t} = P{NF > t, H} + P{NF > t, ∆∗F < −εNF∗ } + P{NF > t, ∆∗F > εt}
=: P1 + P2 + P3 . (34)
We firstly consider P1 . Replacement of ∆∗F by its extreme values (on H) yields the inequalities
P{(1 − ε)NF∗ > t, H} ≤ P1 ≤ P{NF∗ > t(1 − ε), H}. (35)
We note that the right side of (35) is at most P{NF∗ > t(1 − ε)} and the left side is at least
P{(1 − ε)NF∗ > t} − P20 − P30 ,
where
P20 := P{(1 − ε)NF∗ > t, ∆∗F < −εNF∗ }, P30 := P{(1 − ε)NF∗ > t, ∆∗F > εt}.
Hence, we have
P{(1 − ε)NF∗ > t} − P20 − P30 ≤ P1 ≤ P{NF∗ > t(1 − ε)}.
Invoking the simple inequalities P2 ≤ P20 and P30 ≤ P3 we obtain from (34) that yield
P{(1 − ε)N1∗ > t} − P20 ≤ P{NF > t} ≤ P{NF∗ > t(1 − ε)} + P20 + P3 . (36)
We show below that for any 0 < ε < 1
P20 = o(t−α ), P3 = o(t−α ) as t → +∞. (37)
Note that (31), (36) together with (37) imply (32). It remains to show (37).
Proof of P20 = o(t−α ). Given (X1 , Q1 ) with 0 < Q1 < 1, we apply Janson’s inequality
(Theorem 2.14 of [13]) to the conditional probability p∗ε := P∗ {∆∗F < −εNF∗ }. In what
follows we assume that the random graph G(X1 , Q1 ) and complete graph KX1 are both
defined on the same vertex set of size X1 and that X1 ≥ 1. Let
X X
δ := E∗ NF2 − δ, E∗ (IH 0 IH 00 ).

δ :=
H 0 ⊂KX1 H 00 ⊂KX1
EH 0 ∩EH 00 =∅
Here the sum runs over ordered pairs (H 0 , H 00 ) of subgraphs of KX1 such that H 0 , H 00 are
copies of F and their edge sets EH 0 and EH 00 are disjoint. Furthermore, IH stands for the
indicator of the event that H is present in G(X1 , Q1 ). Janson’s inequality implies
∗ 2
P∗ {∆∗F < −ηNF∗ } ≤ e−(ηNF ) /δ̄
, ∀ η ∈ (0, 1). (38)
11
In what follows we upperbound δ̄. The (variance) identity E∗ (NF2 ) − (NF∗ )2 = E∗ (∆∗F )2
implies
δ = E∗ (∆∗F )2 + (NF∗ )2 − δ. (39)
Furthermore, by the fact that VH 0 ∩ VH 00 = ∅ implies EF 0 ∩ EH 00 = ∅ and that the latter
relation implies E∗ (IH 0 IH 00 ) = (E∗ IH 0 )(E∗ IH 00 ), we have
X X
E∗ (IH 0 IH 00 ) = NF∗ E∗ ÑF .

δ≥
H 0 ⊂KX1 H 00 ⊂KX1
VH 0 ∩VH 00 =∅
Here ÑF denotes the number of copies of F in G(X1 − k, Q1 ). Recall that k is the number
1 −k)k
of vertices of F . The identity E∗ ÑF = (X(X 1 )k
NF∗ combined with the simple inequalities
(X1 − k)k X1 − k k
≥ ≥1− , for X1 ≥ 2k
(X1 )k X1 X1
implies the lower bound δ ≥ (N1∗ )2 (1 − kX1−1 ). Invoking this bound in (39) we obtain
δ̄ ≤ E∗ (∆∗F )2 + (NF∗ )2 kX1−1 .
Hence the ratio in the exponent of (38)
(NF∗ )2 (NF∗ )2 (NF∗ )2 X1

≥ = min , (40)
δ̄ max{E∗ (∆∗F )2 , (NF∗ )2 kX1−1 } E∗ (∆∗F )2 k
We will show below that there exist ck > 0 such that for large t the inequality NF∗ > t implies
(NF∗ )2
> ck t2/k . (41)
E∗ (∆∗F )2
Furthermore, we note that NF∗ > t implies X1 > (t/cF )1/k , where cF is the number of
different copies of F contained in Kk (indeed, the number Xk1 cF of copies of F in KX1

is larger that the “E∗ -expected” number NF∗ of copies of F in G(X1 , Q1 )). Therefore, the
right side of (40) is at least min{B ln t, (t/cF )1/k }. Finally, from (38) we obtain on the event
NF∗ > t that
1/k
p∗ε ≤ e−ε min{B ln t,(t/cF ) } = o(t−α ) as t → +∞,
provided that we choose B satisfying B > 2/ε. We conclude that P20 = o(t−α ). It remins to
show (41). Recall that NF∗ ΨF (X1 , Q1 ) = X1vF Qe1F . Hence for large t > 0 on the event
{NF∗ > t} we have ΨF (X1 , Q1 ) t 1. By (33), the ratio
(NF∗ )2 ΦF (X1 , Q1 )
∗ ∗ 2
. (42)
E (∆F ) 1 − Q1
2/vF
Using ΦF (X1 , Q1 ) ≥ ΨF (X1 , Q1 ), see (30), we lowerbound the right side of (42)
ΦF (X1 , Q1 ) 2/v 2/k

≥ ΦF (X1 , Q1 ) ≥ ΨF F (X1 , Q1 ) = ΨF (X1 , Q1 ) t2/k .
1 − Q1
Proof of P3 = o(t−α ). In the proof we apply exponential inequalities for upper tails of
subgraph counts in Bernoulli random graphs [14]. For readers convenience we state the result
of [14] we are going to use. Let

1
 if p < n−1/mF ,
1/α∗H
MF (n, p) = minH⊂F ΨH (n, p) if n−1/mF ≤ p ≤ n−1/∆F ,

if p ≥ n−1/∆F .
 2 ∆F
n p
12
∗
Here αH is the fractional independence number of H, see [14]. We will use the lower bound
∗
αH ≤ vH − 1 that holds for any H with eH > 0, see formula (A.1) in [14]. Let ξF be the
number of copies of F in G(n, p). By Theorems 1.2, 1.5 of [14], for any η > 0 there exists
cη,F such that uniformly in p and n ≥ k (here k = vF is the number of vertices of F ) we
have
P{ξF ≥ (1 + η)EξF } ≤ e−cη,F MF (n,p) . (43)
We apply (43) to the number NF of copies of F in G(X1 , Q1 ). We show that P{∆∗F >
s} = o(s−α ) as s → +∞. We split
P{∆∗F > s} = P{∆∗F > ηNF∗ , ∆∗F > s} + P{∆∗F ≤ ηNF∗ , ∆∗F > s} =: P31 + P32 .
The second term

P32 ≤ P{NF∗ > s/η} = η α (a + o(1))s−α (44)
can be made negligibly small by choosing η arbitrarily small.
Now we upperbound the first term P31 . Introduce events
−1/mF −1/mF −1/∆F −1/∆F
A1 = Q1 ≤ X1 , A21 = X1 < Q1 < Q1 , A22 = Q1 ≥ X1
and denote A2 = A21 ∪ A22 . We split
P31 = P̃1 + P̃2 , P̃i := P{∆∗F > ηNF∗ , ∆∗F > s, Ai } (45)
and upperbound P̃1 and P̃2 .

−1/mF
We firstly consider P̃1 . The inequality Q1 < X1 implies ΨF (X1 , Q1 ) < 1. The latter
∗ ∗ 2
inequality together with (30), (33) imply E (∆F ) ≤ ck ΨF (X1 Q1 ) ≤ ck for some ck > 0.
Hence, on the event A1 we have E∗ (∆∗F )2 ≤ ck . Finally, by Markov’s inequality,
P̃1 ≤ P{∆∗F > s, A1 } = E IA1 E∗ I{∆∗F >s} ≤ E IA1 E∗ (∆∗F )2 s−2 ≤ ck s−2 .

(46)
F −1/m
We secondly consider P̃2 . The inequality X1 ≤ Q1 implies ΨF (X1 , Q1 ) > 1. For
∗
balanced F this yields ΦH (X1 , Q1 ) > 1 for every H ⊂ F with eH > 0. Then using αH ≤
vH − 1 we obtain
1/α∗H 1/vH 1/vF
min ΨH (X1 , Q1 ) ≥ min ΨH (X1 , Q1 ) = ΨF (X1 , Q1 ) .
H⊂F : eH >0 H⊂F : eH >0
Hence, on the event A21 we have (recall that vF = k)

1/k
MF (X1 , Q1 ) ≥ ΨF (X1 , Q1 ) . (47)
−1/∆
F
Likewise on the event A22 the inequality Q1 ≥ X1 yields MF (X1 , Q1 ) ≥ X1 . Now the
vF
trivial inequality X1 > ΨF (X1 , Q1 ) implies (47).
From (43) and (47) we obtain the exponential bound
1/k
P∗ {∆∗F > ηNF∗ } ≤ ecη,F MF (X1 ,Q1 ) ≤ ecη,F ΨF (X1 ,Q1 ) . (48)
Let us upperbound P̃2 . We fix (a large) number B > 0 and introduce events
B1 = {ΨF (X1 , Q1 ) > B lnk s}, B2 = {ΨF (X1 , Q1 ) ≤ B lnk s}.
We split
P̃2 = P̃21 + P̃22 , P̃2i = P{∆∗F > ηNF∗ , ∆∗F > s, A2 , Bi }
and estimate using (48)
P̃21 ≤ P{∆∗F > ηNF∗ , A2 , B1 } = E (IB1 IA2 P∗ {∆∗F > ηNF∗ })

1/k
1/k
≤ E IB1 e−cη,F (ΨH (X1 ,Q1 )) ≤ e−cη,F B ln s . (49)
13
Now we estimate P̃22 . The inequalities 1 ≤ ΨF (X1 , Q1 ) ≤ B lnk s that hold on the event
A2 ∩ B2 imply, see (30), (33),
2−(2/k)
E∗ (∆∗F )2 ΨF (X1 , Q1 ) (1 − Q1 ) ≤ ck (B lnk s)2−(2/k) .
Combining the inequalities above with Markov’s inequality P∗ (∆∗F > s) ≤ s−2 E∗ (∆∗F )2 we
obtain
P̃22 ≤ P{∆∗F > s, A2 , B2 } = E IA2 IB2 P∗ {∆∗F > s}

≤ E IA2 IB2 s−2 E∗ (∆∗F )2 ≤ ck B 2−(2/k) s−2 ln2k−2 s.

(50)
Finally we show that for any given 0 < ε < 1 the probability P3 = o(t−α ) as t → +∞.
We fix ε and denote s = εt. Then
lim sup tα P3 ≤ lim sup tα P{∆∗F > εt} = ε−α lim sup sα P{∆∗F > s}
t→+∞ t→+∞ s→+∞
−α
≤ε lim sup s (P̃1 + P̃21 + P̃22 + P32 ) ≤ (η/ε)α a.
α
s→+∞
The last inequality follows from (44) and (46), and (49), (50). Indeed, given η > 0 we
choose B = B(η) (in (49), (50)) large enough so that cη,F B 1/k > 2. Then P̃21 ≤ s−2 and
lim sups sα P̃21 = 0. We also mention obvious relations lim sups sα P̃1 = 0, lim sups sα P̃22 = 0.
References
[1] M. Bloznelis and V. Kurauskas. Clustering coefficient of random intersection graphs
with infinite degree variance. Internet Mathematics (2016) doi.org/10.24166/im.02.2017
[2] M. Bloznelis and J. Jaworski. The asymptotic normality of the global clustering coeffi-
cient in sparse random intersection graphs. Algorithms and models for the web graph,
16–29, Lecture Notes in Comput. Sci., 10836, Springer, 2018.
[3] M. Bloznelis and L. Leskelä, Clustering and percolation on superpositions of Bernoulli
random graphs, arXiv:1912.13404, 2019.
[4] M. Bloznelis, J. Karjalainen and L. Leskelä, (2021). Assortativity and bidegree distri-
butions on Bernoulli random graph superpositions. Submitted
[5] Breiger, R.L. (1974) The duality of persons and groups. Social Forces 53(2), 181–190.
[6] Britton, T., Deijfen, M., Lageras, A.N., Lindholm, M. (2008) Epidemics on random
graphs with tunable clustering. J. Appl. Probab. 45(3), 743–756.
[7] Frieze, A., Karoński, M. Introduction to Random Graphs. Cambridge University Press
(2015).
[8] Godehardt, E., Jaworski, J. (2001) Two models of random intersection graphs and their
applications. Electronic Notes in Discrete Mathematics 10, 129–132.
[9] B. V. Gnedenko, A. N. Kolmogorov, Limit distributions for sums of independent random
variables. Addison-Wesley, Cambridge, 1954.
[10] Gröhn, T., Karjalainen, J., Leskelä, L.: Clique and cycle frequencies in a sparse random
graph model with overlapping communities, arXiv:1911.12827
[11] Guillaume, J. L.& Latapy, M. (2004) Bipartite structure of all complex networks. In-
form. Process. Lett. 90, 215–221.
[12] Hladký, J., Pelekis, Ch., Šileikis, M. (2021+) A limit theorem for small cliques in
inhomogeneous random graphs. J. Graph Theory.
[13] S. Janson, T. Luczak, and A. Ruciński, Random Graphs, Wiley, New York, 2000.
14
[14] S. Janson, K. Oleszkiewicz, and A. Ruciński, (2004). Upper tails for subgraph counts
in random graphs. Isr. J. Math. 142, 61–92.
[15] J. Karjalainen, J. S. H. van Leeuwaarden, L. Leskelä, Parameter estimators of random
intersection graphs with thinned communities. In A. Bonato. P. Pralat A. Raigorodskii
(Eds.): Algorithms and models for the web graph - 15th International workshop, WAW
2018, Lecture Notes in Comput. Sci. 10836, Springer (2018), 44–58.
[16] Kurauskas, V.: On local weak limit and subgraph counts for sparse random graphs,
arXiv preprint arXiv:1504.08103v2 (2015).
[17] Ouadah, S., Latouche, P. Robin, S. Motif-based tests for bipartite networks,
arXiv:2101.11381 (2001).
[18] Petti, S., Vempala, S.: Approximating sparse graphs: The random overlapping commu-
nities model, arXiv: 1802.03652 (2018).
[19] Privault, N., Serafin, G. (2020) Normal approximation for sums of weighted U-
statistics—application to Kolmogorov bounds in random subgraph counting. Bernoulli
26: 587–615.
[20] Rollin, A. (2021+) Kolmogorov bounds for the normal approximation of the number of
triangles in the Erdos-Renyi random graph, Probability in the Engineering and Infor-
mational Sciences (2021+), doi:10.1017/S0269964821000061
[21] Ruciński, A. (1988). When are small subgraphs of a random graph normally distributed?
Probability Theory and Related Fields 78: 1–10.
[22] Yang, J. and Leskovec, J.: Community-affiliation graph model for overlapping network
community detection. In 2012 IEEE 12th International Conference on Data Mining,
pages 1170–1175. IEEE, (2012).
[23] Yang, J., Leskovec, J.: Structure and overlaps of ground-truth communities in networks.
ACM Trans. Intell. Syst. Technol. 5(2) (Apr 2014).
[24] Statistical analysis of networks, 8th Nordic-Baltic Biometrics Virtual Conference,
Helsinki, June 7–10, 2021, https://nbbc21.helsinki.fi/
15

Normal and Stable Approximation To Subgraph Counts in Superpositions of Bernoulli Random Graphs

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Normal and Stable Approximation To Subgraph Counts in Superpositions of Bernoulli Random Graphs

Uploaded by

Copyright:

Available Formats

Normal and stable approximation to subgraph counts

in superpositions of Bernoulli random graphs

Mindaugas Bloznelis1 , and Joona Karjalainen2 , and Lasse Leskelä2

Naugarduko 24, LT-03225 Vilnius, Lithuania

School of Science, Aalto University

For F = Kk we assume that

where we denote r̂ := r−1

P{NF∗ > t} = (a + o(1))t−α as t → +∞. (5)

For F = Kk we assume that

2 General results and proofs

Hence P{S̃F 6= SF } = o(1).

Then hF = o(n0.5−vF ). Here eF stands for the number of edges of F .

Proof of Theorem 4. Proceeding as in the proof of Theorem 3 above we obtain (NF − SF ) =

Hence E(NF∗ )2 < ∞ implies E(∆∗F )2 = E(E∗ (∆∗F )2 ) < ∞.

v̄j ≤ v̄j−1 + vj0 + ρ̄j − ρ̄j−1 − 1 for j = 2, . . . t − 1. (20)

v̄t−1 ≤ v10 + · · · + vt−1

(s, t) → (s + 1, t − 1) → · · · → (s + t − 1, 1). (23)

In a step (x, y) → (x + 1, y − 1) we remove an edge from Hy and add it to Hx . A simple

(x + 1)? + (y − 1)? = x? + y ? + 1 whenever ∆x = 0, ∆y 6= 1, (24)

We call a step (x, y) → (x + 1, y − 1) positive (respectively negative or neutral) if (25)

Lemma 3. Let k ≥ 3 and r ≥ 2. Let B1 ∪ · · · ∪ Br = E0 be a partition of the edge set of the

b?1 + · · · + b?r ≥ (κ − (r − 1))? + 2(r − 1) ≥ k + r. (27)

Now we are ready to upper bound hF , for F = Kk .

Note that ŝ = s−1

3.2 Power law tails

P{(1 − ε)NF∗ > t, H} ≤ P1 ≤ P{NF∗ > t(1 − ε), H}. (35)

P{(1 − ε)NF∗ > t} − P20 − P30 ,

P{(1 − ε)NF∗ > t} − P20 − P30 ≤ P1 ≤ P{NF∗ > t(1 − ε)}.

We show below that for any 0 < ε < 1

P20 = o(t−α ), P3 = o(t−α ) as t → +∞. (37)

δ̄ ≤ E∗ (∆∗F )2 + (NF∗ )2 kX1−1 .

Hence the ratio in the exponent of (38)

(NF∗ )2 (NF∗ )2 (NF∗ )2 X1

ΦF (X1 , Q1 ) 2/v 2/k

The second term

and denote A2 = A21 ∪ A22 . We split

and upperbound P̃1 and P̃2 .

Hence, on the event A21 we have (recall that vF = k)

B1 = {ΨF (X1 , Q1 ) > B lnk s}, B2 = {ΨF (X1 , Q1 ) ≤ B lnk s}.

P̃21 ≤ P{∆∗F > ηNF∗ , A2 , B1 } = E (IB1 IA2 P∗ {∆∗F > ηNF∗ })

P̃22 ≤ P{∆∗F > s, A2 , B2 } = E IA2 IB2 P∗ {∆∗F > s}

≤ E IA2 IB2 s−2 E∗ (∆∗F )2 ≤ ck B 2−(2/k) s−2 ln2k−2 s.

You might also like