Professional Documents
Culture Documents
Pooled Mining Is Driving Blockchains Toward Centralized Systems
Pooled Mining Is Driving Blockchains Toward Centralized Systems
Abstract—The decentralization property of blockchains stems essential decentralization property of blockchains. There is no
from the fact that each miner accepts or refuses transactions and effective pool-resistant mechanism for Bitcoin and Ethereum
blocks based on its own verification results. However, pooled min- at present, as pools still dominate mining [3] [4].
ing causes blockchains to evolve into centralized systems because
pool participants delegate their decision-making rights to pool This paper makes two main contributions:
managers. In this paper, we established and validated a model • We build the first quantitative model which reveals that
for Proof-of-Work mining, introduced the concept of equivalent the income variance of a miner is inversely proportional
blocks, and quantitatively derived that pooling effectively lowers to pool sizes and that the expected income is hardly
the income variance of miners. We also analyzed Bitcoin and
Ethereum data to prove that pooled mining has become prevalent affected by pooling.
in the real world. The percentage of pool-mined blocks increased • We analyze real-world data from both Bitcoin and
from 49.91% to 91.12% within four months in Bitcoin and from Ethereum. The results show that pools dominate mining
76.9% to 92.2% within five months in Ethereum. In July 2018, for both blockchains starting from late 2017, coinciding
Bitcoin and Ethereum mining were dominated by only six and with the surge in their prices.
five pools respectively.
Keywords-public blockchain; centralization; pooled mining; II. BACKGROUND
modeling; data analysis In PoW-based blockchains, a valid block header hash must
be smaller than or equal to a target threshold, usually with tens
I. I NTRODUCTION of leading zeros. Since hash functions are one-way, miners
Bitcoin, introduced by Satoshi Nakamoto in [1] as a cash have to adjust some fields (e.g. nonce in block header) and
system without a central authority, is now experiencing cen- then hash the header to check if it meets the requirement.
tralization due to pooled mining. In a blockchain network, A pool manager joins a blockchain network as a single
a full node maintains a copy of the public ledger recording miner. Instead of solving PoW puzzles by itself, the pool
what transactions have happened and in what order. The manager outsources the task to pool participants after vali-
process of appending blocks to the ledger is called mining. dating transactions and assembling a block. A pool manager
Bitcoin relies on Proof-of-Work (PoW) to select a leader assembles a block with its own address as the receiver in the
to propose the next block. In order to motivate miners to coinbase transaction. Then the block header is sent to every
participate in the consensus process and extend the ledger, participant in the pool, as shown in Figure 1, and everyone
Bitcoin rewards miners with newly minted coins. Specifically, tries different nonces and timestamps for this block header1 .
there are seven steps in block appending: (1) listening for Participants are not able to mine blocks for themselves
transactions, validate them; (2) listening for blocks, validate because they do not know the coinbase transaction in the
them; (3) maintaining a replica of blockchain and choosing block body. If a participant finds a valid nonce, it sends the
the parent block based on the consensus rule; (4) assembling nonce back to the pool manager, who later distributes the
a block; (5) solving a PoW puzzle; (6) publishing the block to reward based on how much work each participant has done.
the network; (7) waiting for the block to be buried deep enough Participants prove their contribution by sending shares, i.e.,
in the blockchain so that the reward becomes spendable [2]. near-valid blocks, to the pool manager. For example, if the
In the first three steps, miners secure the network via target threshold starts with 75 leading zeros, a share may have
validating transactions and blocks; in the following three, only 70 leading zeros. Depending on the pooling protocol [5],
miners append new transactions to the ledger; in the last step, pool participants may receive reward either when they submit
miners are rewarded for their dedication to the work. How- shares or when the pool finds a block. A block reward consists
ever, pooled mining destroys this incentivizing mechanism. of a block subsidy, i.e. newly minted coins, and transaction
Pool participants only do Step (5) for rewards and delegate 1 For Bitcoin, the timestamp is a UNIX timestamp that must be greater than
their voting rights to the pool manager. This situation drives the median of the previous 11 blocktimestamps and less than two hours in
blockchain systems toward centralization, which violates the the future.
44
90 180
Theoretical value ( =10 ) Theoretical value ( =9 15 )
80
D hetr
160
D . hetr
Theoretical value ( =9 15 )
D . hetr Data from Bitcoin blockchain
70 Data from Bitcoin blockchain 140
60 120
Frequency
Frequency
50 100
40 80
30 60
20 40
10 20
0−2 0 2 4 6 8 10 12 14 16 0−2 0 2 4 6 8 10 12 14 16
Number of blocks found in one hour Number of blocks found in one hour
(a) Entire network (b) BTC.com
450
Theoretical value ( =9 15 )
D . hetr Theoretical value ( =9 15 )
400
D . hetr
Frequency
250
100 200
150
50 100
50
0−2 0 2 4 6 8 10 12 14 16 0−2 0 2 4 6 8 10 12 14 16
Number of blocks found in one hour Number of blocks found in one hour
(c) ViaBTC (d) BTCCPool
Fig. 2. Frequency histogram of number of blocks mined in one hour (from Bitcoin block height 516096 to 518111)
TABLE I
where E(X) denotes the expected value of variable X. M INERS AND T HEIR R ELATIVE H ASH P OWER
We analyzed the distribution of the number of blocks mined
Pool Name Blocks Found Relative Hash Power
in one hour (t = 60 minutes) for the entire network and
Entire Network 2016 1
three pools with different hash rates as listed in Table I. The
BTC.com 530 0.26
blue curve in Figure 2a comprises theoretical values produced
ViaBTC 221 0.11
by Equation 1. There is an obvious offset between the blue
BTCCPool 63 0.03
curve and actual Bitcoin data because D is adjusted based
on previous 2016 blocks but used for block validation for the
next 2016 blocks. Since miners are incentivized to upgrade
their mining hardware over time, the actual value of D/hetr not follow a Poisson distribution. However, it is not hard
should be less than 10. We then calculated the actual mean to observe that our model produces a relatively large error
time between blocks of the 2016 blocks under consideration for BTC.com in Figure 2b. The difference could result from
using the timestamp field in block headers. Unsurprisingly, the the violation of Assumption 5 in the real world. In order to
value is 9.15 minutes. With D set to 9.15, the red curve in forward a block a single hop, the sending miner has to verify
Figure 2a matches practical data better than the blue one. the block, announce its knowledge of the block to the receiving
The model also matches the data of the three pools as miner, wait for the response of the receiving miner, and finally
illustrated in Figure 2b, 2c, and 2d. We performed a Chi- send the block if the receiving miner has not heard of the
squared goodness-of-fit test to determine whether a Poisson block. [10]. Decker et al. found that the mean time until a node
process is an appropriate model for the observed Bitcoin data. receives a block is 12.6 seconds in Bitcoin network [10]. When
Table II gives the test results. Since all p-values are greater block propagation delay is considered, a miner is always the
than 0.05, there is no evidence to suggest that the data do first to receive blocks mined by itself and thus had more time
45
than others to work on the next blocks. This situation makes k k
V ar(Xeq ) = P (Xeq = ) · [ − E(Xeq )]2
block discoveries not completely independent, especially for S S
k
large pools. k E(Xpool ) 2
= P (Xpool = k) · [ − ]
TABLE II S S
G OODNESS - OF - FIT T EST R ESULTS k
1 (7)
= 2 P (Xpool = k) · [k − E(Xpool )]2
Pool Name p-value S
k
Entire Network 0.44 1
BTC.com 0.09 =
· V ar(Xpool )
S2
VIA 0.33
h2 Ht h ht 1 ht
BTCC 0.80 = 2· =( )· = ·
H D H D S D
Equations (6) and (7) show that joining a pool keeps a
IV. I NCENTIVES TO J OIN A P OOL miner’s expected income unchanged but lowers the income
variance by a factor of S, as summarized in Table III.
This section quantitatively analyzes the main benefit of
joining a pool and briefly introduce other minor factors. In TABLE III
our analysis, we use small miners to refer to miners owning C OMPARISON OF M INING S TRATEGIES
relatively small amounts of computational resources, and sim-
Mining Strategy Expected Blocks Variance
ilarly, large miners to refer to miners owning relatively huge ht ht
Solo
amounts of computational resources. From the perspective of D
ht
D
ht
Pooled /S
the network, pools and large solo miners are indistinguishable, D D
0.6
++
follows: ++
+*******++
+
+**+***** ******+**+**+**+**+**+**+**+**+**+**+
0.0
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+
**+ **+
**+ **+**+
E(Xpool ) = P (Xpool = k) · k 0.0 0.2 0.4 0.6 0.8 1.0
k (4)
Ht Number of equivalent blocks
= (by Poisson dist.)
D
Fig. 3. PMF of the number of equivalent blocks mined by an example miner
V ar(Xpool ) = P (Xpool = k) · [k − E(Xpool )]2 Example. The following example visually demonstrates the
k (5) effect of joining a pool from a miner’s perspective. Suppose
Ht
= (by Poisson dist.) a miner m uses an Application-Specific Integrated Circuit
D (ASIC) mining hardware whose capacity is h = 14 × 240
k
Since P (Xeq = S) = P (Xpool = k), we have hashes/s to mine Bitcoin continuously for a year. On March
9, 2018, the nBits value is 0x175589a3 in hexadecimal,
k k which can be converted to a target threshold of 0x5589a3 ×
E(Xeq ) = P (Xeq = )· 2560x17−3 = 0x5589a3 × 2160 [14]. Since the threshold
S S
k
starts with 73 leading zeros, we use 273 to approximate the
k
= P (Xpool = k) · mining difficulty D in this example. When m joins pools of
S (6) different sizes, the PMF of Xeq varies as shown in Figure
k
1 3. It is obvious that m is faced with high variance under
= · E(Xpool )
S solo mining. The probability of mining at least one block
h Ht ht in a year is only 0.05, whereas that of not even receiving
= · =
H D D
46
any reward in a year is as high as 0.95. Figure 3 clearly
!"
illustrates that: (1) the expected number of equivalent blocks !"
stays constant, regardless of pool sizes, and (2) the variance
drops significantly if m joins a pool. The larger the pool size,
the lower the variance.
Low income variance implies steady income and hence
predictable short-term income, which are strong incentives
because people usually value certainty especially when it
comes to money.
V. R EAL - WORLD B LOCKCHAIN DATA A NALYSIS
In Section IV, a Poisson process model is employed to
explain why miners are incentivized to pool in theory. In Fig. 4. Number of pool-mined and non-pool-mined Bitcoin blocks from
January 2017 to July 2018
this section, real-world data from Bitcoin and Ethereum are
examined to reveal that pooling has indeed become a trend
and blockchains may not be as decentralized as claimed. BTC.com
Miners are identified based on tags labeled by block ex- BitClub
plorers of Bitcoin [3] and Ethereum [15] [16]. This approach Bixin
was also taken by Gencer et al. [17]. The analytical results 21.2% AntPool
2.0%
would be representative even if miner tags were wrong because BTC.TOP 2.0% KanoPool
12.5% Huobi.pool
blocks are grouped by miner addresses instead of tags. In other BitcoinRussia
9.3% SecretSuperstar
words, each slice in Figure 5 and 7 (excluding the slice labeled ConnectBTC
2.7% 2.0% tigerpool.net
as other miners and the slice aggregating multiple small pools) DPOOL tiger
CanoePool
corresponds to one distinct block reward receiver address, but 8.9% 11.7% BitMinter
Bitcoin.com
the name of the receiver might be inaccurate. F2Pool 0.9% 2.1% Other miners
11.1%
While identifying pools, we found that although miners 1.4% 12.1%
were allowed to create an unlimited number of addresses, most 58COIN
BWPool
pools continued using the same one, and others consumed less
SlushPool
than five addresses over the years. We suspect that pools want BTCC ViaBTC
to be identified so that they can leverage their mining records
on blockchains to build up a reputation. Further analysis is
Fig. 5. Bitcoin block distribution among miners in July 2018. Pool names
needed to support this hypothesis. were collected from [3].
A. Bitcoin
B. Ethereum
We ran a Bitcoin full node implemented in C++ [13]. Data
on the blockchain were retrieved via the Remote Procedure We ran an Ethereum full node implemented in Go language
Call (RPC) Application Programming Interface (API) and [19] and retrieved data via the JSON RPC API. To be
stored in a MySQL database for analysis. We extracted data consistent, we analyzed data of the same time range (January
between January 1, 2017 and July 31, 2018. This time range 1, 2017 to July 31, 2018) as we did for Bitcoin. Figure 6
was intentionally chosen because it witnessed the growth of depicts that pool-mined blocks increased from 76.9% in June
pool-mined blocks—from 49.9% in November 2017 to 91.1% 2017 to 92.2% in October 20173 . The increase in pool-mined
in February 2018. The huge increase in pool-mined blocks co- blocks co-occurred with the first surge in Ethereum price [21]4 .
occurred with the surge in Bitcoin price [18]. We suspect that Thereafter, more than 90% of blocks were mined by pools
the reason was people became serious about gaining short-term every month. In July 2018, five pools collectively mined 82.7%
income when Bitcoin price grew. Detailed analysis is needed of all 182526 blocks as shown in Figure 7. These data suggest
to prove this hypothesis. Afterwards, over 88% of blocks were that Ethereum suffered more from centralization than Bitcoin.
mined by pools every month as shown in Figure 42 .
VI. R ELATED W ORK
The July 2018 data were further investigated to study block
distribution among pools. Figure 5 shows that 75.1% of blocks Narayanan et al. [2] and Rosenfeld [5] mentioned that the
were mined by only six pools, four of which collectively number of blocks mined in a fixed time interval conformed
possessed enough computational resources to subvert Bitcoin. 3 Block count dropped between March 2017 and September 2019 due to
the hard fork Byzantium [20].
2 The non-pool portion in Figure 4 may contain blocks mined by unrecog- 4 The non-pool portion in Figure 6 may contain blocks mined by unrecog-
nized pools. nized pools.
47
VII. C ONCLUSION
!
"
!
" In this paper, we established a Poisson process model for
PoW mining and validated the model with Bitcoin data. Based
on the model, we introduced the concept of equivalent blocks
and quantitatively derived that income variance is inversely
proportional to the pool size that a miner participates in,
whereas the expected income is hardly affected by pooling.
Low income variance incentivizes miners to pool and this fact
has been demonstrated by real-world data from both Bitcoin
and Ethereum. As a pool manager verifies transactions on
behalf of all participants, the data analysis results confirmed
that decentralization is a real challenge facing blockchains.
48