Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 6

Sometime in the fall of 1955, a Chinese statistical worker by the name of Feng Jixi penned what

might well be the most romantic sentence ever written about statistical work. ‘Every time I complete a
statistical table,’ Feng wrote:

my happiness is like that of a peasant on his field catching sight of a golden ear of wheat, my
excitement like that of a steelworker observing molten steel emerging from a Martin furnace, [and]
my elation like that of an artist completing a beautiful painting.

Feng’s clever juxtaposition allowed statistics, that most staid of subjects, to intrude into a much
broader and more affective consciousness, one populated by more readily discernible achievements in
industry, agriculture and the arts.

This intrusion does not sit easily. After all, most of us don’t care too much for statistics. We might
celebrate Olympic medal tallies or share consternation about a decline in GDP, but our engagement
remains superficial. It’s only in moments of crisis that we begin to pay attention. Our current
obsession with all kinds of data related to COVID-19 is a case in point. But even in such moments, we
focus largely on the numbers themselves, wondering about their reliability, politicising them, arguing
about their possible manipulation, or making comparisons within and across societies. Implicit in
these actions is the assumption that there exists a neutral, untainted truth that these numbers can
accurately and unequivocally capture. This is, of course, patently false. Statistics are neutral only if
we accept that how we come to know something has no bearing on what we know (and, of course,
vice versa).

The 1950s were witness to arguably the most vigorous disagreement over this question
of what and how. As the world emerged from the devastation of the Second World War and entered a
period of decolonisation and imperial collapse, countries both old and new reposed great faith in the
authority of quantitative methods and statistics. Collecting and analysing data using advanced
statistical methods came to define modern governance. This shared faith, however, didn’t always
translate into shared methods. Refracted through an increasingly thick Cold War lens, the universal
desire for ever-increasing quantitative control splintered, taking forms that not only varied
significantly but were seen as each other’s correctives.

In October 1949, Mao Zedong and the Communists declared victory over Chiang Kai-shek and his
Nationalist government, putting an end to nearly four decades of chaos punctuated by rebellion,
warlord-rule, a Japanese invasion and, finally, a bloody civil war. Buoyed by their victory and
confident of the transcendence of Marxism-Leninism, the Communists set about transforming China.
Few, if any, were ever more optimistic. But the great promises of improvement also helped rationalise
repressive and violent measures, leaving behind scars both physical and psychological. Powerful and
progressive reforms, such as the redistribution of land to peasants and the enactment of a new,
egalitarian marriage law, coexisted with the need to discipline and subdue different sections of
society, from bureaucrats to merchants to intellectuals. The final campaign of the decade brought the
story full circle. A terrible famine devastated the same peasants that had so spectacularly been
empowered 10 years earlier. Across this vivid canvas, the story of statistics, subject to benign neglect
at the best of times, ought not to occupy pride of place. And yet, it lies at the heart of China’s socialist
experiment in the 1950s.

Sign up to our newsletter


Updates on everything new at Aeon.

Daily
Weekly

See our newsletter privacy policy here

E ver since their miraculous escape from Nationalist armies in 1934 (mythologised in Party lore as
the Long March), the Communists had sought to distinguish themselves as a party and a movement
with a difference. From their new base in the dusty hills of north-central China, they began
experiments in communist governance, balancing practical and ideological objectives. Recruitment of
a peasant army, land reform and cadre training were accompanied by the creation of a distinct
Marxism-Leninism-inspired theoretical apparatus. After 1949, as they took control of the mainland,
they were in a position to act on their longtime claims to difference. Few domains demonstrated as
clean a break from the past as did statistics. In a speech in 1951, Li Fuchun, one of a handful of
technocratically minded leaders, summarily dismissed the utility of Nationalist-era statistics, branding
them an Anglo-American bourgeois conceit, unsuitable for ‘managing and supervising the country’.
New China needed a new kind of statistics, he declared.

A new kind of statistics China did indeed fashion. And it offered emphatic answers to the interwoven
and essential questions of what and how. Rejecting the Nationalist (and globally dominant)
understanding of statistics as a universal science, Chinese statisticians joined their Soviet compatriots
in redefining statistics as a social science. Its true object was the social world. Eschewed outright were
the physical and natural worlds. These latter areas became the subject of mathematical statistics,
whose correct place was in departments of mathematics, physics, engineering and so on. Symptomatic
of these divides is the career of Xu Baolu, one of China’s foremost probabilists. A professor in the
mathematics department of Peking University during the 1950s, Xu had no known interaction with the
State Statistics Bureau or with the newly established department of statistics at the nearby People’s
University. With their sights set and rightful purpose claimed, Chinese statisticians proceeded to
interpret Marxism’s explicit teleology as grounds to reject the existence of chance and probability in
the social world. In their eyes, there was nothing uncertain about mankind’s march towards socialism
and, eventually, communism. What role, then, could probability or randomness play in the study of
social affairs?

The implications for statistical methods were profound. In rejecting probability, and the larger area of
mathematical statistics within which it belonged, China’s statisticians discarded a large array of
techniques, none more critical than the era’s newest and most exciting fact-generating technology –
large-scale random sampling. Instead, they decided that the only correct way to ascertain social facts
was to count them exhaustively. Only in this way could extensive, complete and objective knowledge
be generated. Out of this understanding emerged a strict hierarchy of methods. At the top was
complete enumeration, realised through a vast system of comprehensive and periodic reports covering
all sectors of the economy. Next came one-time censuses, which were used to collect data on an ad-
hoc, as-needed basis. Finally, only in those circumstances when an exhaustive count wasn’t possible,
did Chinese statisticians also use non-randomised (ethnographic) sample surveys.

The census was a symbolic appendage, invoked to conjure up China’s enormity but of little
use otherwise

The most recognisable example of an exhaustive enumeration is the population census. A crucial tool
to help society understand itself, it also serves as a basis for policymaking. The currently underway
2020 United States census is portrayed as a ‘once-in-a-decade chance to shape the future of your
family and community’. In the US, the census has been a decennial ritual since 1790; in the United
Kingdom, since 1801; and in India, since 1871. China had to wait until 1953 for its first complete
population census. When results were formally declared in November of the following year, China’s
population stood at 583 million. But only a handful of fields – such as name, age, sex and nationality
– had been enumerated and the absence of disaggregated data, which the State Statistics Bureau
withheld, rendered its value for research or policymaking negligible.

Instead, the question of whether China had too many people dominated conversation. In September
1949, Mao had declared: ‘We have very favourable conditions: a population of 475 million people.’
By the middle of the decade, his pro-natalist views had softened and he began to speak of birth control
and birth planning. In 1957, he enigmatically proclaimed: ‘the fact that China has a large population is
good and bad; China’s advantage is that people are many, [its] disadvantage also is that people are
many.’ These vacillations rendered the question of population a fraught matter. Just how fraught
became clear that same year, when the Party mounted a concerted campaign of vilification against the
president of Peking University, Ma Yinchu. By October 1959, academic colleagues and Party
functionaries authored 200 articles castigating Ma for wrongly predicting a Malthusian crisis and
advocating birth control measures. Dismissed from the university presidency in 1960, Ma would have
to wait until 1979 for a formal apology from the Party.

The census was reduced to a symbolic appendage, invoked to conjure up China’s enormity but of little
use otherwise. Instead, China’s statisticians and planners deemed a different set of numbers more
important. These numbers, the poet and author Ba Jin gushed:

gather the sentiments of 600 million people, they also embody their common aspirations and are their
signpost [pointing to the future]. With them, correctly, step by step, we shall arrive on the road to
[building a] socialist society. They are like a bright lamp, illuminating the hearts of 600 million.

The numbers about which Ba waxed poetic were those related to planning and economic
management.

I n standard Chinese chronologies, 1949-52 is described as the period of economic recovery, and
1953-57 are the years of the first five-year plan. The plan itself consisted of a plethora of constituent
parts, broken into annual, semi-annual and quarterly timeframes, which planners applied to the entire
socioeconomic landscape. In its idealised form, the state would collect data across all spheres of
society. The headquarters of the State Statistics Bureau, established in Beijing in 1952, was divided
into 13 branches, dealing with industry, agriculture, capital construction, trade, the distribution of
supplies, transportation, labour wages, culture, education and publication. Each of these branches
received data that had made their way up progressively from villages to counties to provincial
bureaus. In Beijing, the data were further collated and compiled, and then sent to the State Planning
Commission. These plans were then sent back down the same route to provincial, municipal and
county planning offices, and through them, to the corresponding statistics offices. At its mid-decade
peak, this system of statistical data collection employed 200,000 full-time cadres spread across 2,200
counties and 750,000 villages.

The supremacy of planning was accompanied by a particular emphasis on material production and a
corresponding neglect of service-based activities, such as administration, retail and accounting. It also
generated a hierarchical relationship between the two most important sectors of the economy.
Although the Communist Revolution was won by peasant armies, economic policy as a whole placed
primary emphasis on the rapid growth of heavy industries. Agriculture, which constituted nearly 50
per cent of the economy in 1949, was relegated to the background, its primary task to generate surplus
for investment in heavy industries.

All too often provincial bureaus or headquarters in Beijing encountered disparate numbers for
a given product

The scale of the statistical operation, the privileging of material production, and within that the
emphasis on heavy industry and relative neglect of agriculture, all contributed to ensure that the
dream of total information, so alluring as an ideal, was a nightmare in practice. Every level of the
statistical system contributed to the overproduction of data. In a system that valued the production of
material goods above all else, the only way a white-collar service such as statistics could draw
attention to itself was by claiming, as Feng did, that statistical tables were a material contribution to
the economy, just like wheat and steel. With the production of tables so incentivised, the entire system
responded with gusto to produce them. Soon, there were so many reports circulating that it was
impossible to keep track of them. Internal memoranda bemoaned the chaos, but it was a pithy four-
character phrase that truly captured the exasperation. Translated, it reads: ‘Useless at the moment of
creation!’

Compounding the problem of overproduction was the issue of data irreconcilability. All too often
provincial bureaus or headquarters in Beijing encountered disparate numbers for a given product. The
State Statistics Bureau issued catalogues of industrial products in the hopes of achieving some
standardisation, but to little avail. Units of measurement presented another source of confusion, with
regional differences in weights, units of volume, and groupings (think dozens versus tens) making
aggregate estimations frequently incommensurable. Overproduction and irreconcilability of statistical
data, in turn, fuelled chronic delays.

Nowhere were these problems as pronounced as in the agricultural sector. Already neglected under
the prevailing industry-centric orientation, the scale of the agricultural sector and the large variation in
terrain, crops and seasons compounded the tendencies to overreport, delay and generate
incommensurable data. The waves of collectivisation that began in 1952 and forced villagers into
larger and larger collectives also exacerbated the problem of reliable and standardised data.

A s the 1950s wore on, two other trends also shaped statistical work. The first was the increasingly
complex nature of the economy. The state set up new factories and industries, nationalised older
private ones, and continued to reorganise agriculture. A second factor was a growing recognition
among some of the statistical and planning leadership that the State Statistics Bureau ought to provide
not just data but conduct analyses as well.

Out of these circumstances emerged data’s own uncertainty principle: accuracy and timeliness were in
conflict; prioritising precision in one typically meant compromising precision in the other. If numbers
weren’t provided at the right time, decision-making suffered. But what good would those decisions be
if the numbers themselves were of poor quality? The paradox was debilitating. In September 1957, the
head of the State Statistics Bureau Xue Muqiao broke the deadlock by declaring that:

In order for the leading authorities to understand the situation, research questions, and decide on
policies, they frequently need reference data on a timelier basis. Such data need not possess a high
degree of accuracy or be comprehensive, but it must be supplied in a timely fashion.

Xue’s decision released the cat of estimation among the pigeons of accuracy. Higher levels of the
statistical system set ever stricter deadlines and the lower levels responded with more and more
estimated numbers. As these numbers travelled up the chain, from county to province to Beijing, they
were combined with other estimates, leaving provincial and, eventually, national data with ever larger
margins of error.

Random sampling offered a cheaper, more accurate and faster way to collect grassroots data

By 1957, Chinese statisticians were well aware that they had built a system that generated copious
quantities of facts but left them poorly informed. For instance, although the State Statistics Bureau
had detailed granular data for grain and cotton yields, once aggregated to provincial and national
levels, these data were often found to be so inconsistent across years that statisticians and planners
were unable to assess plan completion or the efficacy of specific policy measures. Desperate for
change, Xue and one of his deputies, Wang Sihua, began to consider options what had until then been
anathema. In one of the earliest instances of scientific exchange among countries of the Global South,
they reached out to statisticians at the Indian Statistical Institute in Calcutta. The director of the
institute, a physicist by the name of P C Mahalanobis, had pioneered the use of large-scale random
sampling across India. As the Indian experience made clear, random sampling offered a cheaper, more
accurate and faster way to collect grassroots data. In the summer of 1957, Mahalanobis spent three
weeks in Beijing. Chinese statisticians, including Wang, visited Calcutta in 1957 and 1958. Back in
Beijing, Xue and Wang led efforts to prepare the grounds for wider adoption of the large-scale
random sampling method, hoping especially to employ it in the agricultural sector.

The question of what would have happened had China adopted large-scale random sampling now
occupies the murky realm of counterfactual history. In 1958, Mao and the Party leadership initiated a
new campaign, declaring that China would surpass Britain’s industrial production in 15 years. Known
as the Great Leap Forward, it entailed a fundamental reorganisation of labour and production
technologies. In the countryside, villages and cooperatives went through a final round of
collectivisation, creating massive communes with tens of thousands of people. These people worked
on nationalised farms, at backyard blast furnaces and on myriad infrastructural projects such as dams
and canals. Their urban counterparts experienced similar communal and labour-intensive practices. In
the world of data collection, the Great Leap Forward marked a turn away from exhaustive
enumeration and the adoption, instead, of decentralised and ethnographic methods. A tract from 1927
on rural investigation, authored by Mao, became the new methodological model for data collection.
True knowledge could be gained only by a detailed, in-person investigation, not through vast
exhaustive surveys nor through randomised sampling. The shift left the statistical apparatus with no
reliable means to check its own data. Most tellingly, it contributed to the state’s reduced capacity to
ascertain accurately the devastating famine that overtook the countryside starting in 1959. Estimates
vary, but most scholars agree that at least 30 million people, and possibly many more, lost their lives
by the time the Great Leap Forward ended in 1962.

It would take more than a decade, until after the death of Mao’s designated successor Lin Biao in
1971, to restore statistical work in China. And another decade still before the post-1978 policies of
reform and opening up created the grounds for a thorough reappraisal of statistics and for the slow
reintegration of probabilistic methods.

F rom the vantage point of today, the travails of China’s statisticians during the 1950s might appear
quaint, their obsession with definitional issues and their rejection of probabilistic methods an artifact
of a more ideologically driven time. That would be a mistake. The concerns that drove them are with
us today, as alive and as urgent as they were 70 years ago. At their heart is a set of basic and timeless
questions: what do we need to know and how should we know it? Their answers gave them
confidence to value exhaustive enumeration above all else. This confidence has echoes in today’s Big
Data revolution, which similarly insists that the more information we quantify, the better shall our
knowledge be, and the more appropriate our solutions.

There are other lessons. Today, as in the 1950s, randomised sampling and in-depth case studies
remain valuable but are increasingly neglected. Instead of ignoring them, we need to recognise that
each method – the randomised, the ethnographic and the exhaustive – offers unique insights. And
although none is a panacea, together they constitute a far more supple toolkit, expanding
both what we can know and how we can know.

The COVID-19 pandemic has forced the world to confront the uncertainty principle afresh. Much as
mid-20th-century Chinese statisticians discovered, the timely delivery of data and the guarantee of
their accuracy sit in some tension with one another. To achieve precision in both remains as great a
challenge today. The choices that Chinese officials made then weren’t always easy or self-evident
ones; neither are those that are being made today. And they affect two other values that we ought to
cherish: transparency and commensurability. Timeliness and accuracy are of little use if we don’t
make the data freely available and if we don’t use commonly agreed standards. A lack of both
transparency and commensurability hobbled statistical work in 1950s China. They remain as
intractable today, generating confusion that, in its most vicious form, can sow deep distrust between
researchers, institutions and communities.

As we continue to confront biased and manipulated data in our daily lives, the example of 1950s
China reminds us of the importance of separating outcomes that can be traced to first principles
(‘statistics is a social science’) from those that are a result of post-hoc manipulation (‘this estimate is
too low, let’s report a higher one’). In a world increasingly divided by narrow nationalist visions,
recognising that all data are biased, but that not all biases are the same, might well be a matter of life
and death.

You might also like