Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

850698 ENS Journal of Eurasian StudiesHeinrich et al.

JOURNAL OF
EURASIAN STUDIES
Article
Journal of Eurasian Studies

Transparency and quality 2019, Vol. 10(2) 136­–146


© The Author(s) 2019
Article reuse guidelines:
assessment of research data in post- sagepub.com/journals-permissions
https://doi.org/10.1177/1879366519850698
DOI: 10.1177/1879366519850698
Soviet area studies: The potential of journals.sagepub.com/home/ens

an interactive online platform

Andreas Heinrich, Felix Herrmann and Heiko Pleines

Abstract
The social sciences are increasingly addressing the quality of research data and debating ways to improve data
transparency, that is, the availability of original research data to corroborate claims made in academic publications. This
article offers a systematic discussion of related problems and challenges with the example of post-Soviet area studies. It
goes on to examine ways to improve data transparency. Although the Internet has a huge potential for linking research
with resulting publications and underlying data as well as for organizing a collective discussion around the research,
current data repositories do not truly go beyond basic upload and download functions for datasets. With the example
of the Discuss Data project, this article gives an overview of more elaborated features that can easily be implemented
to improve the visibility and quality assessment of data collections. Finally, it discusses ethical concerns about data
transparency related to privacy protection and copyrights.

Keywords
Data repository, data transparency, post-Soviet region, research data evaluation, research data quality, virtual research
environment

Introduction austerity, the study was frequently cited in mass media.


Because the authors made their dataset publicly available,
The social sciences are increasingly addressing the quality mistakes in their statistics were found and, therefore, could
of research data and debating ways to improve data trans- be corrected. The results of the original study were no
parency. In this context, data refer not just to quantitative longer supported (Cassidy, 2013; “The 90% Question,”
(numerical) data but also to all kinds of qualitative data 2013; “Trouble at the Lab,” 2013).
ranging from different forms of texts and artifacts (or pic- However, it is not only mistakes in the data that cause
tures of them) to audio and video files. problems. The interpretation of data can also be mislead-
Making data available to other researchers has two main ing. An assessment of 6,700 empirical studies in economics
purposes. First, it allows other researchers to check whether came to the conclusion that in half of the research areas,
the claims made on the basis of specific data are correct. nearly 90% of the studies used samples that were too small
Spectacular cases in which the quality assessment of for reliable conclusions, while of the remaining studies, the
research data by the academic community has failed,
repeatedly make the headlines. However, they only address
academic studies that have gained public attention. A typi- Research Centre for East European Studies, University of Bremen,
cal example is the study by two renowned US economists, Germany
published in 2013, which claimed on the basis of an analy-
Corresponding author:
sis of worldwide economic statistics that economic growth Heiko Pleines, Research Centre for East European Studies, University of
declines drastically as soon as a country’s state deficit Bremen, Klagenfurter Str. 8, 28359 Bremen, Germany.
exceeds 90% of its gross domestic product. In times of Email: pleines@uni-bremen.de

Creative Commons Non Commercial CC BY-NC: This article is distributed under the terms of the Creative Commons
Attribution-NonCommercial 4.0 License (http://www.creativecommons.org/licenses/by-nc/4.0/) which permits non-commercial
use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and
Open Access pages (https://us.sagepub.com/en-us/nam/open-access-at-sage).
Heinrich et al. 137

vast majority exaggerated the actual results (Ioannidis, region, compared with the Organisation for Economic
Stanley, & Doucouliagos, 2017). Co-operation and Development (OECD) world, is deficient
An analysis of some 300 studies published in three information infrastructures. Many resources and services
prominent political behaviorist journals found that only provided centrally in the Soviet Union have been discon-
about half of the authors provided access to the underlying nected or can no longer be provided by the successor states
data. Of the accessible datasets, roughly 25% were pre- (Johnson, 2014). Therefore, the availability and quality of
sented “so poorly that replication was impossible” statistical data are limited in many regards (Bessonov,
(Stockemer, Koehler, & Lentz, 2018, p. 799). 2013; Kryukov & Sokolin, 2010), and the access to official
There is also an important—though much less dis- documents is often impeded. Moreover, in authoritarian
cussed—second reason for why the publication of data countries, conducting interviews or collecting data on polit-
makes sense. Starting from the idea of academic research as ically relevant topics can lead to legal problems.
a collective endeavor toward a better understanding of the The following paragraphs provide a number of exam-
world, secondary data analysis can provide huge benefits to ples, which illustrate the three types of quality problems.
researchers. Most importantly, it can offer access to unique The examples are taken from the post-Soviet region, but
information if, for instance, historical data can no longer be this does in no way imply that related problems are less
found or generated. It can also add further evidence to orig- pressing or necessarily of a different nature in other parts of
inal data, supporting or challenging one’s own results, and the world. The examples from mainstream economic sci-
thus allowing for triangulation (on triangulation, see, for ence cited in the introduction clearly indicate that the gen-
example, Junk, 2011). Finally, access to secondary data can eral problem is pervasive globally.
save costs compared with own data collection, which is
especially important for early-stage researchers who lack
funding (e.g., Heaton, 2008). Secondary analyses of Intentional falsification of data
research data, however, also bear the risk of reproducing When individual researchers falsify data so that their con-
existing mistakes if these have not been discovered or suf- clusions look more convincing, this can only be detected on
ficiently communicated. a case-by-case basis. We are not aware of any such instances
Data transparency, that is, free availability of the rele- in post-Soviet area studies.
vant data, constitutes the basis for both quality assessment In any case, the systematic falsification of data concerns
of related published research results and secondary data foremost nonacademic sources, namely, official state agen-
analysis. A necessary second step is the competent discus- cies. Especially in authoritarian regimes, state organs may
sion of the data collections themselves by the academic simply produce the information that is desired, instead of
community. the information that has been collected. This often concerns
The challenges related to data transparency and data economic statistics. While Uzbekistan still claimed in 2015
quality assessments are, of course, dependent on the aca- that the country had not been affected by the severe global
demic discipline and the object of study. In this article, we economic downturn after 2009, Russian mirror statistics,
elaborate on social science data based on the example of foreign investors, and commodity prices all pointed to the
post-Soviet area studies. We start with a typology of the opposite (Focus-Economics, 2016). Discrepancies in
problems of data quality and data interpretation. We go on Russian trade statistics or regional statistics can also—at
to give an overview of current attempts to improve data least partly—be explained by deliberate misreporting with
transparency and offer a brief sketch of a new project that the aim to obtain the politically desirable results (Simola,
aims to go beyond current approaches. Finally, we discuss 2012; Zubarevich, 2012).
the ethical issues that have to be taken into consideration The detection of falsifications of election results has
when discussing data transparency. even developed into a new subdiscipline in political sci-
ence, the “Election Fraud Forensics” (Alvarez,
Ansolabehere, & Stewart, 2005; Alvarez, Hall, & Hyde,
Problems of data quality 2008; Beber & Scacco, 2012; Deckert, Myagkov, &
Based on the underlying cause, we differentiate three qual- Ordeshook, 2011; Lehoucq, 2003; Magaloni, 2010; Vickery
ity problems concerning quantitative as well as qualitative & Shein, 2012). Various studies address such practices in
data. First, there are intentionally falsified data. Second, the post-Soviet region (e.g., Hyde, 2007; Myagkov,
there can also be unintended mistakes in the data. Third, Ordeshook, & Shakin, 2009; Senyuva, 2010; Tucker, 2007).
data can be incomplete and thus misleading. In all cases, To give one example, according to the final results of the
the results are data collections of low quality that should Central Election Commission of the Russian Federation for
only be used after a proper assessment of their the 2011 parliamentary election, the party “United Russia”
shortcomings. received nearly 50% of the votes and, thus, the absolute
The country context impacts the underlying causes in majority of deputies in parliament. However, in 60% of the
multiple ways. One relevant difference of the post-Soviet 60,000 voting districts, irregularities were reported; in
138 Journal of Eurasian Studies 10(2)

3,000 voting districts (especially in Dagestan and North clared income would be detected, resulting in higher taxa-
Ossetia in the North Caucasus), “United Russia” received tion (Ersado, 2006).
100% of the votes. This “ballot stuffing” obviously changed Another example is participants in large-scale demon-
the election results. It has been estimated that in case of a strations. In his codebook for disaggregated event data
normal statistical distribution, as occurred in other elec- “Mass demonstrations and mass violent events in the for-
tions, “United Russia”—the party supporting Russian mer USSR, 1987–1992,” Beissinger (2003) explains,
President Vladimir Putin—would have more likely received
between 30% and 35% of the votes. Similar irregularities The number of participants in a demonstration can often
occurred in Russia during the presidential elections of 2012 fluctuate drastically over the course of a single event. Crowds
(Klimek, Yegorov, Hanel, & Thurner, 2012). of 10 thousand, for instance, may gather on a square in the
However, to present the desired picture, governments do morning; by evening, the same demonstration may have tens
or hundreds of thousands of participants. The variables here all
manipulate not only numbers, that is, quantitative data, but
reflect reported information on the peak number of participants
also all kinds of qualitative data. For example, in the Soviet mentioned in each description of the event. In all, specific
Union, even before the digital age, official photos of lead- information on the number of participants was available for
ing party members were manipulated to delete the faces of 68.4 percent of the demonstrations recorded. Since estimating
people who had fallen out of favor (King, 2014). A com- the size of crowds is an art rather than a science, divergent
mon strategy not limited to authoritarian regimes is the pro- estimates were recorded whenever available. (p. 7)
duction of “fake news,” which often means the spreading of
rumors to discredit opponents. This form of falsification The problem is even more challenging for complex phe-
first of all concerns text documents such as official state- nomena such as migration. The International Federation of
ments and media reports. Human Rights (2016) states in a report,

It should be noted that the lack of reliable statistics pertaining to


Unintended mistakes in data migratory flows from Kyrgyzstan, and especially lack of
Although ethically not as problematic as intended falsifica- disaggregated statistics specifically on the movement of women
tions, it can be assumed that unintended mistakes are at and children at a national and regional levels, makes it difficult
least no less numerous in practice. “Unintended” here refers to assess the full impact of migration on women and children.
Various experts agree that these data underestimate the number
to those in charge of data collection. While, in the above
of Kyrgyz migrants working abroad, which could be up to one
examples of falsifications, those collecting and publishing million. It is challenging to have a real picture of migratory
the data have been the culprits, in the case of unintended flows mostly because of: 1) the visa-free regime in post-Soviet
mistakes, such actors may, for example, fall victim to false countries where Kyrgyz migrants tend to work, 2) significant
claims from the subjects of their research. gaps in data recording at border check points, and 3) the
An area where this might be frequent is public opinion majority of Kyrgyz migrant workers are undocumented. As a
polls on political issues in authoritarian regimes, where result, statistics from both the Kyrgyz State Migration Service
freedom of opinion is restricted. In a public opinion poll and the Russian Federal Migration Service (FMS), as well as
conducted in Russia by the Levada Center in July 2016, estimations of experts on migration, do not match. (p. 9)
only 30% of respondents stated that they would always
honestly answer questions related to politics; furthermore,
Incomplete data collection
only 12% of them assumed that other people would do so
(Levada Center, 2016). Partly related to this uneasiness Incompleteness is easily visible in quantitative datasets, as
with talking about politics is the high rejection rate in pub- missing figures are marked as not available in tables.
lic opinion surveys. It has been claimed that only a small Nevertheless, quantitative analyses often simply ignore
part of the Russian populace (between 10% and 30%) is missing data, thus potentially introducing a bias. This is a
willing to take part in such polls and surveys (Napeenko, major issue, as an advisory group to the United Nations
2017). concluded in 2014 that over the last two decades, the per-
A related case—though not due to authoritarian repres- centage of missing data for basic socioeconomic develop-
sion—is the Gini coefficient produced by the “Azerbaijan ment indicators in 157 countries was on average 30% to
Household Income and Expenditure Survey,” which has 40%; an improvement was not considered likely (“Data and
been included in global datasets. The Gini coefficient for Development,” 2014; on the case of Russian official statis-
Azerbaijan had a low value, which indicated a low degree tics, see Baranov, 2013; Khaninym, 2012; Korhonen,
of social inequality in the country. This finding was unex- 2012).
pected for a country experiencing an oil boom; theory Moreover, existing data collections often stand alone
would rather suggest high inequality. In fact, the low value and isolated; they do not refer to comparable or comple-
was partly caused by better off, middle-class households mentary datasets to check validity or fill in gaps. For
not participating in the survey for fear that their nonde- instance, concerning public opinion polls, the “Caucasus
Heinrich et al. 139

Barometer” has since 2008 published data that asks many to above) may thus miss out on smaller protests because
questions taken from the “World Value Survey” (WVS) for they are no longer being reported.
Armenia, Azerbaijan, and Georgia. Such complementary
data, missing in the WVS, would also invite a discussion
about the methodological comparability of the two Problems of data interpretation
surveys. Even data that are correct and complete can lead to wrong
In the case of qualitative data, incompleteness is often results. This is, in fact, not a problem of data quality in the
harder to identify, and the implications for the conclusions narrow sense but a problem of data interpretation. Here, we
drawn from them are less obvious. A typical example is again distinguish three forms, related to, first, the proper
expert or elite interviews. The researcher starts with a list of implementation of the method of analysis; second, the
ideal candidates for such interviews. However, the final list over-interpretation of data; and third, the misinterpretation
of interviews conducted usually looks very different, as of data.
many decline to give an interview, while other respondents Problems related to the proper implementation of the
are added based on suggestions from earlier interview part- method of data analysis can come in many forms, which
ners or from those who declined themselves. This tech- are specific to the method being used. Regression analyses
nique has been formalized as “snowball sampling” (e.g., can obviously suffer from mathematical mistakes, and
the entry in Lewis-Beck, Bryman, & Futing Liao, 2004). results can also differ depending on the model chosen for
Here, it is far from clear whether the first sample—which calculations. For content or discourse analysis, proper
may have been smaller—is “more complete” than the implementation implies, among many other things, native-
actual one, which may include important additional speaker command of the respective language and a con-
respondents as a result of snowball sampling. sistent coding scheme. This issue is the topic of textbooks
At the same time, at a general level, two biases are likely on the respective methods.
to result from this approach. First, more important people However, problems of data interpretation can also be
(in terms of relevant responsibilities and knowledge) are due to an “over-interpretation” of the data, assigning them
likely to delegate the “task” of the interview to less impor- more reliability than they actually have. Especially, quanti-
tant people. Second, in the snowball approach, respondents tative data suggest an accuracy that may not be supported
will most likely suggest like-minded people for interviews. by the underlying information.
In addition, in hybrid and authoritarian regimes, specific An example of over-interpretation is indices based on
respondents may be discouraged from talking to academic expert opinions. The organization “Reporters without
researchers or may self-censor their answers (Beisembayeva, Borders” explicitly warns that its ranking of media free-
Papoutsaki, Kolesova, & Kulikova, 2013; Goode, 2010; dom does not fulfill academic standards.1 Until 2012, the
Richardson, 2014; Roberts, 2013; Shih, 2015). “Corruption Perception Index (CPI)” by “Transparency
Another example of incomplete qualitative data is the International” used changing data sources and moving
manual content analysis of media reporting. As such an averages; thus, the organization itself stated that the index
analysis requires reading all texts, the sample of media scores could not be compared over time.2 The “Freedom
included in the analysis is often small. Moreover, print House” ranking on political freedoms, which is often used
media available in electronic databases are preferred, as to identify political regimes types, has been criticized for
full-text search functions immensely simplify the creation its unsystematic methodology and for a bias in favor of
of the text corpus. That means that TV, which is by far the allies of the United States (Giannone, 2010; Steiner,
most important source of news for the population in all 2016). However, political scientists often use these and
post-Soviet countries, is often not included in the analysis other rankings without references to studies critically
(for an alternative approach including TV report, see assessing the validity of these rankings (like, for example,
Heinrich & Pleines, 2015, 2018). Andersson & Heywood, 2009; Apaza, 2009; Bühlmann,
Moreover, media reporting itself may lack relevant data Merkel, Müller, Giebler & Weβels, 2012; Giannone, 2010;
(i.e., information). For example, Fredheim (2017) finds that Høyland, Moene, & Willumsen, 2012; Knack, 2006;
pressure from news owners who are close to the ruling Møller & Skaaning, 2012; Munck, 2011; Muno, 2012;
elites had a significant effect on journalistic output at the Pickel & Pickel, 2011, 2012; Pleines, 2018; Steiner, 2016;
popular Russian online newspapers “Lenta” and “Gazeta.” Teorell, 2011).
Editorial changes in both publications were accompanied Finally, problems of data interpretation can take the
by a shift from core news areas (such as domestic and inter- form of actual misinterpretation. Qualitative studies can be
national politics) toward lifestyle and human interest sub- ignorant of relevant context, thus misinterpreting sources
jects. In a similar vein, the Kazakhstani government has due to lack of information or cultural knowledge of specific
systematically prevented political analysis on the country’s meanings, as in the case of the term “democracy” described
websites (Anceschi, 2015; Lewis, 2016). Those compiling below, or of relevant modes of interpretation, for example,
a protest event database (like the one by Beissinger referred missing out on irony. A similar problem in quantitative
140 Journal of Eurasian Studies 10(2)

studies is the rather common use of the CPI as a proxy for research results (Alvarez, Key, & Núñez, 2018). Data trans-
a country’s level of corruption, although a study docu- parency also offers great opportunities for secondary analy-
mented by Transparency International itself has confirmed ses, as highlighted in the introduction.
that there is no systematic relation between expert assess- However, the expectation that once data collections are
ments, as those used in the CPI, and levels of actual corrup- published online, mistakes and problems will be easily
tion as reported in representative public surveys spotted and immediately corrected is over optimistic for at
(Razafindrakoto & Roubaud, 2005). least two reasons. First, often the problem is not a simple
A standard example of misinterpretation for the post- figure that has to be corrected—such as the Gini coefficient
Soviet region is public survey data about “democracy.” in one of the examples given above—but a broader issue
When asked about the desirability of democracy as a regime related to the validity, applicability, and contextuality of
type, large parts of the populaces in the post-Soviet region research data, such as the reliability of opinion polls or the
do not think of the ideal type “democracy” but of their own Freedom House country rankings, which are challenged by
experiences with democratically elected governments in a number of complex arguments. Related assessments and
the 1990s, a time period characterized by corruption and debates are not included in the repositories that store the
social disruption (Carnaghan, 2011). Accordingly, answers data collections in question. Instead, they are spread over a
are to a large degree (but not completely) related to the broad range of academic publications in different disci-
desirability of a “return to the 1990s.” They have to be plines, as demonstrated by the bibliographic references in
interpreted accordingly and cannot simply be included in the section on country rankings above. Accordingly, aca-
global comparisons about perceptions of democracy. In demic researchers downloading a data collection have no
general, it is well known that the questionnaire design can easy and systematic access to related data quality
strongly influence the answers of the respondents (see, for assessments.
example, Lyons, 2012, pp. 257–269). Second, many of the problems of data interpretation
In authoritarian regimes, the possibility of repressions listed in the respective section above are specific to indi-
also fosters self-censorship within the mass media and in vidual analyses and related academic publications.
social media (Alexanyan et al., 2012; Bekmurzaev, Accordingly, they are not addressed in the literature, as
Lottholz, & Meyer, 2018; Malthaner, 2014; Roberts, 2013; long as they are not among the very few famous studies that
Shklovski & Valtysson, 2012). Accordingly, all forms of stir broader debates. In most cases, an assessment of the
content and discourse analysis might include self-censored reliability of data collection and data interpretation is only
and actually censored forms of expression. To take them as conducted by discussants at academic conferences or peer
honest statements might be a misinterpretation. reviewers in the publication process. Their comments are
not available at all to the broader academic community.
Finally, it has been argued that data collections related to
From transparency to discussion qualitative research methods can often not (easily) be pre-
The solution to problems of data quality and data interpre- pared for online publication in the context of transparency
tation currently promoted in the social sciences is data initiatives. Here, mistakes and misinterpretations are much
transparency. The idea behind data transparency is that by harder to substantiate than in disciplines that work solely
archiving research data online, they become publicly avail- with quantitative methods (Büthe & Jacobs, 2015, p. 2). In
able, which offers all researchers the opportunity to assess a similar vein, Monroe (2018) finds DA-RT insufficiently
not just the presentation of research results in academic sensitive to the needs of qualitative data and sensitive envi-
publications but also the underlying data (Elman & ronments, such as authoritarian regimes.
Kapiszewski, 2014; Moravcsik, 2014). Related initiatives In summary, this means that the online availability of data
such as the “Data Access & Research Transparency Joint collections is not enough to tackle the problems of data qual-
Statement” (DA-RT)3 and the “FAIR principles” (Wilkinson ity and data interpretation. In addition, links to the relevant
et al., 2016) require the underlying research data of pub- literature discussing or using the respective data collection
lished journal articles to be openly accessible, findable, and, most importantly, a peer discussion placed next to the
interoperable, and well documented. Recent efforts on actual data collection are needed. In technical terms, virtual
national, European, and global levels have led to the crea- research environments, social media, and commercial online
tion of discipline-specific infrastructures for the upload, services already offer the necessary functionalities.
search, and long-term storage of research data.4 From a technical and organizational standpoint, it is sen-
Obviously, online availability is an important step sible to establish an online platform as a complementary
toward increased data quality. If mistakes in the data cannot layer linking already existing services for the long-term
be corrected because the correct information is not availa- storage with services for discipline-specific commenting
ble, false data can at least be marked as such. Like in the and curating (Akers & Doty, 2013; Anderson & Blanke,
case of incomplete data collection, this allows researchers 2012). A close cooperation between infrastructure provid-
to become aware of and discuss the impact this has on the ers and specific academic communities in the creation of an
Heinrich et al. 141

interactive online discussion platform has the potential to formats from the social sciences and the humanities dealing
establish the most useful digital services for research data with the post-Soviet region—their metadata (i.e., detailed
as well as to ingrain the idea of open access to research data descriptions of the data) and a documentation describing
and transparency within the academic communities. the process of data collection are closely connected with
As many academic disciplines are split into subdisci- the quality assessment and contextualization of these data
plines, sometimes with different research practices and in a single place. The quality assessment is best conducted
requirements (Quandt & Mauer, 2012, p. 61), the direct by experts who are familiar with the content, method, and/
involvement of these academic (sub-) communities in the or context of the dataset (Devarajan, 2013; Jerven, 2013;
development and operation of such infrastructures seems to Seligson, 2004). This discussion will be based on a moder-
be advisable (Pfeiffenberger, 2007). This is especially ated and gated peer-review process7 as well as the subse-
important in the area of quality assessment of data and quent involvement of the interested academic community
interaction with the community (Klump & Ludwig, 2013, through comments and references. For these community
p. 261). discussions, an intuitively usable interface will be devel-
Exactly this is the aim of the Discuss Data Project, an oped that will enable a direct response to posts by other
“Open Platform for the Interactive Discussion of Research users as well as the presentation of complex debates. For
Data Quality (on the example of area studies on the post- posting comments and other annotations, users will have to
Soviet region),” created and operated by the Göttingen register on the platform providing their real names and
State and University Library and the Research Centre for institutional affiliations to avoid spamming and trolling.
East European Studies at the University of Bremen. Important editorial tasks—such as the upload of data
Discuss Data aims to create an online platform that com- collections, user administration and user support—will be
bines the publication of research data not only with a docu- gradually transferred to the user community itself. Thereby,
mentation of the data collection process but also with an a new model for the sustainable establishment of a perma-
interactive place of communication to discuss, evaluate, nent online platform through the transfer of responsibilities
and contextualize these research data. The expert commu- and tasks to the user community will be tested. An impor-
nity will be enabled to indicate faulty or misleading data, to tant project goal is to organize and moderate these transfer
recommend complementary datasets (in case of gaps in the processes and make them technically possible to gradually
data collection) and to discuss extensively the validity, establish a self-organization by the user community. Such a
applicability, and interpretation of the data. This platform form of self-organization is possible as has been proven
creates the opportunity to gather—in a structured way—the (even though outside of academic structures) by the success
feedback to research data that is currently scattered among of the online encyclopedia Wikipedia (for the not always
journal articles, conference papers, and blog posts or has unproblematic relationship of academia toward Wikipedia
not been published at all. The evaluation of research data is, and its knowledge organization, see Black, 2008; Brandt,
therefore, transformed from an individual to a collective 2009; Eijkman, 2010).
endeavor benefiting the whole expert community. Although With the upload of qualitative data collections, the ques-
the Discuss Data Project is organized by academic institu- tion of the protection of privacy requires special attention,
tions on the basis of academic concerns about data quality, as will be elaborated in the following section.
the online platform is open to users with nonacademic
backgrounds as well, as the assessment of data quality is
relevant for a much broader audience, including, for exam- Ethical issues
ple, nonacademic researchers in think tanks or business, Concerning the role of people as subjects of research in the
journalists, or policy makers. social sciences, quantitative as well as qualitative data are
For the storage and long-term archiving of the content, mainly gathered in two forms. They can either be collected
Discuss Data will use the services of the DARIAH-DE from participants who have been recruited by the researcher
repository5 and (eventually) the Humanities Data Centre.6 or taken from sources where they have been produced
Beyond the presentation of so far unpublished (or not in the authentically (i.e., independently of the research project).
respective repository infrastructures published) data collec- An example of the latter is the large amounts of data created
tions, Discuss Data will also link to research data published directly by users in online and social media, which are
in other interdisciplinary repositories, such as the Harvard called “big data.”8
Dataverse, to connect users interactively to all available If the people included in the study (i.e., respondents,
knowledge on the post-Soviet region. The validity and com- participants, or data providers) are recruited by the
parability of these data will be discussed; complementary researcher, ethics guidelines regularly demand “informed
data will be recommended; and all data will be presented in consent.” Participants should be informed about the pur-
an easily accessible and comprehensible online platform. pose of the study and the specific use of the generated data,
The publication of data collections—a broad spectrum that their participation is voluntary, and that they can with-
of quantitative and qualitative data sources, forms and draw from the study at any time. In a final step, data
142 Journal of Eurasian Studies 10(2)

providers should also be informed about the research results information will be used in research” (Elgesem, 2015, p.
(Fossheim & Ingierd, 2015, p. 11; Lüders, 2015, p. 79). 23, emphasis in the original). However,
Consequently, Elgesem (2015) argues that research should
not entail a risk of harm or discomfort for the data assessing the acknowledged publicity of an online venue is not
provider. always straightforward, at least not as seen from the point of
In our view, the idea of “informed consent” poses three view of the participants. A personal blog might be publicly
challenges to the researcher. First, this consent may not be available for all to read, though very often it can be regarded as
a personal and private space by the author. (Lüders, 2015, p.
granted due to a general feeling of mistrust that is not
80)
related to any specific concerns about the respective
research project. Moreover, this mistrust may be directed
An additional aspect is the legal regulation in the coun-
not so much toward the research as such (e.g., giving an
try where the researcher is based. Research activities are
interview) but toward the specific formal requirements of
not considered incompatible with the original purpose of
“informed consent” (e.g., signing a consent form).
data generation in the European Union (and Norway), as
Second, even if “informed consent” is given, neither the
science is afforded a special position in the respective legal
respondent nor the researcher can be sure about the conse-
framework: “This provision might be seen as a fundamen-
quences a participation in a research project may have.
tal principle guaranteeing further use of data for research
Examples are that the researcher later becomes “undesired”
purposes regardless of the original reason for their produc-
by the state authorities, which will badly reflect on the
tion. This leaves open the possibility to conduct research on
“informers,” or that the respondent makes a statement that
information obtained online without consent” (Utaaker
only later becomes “subversive” after official attitudes have
Segadal, 2015, p. 42).
changed. This aspect is especially relevant if the data (e.g.,
To achieve the overarching goal of privacy protection,
interviews) are to be published in full to ensure data transpar-
which is also demanded by law in the European Union, a
ency. This can require careful and time-intensive prepara-
widely used method in quantitative research is the anonymi-
tions by the researchers (cf., Monroe, 2018, pp. 144–145).
zation of data to protect the privacy of the individual
Third, in some cases, such as concerning corrupt state
research subjects. The method of anonymization can entail
officials, militant separatists, or political radicals, research
one or several of these techniques (Albright, 2011, p. 779):
based on direct interaction with the subjects of research is
hardly feasible if based on “informed consent.”
•• (micro)-aggregation (e.g., unspecified gender or
The situation is different if the researcher is not in direct age),
contact with the “data provider,” as in the case of online •• alteration of the data,
data. Social media might, for instance, make compromises •• suppression of certain variables (that might identify
regarding the privacy of data and the anonymity of data the data provider),
providers or users that do not meet the standards of •• data swap (data from one research subject are
“informed consent.” The ubiquity of publicly available ascribed to another research subject and vice versa),
social media data creates enormous possibilities for privacy •• random noise (to distort the original data to some
violations (Albright, 2011; Elgesem, 2015, pp. 24–25; degree).
Hoser & Nitschke, 2010, p. 184; Utaaker Segadal, 2015, p.
36). It often seems impossible (or at least impractical) to However, Aiden and Michel (2014) claim that big data
obtain the informed consent of data providers who also necessarily cast “big shadows.” A shadow is a projection of
cannot be informed about the research results due to their the real object, a “visual transformation that preserves some
large number and due to lack of contact details. aspects of the original object while filtering out others”
McKee and Porter (2009, p. 88) identify four factors that (Aiden and Michel, 2014, p. 60). It has been shown that the
affect the need to obtain consent when research using online anonymization of quantitative datasets can be broken,
data is conducted: revealing personal and sensitive information about the orig-
inal single data provider (Albright, 2011, p. 778; Aiden and
•• the degree of accessibility in the public sphere (pub- Michel, 2014, pp. 61–62; Zimmer, 2010).
lic vs. private), Whereas quantitative research often has no need to iden-
•• the sensitivity of the information, tify individual data providers, qualitative research is often
•• the degree of interaction between the researcher and based on rather detailed profiles of individual data provid-
the research subjects, and ers. It is already a challenge to publish the research results
•• the vulnerability of the research subjects. anonymously

Based on the first criterion, many scholars have pro- if one wishes to publish direct quotes, as these will be
posed that data should not be used without consent if “the searchable on the Internet. It is also important to note that
people being studied do not have an expectation that the pseudonyms or nicknames may be identifiable because they
Heinrich et al. 143

may be used in various contexts online and hence function as a and reliability of research data is a hot topic in the social
digital identity. (Utaaker Segadal, 2015, p. 43; see also sciences at the moment. Discuss Data hopes to be able to
Elgesem, 2015, p. 29) present an easy and effective solution.

Anonymity and privacy protection are of special impor- Acknowledgements


tance for data providers in politically sensitive regions such as
Discuss Data is jointly conducted by the Göttingen State and
the former Soviet Union (Côté, 2013). Authoritarian govern- University Library and the Research Centre for East European
ments increasingly use the Internet and social media to iden- Studies at the University of Bremen. A first version of the Discuss
tify users and possibly harass them to suppress opposing Data platform will be available online in 2020.
views and criticism. In the case of Azerbaijan, for example,
punitive measures by state agencies are evoked not only by Funding
outright political opposition to the ruling regime but also
The author(s) disclosed receipt of the following financial support
include, for example, punishment for insulting members of for the research, authorship, and/or publication of this article: This
the presidential family, which is a very ambiguous and publication has been produced in the context of the Discuss Data
stretchable offense. Thus, the increased visibility and surveil- project, which is funded by the German Research Foundation
lance of opinions shared on social media has made it easier (project numbers PL 621/3-1 and HO 3987/26-1). The funding
for the Azerbaijani government to swiftly and severely punish source does not interfere with the conduction of the project.
online activities (Alexanyan et al., 2012; Pearce, 2015; Pearce
& Guliyev, 2016). As Roberts (2013, p. 344) points out, Notes
“there can be no limit on the provision of anonymity and care 1. See https://rsf.org/en/world-press-freedom-index and https://
in handling data; even in cases when the respondent does not rsf.org/en/detailed-methodology.
ask for that provision” (see also Richardson, 2014, p. 185). 2. See http://www.transparency.org/cpi2014/in_detail.
A second legal issue related to the online publication of 3. See http://www.dartstatement.org/.
data collections is copyrights. In the case of content or dis- 4. Cf., for example, Harvard Dataverse (http://Dataverse.org/),
course analyses, event databases based on media reporting, GESIS Datorium (https://datorium.gesis.org), TextGrid
or audio and video collections, copyright regulations may Repository (http://www.textgridrep.de/).
prohibit the publication of the underlying data collection. 5. See Digitale Forschungsinfrastruktur für die Geistes- und
However, the results of data analysis or interpretation (e.g., Kulturwissenschaften (https://de.dariah.eu).
6. See http://humanities-data-centre.org/.
coding results in the case of content analysis or descriptions
7. On the continued relevance of peer-review procedures in the
of pictures, audio or video sources) can be published online. digital age, see Fitzpatrick (2012) and Nicholas, Watkinson,
To protect privacy and copyrights on the Discuss Data and Jamali (2015); for the area of datasets, see “Editorial:
platform, every data provider specifies the extent to which The Guardian” (2014).
data collections can become available online. Any data upload 8. Big data refer to extremely large sets of semi- or unstruc-
has to go through a multi-level review process to ensure that tured digital data on social transactions that are, deliberately
technical as well as legal requirements are fulfilled. or passively and in various shapes and forms, generated in
our daily interactions with technology. These digital traces
constitute enormous datasets available to others that may be
Conclusion analyzed to reveal patterns, trends, and associations, espe-
The availability and quality of research data are limited in cially relating to human behavior and interactions (Prabhu,
many regards: the challenges range from intended falsifica- 2015: 158; Steen-Johnsen, & Enjolras, 2015: 122).
tion of data, unintended mistakes, and incomplete datasets
to the over- or misinterpretation of correct (and complete) References
datasets. Moreover, the publication of social science Aiden, E., & Michel, J.-B. (2014). Uncharted: Big data as a lens
research data is the cause for ethical concerns, most impor- on human culture. New York, NY: Riverhead Books.
tantly about privacy protection. Akers, K. G., & Doty, J. (2013). Disciplinary differences in fac-
For the often-demanded transparency in academic ulty research data management: Practices and perspectives.
knowledge production, the careful publication of the under- International Journal of Digital Curation, 8(2), 5–26.
lying research data is a necessary first step. The discussion Albright, J. J. (2011). Privacy protection in social science research:
Possibilities and impossibilities. PS: Political Science &
of these data is the second and, in our view, even the more
Politics, 44, 777–782.
vital step. However, academic fora that enable such a dis- Alexanyan, K., Barash, V., Etling, B., Faris, R., Gasser, U., Kelly,
cussion in the digital age are missing so far. J., . . . Roberts, H. (2012). Exploring Russian cyberspace:
By addressing these problems with Discuss Data, we Digitally-mediated collective action and the networked pub-
want to create a digital infrastructure that functions as a lic sphere. Cambridge, MA: The Berkman Center for Internet
virtual place of communication enabling the discussion of & Society Research at Harvard University. Retrieved from
publicly available research data. Evaluating the validity http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2014998
144 Journal of Eurasian Studies 10(2)

Alvarez, R. M., Ansolabehere, S., & Stewart, C., III. (2005). Cassidy, J. (2013, April 26). The Reinhart and Rogoff controversy:
Studying elections: Data quality and pitfalls in measuring A summing up. The New Yorker. Retrieved from http://www.
the effects of voting technologies. Policy Studies Journal, newyorker.com/news/john-cassidy/the-reinhart-and-rogoff-
33, 15–24. controversy-a-summing-up
Alvarez, R. M., Hall, T. E., & Hyde, S. D. (2008). Introduction: Côté, I. (2013). Fieldwork in the era of social media: Opportunities
Studying election fraud. In R. M. Alvarez, T. E. Hall, & S. and challenges. PS: Political Science & Politics, 46, 615–
D. Hyde (Eds.), Election fraud: Detecting and deterring 619.
electoral manipulation (pp. 1–17). Washington, DC: The Data and development: Off the map. (2014, November 15).
Brookings Institution Press. The Economist. Retrieved from http://www.economist.
Alvarez, R. M., Key, E. M., & Núñez, L. (2018). Research replica- com/news/international/21632520-rich-countries-are-del-
tion: Practical considerations. Political Science, 51, 422–426. uged-data-developing-ones-are-suffering-drought
Anceschi, L. (2015). The persistence of media control under con- Deckert, J., Myagkov, M., & Ordeshook, P. C. (2011). Benford’s
solidated authoritarianism: Containing Kazakhstan’s digital law and the detection of election fraud. Political Analysis,
media. Demokratizatsiya, 23, 277–295. 19, 245–268.
Anderson, S., & Blanke, T. (2012). Taking the long view: From Devarajan, S. (2013). Africa’s statistical tragedy [Special issue].
e-science humanities to humanities digital ecosystems. Review of Income and Wealth, 59, S9–S15.
Historical Social Research/Historische Sozialforschung, 37, Editorial: The Guardian view on the end of the peer review. (2014,
147–164. July 6). The Guardian. Retrieved from https://www.the-
Andersson, S., & Heywood, P. M. (2009). The politics of percep- guardian.com/commentisfree/2014/jul/06/guardian-view-
tion: Use and abuse of transparency international’s approach end-peer-review-scientific-journals
to measuring corruption. Political Studies, 57, 746–767. Eijkman, H. (2010). Academics and Wikipedia: Reframing Web
Apaza, C. R. (2009). Measuring governance and corruption 2.0+ as a disruptor of traditional academic power-knowl-
through the worldwide governance indicators: Critiques, edge arrangements. Campus-Wide Information Systems, 27,
responses, and ongoing scholarly discussion. PS: Political 173–185.
Science & Politics, 42, 139–143. Elgesem, D. (2015). Consent and information: Ethical consid-
Baranov, E. F. (2013). Russian statistics: Achievements and prob- erations when conducting research on social media. In H.
lems. Problems of Economic Transition, 55(11), 24–35. Fossheim & H. Ingierd (Eds.), Internet research ethics (pp.
Beber, B., & Scacco, A. (2012). What the numbers say: A digit- 14–34). Oslo, Norway: Cappelen Damm Akademisk.
based test for election fraud. Political Analysis, 20, 211–234. Elman, C., & Kapiszewski, D. (2014). Data access and research
Beisembayeva, D., Papoutsaki, E., Kolesova, E., & Kulikova, transparency in the qualitative tradition. PS: Political Science
S. (2013, June 25–29). Social media, online activism and & Politics, 47(1), 43–47.
government control in Kazakhstan. Paper presented at the Ersado, L. (2006). Azerbaijanʼs household survey data:
IAMCR 2013 Conference “Crises, ‘Creative Destruction’ Explaining why inequality is so low (Policy Research
and the Global Power and Communication Orders,” Dublin. Working Paper WPS 4009). Washington, DC: World Bank.
Beissinger, M. (2003). Codebook for disaggregated event data: Retrieved from http://documents.worldbank.org/curated/
“Mass demonstrations and mass violent events in the former en/2006/09/7063026/azerbaijans-household-survey-data-
USSR, 1987–1992.” Retrieved from https://scholar.princeton explaining-inequality-so-low
.edu/mbeissinger/publications/mass-demonstrations-and Fitzpatrick, K. (2012). Beyond metrics: Community authoriza-
-mass-violent-events-former-ussr-1987-1992-these tion and open peer review. In M. Gold (Ed.), Debates in the
Bekmurzaev, N., Lottholz, P., & Meyer, J. (2018). Navigating the digital humanities (pp. 452–459). Minneapolis: University
safety implications of doing research and being researched of Minnesota Press.
in Kyrgyzstan: Cooperation, networks and framing. Central Focus-Economics. (2016). Uzbekistan economic outlook.
Asian Survey, 37, 100–118. Retrieved from https://www.focus-economics.com/countries
Bessonov, V. A. (2013). On the problems of Russian statistics. /uzbekistan
Problems of Economic Transition, 55(11), 36–49. Fossheim, H., & Ingierd, H. (2015). Introductory remarks. In H.
Black, E. W. (2008). Wikipedia and academic peer review: Fossheim, & H. Ingierd (Eds.), Internet research ethics (pp.
Wikipedia as a recognised medium for scholarly publica- 9–13). Oslo, Norway: Cappelen Damm Akademisk.
tion? Online Information Review, 32, 73–88. Fredheim, R. (2017). The loyal editor effect: Russian online jour-
Brandt, D. (2009). Postmodern organisation of knowledge or: nalism after independence. Post-Soviet Affairs, 33(1), 34–48.
How subversive is Wikipedia? LIBREAS. Library and Ideas, Giannone, D. (2010). Political and ideological aspects in the
14, 4–18. measurement of democracy: The Freedom House case.
Bühlmann, M., Merkel, W., Müller, L., Giebler, H., & Weβels, Democratization, 17(1), 68–97.
B. (2012). Democracy Barometer: A new instrument for Goode, J. P. (2010). Redefining Russia: Hybrid regimes, field-
Comparative Political Science. Zeitschrift für Vergleichende work, and Russian politics. Perspectives on Politics, 8,
Politikwissenschaft, 6, 115–159. 1055–1075.
Büthe, T., & Jacobs, A. M. (2015). Introduction to the symposium. Høyland, B., Moene, K., & Willumsen, F. (2012). The tyranny
Qualitative & Multi-Method Research, 13(1), 2–8. of international index rankings. Journal of Development
Carnaghan, E. (2011). The difficulty of measuring support for Economics, 97, 1–14.
democracy in a changing society: Evidence from Russia. Heaton, J. (2008). Secondary analysis of qualitative data: An
Democratization, 18, 682–706. overview. Historical Social Research, 33(3), 33–45.
Heinrich et al. 145

Heinrich, A., & Pleines, H. (2015). Mixing geopolitics and busi- Lewis-Beck, M. S., Bryman, A., & Futing Liao, T. (Eds.). (2004).
ness. How ruling elites in the Caspian states justify their The SAGE encyclopedia of social science research methods.
choice of export pipelines. Journal of Eurasian Studies, 6(2), Thousand Oaks, CA: SAGE.
107–113. Lüders, M. (2015). Researching social media: Confidentiality,
Heinrich, A., & Pleines, H. (2018). The meaning of “limited plu- anonymity and reconstructing online practices. In H.
ralism” in media reporting under authoritarian rule. Politics Fossheim & H. Ingierd (Eds.), Internet research ethics (pp.
and Governance, 6, 103–111. 77–97). Oslo, Norway: Cappelen Damm Akademisk.
Hoser, B., & Nitschke, T. (2010). Questions on ethics for research in Lyons, P. (2012). Theory, data and analysis. Data resources
the virtually connected world. Social Networks, 32, 180–186. for the study of politics in the Czech Republic. Prague:
Hyde, S. D. (2007). The observer effect in international politics: Institute of Sociology, Academy of Sciences of the Czech
Evidence from a natural experiment. World Politics, 60, Republic.
37–63. Møller, J., & Skaaning, S.-E. (2012). The inconsistency between
International Federation of Human Rights. (2016). Women and concept and measurement in current studies of democracy.
children from Kyrgyzstan affected by migration. Paris, Zeitschrift Für Vergleichende Politikwissenschaft, 6(1), 233–
France: FIDH. 251.
Ioannidis, J. P. A., Stanley, T. D., & Doucouliagos, H. (2017). The Magaloni, B. (2010). The game of electoral fraud and the ousting
power of bias in economics research. Economic Journal, of authoritarian rule. American Journal of Political Science,
127, F236–F265. 54, 751–765.
Jerven, M. (2013). Comparability of GDP estimates in sub-Saha- Malthaner, S. (2014). Fieldwork in the context of violent con-
ran Africa: The effect of revisions in sources and methods flict and authoritarian regimes. In D. della Porta (Ed.),
since structural adjustment [Special issue]. Review of Income Methodological practices in social movement research (pp.
and Wealth, 59, S16–S36. 173–194). Oxford, UK: Oxford University Press.
Johnson, I. M. (2014). The rehabilitation of library and infor- McKee, H. A., & Porter, J. E. (2009). The ethics of Internet
mation services and professional education in the post- research: A rhetorical, case-based process. New York, NY:
Soviet republics: Reflections from a development project. Peter Lang.
Information Development, 30, 130–147. Monroe, K. R. (2018). The rush to transparency: DA-RT and the
Junk, J. (2011). Method parallelization and method triangulation: potential dangers for qualitative research. Perspectives on
Method combinations in the analysis of humanitarian inter- Politics, 16, 141–148.
ventions. German Policy Studies, 7(3), 83–116. Moravcsik, A. (2014). Transparency: The revolution in qualitative
Khaninym, G. I. (2012). Numbers continue to be deceitful. EKO. research. PS: Political Science & Politics, 47, 48–53.
Vserossiiskii ekonomicheskii zhurnal, 3, 4–13. Munck, G. L. (2011). Measuring democracy: Framing a needed
King, D. (2014). The commissar vanishes: The falsification of debate. Comparative Democratization, 9(1), 1–7.
photographs and art in Stalin’s Russia (new ed.). London, Muno, W. (2012). Measuring the world. An analysis of the World
England: Tate. Bank’s Worldwide Governance Indicators. Zeitschrift für
Klimek, P., Yegorov, Y., Hanel, R., & Thurner, S. (2012). Statistical Vergleichende Politikwissenschaft, 6(1), 87–113.
detection of systematic election irregularities. Proceedings Myagkov, M., Ordeshook, P. C., & Shakin, D. (2009). The foren-
of the National Academy of Science of the United States of sics of election fraud: Russia and Ukraine. Cambridge, UK:
America, 109, 16469–16473. Cambridge University Press.
Klump, J., & Ludwig, J. (2013). Research data management. In Napeenko, G. (2017, March 13). If you scratch on a domestic
H. Neuroth, N. Lossau, & A. Rapp (Eds.), Evolution der liberal, you geht an educated conservative: Sociologist
Informationsinfrastruktur. Kooperation zwischen Bibliothek Grigory Yudin about deceiving public polls, elites’ fear
und Wissenschaft (pp. 257–275). Glückstadt, Germany: of the people and the political suicide of the intelligent-
VWH. sia. Colta.ru. Retrieved from http://www.colta.ru/articles/
Knack, S. (2006). Measuring corruption in Eastern Europe and raznoglasiya/14158
Central Asia: A critique of the crosscountry indicators Nicholas, D., Watkinson, A., & Jamali, H. R. (2015). Peer review:
(World Bank Policy Research Working Paper No. 3968). Still king in the digital age. Learned Publishing, 28(1), 15–
Washington, DC: World Bank. doi:10.1596/1813-9450-3968 21.
Korhonen, V. (2012). Russian statistics. A view from the sidelines. Pearce, K. E. (2015). Democratizing kompromat: The affordances
EKO. Vserossiiskii ekonomicheskii zhurnal, 4, 56–73. of social media for state-sponsored harassment. Information,
Kryukov, V. A., & Sokolin, V. L. (2010). Russian statistics. Gains Communication & Society, 18, 1158–1174.
and losses. EKO. Vserossiiskii ekonomicheskii zhurnal, 8, Pearce, K. E., & Guliyev, F. (2016). Digital knives are still knives:
5–23. The affordances of social media for a repressed opposition
Lehoucq, F. (2003). Electoral fraud: Causes, types and conse- against an entrenched authoritarian regime in Azerbaijan.
quences. Annual Review of Political Science, 6, 233–256. In A. Bruns, G. Enli, E. Skogerbo, A. O. Larsson, & C.
Levada Center. (2016, August 12). Trust in mass media and readi- Christensen (Eds.), The Routledge companion to social
ness to state one’s opinion. Author. Retrieved from https:// media and politics (pp. 364–378). Abingdon, UK: Taylor &
www.levada.ru/2016/08/12/14111/ Francis.
Lewis, D. (2016). Blogging Zhanaozen: Hegemonic discourse Pfeiffenberger, H. (2007). Open access to primary scientific data.
and authoritarian resilience in Kazakhstan. Central Asian Zeitschrift für Bibliothekswesen und Bibliographie, 54(4),
Survey, 35, 421–438. 207–210.
146 Journal of Eurasian Studies 10(2)

Pickel, G., & Pickel, S. (Ed.) (2011). Indicies in Comparative Teorell, J. (2011). Over time, across space: Reflections on the
Political Science. Wiesbaden, Germany: VS Verlag. production and usage of democracy and governance data.
Pickel, S., & Pickel, G. (2012). The measurement of indices in Comparative Democratization, 9, 7–11.
Comparative Political Science-methodological sophistry The 90% question: A seminal analysis of the relationship between
or substantial need. Zeitschrift für Vergleichende Politik­ debt and growth comes under attack. (2013a, April 20). The
wissenschaft, 6(1), 1–17. Economist. Retrieved from http://www.economist.com/
Pleines, H. (2018). Political regime-related country rankings. news/finance-and-economics/21576362-seminal-analysis-
Caucasus Analytical Digest, 106, 2–19. relationship-between-debt-and-growth-comes-under
Prabhu, R. (2015). Big data—Big trouble? Meanderings in an Trouble at the lab: Scientists like to think of science as self-cor-
uncharted ethical landscape. In H. Fossheim, & H. Ingierd recting. To an alarming degree, it is not. (2013b, October
(Eds.), Internet research ethics (pp. 157–172). Oslo, Norway: 18). The Economist. Retrieved from http://www.economist.
Cappelen Damm Akademisk. com/news/briefing/21588057-scientists-thinkscience-self-
Quandt, M., & Mauer, R. (2012). Social Sciences. In H. Neuroth, correcting-alarming-degree-it-not-trouble
S. Strathmann, A. Oßwald, R. Scheffel, J. Klump, & J. Tucker, J. A. (2007). Enough! Electoral fraud, collective action prob-
Ludwig (Eds.), Langzeitarchivierung von Forschungsdaten. lems, and post-communist colored revolutions. Perspectives on
Eine Bestandsaufnahme (pp. 61–81). Glückstadt, Germany: Politics, 5, 535–551.
VWH. Utaaker Segadal, K. (2015). Possibilities and limitations of
Razafindrakoto, M., & Roubaud, F. (2005). How far can we trust Internet research: A legal framework. In H. Fossheim & H.
expert opinions on corruption? An experiment based on sur- Ingierd (Eds.), Internet research ethics (pp. 35–47). Oslo,
veys in francophone Africa. In Transparency International Norway: Cappelen Damm Akademisk.
(Ed.), Global corruption report (pp. 292–295). London, Vickery, C., & Shein, E. (2012). Assessing electoral fraud in new
England: Pluto Press. democracies: Refining the vocabulary. Washington, DC:
Richardson, P. B. (2014). Engaging the Russian elite: Approaches, International Foundation for Electoral Systems.
methods and ethics. Politics, 34, 180–190. Wilkinson, M. D., Dumontier, M., Aalbersberg, I. J., Appleton, G.,
Roberts, S. P. (2013). Research in challenging environments: Axton, M., Baak, A., . . . Mons, B. (2016). The FAIR guiding
The case of Russia’s “managed democracy.” Qualitative principles for scientific data management and stewardship.
Research, 13, 337–351. Scientific Data, 3, Article 160018. doi:10.1038/sdata.2016.18
Seligson, M. A. (2004). Comparative survey research: Is there a Zimmer, M. (2010). ‘But the data is already public’: On the ethics
problem? Comparative Politics, 15(2), 11–14. of research in Facebook. Ethics and Information Technology,
Senyuva, O. (2010). Parliamentary elections in Moldova, April 12, 313–325.
and July 2009. Electoral Studies, 29, 190–195. Zubarevich, N. V. (2012). “Deceitful numbers” on the map of the
Shih, V. (2015). Research in authoritarian regimes: Transparency homeland. EKO. Vserossiiskii ekonomicheskii zhurnal, 4, 74–85.
tradeoffs and solutions. Qualitative & Multi-Method Research,
13, 20–22. Author biographies
Shklovski, I., & Valtysson, B. (2012). Secretly political: Civic
engagement in online publics in Kazakhstan. Journal of Andreas Heinrich is a senior researcher in the Department of
Broadcasting & Electronic Media, 56, 417–433. Politics and Economics, Research Centre for East European Studies
Simola, H. (2012). The quality of Russian import statistics. EKO. at the University of Bremen. His research interest is focused on the
Vserossiiskii ekonomicheskii zhurnal, 3, 95–104. political role of the energy sector in the post-Soviet region.
Steen-Johnsen, K., & Enjolras, B. (2015). Social research and
Big Data: The tension between opportunities and reali- Felix Herrmann is research associate for e-research and digital
ties. In H. Fossheim & H. Ingierd (Eds.), Internet research humanities at the Research Centre for East European Studies at
ethics (pp. 122–140). Oslo, Norway: Cappelen Damm the University of Bremen. His research interests include the
Akademisk. development of digital infrastructures to support academic
Steiner, N. D. (2016). Comparing Freedom House democ-
research and the history of economy, industry, and technology in
racy scores to alternative indices and testing for political
the COMECON countries.
bias: Are US allies rated as more democratic by Freedom
House? Journal of Comparative Policy Analysis, 18, 329–
349. Heiko Pleines is head of the Department of Politics and Economics,
Stockemer, D., Koehler, S., & Lentz, T. (2018). Data access, trans- Research Centre for East European Studies and professor of com-
parency, and replication: New insights from the political parative politics at the University of Bremen. His main research
behavior literature. Political Science, 51, 799–803. interest is the role of nonstate actors in authoritarian regimes.

You might also like