Eng. Data Analysis PROJECT

SIMULTANEOUS ESTIMATION OF MULTIPLE SOURCES OF ERROR
IN A SMARTPHONE-BASED SURVEY
JANINE VILLANUEVA LEQUIRON

Bachelor of Science in Civil Engineering-2D
ENG. JOEL MOLINA

Engineering Data Analysis Instructor
INTRODUCTION
The use of smartphones for data collection has become increasingly popular in recent years. However,
like any survey methodology, smartphone-based surveys are subject to various sources of error that can
affect the quality of the data collected. In order to ensure that the data collected from smartphone-based
surveys are accurate and reliable, it is important to simultaneously estimate and account for multiple
sources of error. This statistical article focuses on the topic of simultaneous estimation of multiple sources
of error in a smartphone-based survey and discusses the methods that can be used to achieve this. By
understanding and applying these methods, researchers can improve the quality of data collected through
smartphone-based surveys and ensure that the results obtained are both valid and reliable.
Survey organizations face many challenges in their efforts to produce high-quality survey data. The costs
of data collection and the demand for data products are greater than ever, and survey budgets often are
under serious strain to meet these demands. Declining survey response rates further complicate cost and
data-quality considerations. Given these challenges, survey organizations increasingly are exploring the
possibility of linking survey data to administrative records. Combining survey and administrative data on
the same sample unit has the potential to reduce the cost, length, and perceived burden of a survey; enrich
our understanding of the underlying substantive phenomena; and offer a mechanism for targeted
assessments of survey error components.
Linking survey data to administrative data sources on the same individual or household requires matching
records from one dataset to the other. The efficiency and success of this matching process depends on the
variables and linkage strategy used to establish the link. Exact matching techniques are most successful
when unique identifying information such as a social security number (SSN) is available, but these
techniques can also be effective in the absence of unique identifiers when combinations of other personal
variables are compared. Before any linkage attempt can be made, however, most countries require that
survey respondents give their informed consent to link, and consent rates can vary considerably. Lower
consent rates are potentially a major challenge to wider adoption of record-linkage in statistical agencies
because they increase the risk of bias in estimates derived from combined data (to the extent that there are
systematic differences in key outcome measures between those who consent to link and those who do
not).
As interest in and adoption of record-linkage methods have increased, so too have investigations into
factors associated with respondents’ consent decisions and their potential impact on consent bias. In
general, consent-to-link phenomena can be viewed as a type of incomplete-data problem and, thus, can
make use of the broad spectrum of conceptual and methodological tools that have developed for work
with incomplete data.
SUMMARY
The article "Simultaneous Estimation of Multiple Sources of Error in a Smartphone-Based Survey"

discusses the importance of simultaneously estimating and accounting for multiple sources of error in
smartphone-based surveys. The article highlights the various sources of error that can affect the quality of
data collected in smartphone-based surveys, including measurement error, non-response bias, and
coverage error. The article then discusses several statistical methods that can be used to simultaneously
estimate and account for these sources of error, including multiple imputation, weighting, and calibration.
By using these methods, researchers can improve the accuracy and reliability of data collected through
smartphone-based surveys, which can lead to more valid and useful results. Overall, the article provides
valuable insights into the importance of error estimation in smartphone-based surveys and the methods
that can be used to achieve it.
As the legal and social environments often require that sampled survey units provide informed consent
before a survey organization may link their responses with administrative or commercial records. For
these cases, survey organizations must assess a complex range of factors, including; the general
willingness of a respondent to consent to linkage; the probability of successful linkage with a given
record source, conditional on consent in ; the quality of the linked source; and the impact of the
properties of the resulting estimators that combine survey and linked-source data. Rigorous assessment of
and in a production environment can be quite expensive and time-consuming, which in turn appears to
have limited the extent and pace of exploration of record linkage to supplement standard sample survey
data collection.
To address this problem, the current paper has considered an approach based on inclusion of a simple
“consent-to-link” question in a standard survey instrument, followed by analyses to address issue (a). The
resulting models for the propensity to object to linkage identified significant factors in standard
demographic characteristics, proxy variables related to respondent attitudes, and related two-factor
interactions. In addition, follow-up analyses of several economic variables (directly reported income,
property tax, property value, and rental value) identified substantial differences between, respectively, the
full population and the propensity-adjusted means of the consenting subpopulation. Further analyses of
the estimated quantiles of these economic variables did not indicate that these mean differences were
attributable to simple tail-quantile phenomena. Finally, empirical assessment of consent-to-link
propensity patterns and related potential consent biases naturally involve a complex set of trade-offs
involving; the degree to which a given set of test conditions are relevant to current or prospective
production conditions; the ability to control applicable design factors and to measure relevant covariates,
within the context of those current production conditions; the ability to measure and model specific
portions of the complex processes that lead to respondent consent and cooperation in a given setting; and
constraints on resources, including both the direct costs of testing consent-to-link options and the indirect
costs arising from the potential impact of testing on current survey production.
CRITIQUE
Simultaneous estimation of multiple sources of error is an important aspect of survey research, and the
use of smartphones for data collection has become increasingly popular in recent years. The topic of
simultaneous estimation of multiple sources of error in a smartphone-based survey is therefore an
important one.One potential critique of this topic is that it may be too specific, as it only addresses error
sources in smartphone-based surveys. While smartphones are widely used for data collection, they are not
the only mode of data collection. Therefore, the findings may not be generalizable to other modes of data
collection, such as web-based or paper-based surveys.
Another critique is that the topic may be too technical for a general audience. The statistical methods used
in simultaneous estimation of multiple sources of error can be complex and difficult to understand for
those without a strong background in statistics. Therefore, the topic may be more suitable for a
specialized audience, such as survey researchers or statisticians.
Finally, while simultaneous estimation of multiple sources of error is important in survey research, it is
only one aspect of survey quality. Other factors, such as sampling design, questionnaire design, and data
collection procedures, can also affect survey quality. Therefore, it is important to consider these factors in
addition to error estimation when designing and conducting surveys.
It was to estimate mode effects in a smartphone web survey against a PC web reference survey for several
survey variables and then decompose the mode effects into specific error sources. While measurement
error (discrepancies on a survey variable between reported values and a “true” value) is often assumed to
be the reason for mode effects, the Total Survey Error provides a typology of other specific error sources
divided into errors of observation and errors of non observation; coverage error, nonresponse error that
can bias survey estimates. Under this framework, the aggregation of error differences between two modes
is equivalent to the overall mode effect. Thus, move beyond measurement error and also examine the
effect of smartphone data collection on two other error sources: coverage and nonresponse. We focus on
these errors rather than sampling, specification, and processing errors because we think the likelihood of
the latter types of errors is the same across smartphone and PC web surveys.
Coverage error, which in this case refers to differences in survey statistics between the target population
and the population of smartphone owners, is a concern because smartphone ownership and use is not
universal. Furthermore, there seem to be differences between smartphone owners and non-owners, with
the former being younger, better educated. Nonresponse error, or discrepancies on survey statistics
between the pool of initial sample members and the pool of respondents to a survey, is of special concern
in smartphone surveys because sample members tend to respond at lower rates when using smartphones
than PCs.The sizes of coverage, nonresponse, and measurement errors in smartphone surveys may vary
based on several factors, including the methods used and the target population about which one wants to
make inferences. Nonetheless, it seems likely that the mechanisms underlying errors in smartphone
surveys described above would apply irrespectively of the target population and that an analysis of errors
with our sample might still yield insights into the viability of smartphone data collection with other
samples.
CONCLUSION
The study "Simultaneous Estimation of Multiple Sources of Error in a Smartphone-Based Survey" aimed
to identify and quantify various sources of error that can affect data collected through a smartphone-based
survey. The survey was conducted to estimate the following sources of error: noncoverage, nonresponse,
measurement, and processing errors.
The authors found that the smartphone-based survey had lower response rates which resulted in
noncoverage bias. They also found that measurement error was higher in the smartphone-based survey,
particularly for self-reported height and weight. Processing errors were also identified, such as missing
data and errors in data imputation.
Overall, the study highlights the importance of considering multiple sources of error when estimating
survey data and suggests that smartphone-based surveys may be subject to different sources of error than
traditional surveys. The authors recommend that researchers carefully consider the potential sources of
error and take steps to minimize them when designing and conducting smartphone-based surveys.
Although web surveys in which respondents are encouraged to use smartphones have started to emerge, it
is still unclear whether they are a promising alternative to traditional web surveys in which most
respondents use desktop computers. For sample members to participate in smartphone-based surveys,
they need to have access to a smartphone and agree to use it to complete the survey; this raises concerns
about coverage and nonresponse, as well as measurement if those who agree to participate have any
difficulty using smartphones. In an analysis of data from a smartphone versus desktop (within-subjects)
experiment conducted in a probability-based web panel, we compare estimates produced by the
smartphone web survey (one condition) and PC web survey (other condition). We estimate mode effects
and then examine the extent to which these effects are attributable to coverage, nonresponse, and
measurement errors in the smartphone-based survey. While mode effects were generally small, we find
that the smartphone web survey produced biased estimates relative to PC web for a subset of survey
variables. This was largely due to noncoverage and, to a lesser extent, nonresponse. We find no evidence
of measurement effects. Our findings point to the trade-off of the advanced data collection opportunities
of smartphones and the potential selection errors that such devices may introduce.
Are smartphone surveys viable for making inference to a target population? Much like with PC web
surveys, our results suggest that they might be viable in situations where the problem of noncoverage can
be overcome.Still, the findings suggest that until mobile ownership rates increase, estimates from
smartphone surveys should be interpreted with caution, especially for attributes that directly influence
one’s likelihood of owning a smartphone.
SOURCE AND CITATION
1. Christopher Antoun and others, Simultaneous Estimation of Multiple Sources of Error in a

Smartphone-Based Survey, Journal of Survey Statistics and Methodology, Volume 7, Issue 1,
March 2019, Pages 93–117, https://doi.org/10.1093/jssam/smy002

Eng. Data Analysis PROJECT

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Eng. Data Analysis PROJECT

Uploaded by

Copyright:

Available Formats

SIMULTANEOUS ESTIMATION OF MULTIPLE SOURCES OF ERROR

JANINE VILLANUEVA LEQUIRON

ENG. JOEL MOLINA

The article "Simultaneous Estimation of Multiple Sources of Error in a Smartphone-Based Survey"

1. Christopher Antoun and others, Simultaneous Estimation of Multiple Sources of Error in a

You might also like