Professional Documents
Culture Documents
Wake Forest University Department of Mathematics 2011-2012: Gentry Lectures
Wake Forest University Department of Mathematics 2011-2012: Gentry Lectures
In genetic epidemiology, there is widespread interest in knowing whether genetic status and environmental exposures interact to
affect the probability of disease, e.g., whether certain carriers of a particular genetic status can have their risk of breast cancer
enhanced or lowered depending on their long-term nutrient intakes. Much of the data available to understand this question arise
from genome-wide associate studies (GWAS), based on a particular statistical data design called a case-control study. Case-
control studies have a very different structure from textbook random sampling, and an entirely different statistical analysis. I will
describe the difficulty with uncovering the existent of gene-environment interactions, and the analysis of GWAS when one is willing
to make assumptions about the relationship of genetic status and environmental exposure in the population. The gains in statistical
power when one makes such assumptions can best be described as astonishing, but the analysis itself is very simple in the most
important contexts, and can be understood with the most basic probability calculations. Examples include (a) the BRCA1/2
mutation and use of oral contraceptives; (b) Vitamin D intake markers in the Vitamin D receptor pathway; and (c) smoking status
and a gene related to addiction to smoking. The talk is entirely statistical: no genetic knowledge is needed.
In a series of papers on Lidar data, magically good classification rates are claimed once data are deconvolved and a dimension
reduction technique applied. The latter can certainly be useful, but it is not clear a priori that deconvolution is a good idea in this
context. After all, deconvolution adds noise, and added noise leads to lower classification accuracy. I will give a more or less
formal argument that in a closely related class of deconvolution problems, what statisticians call "Measurement Error Models",
deconvolution typically leads to increased classification error rates. An empirical example in a more classical deconvolution
context illustrates the results.
www.math.wfu.edu/Events/Gentry/Carroll.html 1/1