Research Statement

My methodological and theoretical research as well as a considerable portion of my applied and
collaborative work addresses functional and longitudinal data. An innovative contribution of my
work is the establishment of a new perspective to the analysis of data where one assumes that there
is a smooth stochastic process generating underlying subject trajectories which are observed at
discrete time points with measurement error. This perspective creates a new class of functional
estimation procedures that nonparametrically account for the within subject correlation in an
intuitive and efficient manner. These estimation procedures are motivated by the philosophy that
efficient functional second moment based procedures should directly target the covariance
structure. I adhere to this philosophy by creating penalized estimators based on a new measure of
regularity on the functional covariance that is induced from the smoothness of the subject
trajectories. This regularity is used as the basis of a penalty function. Parameters used to control
the influence of this penalty and parameters indexing other aspects of the second moment based
estimator are jointly estimated through a Kullback-Leibler metric over the covariance space. From
the empirical Bayesian perspective, this is equivalent to assuming a priori the same smoothness for
the subject trajectories and choosing the hyperparameters controlling second moment prior
assumptions to balance the fit of second moment estimators to the observed data as covariance
I developed this class of estimation procedures in response to the large amounts of functional data
that I encountered during collaborative projects and to address a dearth of unifying philosophical
motivation for the analysis of functional data that nonparametrically accounts for the within
subject covariance. This framework has produced interesting and potentially powerful results in
applications to the analysis of three main data sets. In addition to the collaborative work that
motivated the development of these new functional data analysis procedures, I am also involved in
applied and collaborative projects dealing with a wide range of data and design structures in a
large number of scientific settings.
Functional Principal Components Analysis with Application to Gene Expression Data: Since
the high dimensionality of functional data often makes effective visualizations difficult and a
dimension reduction necessary, functional principal components analysis (FPCA) can be a
powerful tool. In collaboration with Dr. Wensheng Guo from the Division of Biostatistics at Penn,
I formulated a penalized method for performing FPCA which estimates the smoothing parameter,
number of principal components, and random noise jointly via the Kullback-Leibler distance
between the estimated distribution and true distribution. This methodology was motivated by the
time course gene expression of fibroblast cells that are essential in wound healing and its
application provides insight into the functions of different genes. I presented this work at ENARs
2006 meeting where it won a Distinguished Student Paper Travel Award. The asymptotic
consistency rates calculated for this paper give mathematical evidence to the intuition that
smoothing parameters controlling the smoothness of a second moment quantity should be based on
metrics over a covariance space.
Varying Coefficient Model with Application to Tumor Growth: Dr. George Coukos
laboratory in the Abramson Family Caner Institute collected data that examines the benefits of
supplementing chemotherapy drugs with antigenic drugs in two different classes of ovarian tumors
in mice. Since parametric methods did not adequately fit these data, along with Drs. Wensheng

Guo and Phyllis Gimotty from Penns Division of Biostatistics, I developed an intuitive functional
extension of iterative reweighed least squares to fit the varying coefficient model while
nonparametrically estimating the within subject covariance. My analysis of these data not only
sparked a paper which received a very positive first review from Biometrics and a paper which is
currently being submitted to a clinical journal, but it has also reveled unexpected insight into the
overlapping pathways of these two classes of drugs and is the motivation for additional
FPCA of EEG Data for the Prediction of Seizure Onset: EEG data from epileptic patients
immediately preceding and during a seizure take the form of non-linear time series and a study of
their time varying spectrum can lead to an understanding of seizure onset. I am in the process of
extending the ideas of penalized FPCA on data where underlying subject trajectories are curves to
the non-linear time series setting where the unit of interest is now the surface of a time varying
spectrum. This work is motivated by and will be applied to a set of EEG scans of epileptic patients
collected by Dr. Brian Litt from the University of Pennsylvanias Epilepsy Center.
Other Collaborative Work: In addition to my work in analyzing functional data, I have also
worked on other collaborative projects dealing with many types of data. I have worked on the
design and presentation of results to data monitoring boards of a clinical trial by Dr. Mitch Schall
from the Department of Radiology at Penn which examines the effects of five different imaging
modalities on the diagnosis of breast cancer. Along with Drs. Charles Branas and Doug Wiebe
from the Division of Epidemiology at Penn and Dr. Michael Elliott now that the Department of
Biostatistics at the University of Michigan, I preformed a spatial analysis to investigated the
relationship between the presence of licensed firearms dealers and homicide rates due to firearms
in the United States and found that this relationship is non-linearly dependent on urbanicity. In
collaboration with the group headed by Dr. James Orsini at Penns Veterinary School, I have also
designed and analyzed a case control study to explore risk factors in the development of laminitis
in horses.
Future Work: An interesting and powerful use of the framework underlying my methodological
work is the development of an intuitive method for performing functional discriminate and
classification analysis. I am also interested in extending these functional data techniques to
functional latent class models. All of my methodological and theoretical work is extendable to any
differentiable Hilbert space, and I intend on extending them to the analysis of image, spatial, and
spatial-temporal data.
In addition to this methodological work, I am interested in continuing to perform collaborations in
a wide range of applications. With an academic minor in imaging and experience though a NCI
biostatistics pre-doctorial training grant which includes exposure to both pre and post clinical
research, I am well suited to enter into collaborations involving PET, MRI, microarray, and growth
curve data. I have further training and experience in collaborations in clinical trials, clinical case
control studies, epidemiological analysis, and analysis of basic science experiments. My intention
is to have the future methodological work described above be only part of my future theoretical
work and that my collaborative efforts will direct me towards open problems to which I can apply
my theoretical and mathematic skills in the development of useful and novel methodologies.

