Welcome to Scribd!

Preview First Page

Uploaded by

0% found this document useful (0 votes)

8 views1 page

1) The document proposes a new method called Multiple Imputation Random Lasso (MIRL) to perform variable selection and prediction on incomplete high-dimensional data, such as from an epidemiological study of eating and activity in teens where 80% of individuals had at least one missing variable. 2) MIRL combines penalized regression, multiple imputation, and stability selection. Simulation studies show MIRL outperforms alternatives in high-dimensional scenarios, especially when variable correlation and missing proportions are high. 3) MIRL achieved improved performance over other methods when applied to the teen eating and activity study, predicting outcomes for boys and girls separately and for a subgroup of low SES Asian boys at high risk for obesity.

Original Description:

Original Title

PreviewFirstPage

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Download as pdf or txt

0% found this document useful (0 votes)

8 views1 page

Preview First Page

Uploaded by

habeeb4sa

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Download as pdf or txt

Jump to Page

You are on page 1of 1

Search inside document

The Annals of Applied Statistics

2016, Vol. 10, No. 1, 418–450

VARIABLE SELECTION AND PREDICTION WITH INCOMPLETE

HIGH-DIMENSIONAL DATA1

B Y Y ING L IU , Y UANJIA WANG , YANG F ENG AND M ELANIE M. WALL

Columbia University
We propose a Multiple Imputation Random Lasso (MIRL) method to
select important variables and to predict the outcome for an epidemiologi-
cal study of Eating and Activity in Teens. In this study 80% of individuals
have at least one variable missing. Therefore, using variable selection meth-
ods developed for complete data after listwise deletion substantially reduces
prediction power. Recent work on prediction models in the presence of in-
complete data cannot adequately account for large numbers of variables with
arbitrary missing patterns. We propose MIRL to combine penalized regres-
sion techniques with multiple imputation and stability selection. Extensive
simulation studies are conducted to compare MIRL with several alternatives.
MIRL outperforms other methods in high-dimensional scenarios in terms of
both reduced prediction error and improved variable selection performance,
and it has greater advantage when the correlation among variables is high
and missing proportion is high. MIRL is shown to have improved perfor-
mance when comparing with other applicable methods when applied to the
study of Eating and Activity in Teens for the boys and girls separately, and to
a subgroup of low social economic status (SES) Asian boys who are at high
risk of developing obesity.

1. Motivating example. In large epidemiological studies, accurately predict-

ing outcomes and selecting variables important for explaining the outcomes are
two main research goals. One commonly encountered complication in these stud-
ies is missing data due to subjects’ loss to follow up or nonresponses. It is not
straightforward to handle missing data when performing variable selection since
most existing variable selection approaches require complete data.
Our motivating study is the Eating and Activity in Teens (Project EAT) with a
focus of identifying risk and protective factors for adolescent obesity [Larson et al.
(2013), Neumark-Sztainer et al. (2012)]. A primary research goal is to identify the
most important household, family, peer, school and neighborhood environmental
characteristics predicting a teenagers’ weight status in order to provide recom-
mendations for potential prevention strategies. A strength of Project EAT is the

Received July 2014; revised December 2015.

1 The Project EAT was supported by grant number R01HL084064 (D. Neumark-Sztainer, principal
investigator) from the National Heart, Lung, and Blood Institute. Supported in part by NIH Grants
NS073671, NS082062 and NSF Grant DMS-13-08566.
Key words and phrases. Missing data, random lasso, multiple imputation, variable selection, sta-
bility selection, variable ranking.
418

Applied Multivariate Statistical Analysis Solution Manual PDF
Document18 pages
Applied Multivariate Statistical Analysis Solution Manual PDF
ALVYAN ARIF
No ratings yet
Speech Acts: Are The Speaker's Utterances Which Convey Meaning and Make Listeners Do Specific Things (Austin, 1962)
Document10 pages
Speech Acts: Are The Speaker's Utterances Which Convey Meaning and Make Listeners Do Specific Things (Austin, 1962)
Ashi Belmonte
100% (4)
BTVS 5x16 "The Body" Audio Commentary
Document12 pages
BTVS 5x16 "The Body" Audio Commentary
njz
No ratings yet
2 Biostatics UNIT-II
Document18 pages
2 Biostatics UNIT-II
vidya gadkari
No ratings yet
Jurnal Farmakoepidemiologi - M. Ari Wisnu - 252020536U
Document11 pages
Jurnal Farmakoepidemiologi - M. Ari Wisnu - 252020536U
anisdhafin135
No ratings yet
Data Acquisition and Preprocessing in Studies On Humans What Is Not Taught in Statistics Classes
Document8 pages
Data Acquisition and Preprocessing in Studies On Humans What Is Not Taught in Statistics Classes
Courtney Adams
No ratings yet
Applied Sciences
Document17 pages
Applied Sciences
m.ansari722
No ratings yet
Global Ecology and Biogeography - 2020 - Johnson
Document12 pages
Global Ecology and Biogeography - 2020 - Johnson
Sabrinah Yap
No ratings yet
Research in Autism Spectrum Disorders: Shin-Yi Wang, Ying Cui, Rauno Parrila
Document8 pages
Research in Autism Spectrum Disorders: Shin-Yi Wang, Ying Cui, Rauno Parrila
Yusrina Ayu Febryan
No ratings yet
Output 8 and 9
Document11 pages
Output 8 and 9
Angelo Parao
No ratings yet
Missing Data Review
Document31 pages
Missing Data Review
Mila Anasanti
No ratings yet
How Handling Missing Data May Impact Conclusions - A Comparison of Six Different Imputation Methods For Categorical Questionnaire Data
Document20 pages
How Handling Missing Data May Impact Conclusions - A Comparison of Six Different Imputation Methods For Categorical Questionnaire Data
Isaac Iconoclasta
No ratings yet
The Effectiveness of Intervention On The Behavior of Individuals With Autism
Document21 pages
The Effectiveness of Intervention On The Behavior of Individuals With Autism
raras
No ratings yet
Paper 2, Modeling Human Development Index Using Discriminant Analysis
Document21 pages
Paper 2, Modeling Human Development Index Using Discriminant Analysis
Iwuagwu, Chukwuma Emmanuel
No ratings yet
Stunting 2 PDF
Document7 pages
Stunting 2 PDF
Permata Hati
No ratings yet
Meta-Analysis in Clinical Research
Document2 pages
Meta-Analysis in Clinical Research
Andrea Chew
No ratings yet
Sample Research Paper
Document6 pages
Sample Research Paper
Melissa Aintgonnanevermiss'em Carroll- Hall
No ratings yet
Tutorial in Biostatistics Using The General Linear Mixed Model To Analyse Unbalanced Repeated Measures and Longitudinal Data
Document32 pages
Tutorial in Biostatistics Using The General Linear Mixed Model To Analyse Unbalanced Repeated Measures and Longitudinal Data
pedro veiga
No ratings yet
Statistics in Medicine - 2019 - Robbins
Document18 pages
Statistics in Medicine - 2019 - Robbins
Gabriel Alonso González Medina
No ratings yet
Population Pharmacokinetics Is The Study of The Sources and Correlates of Variability in Drug
Document2 pages
Population Pharmacokinetics Is The Study of The Sources and Correlates of Variability in Drug
chandru sahana
No ratings yet
2017 - Alomari Et Al
Document18 pages
2017 - Alomari Et Al
azeemathmariyam
No ratings yet
202266-Article Text-506509-1-10-20201216
Document16 pages
202266-Article Text-506509-1-10-20201216
Periasamy Rathinasamy
No ratings yet
The Effect of Medical Experience On The Economic Evaluation of Health Policies. A Discrete Choice Experiment
Document13 pages
The Effect of Medical Experience On The Economic Evaluation of Health Policies. A Discrete Choice Experiment
Guillermo Sanchez Vanegas
No ratings yet
Bài 3
Document2 pages
Bài 3
Quỳnh Như Nguyễn Trịnh
No ratings yet
Kavanagh 2014 Simplifying The Assessment of Antid
Document2 pages
Kavanagh 2014 Simplifying The Assessment of Antid
ok
No ratings yet
Untitled Document
Document11 pages
Untitled Document
robymuiruri42
No ratings yet
Significant Oral Cancer Risk Associated With Low Socioeconomic Status
Document2 pages
Significant Oral Cancer Risk Associated With Low Socioeconomic Status
WJ
No ratings yet
Lead Exposure
Document5 pages
Lead Exposure
mcproenca
No ratings yet
Statistic - Task
Document5 pages
Statistic - Task
gelma furing lizaliza
No ratings yet
QUALITATIVE AND MIXED METHODS RESEARCH DESIGNS-revison
Document3 pages
QUALITATIVE AND MIXED METHODS RESEARCH DESIGNS-revison
david murimi
No ratings yet
Workings Papers Referencias
Document5 pages
Workings Papers Referencias
GianMarcoCanchanyaVillegas
No ratings yet
Standardised Regression Coefficient-Metaanalysis
Document15 pages
Standardised Regression Coefficient-Metaanalysis
Jasleen Kaur
No ratings yet
Review 2008 en
Document6 pages
Review 2008 en
vimlen
No ratings yet
Quantitative Research: Module 1: The Nature of Inquiry and Research
Document19 pages
Quantitative Research: Module 1: The Nature of Inquiry and Research
margilyn ramos
No ratings yet
Statistical Causal Inferences and Their Applications in Public Health Research-Springer International Publishing (2016)
Document324 pages
Statistical Causal Inferences and Their Applications in Public Health Research-Springer International Publishing (2016)
Katerina Ts
No ratings yet
8234 30157 1 PB
Document9 pages
8234 30157 1 PB
Evi Nurhidayati
No ratings yet
PPhA NatCon 2024 Presentations Book of Abstracts
Document53 pages
PPhA NatCon 2024 Presentations Book of Abstracts
Rey Tabay
No ratings yet
Journal Pone 0089482
Document13 pages
Journal Pone 0089482
Monica Lauretta Sembiring II
No ratings yet
Statistics Biology Experiments Medicine Pharmacy Agriculture Fishery
Document15 pages
Statistics Biology Experiments Medicine Pharmacy Agriculture Fishery
kavya nainita
No ratings yet
Section Iv
Document8 pages
Section Iv
yogi.kristiyanto
No ratings yet
Adherence 3
Document29 pages
Adherence 3
b.mod
No ratings yet
Nature and Inquiry of Research 1
Document18 pages
Nature and Inquiry of Research 1
MAUREEN BRILLANTES ENCABO (Lunduyan 4)
No ratings yet
Investigating The Utility of Machine Learning Models in Disease
Document15 pages
Investigating The Utility of Machine Learning Models in Disease
tocaofficial110
No ratings yet
His 9
Document158 pages
His 9
Anjum
100% (6)
Pre-Post OT Intervention
Document16 pages
Pre-Post OT Intervention
Idanerlis
No ratings yet
Sample Article Critique
Document7 pages
Sample Article Critique
rebecca
No ratings yet
Parents Suggest Which Indicators of Progress and Outcomes Should Be Measured in Young Children With Autism Spectrum Disorder
Document11 pages
Parents Suggest Which Indicators of Progress and Outcomes Should Be Measured in Young Children With Autism Spectrum Disorder
Ana Gamba
No ratings yet
Kangaroo Mother Care and Neonatal Outcomes A Meta-Analysis
Document18 pages
Kangaroo Mother Care and Neonatal Outcomes A Meta-Analysis
Imtiyazi Nabila
No ratings yet
PR2:Nature and Inquiry of Research
Document18 pages
PR2:Nature and Inquiry of Research
Mary Joy Llosa Redulla
No ratings yet
Rss Abstracts Booklet 2019 A4 PDF
Document324 pages
Rss Abstracts Booklet 2019 A4 PDF
Slice Le
No ratings yet
Rahibu Concept Note
Document4 pages
Rahibu Concept Note
Fadhil Chiwanga
No ratings yet
Objectives:: SAGE Open Medicine
Document34 pages
Objectives:: SAGE Open Medicine
Isaac Iconoclasta
No ratings yet
Statistics
Document50 pages
Statistics
AGEY KAFUI HANSEL
No ratings yet
Ralph2015 Metaanálise
Document9 pages
Ralph2015 Metaanálise
Thayane Fraga
No ratings yet
3 Delinquent Mta Molinaetal07
Document13 pages
3 Delinquent Mta Molinaetal07
lucasher35
No ratings yet
Bednets, Information and Malaria in Orissa: Aprajit Mahajan Alessandro Tarozzi Joanne Yoong Brian Blackburn March 2009
Document42 pages
Bednets, Information and Malaria in Orissa: Aprajit Mahajan Alessandro Tarozzi Joanne Yoong Brian Blackburn March 2009
aprajit.mahajan
No ratings yet
4 5848221817503747052
Document15 pages
4 5848221817503747052
Yasin Ahmed
No ratings yet
10 1 1 703 9931 PDF
Document10 pages
10 1 1 703 9931 PDF
werwerwer
No ratings yet
Clayson 2019
Document17 pages
Clayson 2019
injuwanijuwa
No ratings yet
JDS 712
Document9 pages
JDS 712
aaditya01
No ratings yet
Effectiveness of Paediatric Occupational Therapy For Chidren With Disabilities A Systematic Review
Document16 pages
Effectiveness of Paediatric Occupational Therapy For Chidren With Disabilities A Systematic Review
Cynthia Rodrigues
No ratings yet
Concise Biostatistical Principles & Concepts: Guidelines for Clinical and Biomedical Researchers
From Everand
Concise Biostatistical Principles & Concepts: Guidelines for Clinical and Biomedical Researchers
Franklin Opara
No ratings yet
Efficient Skyline Computation On Massive Incomplet
Document19 pages
Efficient Skyline Computation On Massive Incomplet
habeeb4sa
No ratings yet
Online Feature Selection and Classification With Incomplete Data
Document13 pages
Online Feature Selection and Classification With Incomplete Data
habeeb4sa
No ratings yet
The Behaviour of Rank Correlation Coefficients For
Document15 pages
The Behaviour of Rank Correlation Coefficients For
habeeb4sa
No ratings yet
AI - Learning.websites and Other Educational Websites
Document18 pages
AI - Learning.websites and Other Educational Websites
habeeb4sa
No ratings yet
Bias and Inaccuracy in AI Chatbot Ophthalmologist
Document9 pages
Bias and Inaccuracy in AI Chatbot Ophthalmologist
habeeb4sa
No ratings yet
7 Steps of Writing 10X Better Prompts
Document63 pages
7 Steps of Writing 10X Better Prompts
habeeb4sa
No ratings yet
ASurveyon AIChatbots
Document10 pages
ASurveyon AIChatbots
habeeb4sa
No ratings yet
Call Out Culture
Document6 pages
Call Out Culture
Eugene Toribio
No ratings yet
Ccts Maine Standards Rev 11 20
Document11 pages
Ccts Maine Standards Rev 11 20
api-607068017
No ratings yet
Report On Business February 2010
Document14 pages
Report On Business February 2010
investingthesis
No ratings yet
Otp System Documentation
Document345 pages
Otp System Documentation
jhernandezb219
No ratings yet
Case Analysis On Saffola
Document12 pages
Case Analysis On Saffola
Shiva Krishna Padhi
100% (3)
Self Efficacy Towards Geometry Scale 2
Document13 pages
Self Efficacy Towards Geometry Scale 2
S2118169 STUDENT
No ratings yet
D E R I V A T I O N of A Diggability Index For Surface Mine Equipment Selection
Document18 pages
D E R I V A T I O N of A Diggability Index For Surface Mine Equipment Selection
Zuhaib Shaikh
No ratings yet
Lesson Plan-Elm 375
Document3 pages
Lesson Plan-Elm 375
api-582006278
No ratings yet
Assignment Thermo
Document6 pages
Assignment Thermo
Shahrul Naim
No ratings yet
Articles of Association SIDCL
Document19 pages
Articles of Association SIDCL
david
No ratings yet
When Sexual and Religious Orientation Collide
Document25 pages
When Sexual and Religious Orientation Collide
abdurrakhman sapar
No ratings yet
Instructions To Candidate: REAS - SPL - 2019 TEST ID - RAB-87148324
Document4 pages
Instructions To Candidate: REAS - SPL - 2019 TEST ID - RAB-87148324
Jitendra Gautam
No ratings yet
Experiment No: 1 Objective:: Force Area
Document3 pages
Experiment No: 1 Objective:: Force Area
SAIF
No ratings yet
Commercial Dispatch Eedition 12-21-15
Document16 pages
Commercial Dispatch Eedition 12-21-15
The Dispatch
No ratings yet
3.3.7 Practice Mapping Cold War Alliances
Document2 pages
3.3.7 Practice Mapping Cold War Alliances
sweensoft
No ratings yet
CHALLENGES IN TRANSLATING CHINESE CLASSICAL NOVEL 《魔道祖师》 (MODAOZUSHI) INTO ENGLISH
Document16 pages
CHALLENGES IN TRANSLATING CHINESE CLASSICAL NOVEL 《魔道祖师》 (MODAOZUSHI) INTO ENGLISH
tang qiao er
No ratings yet
Hemodialysis and Peritoneal Dialysis - Patients' Assessment of Their Satisfaction With Therapy and The Impact of The Therapy On Their Lives
Document6 pages
Hemodialysis and Peritoneal Dialysis - Patients' Assessment of Their Satisfaction With Therapy and The Impact of The Therapy On Their Lives
maria teresa pertuz rondon
No ratings yet
Admas University: Faculty of Business
Document5 pages
Admas University: Faculty of Business
eyob negash
No ratings yet
The Most Amazing Structure On Earth
Document2 pages
The Most Amazing Structure On Earth
Strafalogea Serban
No ratings yet
Fiber Optic Cable Outdoor,: Bellcomms Bellcomms
Document1 page
Fiber Optic Cable Outdoor,: Bellcomms Bellcomms
Chawengsak Choomuang
No ratings yet
Fire Cable EL Sewedy
Document3 pages
Fire Cable EL Sewedy
Motaz Ali
No ratings yet
EF3e Elem Quicktest 01
Document3 pages
EF3e Elem Quicktest 01
kujtim78
No ratings yet
Microchips Small and Demanded
Document4 pages
Microchips Small and Demanded
xellosdex
No ratings yet
Travel Hacking Airlines
Document47 pages
Travel Hacking Airlines
Ben Hughes
100% (2)
4471242v6 1rq National Collective Agreement 2007-2009 Wage Attachment 2016
Document15 pages
4471242v6 1rq National Collective Agreement 2007-2009 Wage Attachment 2016
Aditya
No ratings yet
Al., G.R. No. 209271, December 8, 2015
Document116 pages
Al., G.R. No. 209271, December 8, 2015
Martin S
No ratings yet
China Banking Corp vs. CA - GR No.140687
Document13 pages
China Banking Corp vs. CA - GR No.140687
Leizl A. Villapando
No ratings yet
IMRAD Format
Document3 pages
IMRAD Format
En Ash
100% (1)