Professional Documents
Culture Documents
Data Analytics End Term QP
Data Analytics End Term QP
Data Analytics
End Term Question Paper
Overview of Data:
o Principal Investigator(s): United States Department of Health and
Human Services. Centers for Medicare and Medicaid Services
o Summary: This dataset obtained from the ICPSR website contains results of the
Medicare Health Outcomes Survey (MHOS), a survey intended to provide data
for investigation of health outcomes in older adults. The MHOS is administered
annually to 1200 randomly selected Medicare beneficiaries per plan in regions
throughout the US. The self-administered surveys are sent by mail, with
telephone follow-up. The MHOS includes the SF-36, an instrument to assess
eight domains of mental and physical health-related quality of life, as well as
questions regarding chronic medical conditions, smoking status, height and
weight, depression, and activities of daily living, fall risk. Demographic and
clinical variables, including age, race, gender, ethnicity, education, marital
status, geographic area, and comorbid conditions are self-reported on the
MHOS. The MHOS is longitudinal, with the same survey administered at
baseline and two-year follow-up. The MHOS has been administered since 1998,
with data from twelve cohorts (1998-2011) currently available for analysis. Self-
reported demographic information is verified in the Medicare Enrollment
Database. Comorbid conditions self-reported in the MHOS include
hypertension, diabetes, angina or coronary artery disease, HF, stroke, history of
myocardial infarction, asthma, COPD, arthritis, depression, cancer, and others.
o Note: To limit the size of the dataset, the dataset uploaded for this case study
includes only baseline data for 1000 participants in cohort 4 (2001-2003).
QUESTION: List and answer a set of 3-5 questions that have clear healthcare
applications that might be addressed, or at least examined by, using the dataset
Potential questions that may be addressed using this data include (but not limited to, you can
add/replace with your own questions as well):
1. Among Medicare recipients in the US, which chronic conditions are most strongly
associated with fall risk (loss of balance) and/or impaired activities of daily living?
2. Among Medicare recipients in the US, which chronic conditions are associated with the
greatest decline in HRQL over time? (i.e., during the two-year period between baseline
MHOS and follow-up MHOS).
3. Among Medicare recipients in the US, how do chronic conditions such as heart failure
impact physical and mental quality of life over time? (What changes in HRQL occur
during the two-year period between baseline MHOS and follow-up MHOS?).
4. Among Medicare recipients in the US, how does prevalence of COPD vary by race or
gender? (Or geographic region?)
5. Do patient health outcomes vary among different Medicare managed care plans
(incidence of chronic disease, impaired ADLs, etc.)?
6. BONUS QUESTION (If even one student attempts this, no one will be happier than
me. Those who will not attempt will not be penalized. Those who will attempt will be
“praised”.): Can you run some kind of dimensionality reduction algorithm (psst., I
am talking about PCA) to find out which factors have least contribution to the
general Quality of Life?
Meta-data:
MHOS participants fill out a paper survey that is mailed to them, with follow-up by phone for non-
respondents. From 1998 to 2006, the MHOS survey used the RAND Short Form-36 (SF-36) to measure
HRQL, along with self-reported demographic and clinical data.
Measures of HRQL to be derived from the surveys (Table 1) include the Physical Component Score
(PCS), Mental Component Score (MCS), and 8 domain subscale scores: Physical Functioning, Role-
Physical, Bodily Pain, General Health, Vitality, Social Functioning, Mental Health, and Role-emotional
(Table 2). The SF-36 is a valid and reliable measure of HRQL that has been validated in numerous
populations, including cancer, heart failure, and healthy populations (Turner-Bowker, 2002). All scales
within the SF-36 are based on a standardized t-score of 0 to 100 (mean=50, SD=10), with published US
population norms available. A score below 50 can be interpreted as below average, and a score above
50 is above average.
Turner-Bowker DM, Bartley PJ, Ware JE. SF-36VR Health Survey & “SF” Bibliography: Third Edition
(1988-2000). Lincoln, RI: Quality-Metric, Inc.; 2002.
Demographic variables
Age (less than 65, 65 to 74, or greater than 74)
Race
Gender
Marital status
Education level (less than high school/GED, high school/GED, or greater than high school/GED)
Self-reported chronic conditions (the survey uses non-medical terminology for these conditions,
e.g. hypertension is “high blood pressure”; myocardial infarction is “heart attack”)
Hearing or vision loss
Urinary incontinence
Hypertension
Angina or coronary artery disease (CAD)
Congestive heart failure (CHF)
Acute myocardial infarction
Other cardiac conditions
Stroke
Emphysema, asthma, or COPD
GI problems
Arthritis
Sciatica
Diabetes
History of any cancer (other than skin cancer)
Fleishman JA, Selim AJ, Kazis LE. Deriving SF-12v2 physical and mental health summary scores: a
comparison of different scoring algorithms. Qual Life Res. 2010; 19:231-241.
Citation:
United States Department of Health and Human Services. Centers for Medicare and Medicaid
Services. Medicare Health Outcomes Survey (HOS), 1998-2012. ICPSR23380-v2. Ann Arbor,
MI: Inter-university Consortium for Political and Social Research [distributor], 2014-11-13.
http://doi.org/10.3886/ICPSR23380.v2
URL: http://doi.org/10.3886/ICPSR23380.v2