Download as pdf or txt
Download as pdf or txt
You are on page 1of 42

Basics and Beyond: Displaying Your

Data

Mario Davidson, PhD


Vanderbilt University School of Medicine
Department of Biostatistics
Instructor
Objectives
1.Understand the types of data and levels of
measurement
2.Understand how a Table 1 typically looks
3.Be able to interpret all of the basic graphs.
4.Know the type of displays that may be used
dependent upon the type of data and level of
measurement
5.Be introduced to less familiar displays of the data
Types of Data (Obj1)
Qualitative Data


Consist of attributes, labels, or non-numerical entries.

If you can’t perform mathematical operations or order data,
it’s qualitative.

Ex: Colors in a box of crayons; names; county
Quantitative Data


Consist of numerical measurements or counts.

Ordering is a dead give away

Ex: BMI; age; numerical grade
Levels of Measurement (Obj1)


Nominal

Qualitative

Categorized using names, qualities, or labels

Ex: Top 5 movies, jersey numbers, type of drug

Ordinal

Quantitative or Qualitative

Can order

Differences between data are not meaningful.

Ex: Letter grade, Likert scale such as very
dissatisfied to very satisfied
Levels of Measurement (Obj1)

Interval Level of Measurement



Quantitative

Can order

Can calculate meaningful differences

No Value that means “nothing/none.” A zero entry merely
represents a position on a scale (i.e. no inherent zero).

Ex: Time of day, temperature
Ratio Level of Measurement


Quantitative

Can order

Can calculate meaningful differences

There’s a value that means “nothing/none.”

Ex: Age, weight, test score
Popular Displays
Description of Table 1 (Obj2)
 Typically summarizes baseline characteristics of the data.
 Compares statistics between groups

 May provide means, medians, confidence intervals,

percentiles, percentages, p-values, standard deviations,


etc.
 Summaries of all types of data (e.g. continuous,

categorical, nominal, ordinal, interval, ratio) may be used.


 Likert scale: Scale indicating degree of agreement (e.g.

Rate the following statement: I have a had a difficult time


focusing on my studies this semester: SD D N A SA
Example of a Table 1 (Obj2)
Test Your Knowledge

Interpret the following graphs.


Test Your Knowledge

Interpret the following graphs.

Cherry or Apple Pies sold the Nearly 15 subjects chose


most in January. “Other” pies Saturday as their favorite
sold the least day. Sunday was the least
chosen.
Pie Charts (Obj3)

Features (Obj4)
– Nominal or Ordinal
– Compares Levels of One
Characteristic

Advantages:

Easily Interpreted
• Larger Area; Greater
Proportion

Easy to Create

Disadvantages

Difficult to Judge Areas

Wastes Ink
Bar Plots (Obj3)

Features (Obj4)
– Nominal and
Ordinal
– Compares

Advantages

Same as Pie Chart

Disadvantages

Similar to Pie Chart

No such thing as an
Analyte 2.5

Ordering can Change
Perception
Test Your Knowledge

Interpret the following graphs


Test Your Knowledge

The most frequent BMI seems to be There were 8 subject weighing


approximately around 24-26. approximately 0 grams. There
was only one weighing 10 grams.
Histograms (Obj3)
 Features
– Shows Distribution
– Continuous
– One Characteristic (Obj4)
 Advantages
 Easy to Interpret
 Easy to Produce
 Disadvantages
 Size of Bins can Change
Perception
 Cannot Read Exact Values
Dot Plot (Obj3)
 Features (Obj4)
– One Characteristic
– Ordinal
 Advantages

 Good for Small and

Moderate Data
 Easily Interpreted

 Disadvantages

 May not be Best Option

with Large Data


 Not Produced in all

Packages
Stem and Leaf Plot (Obj3)
 Features (Obj4)
– One Characteristic
– Ordinal
 Advantages
 Useful with Small Data and
May be Used with Large Data
 Can be produce by hand
 Easily Interpreted
 Useful with Numeric
 Disadvantages
The most frequent USMLE1 scores in our  May be Difficult to Measure
data were in the 220's, 230's, and 260's. Center
The highest and lowest scores were 190
and 278 respectively.
 Not Appealing
Test Your Knowledge

Why is this graph difficult to 
What is the trend?
interpret? 
An outlier is data that is a

What is the trend? numerical distance from the
rest. Can you find one?
Test Your Knowledge

There is no y-label. 
Seems to be a slight positive
trend: as age increases so

R is a statistical software. does POMS.

From Jan-Dec, there is an 
The arrows suggest 2
upward trend. possibly outliers.
Line Graph (Obj3)
 Features (Obj4)
– One Characteristic
– Used with Ordinal
and Continuous
– Displays
Associations,
Trends, and
Range
 Advantages
 Produced in Most
Packages
Line Graph with Rugplot
Scatterplot (Obj3)
 Features (Obj4)
– Continuous and Ordinal
– Shows Associations
– Shows Trend
 Advantages
 Shows all of Data
 Produced in Most Packages –
not the Line
 Exact values shown
 Easily Interpreted
 Disadvantage
 May not be Best Way for Large
Data
Less Familiar Graphs
Boxplot (Obj3 and Obj5)
 Features
 Continuous by Nominal or
Ordinal (Obj4)
 May Compare Groups
 Advantages
 Good Summary: Min, 1Q,
2Q(median), 3Q, Max
 Disadvantages
 Does not Display All the
Data
 Not as Appealing
 Cannot be Created in All
Packages
 May not be as Recognized
by Some
Boxplot
 The median tooth length for orange
juice at 1dose of Vitamin C was
roughly 25 units.
 The first quartile length for 1 dose of
ascorbic acid was approx. 15.
 As Vitamin C doses increase tooth
length increases. Overall, it
appears that those using orange
juice had greater length given the
same dose and excluding possibly a
Vitamin C dose of two.
 There was an outlier for the ascorbic
acid at dose 1.
Boxplot Overlayed with Stripchart (Obj5)
 Features
– Same as
Boxplot
 Advantages
 Same as Boxplot
 Can See All of the
Data
 Disadvantage
 Many Programs
Cannot Create
Dot Chart (Obj5)
 Features
 Nominal, Ordinal
Characteristics with a
Continuous Outcome
(Obj4)
– Can Compare Levels
and Groups
 Advantages
 Easily Interpreted
 Size of Data Irrelevant

 Disadvantage
 Not as Recognized as
Bar Graphs and Pie
Charts
Kaplan Meier Curve (Obj5)

Demonstrates the
probability of
survival

The plot suggests
that males have a
more favorable
rate of survival
over the years.

Can be created in Number at Risk
most programs
Probably Even Less Familiar
Graphs
Spaghetti Plot (Obj5)
●Alzheimer's
Disease
●Verbal IQ – Words

that could not be


sounded out (e.g.
Depot)
Spaghetti Plot
 Features (Obj4) The overall trend suggest that
as age increases so do
– Continuous, earnings.
Longitudinal
– Two Characteristics
– Shows Trend
 Advantages
 Shows all of the Data
 Disadvantages
 Not Available in All
Packages
odf o ds ht ( s gni nr a E

 May be Difficult to
Interpret

Age(yrs)
Dendogram: Cluster (Obj5)

Useful for
Determining
Clustering

May Help to
Remove
Variables (Data
Reduction)

PGY clustered
Clinical Year
Scatter Plot with Marginal
Histograms (Obj5)

Continuous

Virtually appealing

Shows trends,
associations, and
the distributions of
the data

Cannot be created
in many programs
Large Data Sets
Sunflower Plot (Obj5)

Large data sets

The more ink used,
the more dense the
data

Ordinal

More fresh embryos
to the uterine were
transferred on day 3.
Heat Map (Obj5)

Encephalitis

Red

Proportion
of Presence

Green

Proportion of
Absence

White

Missing

Light/Dark

Intensity of
Presence of
Attribute
Heat Map

Similar to the
Hexagon Plot

Lightness or
Darkness
Indicates Intensity

May not be
Created in Some
Programs
Nomogram (Obj5)
 May Provide Risk,
Probability, etc.
 Useful in Providing Predictive
Scores
 Sum the “Points” for each
category, find the “Total
Points,” then look at the
corresponding “Risk of
Death.”
 40 yo, Male, 200 Cholesterol,
and 170 BP has
Approximately a 48% Risk of
Death
Multidimensional Plot (Obj5)
http://data.vanderbilt.edu/rapache/bbplot/
Multidimensional Plot (Obj5)
Conclusion

Always try to think of the best way to display
your story (data).

Consider your target audience.

When publishing, color may cost.
References
Hamid, et al. BMC Infectious Diseases 2010, 10:364.
http://www.biomedcentral.com/1471-2334/10/364
Grober, E, Hall, CB, Lipton, RB, Zonderman, AB, Resnick, SM, and Kawas, C
(2009). Memory impairment, executive dysfunction, and intellectual decline
in preclinical Alzheimer's disease. Journal of the International
Neuropsychological Society, 14(2), 266-278.
http://data.vanderbilt.edu/rapache/bbplot/

You might also like