Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

Department of Biostatistics

How to create a box and whiskers plot in EXCEL

A typical box and whiskers plot:


Department of Biostatistics

Step 1: You should have a continuous outcome, plus one or more groups. In this example, we have two
test scores and we want to examine these test score distributions. Here is a small subset of the data so
you can see what it should look like in EXCEL. Assume that our data is in rows 2 to 51 (n=50 respondents
in total), and we have two types of test scores – score1 and score2.

A B C

ID score1 score2
1 235 163
2 304 148
3 283 155
4 280 148
5 245 132
6 221 138
7 221 155
8 231 150
… …. ….

Step 2: You need to compute some summary statistics prior to graphing (EXCEL does not make
generating this graph easy…) In EXCEL, Quartile ‘1’ is the 25th
Summary statistics EXCEL function percentile, ‘2’ is the 50th (median), ‘3’
is the 75th percentile
25th percentile =QUARTILE(B2:B51,1)

Minimum =MIN(B2,B51)

Maximum =MAX(B2:B51) This is the range of data you want to


compute these summaries for. Since
75th percentile =QUARTILE(B2:B51,3) our data for score 1 is in column B in
rows 2 to 51, this is the range we
Median =QUARTILE(B2:B51,2)
specify

NOTE – TO GENERATE A BOX PLOT FOR BOTH TEST SCORES, YOU NEED TO DO THIS TWICE – ONCE FOR
SCORE1 (COLUMN B) AND ONCE FOR SCORE2 – COLUMN C
Department of Biostatistics

Step 3: Now you need to compute a few more numbers based on those summary statistics (you just
need to subtract a few quantities) so we can get the box and whiskers to graph properly

SCORE1 SCORE2

MIN 210 126


25TH PERCENTILE (Q1) 241.25 139.5
50TH PERCENTILE (MEDIAN) 303 149
75TH PERCENTILE (Q3) 327 157
MAX 415 170

Q1 - MIN 31.25 13.5


25TH PERCENTILE (Q1) 241.25 139.5
MEDIAN-Q1 61.75 9.5
Q3 - MEDIAN 24 8
MAX-Q3 88 13

Step 4: Select Q1, Median-Q1 and Q3-Median as your data to plot. Select a stacked 2D column plot
Department of Biostatistics

Note: By doing this stacked column chart, you can see the start of the box plot. The top of the green
box represents the 75th percentile, while the bottom of the red box represents the 25th percentile. The
blue box goes from zero to the 25th percentile. We now need to remove some of the filled in areas and
add the whiskers!

Step 5: Get rid of the FILL and BORDER color for the blue boxes by clicking on that box and selecting
those options (NO FILL, NO BORDER). Now you can see the box plot looks like it should – with the
middle line representing the median, the top of the box representing the 75th percentile, and the
bottom of the box representing the 25th percentile.

Step 6 – Add the error bars and remove gridlines, change colors

Adding the error bars is a bit tricky. In CHART TOOLS -> LAYOUT there is an option for ERROR BARS.
Select this and then select MORE ERROR BAR OPTIONS. We want the whiskers to represent the
maximum and the minimum values, so we will use the computed numbers MAX-75th percentile and
MIN-25th percentile respectively.

To compute the lower error bar, you want to make sure that you have clicked in SERIES 1 – the now
invisible box (recall we just removed the blue color from this box) that represents the 25th percentile to
zero. If you are there, then you want to add a CUSTOM VALUE for the negative error value. That
custom value is equal to your 25th percentile – MIN computed value
Department of Biostatistics

The invisible blue box!


Department of Biostatistics

The negative error value is equal to


your two 25th percentile-MIN values
for score1 and score2
Department of Biostatistics

Now you have your first whisker – notice that it extends all the way from the 25th percentile to the
minimum observed value. CHECK THIS in your data by looking at your computed summary statistics.

Do the same steps to get the upper whiskers, except now you want to be in the top box corresponding
to the area between the median and 75th percentile and you want to specify a custom error bar in the
PLUS direction
Department of Biostatistics
Department of Biostatistics

Step 7: Change colors as desired, removed gridlines, add axes titles and labels

450

400

350

300
Test Scores

250

200

150

100

50

0 Test 1 Test 2

Figure 1: Box and Whiskers Plot of Test scores observed for Test version 1 and
version 2 (n=50 participants). The upper, mid, and lower box edges represent the
75th, 50th, and 25th percentiles, respectively. The whiskers represent maximum
and minimum observed test scores.

You might also like