Bio 55/155

Lab Objectives
• Develop a better understanding of “randomness” and probability
• Collect data, enter raw data into Excel, and make a histogram in Excel
• Label graphs and write figure captions

Part 1: Randomness and Probability

Statistical hypothesis testing (inferential statistics) works by either computing or estimating
the probability or likelihood (p-value) that the results obtained in a particular experiment
were due to chance alone rather than due to a real effect, difference, or relationship. As
such, understanding probability is essential. In this lab, you will explore the nature of
probability and randomness using something you should be familiar with, coin flips.

First of all, let’s review some basic facts about probability. Read over your lecture notes on

Decimal places/rounding for probability in this course: When you calculate a probability,
if the answer divides out to a “round” number and doesn’t need to be rounded, such as 0.5 or
0.78, include all decimal places. If the answer does not divide out to a “round” number, round
the answer to 3 decimal places following any leading zeros, e.g. 0.565, 0.0531, 0.704, 0.0704.

If you flip a coin once, what is the probability that it will show heads?

If you flip a coin three times, what is the probability that it will show heads three
times in a row.

If you flip a coin 49 times and it was heads every time, what is the probability that it
will be heads on the 50th coin flip. Hint: if you find yourself reaching for a calculator,
rethink your strategy.

Now, let’s work on developing a good understanding of randomness. We know that every
time you flip a coin, there is a random, 50% chance (probability = 0.5) that it shows heads
(or tails), but what does random really mean?

Flip a coin 100 times (it’s a lot of flipping, but it’s usually better to have more data)
and record the results in the table.
Data Sheet for the Coin Flip Lab
Flip Face Flip Face Flip Face Flip Face

1 26 51 76
2 27 52 77
3 28 53 78
4 29 54 79
5 30 55 80
6 31 56 81
7 32 57 82
8 33 58 83
9 34 59 84
10 35 60 85
11 36 61 86
12 37 62 87
13 38 63 88
14 39 64 89
15 40 65 90
16 41 66 91
17 42 67 92
18 43 68 93
19 44 69 94
20 45 70 95
21 46 71 96
22 47 72 97
23 48 73 98
24 49 74 99
25 50 75 100

Count up the total number of heads and tails and record them below.

Number of heads: = __________%

Number of tails: = __________%

Hopefully, each percentage was close to 50%.

Take a look at the list. Does it contain a “run” or sequence, of six heads or six tails in
a row? Does it contain even longer consecutive runs?

Let’s count the number of consecutive, same-side coin flips and record them in Table
1. We are recording run lengths. A run length of 5 means you observed H H H H H T or T
T T T T H. An easy way to determine the run length is to draw a line between any two
consecutive flips that were different, and then count the number of flips within each pair of
lines. For example:

H H / T / H H H / T

2 / 1 / 3 / 1

Count the total number of run lengths of 1, 2, etc. and put those numbers in Table 1
below in the Observed Number column.

Table 1. Expected numbers of runs of lengths 1-10 and the observed number of these runs
in the 100 coin flips.
Run Length Probability Expected Number Observed Number
1 0.5 25 lips
2 0.25 12.5
3 0.125 6.25
4 0.0625 3.125
5 0.03125 1.563
6 0.01563 0.782
7 0.00781 0.391
8 0.00391 0.196
9 0.00195 0.098
10 0.00098 0.049
You’ll notice there are two additional columns that have been filled in for you. The
“Probability” column indicates the probability of getting a consecutive run of the specified
length. “Expected Number” is the probability times 50 (to be explained below).

To understand where these numbers come from, imagine that you flipped a coin and it
landed on “heads.” The probability of a consecutive run of length=1 is just the probability of
it not coming up “heads” on the second flip (0.5). To get a consecutive run of length=2, you
need to flip “heads” on the second flip and “tails” on the third flip, i.e. 1 x 0.5 x 0.5 = 0.25. A
pattern should emerge, but make sure you understand how to calculate the probability.

Based on these probabilities, on average how many strings of 6 consecutive coin flips
coming up on the same side should you have seen in your data sets? To find this out, your
first instinct may be to multiply the probability you just calculated by 100 (the number of
coin flips). This would be a mistake. Think about what this probability tells you. P[run
length of 1] = 0.5, means that half of all consecutive runs should have a run length of 1. If
you multiplied this probability by 100 coin flips, you might think that 50 out of 100 coins
would be in a run length of 1, but think about what happens with a consecutive run of
6…how many coins are in a single consecutive run of 6 (hint: it’s 6 and I hope you knew
that). If you think about it, the probability has nothing to do with the number of coin flips
and everything to do with the total number of consecutive runs.

To find out how many consecutive runs occur, on average, in a sequence of 100 coin flips,
you need to use something called a Markov Chain and some calculus. In short, you won’t be
doing it for this lab but the numbers in Table 1 are based on this calculus. If you had an
infinite number of data sets, the average number of consecutive runs would be
approximately 50. The “Expected” column in the table above has been filled in by
multiplying each probability by 50. Before you ask, it’s not a coincidence that this number is
# total events / # possible outcomes, but the formula is more complicated than that.

Part 2: Entering and Graphing Results in Excel

How different were your results in Table 1
from the expected distribution? While you
can eyeball it using the table, let’s look at this
graphically. Start Excel and start a new Blank
Workbook. Type the coin flip data from your
Table 1 above into the Excel sheet so that it
looks like the data set to the left, but note that
the data on the left are for flank scars and not
coin flips. Create the the following columns
in your sheet: Run Length, Expected
Number, Observed Number.

Next, we will make a histogram or frequency graph. Note: To use the following
instructions, DO NOT highlight any of the data. These instructions will teach you how to
manually select the data values you want. This is the method for getting Excel to make the
correct graph.

1. From the top menu tabs, click Insert tab.

2. Click the small arrow next to the graph
button and then choose the 2D Column
option in the window that pops up. Choose
the first option (Clustered column) in this
menu. A blank graph will appear on your
screen and the Design menu tab will show.

3. Click on the Select Data button. A new

window will pop up, titled Select Data
Source (below). This window allows you to
designate series, input data, and use a
column in your spreadsheet as labels for the
X-axis if needed.

4. In the Select Data Source window, click the Add button under Legend Entries
(Series). For this graph, you will add two series: Expected Number, Observed
Number. We add Expected Number first, as follows.

5. Click in the box under Series name, and
then in your spreadsheet click only the cell
that says “Expected Number.”
6. Next to Series values, delete ={1}. Then,
highlight all of the data points (like below) in
the Expected Number series (don’t highlight
“Expected Number”). The result is in the
picture on the left.

7. Click on OK.

8. In the Select Data Source window, again click on Add and then use steps 5, 6,
and 7 to add the series name and series values (data points) for Observed
Number. Then click on OK to see the resulting graph.

9. Check the labeling of the ticks 30
on the X-axis of your graph.
This one looks correct
(labeled 1-10)


1 2 3 4 5 6 7 8 9 10

The resulting graph is a little…naked. A good graph in science must pass a basic litmus
test… Imagine that you printed out this graph and dropped it on the ground on your way
home. If another biology student found it, how much information could they get out of it?
Ideally, a graph and its caption should be sufficiently detailed that, if you dropped it, the
finder would be able to understand exactly what results the graph was showing. In this
case, your figure needs the following details: axis labels, a legend designating what
series each color corresponds to and a … caption! You thought I was going to say title
didn’t you? Titles are for sixth grade science fair projects, not professional graphs! Flip
open a journal and study the figures. You’d be hard pressed to find one with a title.

10. To add the missing elements, you will

need to use the “Add Chart Element” (+
button) that appears to the right of
your chart when it is selected. You can
also use the Add Chart Element button
in the menu on the top of your sheet.
Check the boxes for Axis Titles and

11. Edit the axis labels. The X-axis label should be “Consecutive Same-side Coin-Flips” and
the Y-axis should be “Frequency.”

12. The legend indicates what series each bar
color corresponds to. You will get an option
to decide where to put it, but most journals
would ask you to add it on the right by

Much better, but you really need a caption

for the reader to figure out what is going on.
Examine the figure to the left. You’ve
probably been wondering what it’s about.
What organism is this? Why are their scars on their flanks? The axis label added another
mystery: why is it only the right flank? A good caption would clarify things greatly.

Unfortunately, Excel doesn’t have a good way to add captions. We will use Microsoft
Word instead to add a caption.

Part 3: Adding a Caption in Microsoft Word

1. In Excel, make sure your graph is selected and then copy (ctrl+c) your entire graph area
to the clipboard (if you’re not familiar with the term clipboard, this is the magical virtual
place where all your copied information goes until you copy something else).

2. Start Word and open a new, blank document. You could simply paste (ctrl+v), but the
default behavior is to paste it as an active, editable figure. This can be really useful if you
will be making changes to the figure … but it can also be really frustrating when your
figure gets changed converting between different versions of Office. Instead, let’s paste
the figure in as a PNG, or Portable Network Graphic. You’ve likely worked with JPEGs
most of the time. Unlike JPEG, PNG files do not lose information when compressed or
resized. You’ve probably noticed how hard it is to resize a JPEG without making it
blurry. PNGs are easier to work with in that sense.

From the home table, click the “Paste” drop down menu as
shown to the left. Select “Paste Special” and then “Picture

3. Resize the figure so it fills up as much space on the

page as possible, without going past the margins.

4. Next, type the caption. For a figure (image or graph), the caption should be placed
underneath the figure. For a table, the caption (header) should be positioned above
the table. Captions always begin with a figure or table number. This figure is your only
figure, so “Figure 1” will work. The caption also states what is depicted in the graph
rather than interpreting or reporting the results obtained. Review the figure and
caption below to see what’s in an effective figure and caption, and then type your

5. Save the Word file with the name LastName_ProbabilityHistogram.

Figure 1. Scars left by scale eating cichlids on the right flanks of male and female
Pseudotropheus zebra. Examination of 204 specimens collected from Lake Malawi, Africa in
July of 2009.

The explanation of results, such as "Males showed more scars on the right flank than
females, indicating a preference for attacking males among “left handed” scale eaters."
usually goes in the text of the Results section, rather than in the graph caption.

Before leaving the lab, make sure you complete the review questions on Canvas to
earn points! You will upload your Word file with your figure as part of the Review

