Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 18

CHAPTER FIVE: PRODUCING DATA

Objectives:
Distinguish between, and discuss the advantages of, observational studies and experiments. Identify and give examples of different types of sampling methods, including a clear definition of a simple random sample. Identify and give examples of sources of bias in sample surveys. Identify and explain the three basic principles of experimental design. Explain what is meant by a completely randomized design. Distinguish between the purposes of randomization and blocking in an experimental design. Use random numbers from a table or technology to select a random sample.

SECTION 5.1: Designing Samples What if, we wanted to find out, on average, how many evenings per week all Americans dine out? Since we cannot put a question to the entire population of the U.S., we can put the question to

Sample studies are one type of observational study.

Caution:

In order to see the response change, we must actually impose the change.

Observational studies of the effect of one variable on another often fail because the explanatory variable is confounded with lurking variables. Review: Confounding: Simulation provides an alternative method in these circumstances. After producing data, the next logical step is to use formal statistical inference, which answers specific questions with a known degree of confidence. DEFINITIONS: 1. 2. 3. 4. If conclusions based on a sample are to be valid for the entire population, a sound design for selecting the sample is required. The design of a sample refers to Poor sample designs can produce misleading conclusions. 5.

6.

The design of a study is

Simple Random Samples The statistical remedy for personal choice bias samples is to allow impersonal chance to choose the sample. A sample chosen by chance allows neither favoritism by the sampler nor selfselection by respondents. Choosing a sample by chance attacks bias by The simplest way to use chance to select a sample is to place names in a hat ( ) and draw out a handful ( ). This is the idea of DEFINITION:

An SRS not only gives each individual an equal chance to be chosen, but it also The idea of an SRS is to choose our sample by drawing names from a hat. In practice, computer software can shoose an SRS from a list of individuals in the population by using a random number generator or by consulting a table of random digits.

A table of random digits is a long string of the digits 0, 1, 2, 3, , 9 with two properties: 1. 2. Table B on the back page of your book is a table of random digits. HOW TO CHOOSE AN SRS: 1. 2.

The use of chance to select the sample is the essential principle of statistical sampling. A probability sample is a sample chosen by chance. We must know what samples are possible and what chance, or probability each possible sample, has. When sampling from large populations, it is common to sample important groups within the population separately, then combine these samples. This is a To select a stratified random sample: 1. 2.

Another common means of restricting random selection is to choose the sample in stages i.e the current population survey uses a multistage sampling design along with opinion polls and other nation samples. There are a few cautions about using sample surveys in particular. When the population consists of human beings accurate information from a sample requires much more than a good sampling design. To begin we need an accurate and complete list of the population. Because such a list is rarely available,

A more serious source of bias in most sample surveys is

Again,

In addition, the behavior of the respondent of or the interviewer can cause response bias.

Some final comments:

Because we deliberately use chance, the results obey the laws of probability that govern chance behavior. (We will study all the laws of probability in Chapter 6.) Results from a survey usually come with a margin of error which we will learn in Chapter 10. Finally,

and part of 5.1 HW

SECTION 5.2: Designing Experiments Experiment: Experimental units:

Treatment: The distinction between explanatory and response variables is important! Factors: Many experiments study the joint effects of several factors. Each treatment is formed by combining a specific value, called a level, of each of the factors. Does regularly taking aspirin help protect people against heart attacks? The Physician's Health Study looked at the effect aspirin. One-half of the 21,966 physicians in the study took aspirin and the other half did not. The subjects: The factors: Treatments: The physicans that did not take the aspirin took a ______________: Why do we use them?

Laboratory experiments in science and engineering often have a simple design with only a single treatment, which is applied to all of the experimental units. The design of such an experiment can be outlined as

With comparative experiments we need to guard against the placebo effect: Especially in medical experiments where there are two groups, one treated with medication and one treated with a placebo,

To limit the effect of the placebo response and also the confounding of variables,

The group of patients who receives a sham treatment is called a _______________ because

1st rule:

2nd rule:

How can we assign experimental units to treatments in a way that is fair to all of the treatments? The statistician's remedy is to rely on chance to make an assignment that does not depend on any characteristic of the experimental units and that does not rely on the judgment of the experimenter in any way. The use of chance can be combined with matching, but the simplest design creates groups by chance alone.

3rd rule: When treating two groups,

To avoid making quick conclusions about cause and effect from a single study, The 3 principles of good experimental design are: 1. 2. 3. You will often see the "statistically significant" in reports of investigations in many fields of study.

Cautions about experimentation: One type of study is called a double blind experiment:

The most serious potential weakness of experiments is lack of realism:

Many behavioral science experiments use subjects that know they are subjects... Sometimes a matched pairs design is used to compare subject preference between two objects. The two objects taken together are called a ____________. Arrangement of the treatments within the block must be _______ and _________. Tossing a coin, which doesn't seem too scientific, is a great way to accomplish this. The order that the treatments are given can also influence the subject response, so ___________________________________________ also, again by a coin toss. The matched pairs design uses

A block design is: A block is:

Blocks are a form of control.

The idea of blocking is an important additional principle of statistical design of experiments. A wise experimenter will form blocks based on Randomization will

EXPERIMENTAL DESIGN ACTIVITY Surfing Clams of North Carolina Snood Power! Personal SpaceThe Final Frontier Return of the Hungry Turkeys

The Surfing Clams of North Carolina


In the field of marine invertebrate biomechanics, a researcher has reason to believe that coquina clams in North Carolina attempt to stay near the waters edge by, in effect, surfing. Her Hypothesis is that the clams use their muscular tongue-like foot to propel themselves toward the beach, where the waves break and stir up the plant and animal material that is the clams food. One difficulty she faces in evaluating the hypothesis is that the surf propels both live and dead clams toward the beach. She would like to design an experiment to test her hypothesis, using life status of the clams as an explanatory variable. She has budgeted for 200 clams, 100 live and 100 deceased, to use in her experiment. Her plan is to place the clams in the water at different locations in the surf; she reasons that the live clams, on average, should end up closer to the beach than the deceased ones. The clam habitat exists roughly for two miles along the beach, and about 20 yards back into the ocean from low tide. The elevation of the shore decreases slowly at a fairly constant rate over the 20 yards. Over the 2 miles along the beach, the bottom of the ocean ranges from large plant density in the North to a more rocky terrain in the South. Tides come in and go out twice a day, once during daylight hours and once during nighttime hours. For purposes of this experiment, you may assume that high tide occurs at noon and at midnight. Your task is to design an experiment to address this question: do the coquina clams propel themselves toward the food source? Be particularly clear about how you assign the clams their initial positions in the surf and the times of entry into the water. Since the clams may not move very far during the experimental period, you will need to be precise in your description of the measurement process. You may assume that marking the clams L and D for live and deceased will not attract predators, and that the experiment is of short enough duration that the likelihood of the experimental mortality is vanishingly small.

Snood Power!
The wild turkey, Meleagris gallopavo, is an example of a sexually dimorphic species exhibiting an array of extravagant behavioral and morphological [characteristics] that serve no obvious function other than to attract mates. One of these characteristics is a distensible process at the base of the upper mandible known as a snood. From the female perspective, better father turkeys will mean that the next generateion of turkeys will have a greater chance of survival; a big snood, it is suggested, is taken by the female to indicate good genes. This, from a biological standpoint, it makes sense to select mates from males with bigger snood. It is possible that the snood length might also be regarded as a measure of tough turkey maleness by male turkeys. From the male turkey perspective, it might not be a good idea to ruffle the feathers of a big snooded male because he might ruffle your feathers in response. Your task is to design an experiment to address this question: does the male turkey regard other males snood length as a measure of dont-tread-on-me capability? An experiment is envisioned as follows. One male turkey decoy is to be placed in a small arena with a pile of birdseed nearby. (Using a decoy will

give direct control over some possible confounding variables. For example, a live turkey might provide a different aggressive display for different situations, or be at different levels of hunger during the experiment.) A hungry live male turkey would then be placed in the arena. It is reasoned that if male turkeys regard the snood as indicative of a powerful turkey they will be less fearful of competing for food against a short snooded turkey. You have budgeted for 60 male turkey subjects for your experiment and can order the animals to conform to your size specifications. For example, a 5 lb turkey would be small, a 20 lb turkey would be large.

Personal SpaceThe Final Frontier


Experimental studies of human interaction have examined the concept of personal space, a zone of distance that people maintain between themselves and others. The amount of personal space maintained, of course, depends on the relationship between the persons, and the particular situation. The studies have demonstrated that as experimenters approach subjects, various signs of discomfort emerge in the subjects. It is possible that the nervousness on the part of subjects is due to the uncertainty of the motives of the experimenter, or natural fear of an unknown person, rather than simply the close proximity. An experiment is envisioned which forces the subject to approach a stationary experimenter. The idea is that the experimenter will sit in a chair some distance from a water fountain, reading a book and ignoring the hall traffic. It is reasoned that if subjects of the experiment the potential water drinkers avoid drinking water near the experimenter, they do so because of their reluctance to invade the personal space of another. The subjects have a perfectly good reason to invade the space, i.e. getting a drink, and they would have no reason to believe the experimenter has any malicious intent, he is just reading a book. You are to design an experiment to examine the existence and effect of personal water fountain space.

Return of the Hungry Turkeys


The color vision of birds plays a role in their foraging behavior; birds use color to select and avoid food. Whether such behavior is evolutionary or learned is not known, and the relative preference and avoidance of particular foods varies from species to species. In designing food for future Thanksgiving turkeys, it would be cost effective for the turkey farms if the manufacturers could produce food that turkeys would eat, but that other bird species would avoid. It is envisioned that turkey raisers would exist in environments with different competing birds. It is of interest to try to answer the following questions: 1. Do turkeys have a color preference/avoidance pattern for food? 2. If so, does this pattern exist from birth? 3. Can they be trained by standard methods to eat certain colored foods while avoiding others? You may assume that you have an unlimited number of turkeys of all ages for your experimental use, and that the gender of the turkey may be ascertained at birth. You may further assume that turkeys are able to discriminate among the colors blue, green, yellow, and red, and that the turkey food can be dyed those colors without changing the palatability of the food. You may also assume the existence of an agent that renders the food unpalatable and which works equally well for all colors of dye. (Are these miracles of what!?!?!?!?!?)

Section 5.3: Simulating Experiments There are three methods for answering questions like "What is the likelihood that...?", "What are the chances that...?", or "What are the chances that this will happen before that?"
We can: 1) Try to estimate the likelihood of a result of interest by actually carrying out the experiment many times and calculating the result's relative frequency. This is slow, possibly costly, and often impractical. 2) Develop a probability model and use it to calculate a theoretical answer. This requires that we know something about the rules of probability and may not be feasible. 3) Start with a model that reflects the truth about the experiment, and develop a procedure for imitating, or simulating, a number of repetitions of the experiment. This is quicker, cheaper, and relatively easy using technology.

The imitation of chance behavior, based on a model that accurately reflects the experiment under consideration, is called a ________________________. Old statistical technology includes equipment like two sided fair coins, and dice of any number of independent, non-repeated numbers, using a random number table.
Flipping and tossing are really mathematical concepts. Coin tosses are independent because the result of one toss has no effect or influence over the next coin toss. Shooting 10 free throws in basketball and observing the sexes of 10 children have similar models and are simulated the same way. A model is based on opinion and past experience.

Assigning digits can be done in different ways but some are more efficient and preferred. New technology includes assigning outcomes to random digits using a calculator or computer
software. These machines will calculate as many repetitions as requested quickly and without complaining.

Using the TI-83 we can use the command randInt (found under MATH/PRB/5) to generate lists of random integers at will. The command randInt (0, 9, 8) will generate lists of 8 random numbers between 0 and 9 with successive enter commands.

The command randInt (0, 99, 12) generates lists of ____ random numbers between ________. How could we simulate the outcomes of rolling a 6 sided die 15 times?
Repeating the steps for carrying out a simulation (most often carried out with random numbers) are:

1. 2. 3. 4. 5. 6.

Experiment Pepsi vs. Coke In this activity, we want to determine if students can tell the difference between two popular colas. The students will be given 3 cups of cola. Two will be of one type and the other will be of the other type. The goal is not to be able to determine which is Coke and which is Pepsi but to be able to tell which one cup contains a cola different from the other two. We are interested in determining the proportion of students that can tell a difference in the colas. What are some things to keep in mind in conducting this study in order to validate the results? How will we account for these? Each student will take the taste test. The student administering the test will record the result (correct or incorrect in identifying the different cola) Results: We are interested in the proportion of correct choices. Record the number of correct and incorrect choices in the table below. Correct Incorrect

What proportion of the students chose correctly?

If a student were blindly guessing, what proportion would he/she have guessed correctly?

Did our class do better than blindly guessing? Was our proportion significantly higher than blindly guessing to say that some students can distinguish between the flavors of two popular colas?

(over)

Simulate 40 classes blindly guessing using the calculator. Clearly describe the procedure. And carry out the simulation. Proportion Proportion Trial Trial correct correct 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40

What proportion of these simulations did as well or better than our class did in choosing the different cola?

Now use this number to describe whether the proportion of correct choices by our class was significantly higher than blindly guessing to say that some students can distinguish between the flavors of two popular colas.

Ch 5 review

You might also like