Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 35

Reading Scientific Papers

And other useful mathematical tools for journalists A Mathematicians Perspective


Rebecca Goldin December 10, 2012 National Press Foundation

Statistical Assessment Service www.stats.org


Jon Entine, Senior Fellow Cynthia Merrick, Intern

Rebecca Goldin, Director of Research Trevor Butterworth, Editor

Statistical Concepts in Scientific Journal Articles


Mean, median, mode Standard deviation Confidence intervals

Orders of magnitude
Confounding factors Percentages Absolute vs. relative risk Scientific methods Causation versus correlation

In the beginning: a press release


Press Releases Dont Tell the Whole Story. Are designed to get attention by the press Present the results in the rosiest terms possible. Dont put the results in context

Abstracts Dont Tell the Whole Story. They dont answer: How were the subjected recruited? What was the design of the experiment? What methods were used to analyze the data? What are the weaknesses of the conclusions?

Basic Advice for a Journalist with Limited Time and Ideas


Read as a skeptic at all times. Avoid most

conclusions of causality. A lot can be understood by even a cursory read (<10 minutes) of the summary, the abstract, and the conclusion. Avoid the press release. The summary and abstract will tell you the results, but hardly ever hint as to what the limitations are. The conclusion will often tell you some caveats. Look up on PubMed.gov key words and see if other literature has been published on the topic give other research equal time!

(Easy) Questions to Ask While Reading a Journal Article

How are the people in the study recruited? In particular, would the recruitment method itself bias the results by involving people who might not be typical with regard to the thing measured? Recent example: Triple P (Positive Parenting Program) seemed to have clear positive impact for families with children with conduct disorder. However, most participants in most studies were self-selecting they were more likely to be motivated, more likely to be literate, more likely to be confident to volunteer their family (leading possibly to higher levels of compliance to treatment). For example, are women who have a family history of breast cancer more likely to get a mammogram? If so, then rates of cancer detection among women getting

(Easy) Questions to Ask While Reading a Journal Article


How is data collected? Is there room for bias (this is

especially important in survey, opinion, and food studies). Is the data biologically relevant? Are the numbers significant? (more on significance in a bit) Usually scientific articles are trying implicate something or some behavior red meat eaters have more cancer translates as red meat causes more cancer. Are the authors considering several possible explanations for observed data (such as the different

The Moral and Statistical Collide


Medical Abortion Obesity Nursing vs. Bottle

Feeding Smoking Homosexuality Daycare Food/Alcohol Natural versus Chemical Pollution Crime

Causation or Correlation

Its easy to be fooled


Height correlates with reading skills in children

under 10. Income correlates with success in college. Ratio of finger lengths correlates with aggression. Facebook correlates with poor grades. Facebook correlates with good grades. Doing heroin correlates with doing marijuana. Higher taxes correlate with high annual growth, and are inversely correlated with poverty rates. Alcoholism correlates with less gray matter in the prefrontal cortex.

fMRI studies a case study


fMRIs are large magnets

measuring oxygen levels in blood People can engage in activities inside the machine Patterns of blood flow are thought to reflect patterns of brain activity (more on that in a bit). Typical studies: assume that observed patterns only occur when the tested behavior occurs. Typical studies: assume that observed patterns are caused by the tested behavior.

fMRI studies a case study


Lying can be determined by

patterns of fMRI scans. But perhaps stress or anxiety can lead to the same patterns Violent video gaming leads to violent brain patterns But perhaps any competitive play, including non-violent non-video games has similar brain patterns. Plus, no indication of actual violence. Math anxiety triggers activity in the pain center of the brain. But no pain experienced by subject with math anxiety. Perhaps anxiety, not mathematics, correlated with

Jumping from Correlation to Cause


You dont always have to know why it may not be

causal. Be wary of any claims of causality. Some common reasons that a correlation could look causal when its not include: not adjusting for confounders, misunderstanding the mechanism, having an unknown confounder. A causal relationship might be reasonable to suspect when the statistics are
Overwhelming

Observed in many different contexts


Repeated tests show the same effect, on large numbers

of people Double blind case-control studies.

Causation vs. correlation is not the only thing to worry about in medical research

The roll of randomness


Given a hug urn of balls 30% of the balls are white,

and the rest are other colors. Each of 100 people pick 10 balls, write down their colors, then return the balls to the urn. Some people will have 3 white balls, but others will have greater or fewer.

Number of Whites is Not Determined


Randomness in Choosing 10 Balls: How many are white?
0.3 0.2 0.25 0.15 0.1 0 0.05
Probability

About 27% chance

you will get 3 white balls; its much more likely youll get some other number About 1% chance of getting 7 white Suppose that white represents something balls random, and bad, like producing a defective product at your factory. If one factory produces 70% defects while the others only have 30% defects, wouldnt you think theres a reason? Our statistics suggest that maybe not. But we look for reasons, convinced of
0 1 2 3 4 5 6 7 8 9 10 Number of White Balls Chosen

On p-values
Suppose you are flipping a coin

seen: it answers the question: if the coin were fair, how likely would I be to see the data I am seeing? In other words, if you had a fair coin, is it reasonable to see the proportion of heads/tails, or is it very unlikely to see that? If you flip 1000 times, and you get 520 heads, there is just under a 10% chance of getting this many heads (or more). In contrast, if you had 550 heads in 1000 flips, the chance of this happening randomly is only about .1%., i.e. very unlikely if the coin were fair. The biomedical community generally accepts p=.05 (5%) as

many times, and you think this coin is biased, because you arent getting close to heads and tails. How can you quantify your The p-value is a measurement based on the data you have suspicion?

Confidence Intervals, Odds Ratios, Confounders

Confidence Intervals are significant when they do not contain the number 1.0.

Multiple Testing, or How to Guarantee Results


Once you have a standard, like p<.05, you have

ways of gaming the system. There is always a small chance, something less than 5%, that you will see something that looks suspicious when it really isnt. Sometimes your coin will favor heads by a suspicious amount, the coin really is fair. The more hypotheses you check, the more likely this it to happen. And once you find something suspicious, you can write a scientific article about that.

What happens in the lab: Experiments Galore...

What the rest of the world sees

Metaphors for Bad Statistical Methods


Drunk looking for his keys under the lamppost

Metaphors for Bad Statistical Methods


Texas Sharpshooter Fallacy

In real life? Value Added Models


Value added models evaluate teachers based on

the progress of the kids, measured in standardized tests, in their classes during the year they teach them. The idea is to use these test scores to evaluate whether the teachers are effective or not. Never mind the difficulties of the measurement, or the question of what effective means.

In real life? Value Added Models


If kids were distributed randomly to the teachers,

some teachers would get unlucky and have some bad learners. Just like its unlikely to get 7 white balls out of 10 when there is only a 30% chance of getting a white ball, when you do it on a large scale, its bound to happen. Bad teachers may fare well, good teachers may fare poorly, purely due to chance. If you correct for known problems, such as family income, you introduce more variance. Lots of people know about these issues, but no one is reporting on them.

Absolute versus relative risk


Absolute risk is the risk you actually undergo.

Women who take the birth control pill have an absolute risk of venous thrombosis (blood clot) of about 1 in 15,000 per year. The absolute risk of women who do not take the pill is 1 in 10,000 per year. Relative risk is a risk compared to another group. Women who take the birth control pill have a relative risk of 50% of venous thrombosis, compared to women who dont take the pill.

Relative risk representations have consequence


In 1995, Committee on

Safety of Medicine in UK concluded that the 3rd generation birth control pill was riskier than previous versions. Some press reported a 100 percent increase in risk in Deep Vein Thrombosis (blood clots); others reported twice the risk.

The absolute risk for DVT was 15 per 100,000 for 2nd generation birth control pills The absolute risk for DVT was 30 per 100,000 for 3rd generation birth control pills. The media blitz led to many women not taking their medications (rather than immediately replacing them) and a resulting increase in unwanted pregnancies The abortion rate went up 9% from 1995 to 1996. The absolute rate of DVT for pregnant women is 80 per 100,000.

Media Impact is Great

In 1998, Andrew

Wakefield published a study on 12 children which was the basis for the belief that Autism is a result of vaccinations. Press repeatedly reported these results, even though the scientific community was unable to reproduce the results. The existence of this study gave greater voice to other studies

A journalist was responsible for an investigation of the scientific integrity of Wakefields work. After his autism study was discredited, most media coverage about vaccinations reports both sides of the story about whether vaccines are safe or not. However, the medical community almost universally endorses vaccine and believes that vaccines are safe. Pockets of measles and croup due to vaccination refusal or lack of herd effect have been found in the U.S. and in the UK.

Scientific Culture
Scientists hear of a surprising result and wonder, What went wrong? Titles of papers are meant to be informative about the paper, not conclusive of the results. Abstracts tell only part of the story. Results cannot be divorced from methods. The timeline of a scientific result is months to years. Great ideas should be kept secret until they can be proven, The media and lay people dont matter as much as scientific journals. Any result should be contextualized. Vested interests include funding sources, prestige, and belief systems.

Journalist Culture
Journalists hear of a surprising result and think it makes for a great story Headlines are meant to sell the story. Scientific abstracts tell the whole story about the science. Results are results, unless there is a money trail telling a story. Journalists are on a tight time-line (often a couple days), Great ideas should be shared right away, before they get scooped. One scientific result is enough to write a story. Recent meta-analysis found that spin in news coverage correlated highly with spin in abstract article conclusions.

Research is routinely plagued


Research is plagued Low levels of significance Multiple testing No acknowledgement of randomness in research design Lack of context/repeated experiments Scientists dont know how to talk to journalists. But if you are looking to find one scientist willing What can a journalist do? Write about the levels of significance, bias, caveats Ask the researchers about multiple testing. Did they adjust for them? Write about absolute risks. Look for a body of research rather than one specific paper Cite your sources! DONT INDICATE

Doting on Data
What cannot be found in the data?
Answers to our moral

Lessons to be Learned

Dont over-estimate the

questions (though they can be informed by data) Answers to extremely complicated questions 100% certainty Lots of data is unavailable. Answers that take care of all possibly

ability of poor data to give answers. The world is complicated; many things interact with each other. The public voice is at least as loud as the scientific voice. Consensus is extremely important

To Life!

Thank you!

You might also like