Motion in Limine On Statistical and Scientific Evidence

UNITED STATES DISTRICT COURT
FOR THE DISTRICT OF COLUMBIA

DONALP RAYNOR, JR., ~ AL.
Plaintiffs,
vs.
RICHARDSON-MERRELL INC.,
Defendants.
)
)
)
)
) Civil Action No. 83-3506
) (Judge Thomas F. Hogan)
)
)
)
)
--------------------------_.)
MOTION IN LIMINE REGARDING THE SCOPE OF
SCIENTIFIC EVIDENCE TO BE OFFERED AT TRIAL
Defendants Merrell Dow Pharmaceuticals Inc., sued herein under its former
name Richardson-Merrell Inc., and Standard Drug Company Inc. respectfully move this
Court for an order governing the scope of the statistical evidence to be offered at trial.
Unless statistical evidence used at trial conforms to generally accepted standards and is
material, the danger of misleading the jury and prejudicing the party against whom such
evidence is offered is so grave as to require that such evidence be excluded. The reasons
for this motion are more fully set forth in defendants' accompanying memorandum of
points and authorities.
Dated: July 3, 1986
Respectfully submitted,
~ ~
MARK L. AUSTRIAN Bar No. 346593
PATRICK J. COYNE Bar No. 366841
COLLIER, SHANNON, RILL & SCOTT
1055 Thomas Jefferson Street, N. W.
Washington, D.C. 20007
(202) 342-8400
Attorneys for Defendants
Merrell Dow Pharmaceuticals Inc. and
Standard Drug Co., Inc.
DONALD RAYNOR, JR., ET AL.
Plaintiffs,
vs.
Defendants.
)
)
)
)
)
)
)
)
---------------------------)
ORDER GOVERNING SCOPE OF
STATISTICAL EVIDENCE OFFERED AT TRIAL
On the basis of the motion in limine submitted by defendants Merrell Dow
Pharmaceuticals Inc. and Standard Drug Company Inc. regarding the scope of scientific
evidence to be offered at trial, the memoranda of the parties in support of and in
opposition to that motion, and argument of counsel, it is this day
of ________ , 1986, hereby
ORDERED that:
1. Statistical evidence be, and hereby is, admissible in evidence only if that
evidence is statistically significant at a confidence level of 9596; and
2. Plaintiffs be, and hereby are, precluded from presenting alleged
methodological flaws and other errors in epidemiological and animal studies in their case
in chief unless plaintiffs can make a satisfactory showing that, absent the flaws, the
studies would affirmatively demonstrate that Bendectin is a teratogen at the 9596
confidence level.
THOMASF.HOGAN
UNITED STATES DISTRICT JUDGE
DONALD RAYNOR, JR., ET AL.
Plaintiffs,
vs.
Defendants.
)
)
)
)
)
)
)
)
---------------------------)
MEMORANDUM OF DEFENDANTS
MERRELL DOW PHARMACEUTICALS INC. AND
STANDARD DRUG COMPANY INC.
IN SUPPORT OF MOTION IN LIMINE
REGARDING THE SCOPE OF
STATISTICAL EVIDENCE TO BE OFFERED AT TRIAL
Defendants Merrell Dow Pharmaceuticals Inc. ("Merrell Dow"), sued herein
under its former name Richardson-Merrell Inc., and Standard Drug Company Inc.
("Standard") respectfully submit this memorandum in support of its motion in limine
regarding the scope of the scientific evidence to be offered at trial. One of the primary
issues at trial will be whether the drug Bendectin, manufactured by Merrell Dow, cal1ses
birth defects. Both parties will rely on a number of epidemiological studies in presenting
their arguments on this issue of causation. These studies involve statistical analysis of
populations of individuals with respect to Bendectin usage and birth defects.
In order to be admissible, statistical evidence must conform to generally
accepted standards used by professional epidemiologists and statisticians. As will be
discussed below, in order for a study to show a valid statistical relationship between an
exposure to a drug and a disease or malformation it must be "statistically significant" at

the 95% level. Merrell Dow requests an order providing that any statistical proof
- 2 -
concerning a relationship between Bendectin and birth defects be limited to those studies
which show such a statistically significant association. If a study does not, that
statistical evidence is not of a type reasonably relied upon by experts in the field to
which It pertains and cannot form the basis for expert opinion at trial. The danger of
prejudice or of misleading or confusing the jury requires that evidence of alleged
associations that are not statistically significant must be excluded.
Further, in attacking alleged flaws In the epidemiological studies that have
been conducted on Bendectin, Merrell Dow requests an order providing that those flaws
must necessarily alter the authors' conclusions and result in a statistically significant
association at the 9596 level in order for any such alleged flaws to be material and to be
presented in piaintiffs' case in chief. Identical motions have been granted by Judge
Jackson in Richardson v. Richardson-Merrell Inc., No. 83-3505 (D.D.C. June 9, 1986) and
Judge Johnson in Koller v. Richardson-Merrell Inc., No. 80-1258 (D.D.C. February 25,
1983), discussed below. 11
I. INTRODUCTION
A. Statistical Significance
Medicine is concerned with limiting or preventing disease. Once the cause
of a particular disease is established, the ultimate objective is to prevent its
occurrence. A factor may be said to cause a disease when its presence is shown to
contribute to the development of the disease and its removal is shown to reduce the
frequency ~ disease.
The first step in determining whether there is a relationship between
exposure and a disease or malformation is to determine whether a statistical association
11 A copy of Judge Jackson's Order is attached hereto as Exhibit A. A 90PY of Judge
Johnson's decision is attached hereto as Exhibit B. Judge Jackson's Order states that the
Order is granted preliminarily.
-3-
exists. Lilienfeld, A., Lilienfeld, D., Foundations of Epidemiology 289 (2d ed. 1980),
Exhibit C. In other words, is there a statistical relationship between exposure to the
drug and the occurrence of the disease or malformation. If there is no statistical
relationship, then there is no reason to suspect the exposure.
In the field of congenital malformations, the initial determination of
whether a statistical association exists is through the use of epidemiological studies.
Because of the dangers of incorrect conclusions based on statistical evidence, scientists
use great care before concluding that a statistical association exists between exposure
and a disease. Scientists generally use a confidence level of 9596 before making a
determination that an association is "statistically significant". D. Freedman, R. Pisani,
R. Purves, Statistics at 444 (1980), Exhibit D. The use of a 9596 confidence level is
consistent not only with the published texts but also with the opinions of plaintiffs'
experts in this case. For example, Dr. Done has testified as follows:
Q. So in your FDA study you are telling me that you simply
never addressed the issue of causation?
A. No. That is not correct. You use the level of confidence
that the difference exists at in drawing your conclusions
about whether causation is likely. And there you would
use whatever standard is required by whomever is asking
that question. Scientifically, you would not say that you
could show a probability of causation from whatever
association you find, unless you find that at a 9596 level.
That is the scientific level that scientists conventionally
use. (Emphasis added.)
Deposition of Alan K. Done at 225-26.!!
As will be discussed later, the 9596 level is generally accepted in the
scientific community. Dr. Done again acknowledged this point:
Q. Before you start the exercise [referring to causation],
that is the minimum confidence level, that is 9596?
!! Deposition of Dr. Allan K. Done in Schumacher v. E.R. Squibb & Sons,., No. 84-2955
(C.D. Cal.) taken on July 2, 1985 ("Done Dep."). Copies of the cited pages are attached
hereto as Exhibit E.
-4-
A. Yes.
Q. That is the one that is generally accepted by the
scientific community, as I understand.
A. Yes.
Q. You accept that too?
A. Yes.
Done Dep. at 226. Exhibit E.
stated:
Another of plaintiffs' experts in this case, Dr. Nancy Lord,
Q. Would it be true that you cannot draw any cause and
effect inferences or conclusions from epidemiologic
studies where the association is not statistically
significant?
A. Yes, in general I'd agree.
Deposition of Nancy T. Lord at 2 0 6 ~ /
A statistically significant relationship does not itself establish causation,!/
but it is a critical first step. Exhibit C at 289.
1. The Anticipated Problem
Merrell Dow anticipates that plaintiffs will in this action, as they have in
other Bendectin cases, attempt to introduce expert testimony based on statistical
analyses which are not generally accepted in the scientific community. Merrell Dow
believes that plaintiffs' experts will attempt to ignore the limitations on statistical
3/ Deposition of Dr. Nancy Lord in Velleff v. Ortho Pharmaceutical Corp., No. 80 L 4988
(Cir. Ct. Cook County, Ill.), taken on July 31 - Aug. 2, 1985. Copies of the cited pages
are attached hereto as Exhibit F.
4/ Once a statistically significant association has been found, a number of other factors
must be evaluated before a conclusion as to causation can be reached. The Advisory
Council to the Surgeon General of the Public Health Service in 1964 defined five criteria
that should be fulfilled to establish a causal relationship: (1) the consistency of the
association; (2) the strength of the association; (3) the specificity of the association; (4)
the temporal relationship of the association and (5) the coherence of the association.
- 5 -
significance and testify that associations from epidemiological studies are evidence of
causation even though the 9596 confidence level is not met. The rationale appears to be
that plaintiffs' experts are entitled to ignore scientific principles when they testify in
court and can use the legal test of "more probable than not" and say that a statistical
association is significant at a 5196 level of confidence. Thus, plaintiffs desire to inject
mathematical evidence into this case while ignoring the inherent limitations of that
evidence. It is Merrell Dow's position that, under the Rules of Evidence, experts are not
entitled to ignore generally accepted scientific principles when testifying.
2. The Richardson and Koller Decisions
There is precedent directly on point in this District in other Bendectin
cases. In Richardson v. Richardson-Merrell Inc., No. 83-3505 (D.D.C. June 9, 1986) Judge
Jackson recently granted defendant's motion in limine regarding the scope of statistical
and scientific evidence to be offered at trial preliminarily. In his Order Judge Jackson
provided that "plaintiffs shall present no evidence absent a showing of statistical
Similarly, in Koller v. Richardson-Merrell Inc., No. 80-1258 (D.D.C.
February 25, 1983), Judge Norma Holloway Johnson granted Merrell Dow's motion to
limit plaintiffs' proof:
With respect to statistical significance, no statistical evidence
will be admitted during the course of the trial unless it meets a
confidence level of 9596.
Id. at 1-2. (Footnote omitted.) / In its opinion, the Court cited a series of cases
establishing that statistical evidence is inadmissible unless it meets the 9596 confidence
level generally accepted by statisticians. These authorities are discussed at page 16,
A co-py of Judge Jackson's Order is attached hereto as Exhibit A.
61 A copy of Judge Johnson's decision is attached hereto as Exhibit B.
-6-
B. Order of Proof
Under the best of circumstances, the jury will have a difficult task
understanding the epidemiological proof presented at trial. Since none of the
statistically significant epidemiological studies support plaintiffs' case on limb defects,
Merrell Dow anticipates that plaintiffs will attempt in their case in chief to attack the
epidemiological studies without any regard to whether these alleged flaws would alter
the results of the studies. As a result of these attacks, the jury will eventually be
confused and lose sight of the fact that plaintiffs are obliged to produce statistically
significant evidence to prove causation. In Koller, Judge Johnson recognized the
potential for confusion and concluded:
As for plaintiffs' plans to attack the epidemiological and
animal studies of Bendectin, such evidence may not be
presented in plaintiffs' case in chief unless plaintiffs first
establish a foundation that the particular flaw would alter the
conclusions of the study in a statistically significant manner.
* * *
Plaintiffs have no right to attempt to preempt defendants'
anticipated defense in their case in chief by advancing alleged
weaknesses in studies that are favorable to defendant. This
approach would confuse the jury and would badly obscure the
fundamental requirement that plaintiffs prove that Bendectin
is a teratogen that caused the birth defects of Anne Koller.
Koller, memo OPe at 2-3. Exhibit B.
D. ARGUMENT
A. It Is Essential That The Court Exercise Its Inherent Authority
To Control The Mode And Order Of The Presentation Of
Scientific Evidence At Trial In Order To Prevent The Jury
From Being Confused Or Misled
As developed under the common law, the judge has broad powers to control
the mode and order of the presentation of evidence at trial. Moreover, the trial judge
has an obligation to exercise that power in appropriate circumstances. Koller, memo OPe
at 3. Exhibit B. Rule 611 of the Federal Rules of Evidence codifies this common law
-7-
power and responsibility. Rule 611 provides, in pertinent part, that: "[t]he court shall
exercise reasonable control over the mode and order of presenting evidence so as to
(1) make the interrogation and presentation effective for the ascertainment of the truth
[and] (2) avoid needless consumption of time." Fed. R. Evid. 611(a); Baker v. United
States, 401 F.2d 958, 987 (D.C. Cir. 1968), cert. denied, 400 U.S,, 965 (1970), Wright v.
United States, 183 F.2d 821, 822 (D.C. Cir. 1950); United States v. Bender, 218 F.2d 869,
874 (7th Cir. 1955).
This power is particularly critical in a case such as this which involves
complex scientific evidence. The California Supreme Court has noted that,
"mathematics, a veritable sorcerer in our computerized society, while assisting the trier
of fact in the search for truth, must not cast a spell over him." People v. Collins, 68 Cal.
2d 319, 320, 438 P.2d 33 (1968).
Courts are mindful of the danger that testimony expressing opinions or
conclusions in terms of statistical probabilities may mislead and confuse the jury. United
States ex reI. DiGiacomo v. Franzen, 680 F.2d 515 (7th Cir. 1982); United States v.
Massey, 594 F.2d 676, 681 (8th Cir. 1979). The concern is that statistical evidence will
have a potentially exaggerated impact on the trier of fact. "Testimony expressing
opinions or conclusions in terms of statistical probabilities can make the uncertain seem
all but proven." State v. Carlson, 267 N. W.2d 170 (Minn. 1978). Professor Tribe has
noted that:
the very mystery that surrounds mathematical arguments - the
relative obscurity that makes them at once impenetrable by
the layman and impressive to him - creates a continuing risk
that he will give such arguments a credence they may not
deserve and a weight they cannot logically claim.
Tribe, Trial By Mathematics: Precision and Ritual In The Legal Process, 84 Harv. L. Rev.
1329, 1334 (1971).
The mystique surrounding statistics merely compounds the problem. In
,
EEOC v. Federal Reserve Bank, 698 F.2d 633 (4th Cir. 1983), rev'd on other grounds, 467
-8-
u.s. 867, 104 S. Ct. 2794 (1984), the United States Court of Appeals for the Fourth
Circuit noted that:
(i]naccuracies or variations in data or in the formulae used to
test such data may easily lead to different, contradictory, or
even misleading conclusions by experts. This fact prompted
one court to comment that too often statistical conclusions
"appear to depend in large part on the side producing them."
Id. at 645, quoting Stastny v. Southern Bell Telephone & Telegraph Co., 458 F. Supp. 314,
324 (W.O. N.C.), aff'd in' part and reversed in part, 628 F.2d 267 (4th Cir. 1980). The
manipulability of statistics has and should cause courts great concern:
[S]tatistical evidence, like any other type of circumstantial
evidence, "must not be accepted uncritically," and, because of
the sophistication and complexity of many of the statistical
models being used in discrimination cases by professional
econometricians, courts must give "close scrutiny [to the]
empirical proof "on which the models are erected, in order to
guard against the use of statistical data which may have been
"segmented and particularized and fashioned to obtain the
desired result."
EEOC v. Federal Reserve Bank, 698 F.2d at 645-46 {citations omitted}. Even expert
statisticians can overlook critical factors in performing statistical analyses. Tribe, 84
Harv. L. Rev. at 1363. "[I]f [even the experts] were seduced by the mathematical
machinery, one is entitled to doubt the efficacy of even the adversarial process as a
corrective to the jury's natural tendency to be similarly distracted." Id. (Footnote
omitted). Professor Tribe noted:
[t]he problem of the overpowering number, that one hard piece
of information, is that it may dwarf all efforts to put it into
perspective with more impressionistic. sorts of evidence.
The problem - that of the overbearing impressiveness of
numbers - pervades all cases in which the trial use of
mathematics is proposed.
Id. at 1360-61.
In spite of all these dangers, the Court cannot simply jettison statistics in
this case. For better or worse, statistical analysis provides the most direct evidence of
whether or not Bendectin causes birth defects. The evidence must be usedJn spite of its
potential dangers. In re "Agent Orange" Products Liability Litigation, 603 F. Supp. 239
-9-
(E.O. N.Y. 1985), 611 F. Supp. 1223 (E.O. N.Y. 1985) (properly developed epidemiological
evidence is sound, reliable, and must be relied on in mass products liability litigation);
Terrell v. United States, 517 F. Supp. 374, 379 (N.D. Tex. 1981) (rejecting finding of
causation as speculative in the absence of epidemiological evidence or scientific
understanding as to causation); Heyman v. United States, 506 F. Supp. 1145, 1149 (S.D.
Fla. 1981) (physician cannot make accurate prediction of causation without at least some
reference to epidemiological studies). Faced with the tension between the need for
epidemiological evidence and its substantial potential for confusion, how then can
epidemiological evidence be used without being abused?
The court has at its disposal several measures that enable it to regulate the
type and quality of the statistical proof admitted at trial. Fed. R. Evid. 611, 403, and
703. In addition, Section 1.80 of the Manual for Complex Litigation provides that when
desirable to expedite the case, the court should provide an efficient method for
submission and determination of preliminary legal questions. Manual For Complex
Litigation 1.80, cited with approval in Tcherepnin v. Franz, 461 F.2d 544, 548 n.4 (7th
Cir.), cert. denied, 409 U.S. 1038 (1972); Control Data Corp. v. International Business
Machines Corp., 306 F. Supp. 839, 852 (D. Minn. 1969), appeal dismissed, 421 F.2d 323
(8th Cir.), affirmed sub nom., Data Processing Financial &. General Corp. v. International
Business Machines Corp., 430 F.2d 1277 (8th Cir. 1970).1' The need for a thorough
analysis under Fed. R. Evid. 703 is illustrated by the recent United States Supreme Court
decision in Matsushita Electric Industrial Co. v. Zenith Radio Corp., No. 83-2004, slip Ope
(March 26, 1986), which expressly approved the lower court's detailed analysis under Fed.
'1/ In simplifying the proof for submission to the jury, the court has power to limit the
evidence. Manual For Complex Litigation 4.30 at 184-85; United States V. Maryland &.
Virginia Milk Producers Ass'n., 20 F.R.D. 441 (D.C. Cir. 1957) (reducing the period
covered by the evidence to a reasonable length). Similarly, the courts have excluded
statistical proof that did not relate directly to the ultimate question of discrimination to
be resolved in employment discrimination cases under Title VII and the Equal Protection
Clause. New York Transit V. Beazer, 440 U.S. 568 (1979); Coe v. Yellow Freight System,
Inc., 646 F.2d 444, 452 (lOth Cir. 1981).
- 10 -
R. Evid. 703 excluding plaintiffs' proffered expert testimony. Slip Ope at 18 n.19. (A
copy of this opinion is attached as Exhibit G hereto.) This motion raises just such a
preliminary question.
B. Admissibility of Epidemiological Evidence
In order to be admissible, scientific evidence must first be of the type
generally accepted by experts in the particular field to which that evidence pertains.
Only then, can it be reasonably relied upon by expert witnesses. Fed. R. Evid. 703.
There is essentially no dispute as to the general principles this Court should apply in
determining whether statistical evidence is of the type generally accepted by
statisticians and epidemiologists. There may, however, be some confusion in
incorporating these principles into the standard of proof at trial. It is important,
therefore, that the jury, the court, and the parties understand the threshold issue that
statistical evidence must first be determined by the court, not the jury, to conform to
generally acceptable scientific principles before it can be used by expert witnesses as a
basis for their opinion testimony.
1. Only Statistical Evidence That Conforms To Generally
Accepted Statistical and Epidemiological Principles Can Be
Relied On As A Basis for Expert Testimony
In Frye v. United States, 54 App. D.C. 46, 293 F. 1013 (D.C. eire 1923), the
U.S. Court of Appeals for the District of Columbia Circuit set forth the standard by
which questions of admissibility of expert testimony based on methods of scientific
measurement are to be resolved. United States v. Addison, 498 F.2d 741 (D.C. Cir.
1974). Frye requires that scientific evidence be excluded unless the process, system, or
theory on which the evidence is based is "sufficiently established to have gained general
acceptance in the particular field to which it belongs." 293 F. at 1014. The Frye
standard has been applied to a wide variety of scientific evidence including radar, public
-11-
opinion surveys, breathalizers, psycholinguistics, trace metal detection, bite mark
comparisons, blood-spattering deductions, and psychological stress syndromes as well as a
range of other studies, experiments, and tests.
Courts have recognized that the foundational prerequisites set forth in Frye
are needed to predict, and protect against, the p o s s i l ~ prejudicial dangers inherent in
any expert scientific testimony used at trial.
There are good reasons why not every ostensibly scientific
technique should be recognized as a basis for expert
testimony. Because of its apparent objectivity, an opinion that
claims a scientific basis is apt to carry undue weight with the
trier of fact. In addition, it is difficult to rebut such an
opinion except by other experts or by cross-examination based
on a thorough acquaintance with the underlying principles. In
order to prevent deception or mistake and to allow the
possibility of effective response, there must be a
demonstrable, objective procedure for reaching the opinion and
qualified persons who can either duplicate the result or
criticize the means by which it was reached, drawing their own
conclusions from the underlying facts.
United States v. Brady, 595 F.2d 359, 362-63 (6th Cir.), cert. denied, 444 U.S. 862 (1979),
quoting United States v. Brown, 557 F.2d 541 at 566 (6th Cir. 1977) and United States v.
Baller, 519 F.2d 463, 466 (4th Cir.), cert. denied, 423 U.S. 1019, (1975). Thus, the Frye
standard enhances the search for the truth and ensures fairness in the presentation and
review of scientific evidence.
2. Epidemiology and Causation
Numerous problems can be avoided if the Court requires the epidemiological
evidence to conform to generally accepted epidemiological and statistical principles.
Epidemiology is the only generally accepted scientific discipline that uses statistical
techniques to identify the causes of human disease. It allows a scientific estimate of the
degree of risk of a disease or condition that can be attributed to a given factor, such as
exposure to an allegedly harmful drug. Epidemiology provides courts with a rational and
- 12 -
consistent method for evaluating evidence of causation between exposure to a given
factor and the incidence of disease.
Epidemiology has been described as a two-step process, beginning with
statistical analysis and then attempting to draw biological conclusions from the results of
that analysis.
Basically, the epidemiologist uses a two-stage sequence of
reasoning:
1. The determination of a statistical association between a
characteristic and a disease;
2. The derivation of biological information from such a
pattern of statistical associations.
Lilienfeld, A., Lilienfeld, D., Foundations of Epidemiology at 13 (2d ed., 1980),
Exhibit C. The epidemiologist attempts to discern the relationship between a disease and
a factor suspected of causing it. This relationship is developed by comparing the disease
experiences of people exposed to the factor with those not exposed to the factor. Id. at
3. This relationship between the factor and the disease is known as an "association."
In the first step of the epidemiological process, development of an
association, epidemiologists use statistical concepts to determine whether an association
exists between a factor, such as exposure to a drug and a disease condition, and, if so,
how large that association is. The epidemiologist compares the rate of incidence of the
condition being studied among those exposed to the factor with the rate among those who
are not exposed. These incidence rates are a measure of the probability that an
individual will develop the condition. In effect, the epidemiologist is trying to determine
If exposure to the .factor increases the probability that an individual will develop the
condition.
- 13-
There are several types of epidemiological studies. Two principal types of
studies have been performed on Bendectin: cohort and case controlJI In both, a statistic
is developed that represents the increased risk of birth defects, if any, that may result
from Bendectin use. Before this statistic can be used as a basis for concluding that there
is an association, however, the investigator must assess whether any difference in the
incidence rates of birth defects between the groups studied is real or whether it merely
results from chance because the investigator has examined fewer than all of the
individuals in the group being studied. Only if the investigator can be relatively certain
that the difference is not due to chance can the rates be used to say anything about
whether an association exists. The process of testing the rates to see whether there is
any true difference between them is called "significance testing." If the investigator can
reasonably exclude the possibility of chance, the difference is said to be "significant."
If the statistic is significant, the next step is to estimate the magnitude of
the association. The generally accepted means of measuring an association in a cohort
study is to calculate what is called the "relative risk." This is simply the ratio of the
incidence rate of the condition being studied in the group exposed to the factor divided
/
by the rate in the group that was not exposed. In a case-control study the measure is
81 The first type, called a cohort study, involves two groups of people, one exposed to
the factor and one not exposed. Exhibit C at 226-27. The investigator follows these two
groups and observes the incident rates of the condition in each group. The second type of
study is called a "retrospective" or "case control" study. Rather than looking at the rate
at which the condition occurs in groups that are exposed and not exposed, a case control
study begins with individuals who already have the condition and individuals who do not.
The investigator then examines past exposures to the factor to determine whether the
group that has the condition was exposed to the factor more frequently than the group
that does not. As with a cohort study, the investigator must first determine that any
difference in rates of exposure between the two groups is in fact real before any
inference can be drawn from the rates. Similar methods of significance testing are
applied with case control studies and cohort studies.
- 14-
called an "odds ratio." The odds ratio is based on the rate of usage of the drug among
patients who have the defect under study compared to the usage rate among patients who
do not. The relative risk and odds ratio are both estimates of the magnitude of any
association that can be drawn from the data. They are known as "point estimates." If
there is no association between the factor and the disease, the point estimate is 1.0.
That is, the rates of the two groups are equal.
Thus, before a point estimate can be used as a basis for inferring anything
about an association, it must be shown to be significant. That is, the difference that
gave rise to the point estimate must be true and the investigator must be reasonably
certain that it. is not due to chance simply as an artifact of the "luck of the draw." Only
if the point estimate is significant is the epidemiologist or statistician concerned with
whether it is large enough in magnitude to support the conclusion that there is an
association between the factor and the condition or disease.
lO
/
Only after both of these criteria are satisfied, can the epidemiologist move
to the second stage of the epidemiological reasoning process - developing biological
inferences from a pattern of statistical associations. The process of drawing biological
inferences from an association is beyond the scope of this motion. The existence of an
association, however, is a threshold requirement that must be satisfied before any
9/ Because of inherent limitations on the design of case control studies, the investigator
cannot directly determine incidence rates among the exposed and non-exposed groups.
Accordingly, a relative risk cannot be calculated directly. Cornfield, "A Method Of
Estimating Comparative Rates From Clinical Data: Applications to Cancer of the Lung,
Breast and Cervix," 11 J. National Cancer Institute 1269 (1951). The same type of
calculation, the rate of exposure among cases divided by the rate of exposure among
controls, however, can be calculated. This measure is called the "odds ratio" for a case
control study. The odds ratio closely approximates the relative risk. Hence, the two can
be used almost interchangeably as measures of an assocation.
10/ The greater the magnitude of the observed relative risk, the stronger the association
between the factor and the disease. When a statistically significant relative risk of ten
or more is found, one can be certain that the factor causes the disease or
condition.
- 15 -
biological inference can be drawn from it. The requirement that the association be real,
that is to say "significant" in statistical terms, is in turn a threshold requirement that
must be satisfied before the investigator can conclude that an association in fact exists.
It is critical that statistical evidence conform to these generally accepted
principles of statistical signficance before it can be relied on by expert witnesses who
will provide their opinions as to causation on the basis of that evidence. If the
association is not significant, it cannot reasonably be relied upon by an expert. The
degree of signficance required by epidemiologists and statisticians in order to be
relatively certain that an association is not due simply to chance is relatively high.
Merrell Dow anticipates that plaintiffs will argue that it is "too high" and that the Court
should allow plaintiffs' experts to give their opinion on causation based on statistical
evidence that epidemiologists or statisticians would not accept as reflecting a true
association. In so doing, Merrell Dow anticipates that plaintiffs will argue that an
association need only more likely than not be real rather than due to chance.
c. The Epidemiological Evidence Must Be Statistically Significant
At a 9596 Confidence Level Before It Is Admissible Into
Evidence Or Can Be Relied On By Expert Witnesses At Trial
Epidemiologists and statisticians universally use a confidence level of 95% in
testing whether the differences between the rates of disease or exposure between two
groups is real. This means that epidemiologists and statisticians demand that there is no
more than a 596 chance that the difference is due to the "luck of the draw." The
epidemiological studies done on Bendectin and its components have been based on
generally accepted scientific principles. Because these studies do not, taken as a whole,
show a statistically significant association between Bendectin and an increased incidence
of birth defects, plaintiffs have attempted to distort the accepted standards of
statistical significance in order to arrive at conclusions contrary to those of the authors
of the stUdies. This distortion violates basic principles of statistics and epidemiology.
- 16-
The lower confidence limits suggested by plaintiffs are not generally accepted by
epidemiologists and statisticians and should not be accepted into evidence in this case.
1. Epidemiologists And Statisticians Generally Require That
An Association Be Significant At The 9596 Confidence
Level Before It Can Be Accepted As A True Association
Epidemiologists must be able to determine the probability that an observed
statistical association is due to chance or errors in the sampling of data instead of
reflecting a true association. The 9596 confidence level has become established in the
scientific community as the standard of associations that are "statistically significant."
This 9596 confidence level is also referred to as the 596 "significance level." A 99%
confidence level is sometimes used to define results considered "highly statistically
significant." D. Freedman, R. Pisani, R. Purves, Statistics at 444 (1980), Exhibit D;
T. Wonnacott, R. Wonnacott, Introductory Statistics, at 252 n.16 (3d Ed. 1977) Exhibit H.
Epidemiologists generally, and all of the experts whose studies will be
offered by Merrell Dow in this case, 11/ use a 9596 confidence level in determining
whether the differences they observe are likely to be due to chance alone. Testimony of
Ollie Heinonen in Mekdeci v. Merrell-National Laboratories Inc., No. 77-255-0rl-Civ-Y
(M.D. Fla. 1981), affld, 711 F.2d 1510 (11th Cir. 1983) Tr. at 4505-4510, Exhibit I;
Testimony of Brian MacMahon in Mekdeci, Tr. at 5420-21, 5434-35, Exhibit J. The
published epidemiological studies on Bendectin or its components generally use a
11/ Merrell Dow has filed simultaneously with this motion a motion to admit the
relevant epidemiological stUdies on Bendectin into evidence. References tQ Appendices I
and II contained herein are to those Appendices contained in that motion.
-17 -
confidence level of 9596.
12
/ S e e ~ L. Milkovich and B.J. van den Berg, "An evaluation
of the teratogenicity of certain antinauseant drugs," Amer. J. Obstet. Gynec. 125(2):
244-248 (May 15, 1976); Appendix I, Exhibit 2; G. Greenberg, et al., "Maternal Drug
Histories and Congenital Abnormalities," Brit. Med. J. 2:853-56 (October 1977),
Appendix I, Exhibit 6; G.T. Gibson, et al., "Congenital Anomalies in Relation to the Use
of Doxylamine/Dicyclomine and other Antenatal Factors," Med. J. Aust. 1:410-414 (April
18, 1981), Appendix I, Exhibit 11; J.F. Cordero, G.P. Oakley, et al., "Is Bendectin A
Teratogen? ," J. Am. Med. Assoc. 245(22):2307-2310 (June 12, 1981), Appendix I, Exhibit
13; Heinonen, Sloan and Shapiro, "Birth Defects and Drugs in Pregnancy," Publishing
Sciences Group, Inc., Littleton, Mass. (l977), Appendix I, Exhibit 4.
The 596 level for statistical significance, as well as the 196 level for highly
statistically significant findings, is an arbitrary cutoff point. It is, however, universally
accepted in the scientific community. This line is reflected in the practice of academic
journals in accepting articles for publication - they universally use the 596 level of
statistical significance. Freedman at 493, Exhibit D. A significance level greater than
596, corresponding to a confidence level of less than 9596, is not acceptable in the
profession as establishing statistical significance. This principle is accepted by plaintiffs'
experts in this case. (See pp. 2 - 3, supra.)
12/ The Michaelis study used a 9096 confidence interval. J. Michaelis, et al., prospective
Study of Suspected Associations Between Certain Drugs AdministeredDuring Early
Pregnancy and Congenital Malformations," Teratology 27:57-64 (l983); "Teratogene
Effekte Von Lenotan?," "(Does Lenotan Have Teratogenic Effects?)" Deutsches
Arzteblatt 23:1527-1529 (June 1980) (English translation), Appendix I, Exhibit 8. It did so
in a so called "two-tailed" test. The issue whether a so called "one-tailed" or "two-
tailed" test is appropriate is a minor and extremely confusing issue. See Freedman,
Statistics at 494-96. Exhibit D. Merrell Dow does not assert that Bendectin has any
protective effect with respect to birth defects. Further, the lower confidence limit is
the same regardless which test is used. Hence, the differences between "one-tailed" and
"two-tailed" tests of significance are not relevant. See Koller at 1 n.*, Exhibit B. What
is important is that a 9596 confidence level be used in accordance y{ith generally
accepted statistical principles.
- 18-
The 9596 requirement has been recognized time and again by the courts. It is
now firmly ensconced in judicial precedent as well as scientific practice. Courts have
held that scientific evidence must conform to the standards recognized by professional
statisticians, including the 9596 confidence level as a measure of statistical
significance. In Moultrie v. Martin, 690 F.2d 1078, 1082-85 (4th Cir. 1982), plaintiffs
sought to demonstrate discriminatory selection of grand jurors based in part on a showing
of historical underrepresentation. Plaintiff failed, however, to calculate the statistical
significance of the figures presented. The court stated:
When a litigant seeks to prove his point exclusively
through the use of statistics, he is borrowing the
principles of another discipline, mathematics, and
applying these principles to the law. In borrowing from
another discipline, a litigant cannot be selective in which
principles are applied. He must employ a standard
mathematical analysis. Any other requirement defies
logic to the point of being unjust. Statisticians do not
simply look at two statistics, such as the actual and
expected percentage of blacks on a grand jury, and make
a subjective conclusion that the statistics are
significantly different. Rather, statisticians compare
figures through an objective process known as hypothesis
testing [w]ithout the use of hypothesis testing, a
court may give weight to statistical differences which
are actually mathematically insignificant For this
reason it is particularly important that courts follow such
formulae before drawing conclusions from statistical
evidence, and we so require it.
Id. at 1082-83. (Citations omitted.)
In Moultrie, 690 F.2d at 1083 n.7, the court recognized that statisticians
usually state their conclusions in terms of "whether the difference between actual and
expected values is statistically significant at a given confidence level. Statisticians
usually use 9596 or 9996 confidence levels." Generally, courts have required that
statistical significance testing conform to a 9596 confidence level. Little v. Master-Bilt
Products, Inc., 506 F. Supp. 319, 327 n.7 (N.D. Miss. 1980) (noting that a 9596 confidence
level was required but that an even higher level may be required in some cases); Taylor v.
TeletyPe Corp., 475 F. Supp. 958, 962 (E.D. Ark. 1979), cert. denied, 454 U.S. 969 (1981);
-19 -
EEOC v. American National Bank, 652 F.2d 1176, 1192 (4th Cir. 1981).
The Supreme Court has repeatedly held that significance testing must be
used in analyzing statistical evidence. Castaneda v. Partida, 430 U.S. 482, 496 n.17
(1977); Hazelwood School District v. United States, 433 U.S. 299, 307-11 (1977); Mayor of
Philadelphia v. Educational Equality League, 415 U.S. 605, 619-21 (1974).
In so stating, the United States Supreme Court adopted a 9596 confidence
level in a series of cases involving employment discrimination.
13
/ The Court noted in
Castaneda that some fluctuation from the expected number is anticipated in any
statistical measure:
The important point, however, is that the statistical
model shows that the results of a random drawing are
likely to fall in the vicinity of the expected value. The
measure of the predicted fluctuations from the expected
value is the standard deviation [I]f the difference
between the expected value and the observed number is
greater than two or three standard deviations, then the
hypothesis that the jury drawing was random would not be
suspect to a social scientist.
Castaneda, 430 U.S. at 496 n.17 (citation omitted). The two or three standard deviations
referred to in Castaneda correspond to confidence levels of 95 to 9996. Moultrie, 690
F.2d at 1084 n.10. The Court reinforced its holding in Castaneda in Hazelwood School
District v. United States, 433 U.S. 299, 308 n.14, 311 n.17 (1977), adopting the Castaneda
standard of statistical significance testing based on fluctuations from the expected value
of two or three standard deviations.
13/ The Court's rulings on statistical significance in discrimination cases are equally
applicable in the context of the Bendectin cases. The statistical analysis employed with
respect to employment discrimination is substantially identical to that employed in
epidemiology. The only meaningful difference is that, whereas a doubling of the relative
risk may be needed in order to show that the condition or disease is more likely than not
to have been caused by the alleged teratogen in an epidemiological study, a lower point
estimate may well suffice in a discrimination case where it would be expected that there
is no background rate of discrimination. See Castaneda v. Partida, 430 U.S. 482, (1977);
Hazelwood School District v. United States, 433 U.S. 299 (1977); Mayor of Philadelphia v.
Educational Equality League, 415 U.S. 605 (1974).
- 20 -
This view is supported by the decision of Judge Jackson in another case
involving virtually identical allegations, Richardson v. Richardson-Merrell Inc., No. 83-
3505 (D.D.C. June 9, 1986) where plaintiffs were precluded from presenting any
statistical or scientific evidence absent a showing of statistical significance. Another
court in this District has also held, in a case involving Bendectin, that no statistical
evidence will be admitted during the course of the trial "unless it meets a confidence
level of 9596." Koller v. Richardson-Merrell Inc., No. 80-1258, memo OPe at 1 (D.D.C.
February 25, 1983). Exhibit B. In Koller, Judge Norma Holloway Johnson sua sponte
raised two evidentiary issues relating to the scope of the statistical evidence to be
admitted at trial.
141
In requiring that statistical evidence conform to the 9596
confidence level, Judge Johnson noted that every study examining whether Bendectin is a
teratogen has employed a confidence level of at least 95%. Id. at 2. In addition,
plaintiffs concede that social scientists routinely utilize a 9596 confidence level. Finally,
all legal authorities agree that statistical evidence is inadmissible "unless it meets the
9596 confidence level r ~ q u i r e by statisticians." Koller, memo Ope at 2.
Particularly in view of the Courts' recognition that the 95% confidence level
is generally accepted in a scientific community, there is simply no justification for
allowing lax and unprofessional statistical opinions and analyses at trial. Statistical
evidence that does not conform to the 9596 confidence level would not be tolerated
outside the courtroom, nor would it be allowed in scientific journals. Plaintiffs' desire to
alter the level of statistical significance ignores the well-established convention among
epidemiologists. Professional statisticians and epidemiologists recognize that
significance at a level of less than 596 is an indication that additional confirmatory work
141 Plaintiffs in Koller are now represented by the same plaintiffs' counsel representing
the plaintiffs in this case. .
- 21 -
should be done before moving forward with step two of the epidemiological reasoning
process and attempting to derive biological inferences from the data. 15/
Defendant respectfully submits that the appropriate level of statistical
signficance at which to consider the published epidemiologic studies in this case is the
level used by the authors of the various studies and employed by the editors of the
scientific journals in which they were published. Confidence levels of 9596 or 9996
(corresponding respectively to significance levels of 596 and 196) are generally recognized
, .
and used by epidemiologists and statisticians. These significance levels are supported by
the case law and the scientific literature. The use of any other level of statistical
significance is unwarranted and would lead the Court to accept allegedly scientific
"results," cloaked in the mystique of mathematics, that scientists themselves would not
accept.
2. Statistics and the Standard of Proof
Any argument that an association drawn from an epidemiological study can
be relied on if it is simply more likely than not to be a true association confuses basic
principles of statistical significance with this Court's standard of proof. The two are
distinct and different. Confusion in this regard will only mislead the jury.
Contrary to plaintiffs' contention, statistical significance does not equate to
the burden of proof. Significance is a threshold issue, akin to a ruling by the Court on
the admissibility of evidence. An association that is valid only at a level of statistical
15/ In addition, plaintiffs misconceive the nature and importance of statistical testing
itself. The existence of one study showing a statistically significant association would
not itself demonstrate causation. Even at the 596 signficance level, statistically
significant associations would occur by chance 596 of the time even if there is no true
association. Freedman at 494, Exhibit D. Thus, when over 20 studies have been
conducted, such as they have on Bendectin, at least one statistically significant
association would be expected due to chance alone, even if no association exists in
reality. The relevant epidemiological studies on Bendectin do not support Jiny biological
inference of causation.
- 22-
significance of 5196 is not probative of causation, regardless of the value of the point
estimate. It cannot reasonably be relied upon by epidemiologists or statisticians.
The statistical measure that corresponds to the standard of proof is the
magnitude of the association, not its significance. Where there is a background rate of a
disease or condition, such as there is with birth defects, it is necessary to show that the
factor is associated with a doubling of the background incidence rate in order to infer
causation. In other words, in order for it to be more likely than not that Bendectin and
not some other factor was the cause of any individual's birth defect, the magnitude of
the point estimate (1) would have to be statistically significant at the 9596 confidence
level and (2) would have to be greater than 2.0.
16
/ This point was recently recognized by
the court in Marder v. G. D. Searle & Co., No. Y-82-3506, slip. OPe (D. Md., March 19,
1986) when the court stated:
In epidemiological terms, a two-fold increased risk is an
important showing for plaintiffs to make because it is the
equivalent of the required legal burden of proof -- a
showing of causation has the preponderance of the
evidence or, in other words, a probability of greater than
5096.
Mem. Ope at 14 (Exhibit K).
16/ Since there is a background rate of birth defects, a certain number of birth defects
will be caused by factors other than the alleged teratogen, regardless whether the
substance is or is not teratogenic. Even if it can be shown by a statistically significant
association greater than 1.0 that the factor is associated with the condition, a certain
number of birth defects will be due to f c ~ o r s other than the alleged teratogen. If a
statistically significant association of 1.5 is found, for example, only one out of every
three cases of birth defects could be attributable to the factor. Two of the three cases
would be caused by factors contributing to the background rate. It would be twice as
likely that any individual's defect was part of the background rate, rather than due to the
factor. At a statistically significant association of 1.5, therefore, it would be more
likely than not that the factor did not cause any particular individual's condition. Were a
statistically significant association of 2.0 found, it would be equally likely that any
individual plaintiff's condition would be due to background factors as it would to the
suspected teratogen. Only when a statistically significant association greater than 2.0 is
found, does it become more likely than not that any particular plaintiff's condition was
due to the suspected teratogen.
- 23-
Hence, the Court must wrestle with two statistical principles. The first is a
threshold issue - whether an association is real, or statistically significant. This issue
relates to the admissibility of evidence of the association. Only if the association is
found to be significant does the Court come to the second issue -- the magnitude of the
association. If the association is not significant, it cannot be used to infer causation. It
is the magnitude, or the importance of the association, that relates to the burden of
proof. Marder, Mem. OPe at 24, Exhibit K. The magnitude goes to the weight of the
evidence. Only if a statistically significant association greater than 2.0 can be found can
it be said that a particular factor under study is more likely than not to have caused any
particular plaintiff's condition.
In Cook V. United States, 545 F. Supp. 306 (N.D. Cal. 1982), the court applied
these principles to plaintiff's allegation that her disease was caused by her immunization
under the swine flu program. Both parties contested the magnitude of the relative risk
developed from certain epidemiological studies. The court noted that:
Whenever the relative risk to vaccinated persons is
greater than two times the risk to unvaccinated persons
there is a greater than 5096 chance that a given GBS case
among vaccinees of that latency period is attributable to
vaccination, thus sustaining plaintiff's burden of proof on
causation.
545 F. Supp. at 308. The plaintiff in Cook, however, was unable to show that the relative
risk remained above 2.0 at the time she was vaccinated. Accordingly, plaintiff's data
was insufficient to satisfy plaintiff's burden of proving causation. Id. at 316. Hence, it is
the magnitude of the association and not its significance that corresponds to the "more
likely than not" requirement for evidence to be probative.
- 24-
D. Methodological Defects Of The Epidemiological Studies
Are Not Probative Unless They Effect Statistically
Significant Changes In The Conclusions Of The Studies
Merrell Dow also anticipates that plaintiffs will attack the studies on the
basis of a number of alleged methodological defects or flaws. These attacks, however,
are neither probative nor material absent evidence that the conclusions reached by the
authors of those studies would have been altered, absent those defects. In view of the
difficulty the jury will have in understanding and digesting the scientific evidence itself,
let alone any alleged defects in that evidence, it is necessary that the Court establish
reasonable ground rules governing the admissibility of this evidence.
1. The Court Has Great Latitude In Excluding
Evidence That Will Mislead The Jury, Confuse The
Issues, Or Waste Time
Evidence should be excluded where it might create undue prejudice or
confuse or mislead the jury. Douglas v. United States, 386 A.2d 289 (D.C. 1978); United
States v. Margiotta, 662 F.2d 131, 143 (2d Cir. 1981), cert. denied, 461 U.S. 913 (1983)
(state law violations excluded in prosecution for violation of federal law because of
prejudicial effect); Rigby v. Beach Aircraft Co., 548 F.2d 288 (10th Cir. 1977) (evidence
of defects other than those in issue would have confused the jury); E. Cleary, McCormick
on Evidence, 185 (2d ed. 1972); 6 Wigmore On Evidence 1864, 1865, 1904 (1976).
These principles are expressly embodied in Federal Rule of Evidence 403 which permits
the court to exclude evidence "if its probative value is substantially outweighed by the
danger of unfair prejudice, confusion of issues, or misleading the jury, or by
considerations of undue delay, waste of time, or needless presentation of cumulative
evidence."
In conjunction with that power, the court has broad discretionary authority
over the presentation of evidence. 6 Wigmore On Evidence 1867 (1976); Griffin v.
United States, 164 F.2d 903, 904 (D.C. Cir. 1947), cert. denied, 333 U.S. 857 (1948). As
- 25 -
noted at page 6, supra, Rule 611 of the Federal Rules of Evidence grants the court broad
discretion to exercise reasonable control over the mode and order of presenting
evidence. The court noted in Griffin, construing Fed. R. Evid. 611, that it is the duty of
the trial judge to see that the facts of the case are properly developed and that the jury
is not confused:
It cannot be too often repeated, or too strongly
emphasized, that the function of the federal trial judge is
not that of an umpire or a moderator at a town meeting
He sits to see that justice is done in the cases heard
before him; and it is his duty to see that the case on trial
is presented in such a way as to be understood by the
jury, as well as himself [The trial judge] has no more
important function than to see that the facts are properly
developed and that their bearing upon the question at
issue are clearly understood by the jury.
Griffin, 164 F.2d at 904-05, quoting Simon v. United States, 123 F.2d 80, 83 (4th Cir.
1941), cert. denied, 314 U.S. 694 (1941). The court's discretionary power to control the
presentation of evidence has been exercised to require a litigant to present a portion of
the evidence he might not otherwise present in order to make his .other evidence
comprehensible. Sweitlowich v. County of Bucks, 610 F.2d 1157 (3d Cir. 1979); Baker v.
United States, 401 F.2d 958 (D.C. Cir. 1968), cert. denied, 400 U.S. 965 (1970). Similarly,
it is necessary in this case that the Court exercise its power to prevent plaintiffs'
preemptive attack based on irrelevant and immaterial allegations of methodological
flaws in the epidemiological evidence.
2. Plaintiffs' Alleged Methodological Flaws Are
Irrelevant On The Issue Of Causation
Epidemiology is an imperfect science. It is virtually impossible to eliminate
all confounding factors to match the groups being studied perfectly, to manage a large
sample, to police with perfection the collection of data, and to anticipate every problem
in the methodology. Some of the "criticisms" advanced by plaintiffs' witness, Dr. Done,
would be true of almost all epidemiological studies. Other of the alleged flaws are
- 26-
simply disagreements of judgment. All are obvious points of which the scientists
evaluating the studies would be well aware.
The studies relied on by Merrell Dow were published in scientific journals.
Most of these journals are peer review journals which will not publish work until after it
has been closely scrutinized by a critical group of scientist-editors and found to have
substantial merit. Dr. Done has never authored any such publication involving an
epidemiological study of Bendectin. Nonetheless, his criticisms include:
1. Use of mothers of children with birth defects rather than mothers with
normal children as a control group in the Cordero and Oakley study,
Appendix I, Exhibit 13;
2. Using a large population which includes a relatively small number of
Bendectin mothers: Heinonen, Appendix I, Exhibit 4; Smithells,
Appendix I, Exhibit 7; Michaelis, Appendix I, Exhibit 8;
3. Reliance on prescriptions as evidence of ingestion of the drug:
Smithells, Appendix I, Exhibit 7; Jick, Appendix I, Exhibit 15;
4. Failure to identify possible over-the-counter use of drugs containing
Doxylamine or possible Bendectin use among controls (all studies
except those based on questionnaires); and
5. Dilutron of effects on specific periods of organogenesis by using too
long a test period such as the first trimester (all except Smithells).
Each of these observations reflects some of the compromises necessary in
epidemiological studies. In No. 1 above, Cordero and Oakley wished to minimize recall
bias which favors the memory of mothers whose children have birth defects. In No.2 the
control populations being studied were immense. The larger control populations provide
more powerful statistical analysis and greater certainty in determining the background
rates for specific categories of birth defects. In No.3, records of filled prescriptions, if
computerized, are at least complete and relatively foolproof. The problem of actual
ingestion can never be solved unless the investigator is observing each mother while she
takes the pill. In most of the studies, prescription use was spot-checked by interview or
questionnaire. The problem identified in No.4, of unreported drugs, is present in any
study, but is reduced by conference and questionnaire. Finally, the problem of dilution,
- 27-
No.5, is also a compromise. Because of the uncertainty as to conception and the
interest in studying more than one kind of birth defect, some dilution is necessary in
order that relevant data not be omitted. These points are not flaws but, rather, reflect
compromises made by the authors on the basis of their professional judgment. To the
extent that these compromises are shortcomings, they have been openly discussed. They
are not quantifiable in precise terms.
In the absence of repeated, confirmed, statistically significant associations
as to categories including limb reduction defects, these observations are not more than
cautionary. While anyone may be used to argue that an additional study is needed,
Bendectin has been thoroughly studied by a significant number of investigators in more
than 30 epidemiological studies. None of these alleged flaws would change the outcome
of any of the studies at which they are leveled. In the absence of proof of statistically
significant changes in the results of the studies, these arguments should be left to
plaintiffs' impeachment of Merrell Dow's proof. They do not constitute affirmative
evidence of causation.
Plaintiff's alleged defects are not relevant to negligence or any other basis
of liability. Openly discussed methodological compromises are expected in almost all
epidemiological studies. There is no evidence that the data of any of the studies was
misread. Interpretation of that data was a matter of scientific judgment. Over twenty
years of continued study shows overwhelming, confirmatory proof of the lack of an
association between Bendectin and an increased incidence of limb reduction defects.
Reliance on these studies, notwithstanding these open "compromises" is not negligent and
does not raise an issue of fact.
- 28-
3. The Complexity Of The Subject Matter Compels
Exclusion Of Plaintiffs' Conjectured Defects Even
Were They Marginally Relevant
Methodological criticisms, unless they result in a statistically significant
alteration of the study's conclusion, will confuse the jury. In the Court's Memorandum
Opinion of February 25, 1983 in Koller, Judge Johnson considered the foundation which
must be established before plaintiffs may introduce in their case in chief evidence of
alleged methodological flaws and other weaknesses of epidemiological studies. The
Court held that attacks on the epidemiological and animal studies of Bendectin:
may not be presented in plaintiffs' case in chief unless
plaintiffs first establish a foundation that the particular
flaw would alter the conclusions of the study in a
statistically significant manner. In other words, it will be
insufficient to suggest that methodological errors
significantly weaken the conclusions of the
epidemiological studies. To be admissible in plaintiffs'
case in chief, it must be demonstrated initially that the
study, absent the methodological error, would indicate
that Bendectin is a teratogen at the appropriate level of
statistical significance.
Koller, memo OPe at 2-3. Exhibit B. Judge Johnson preserved for plaintiffs their right to
present in rebuttal the weaknesses of the stUdies.
The Court in Koller properly required that such rebuttal evidence,
nonetheless, be material. The Court limited the nature of the rebuttal:
[P]laintiffs will not be permitted to cite trivial flaws
absent a preliminary showing that the alleged flaws would
materially weaken the conclusions of the studies. That
some of the raw data would have changed if the alleged
flaws had not occurred also is insufficient to establish
materiality.
Id. at 3 n. Exhibit B. The Court stated further:
Plaintiffs have no right to attempt to preempt
defendants' anticipated defense in their case in chief by
advancing alleged weaknesses and studies that are
favorable to defendant. This approach would confuse the
jury and would badly obscure the fundamental
requirement that plaintiffs prove that Bendectin is a
teratogen that caused the birth defects of Ann Koller.
Regardless of how serious plaintiffs believe the flaws in
- 29-
the epidemiological [and] animal studies are, these flaws
are irrelevant to plaintiffs' case in chief unless plaintiffs
can employ these flaws to show that the studies
affirmatively demonstrate that Bendectin causes birth
defects.
Id. at 3 - 4. Exhibit B. Without this limitation, plaintiffs would be able through the use
of irrelevant and immaterial evidence to effectively shift the burden of proving that
Bendectin is safe to Merrell Dow, while at the same time attempting to destroy the
substantial evidence amassed by the scientific community exonerating Bendectin.
Absent some effect on the conclusions reached by the authors of these studies, the
alleged defects are simply immaterial. Even a straightforward presentation of the
epidemiological evidence in accordance with generally accepted scientific principles will
be extremely difficult for a lay jury to understand and digest. If confounded by
plaintiffs' attempts to needle the studies by injecting immaterial alleged methodological
flaws, the jury will become hopelessly confused. Unless these flaws would have some
effect on the outcome of the studies, they do not support plaintiffs' position on
causation. The alleged flaws should be relegated to their only proper position in this case
- plaintiffs' rebuttal evidence. If plaintiffs are able to show some effect on the outcome
that would be statistically significant, only then will these alleged defects be relevant
and material.
m. CONCLUSION
For the foregoing reasons, defendants respectfully request that the Court
enter an order limiting the statistical evidence to that which is statistically significant
at a confidence level of 95% (statistically significant at the 5% level), and that plaintiffs
be precluded in their case in chief from raising claims of methodological error in the
- 30-
epidemiological studies unless (1) correction of those alleged errors would vary the
outcome of the studies and (2) plaintiffs' methodological corrections caused t ~ studies'
results to reach a level of statistical significance.
Dated: July 3, 1986
Respectfully submitted,
MARK L. AUSTRIAN Bar No. 346593
PATRICK J. COYNE Bar No. 366841
COLLIER, SHANNON, RILL &: SCOTT
1055 Thomas Jefferson Street, N. W.
Washington, D.C. 20007
(202) 342-8400
Attorneys for Defendants
Merrell Dow Pharmaceuticals Inc.
and Standard Drug Co., Inc.

Motion in Limine On Statistical and Scientific Evidence

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Motion in Limine On Statistical and Scientific Evidence

Uploaded by

Copyright:

Available Formats

UNITED STATES DISTRICT COURT

FOR THE DISTRICT OF COLUMBIA

You might also like