Festing 1998

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 19

ATLA 26, 283-301, 1998 283

Reducing the Use of Laboratory Animals in


Biomedical Research: Problems and Possible
Solutions
The Report and Recommendations of ECVAM Workshop 29 1•2 •3

Michael F.W. Festing, 4 Vera Baumans, 5 Robert D. Combes, 6 Marlies


Halder, 7 Coenraad F.M. Hendriksen,8 Bryan R. Howard, 9 David P.
Lovell/ 0 Graham J. Moore, 11 Philip Overend 12 and MarieS. Wilson 13

4
MRC Toxicology Unit, University of Leicester, Lancaster Road, Leicester LEI 9HN, UK;
5
Department o[. Laboratory Animal Science, University of Utrecht, 3508 TD Utrecht, The
Netherlands; FRAME, Russell & Burch House, 96-98 North Sherwood Street, Nottingham
NGl 4EE, UK; 7ECVAM, JRC Environment Institute, 21020 lspra (VA), Italy; 8National
Institute of Public Health and the Environment, 3720 BA Bilthoven, The Netherlands; 9Field
Laboratories, Medical School, University of Sheffield, Beech Hill Road, Sheffield SlO 2RX,
UK; 10BIBRA International, Woodmansterne Road, Carshalton, Surrey SM5 4DS, UK;
11
Pfizer Central Research, Sandwich, Kent CT13 9NJ, UK; 12SmithKline Beecham Pharma-
ceuticals, New Frontiers Science Park (North), Third Avenue, Harlow, Essex CM19 5AW,
UK; 13Merck Sharp & Dohme Research Laboratories, Neuroscience Research Centre, Ter-
lings Park, Harlow, Essex CM20 2QR, UK

Preface tion of alternative tests into regulatory pro-


cedures. It was decided that this would be
This is the report of the twenty-ninth of a best achieved by the organisation of ECVAM
series of workshops organised by the Euro- workshops on specific topics, at which small
pean Centre for the Validation of Alternative groups of invited experts would review the
Methods (ECVAM). ECVAM's main goal, as current status of various types of in vitro
defined in 1993 by its Scientific Advisory tests and their potential uses, and make rec-
Committee, is to promote the scientific and ommendations about the best ways forward
regulatory acceptance of alternative methods (1). In addition, other topics relevant to the
which are of importance to the biosciences Three Rs concept of alternatives to animal
and which reduce, refine or replace the use of experimentation have been considered in
laboratory animals. One of the first priorities several ECVAM workshops. This is a report
set by ECVAM was the implementation of of the first ECVAM workshop to be devoted
procedures which would enable it to become exclusively to reduction as defined by Russell
well-informed about the state-of-the-art of & Burch (2).
non-animal test development and validation, The workshop on Reducing the Use of Lab-
and the potential for the possible incorpora- oratory Animals in Biomedical Research:

Address for correspondence: Dr M1chael F. W. Festmg, MRC Toxicology Umt, University of Leicester, Lancaster
Road, Leicester LEl 9HN, UK.
Address for reprmts. ECVAM, TP 580, JRC Environment Instttute, 21020 lspra (VA), Italy.
1

ECVAM- European Centre for the Val!datwn Alternatwe Methods. 2 This document represents the agreed
report of the part1c1pants as mdw1dual sc1ent1sls. Derek Fry (Home Office, Shrewsbury, UK) was also a partici-
pant at the workshop and contributed to thts report.
284 M.F.W. Festing et al.

Problems and Possible Solutions was held in imising pain, suffering, distress or lasting
Southwell, UK, on 12-15 January 1998, harm.
under the chairmanship of Michael Festing Although the concept of reduction is rela-
(MRC Toxicology Unit, Leicester, UK). The tively simple, possible methods of achieving
participants, who all attended as individuals, it are not immediately obvious. However,
not as representatives of their respective there is often a clear association between
organisations, were very experienced in the reduction and the quality of the resulting sci-
use of animals in biomedical research, hav- ence. If animals are used in a poor quality
ing a strong commitment to high quality project, which does not make a significant
research and the ethical use of animals contribution to knowledge, or if a project is
where such use cannot be avoided. The aims undertaken in such a way that it fails to
of the workshop were to find ways of reduc- meet its scientific objectives, animals will
ing the number of animals used in biomed- have been used needlessly. This will also be
ical research without reducing research the case if the scientific objectives are not
output, and to make recommendations for clear, so that it is not easy to determine
practical ways in which this might be whether these have actually been met. Even
achieved. when the project is of high scientific calibre,
there might be scope for reducing animal use
by using pilot experiments and/or more
Introduction advanced experimental designs and statisti-
cal methods. Very often this will also reduce
The concept of the Three Rs (replacement, the need for other scientific resources and
reduction and refinement) was developed by avoid unnecessary work, so it will improve
Russell & Burch (2) to provide a framework general scientific effectiveness in the long
for improving the conduct and ethical accept- term.
ability of experimental techniques on ani- One obvious area where there could be
mals. Given that animals used in research scope for reducing animal use is in ensuring
may experience pain, suffering or lasting that each experiment is of an appropriate
harm, the first step must be to consider size. Although a large experiment will usu-
whether less sentient or non-sentient alter- ally have higher statistical power (i.e. it will
natives can be ·used instead (replacement). be more likely to detect a treatment effect if
Where this is not possible, care needs to be there is one) than a comparable smaller one,
taken to minimise any pain that an individ- it could lead to inefficient use of resources
ual animal may suffer (refinement), both (including animals), because once an experi-
during the actual experiment and before or ment has reached a certain size, the use of
after the conduct of experiments. Refine- additional animals provides relatively little
ment is often achieved, for example, by pro- further information (4). However, animals
viding the animals with an environment in will also be used unnecessarily if an experi-
which they can feel secure and comfortable, ment is so small that it is incapable of detect-
ensuring that they are free of infectious dis- ing a scientifically important treatment
eases, and by using appropriate anaesthetics effect.
and analgesics if surgical techniques are to The exact design of an experiment can also
be used (3). Lastly, the number of animals be important. For example, randomised
used in a given project needs to be minimised block experimental designs sometimes help
(reduction), while ensuring that the objec- to remove variability due to otherwise
tives of the study can still be achieved; typi- uncontrollable time and space variables, and
cally, this will also reduce the sum total of they therefore increase the statistical power,
animal suffering. so that fewer animals are needed. Factorial
For the purposes of this report, reduction designs can be used to incorporate both sexes
means ways of obtaining comparable levels and/or more than one strain of animal with-
of information from the use of fewer experi- out increasing the total number. If the treat-
mental animals, or of obtaining more infor- ment is eventually to be studied in both
mation from a given number of animals, so sexes, then such designs can reduce the num-
that fewer animals are needed to complete a bers of animals needed and produce informa-
given research project, taking into account tion which could not be obtained in any other
individual animal welfare in relation to min- way. Thus, reduction needs to be considered
ECVAM Workshop 29: reduction 285

in terms of research strategy, including the and females presents no serious statistical
actual design, size and scientific information problems, and has the advantage of being
provided by each experiment. able to show whether the two sexes respond
The potential conflict between reduction in the same way. Similarly, animals hetero-
and refinement should also be considered. It geneous for weight or age can be incorpo-
is sometimes possible to reduce the number rated into an experiment without any loss of
of animals needed by increasing, for example, precision by using a randomised block
the dose of a test chemical to ensure that a design, obviating the need for very narrow
toxic effect is observed. This may lead to the weight or age ranges. Some of these points
use of fewer animals, although each individ- are considered in more detail later in this
ual animal may suffer more . .AJ:, it is difficult report.
to quantify pain and suffering, care will need Russell & Burch (2) suggested that reduc-
to be taken in such cases to ensure that pro- tion can be achieved by better research strat-
posed changes are not counter-productive in egy, by better control of variation and by the
terms of total suffering. However, surveys application of better statistical methods.
suggest that there are many experiments
which could be conducted with fewer animals
without an increased burden on those which Improving Research Strategy
have to be used (3, 4), and that, in some cases,
the experiments could be redesigned and There are several ways in which research
analysed more efficiently to provide more strategy can be improved, as discussed
information. Where increased refinement below.
(for example, through the use of analgesics or
by environmental enrichment) also reduces Objectives
variation between animals, it may contribute Research objectives need to be clearly speci-
to reduction. Thus, with careful thought at fied and flexible, with the definition of appro-
the design stage, fewer animals can often be priate decision points. The latter would help
used without any loss of information. a researcher to decide whether to continue
Reduction can also be achieved by min- with a particular line of research or to try
imising the wastage of animals which have another approach. ·
been bred for research, but which are not The most appropriate animal model
used because of failure to match supply and should be chosen. A wide range of inbred
demand. For example, demand may be strains, mutants, outbred stocks and trans-
largely for males so that females are not genic strains of mice and rats are available,
needed, or it may be so sporadic that a given and the outcome of the project may depend
batch of animals may have become too old or critically on the strain(s) used. The choice
too heavy by the time they are required. of individual strains or stocks needs to be
Although these animals are not used for given careful consideration, and should be
research, excess production still poses an justified in research proposals. It has been
ethical problem because the breeding and claimed that "the introduction of inbred
killing of animals for no real purpose is itself strains into biology is comparable in impor-
ethically undesirable, and their availability tance with that of the analytical balance
could encourage researchers to use more ani- into chemistry" (5). The uniformity of
mals than they would otherwise consider inbred strains means that, in many cases,
necessary. Matching breeding to realistic use fewer animals are needed than if outbred
is particularly important for colonies of stocks are used (6), and selection of the
harmful mutant or transgenic animals, most appropriate inbred strain from those
where at least some of the offspring may suf- which are available may lead to further
fer adverse effects. Cryopreservation can reduction (7). If outbred stocks have nor-
reduce the need to maintain colonies simply mally been used in the past, the possibility
to preserve the line. of switching to inbred strains should be
Appropriate designs can also help to considered as a way of improving the sci-
reduce such wastage. Factorial designs can ence as well as of reducing animal numbers
often be used to even out the demand for (8). However, whether inbred strains or
both sexes, because splitting a single-sex outbred stocks are used, research workers
group into two half-sized groups of males should make some attempt to justify their
286 M.F.W. Festing et al.

choice to indicate that they have at least mal welfare, biometricians, and possibly spe-
given it some thought. cialists in informatics. Procedures should be
developed to allow these people to communi-
Background research cate effectively with each other (14). This
will require written protocols and meetings
In evaluating the need to undertake a partic-
to ensure that the project is feasible and can
ular project, critical review of existing back-
be done efficiently to the highest scientific
ground information is essential. Surveys of
standards.
the general biomedical literature suggest
that over 50% of published papers have obvi-
A statistical approach to strategy
ous statistical errors, and in some cases the
conclusions are not supported by the pub- Muller et al. (15) provide a good basis for
lished data (9-12). Such papers should not be considering statistical aspects of research
accepted at face value. They may also cause strategy. They recommend "top down plan-
another researcher to select inappropriate ning", which involves five steps: a) specifica-
strategies and designs. tion of the experimental questions of
interest; b) specification of testable hypothe-
Time pressures on experimenters ses implied by these questions; c) specifica-
tion of "target analyses", i.e. the statistical
There must be adequate time allowed for the
computations which will be necessary to esti-
completion of a project. Animals from an out-
mate the presence, and size, of any treat-
side supplier need to be acclimatised for
ment effects arising from the hypotheses to
about two weeks (13), to enable them to
be tested; d) determination of the data sets
adapt to the new environment, diet and
microflora, otherwise they may be physiolog- which will be needed to enable such compu-
tations; and e) specification of the informa-
ically and immunologically abnormal. Tech- tion which must be collected to provide the
niques should have been optimised before
raw data. They also make a distinction
the project is started. If staff need to learn
between "confirmatory" experiments/analy-
manual skills, such as dosing procedures or
ses, which are designed to test a particular
surgical techniques, during the course of the
hypothesis that has been explicitly stated at
experiment, this could introduce an unac-
the design stage, and "exploratory" ones,
ceptable level of uncontrolled variation which explore or "mine" the data for unex-
which could obscure treatment effects. There
pected or interesting information (15).
may be little apparent incentive for the
Exploratory data analysis should be
researcher to reduce the number of animals,
encouraged, provided it is recognised that it
and there is sometimes a "comfort factor" in
gives biased estimates of statistical signifi-
using large numbers, as it is hoped that this
cance. For example, selecting the highest and
may obviate the need to repeat the experi-
the lowest mean values and performing a
ment.
Pilot studies, using a few animals with the t test to see whether they differ significantly
is unlikely to give the correct results if the
objective of determining whether a previ-
experiment involves several treatment
ously described model can be replicated in a
groups, unless an appropriate correction is
new environment, are important f~r overall
made. This is because the t test and associ-
reduction. Such studies can reveal any hid- ated probability levels are only designed for
den problems with dose rates or logistics, analysing experiments which involve a com-
they may reveal scope for refinement, such
parison of the means of two groups defined
as the choice of a more-humane endpoint,
before an experiment is undertaken. How-
and they can provide data which can be used
ever, the next experiment could be designed
for estimating required sample sizes. Ani-
specifically to compare two such mean val-
mals are more likely to be used unnecessar-
ily by launching straight into a full-scale ues.
In some cases, the same experiment can be
experiment, yet this is common practice.
both confirmatory and exploratory. This is
known as the "leapfrog" approach, in which
Teamwork
each study is used to investigate a specific
Animal research is multidisciplinary, requir- hypothesis, and also to generate new
ing expert input from research scientists, hypotheses for further study (15). As already
animal handlers, those concerned with ani- noted, pilot studies can be used to gather
ECVAM Workshop 29: reduction 287

preliminary data which can then be used in unnecessarily . Many controlled studies could
the design of more-definitive studies. be improved by quite modest changes in
Some complex sets of data involve measur- experimenta l design or in the statistical
ing several different parameters for each analysis of the results (19).
individual. For example, haematology stud- Even the definition of what constitutes an
ies will provide data on red and white blood "experiment" is not always clear. It is not
cell counts, packed cell volume, platelets, unknown for a research worker to build up
reticulocytes, etc. Each parameter could be experimental data from a control group and
analysed separately, or a multivariate analy- some treated groups, with animals being
sis could be used to analyse the whole data added on an ad hoc basis without any a priori
set in a single analysis, taking account of any indication of what the eventual set of data will
correlations between various parameters look like. The problem with this approach is
(16). With such complex data, it is often dif- that it assumes that the environment , experi-
ficult to specify the hypotheses to be studied, mental animals and measuremen t conditions
and many of the multivariate statistical remain constant. If this is not the case, any
methods, such as principal components treatment comparisons will be confounded by
analysis, are essentially exploratory (17). these environmen tal variables. Thus, all
Exploratory methods can also be used with experiments should be fully planned before
the analysis of variance (ANOVA [18]). any data are collected, and the experimenta l
Whether or not the approach suggested by plan should define the treatment groups,
Muller et al. (15) would suit all projects is species, strain, sex, age and number of ani-
open to debate. However, all research pro- mals, manner of randomisatio n, experimenta l
jects should be subject to strategic review at design, time-scale, data to be collected, and
which the Three Rs are considered both proposed method of statistical analysis. Only
before the experimenta l work commences, in exceptional circumstance s should the plan
and periodically throughout the project. This be modified once the experiment has started.
could lead to savings of resources as well as Thus, it might be acceptable to eliminate a top
to a reduction in animal use. dose group if the test chemical is unexpectedly
toxic, although this could alter the power of
the experiment. However, it would not usu-
Experimen tal Design ally be acceptable to add another dose group
once an experiment is under way, because in
Research projects often involve many sepa- such circumstance s proper randomisatio n is
rate experiments performed either sequen- impossible and there is no assurance that
tially or in parallel. Some of these can be environmen tal variables will not have
uncontrolled and qualitative with a clear changed.
objective which may or may not be achieved. Some experiments could be improved by
For example, a project may involve the pro- using more treatment combination s. Mead
duction of a transgenic mouse strain which is (20) suggested that most controlled experi-
either successful or unsuccessful. Success ments should involve 10-50 treatment com-
may depend on many of the factors discussed binations (usually in a factorial design) if
previously in the section on research strat-
resources are to be used efficiently, although
egy. this must be done with the appropriate sta-
tistical analysis. In one survey, only 10% of
Controlled studies papers published in two toxicological jour-
Many experiments involve comparative stud- nals had ten or more treatment combina-
ies ("controlled" experiments) , in which two tions (21). Thus, if Mead is correct, there is
or more groups are compared which, as far as scope for obtaining more information at little
possible, only differ with respect to one or cost in terms of animal welfare or financial
more treatments. These studies are capable consideration s.
of detecting quite subtle treatment effects, Some researchers place great value on his-
and are widely used in safety evaluation torical data, but this must be used with great
where the aim is to define the conditions care in view of the many factors that can
under which exposure to a chemical has little influence a biological response (22). As noted
or no effect. However, they need to be care- earlier, factorial experimenta l designs (Figure
fully designed if animals are not to be used 1), in which two or more factors (for example,
288 M.F.W. Festing et al.

treatments, time, sex, strain, age, or diet) are mental unit because animals in the same
varied simultaneously, usually make more cage may have a common environment, so
efficient use of resources (including experi- they are not independent of one another. A
mental animals) than do designs involving statistical analysis based on the assumption
only a single factor (23-25). Where there are that the animal was the experimental unit
many factors which can influence a response, could show whether or not the means of the
it is even possible to use fractional factorial two groups differed, but it would not be clear
and so-called "confounded" designs to explore whether this was caused by the treatment or
their importance (26), though such designs by environmental differences between the
are rarely used in biomedical research. cages, possibly as a result of fighting in one,
It is important to identify the "experimen- but not in the other, cage. With such a
tal unit", i.e. the unit which can be assigned design, the cage is really the experimental
at random and independently to a treatment. unit, and no valid statistical analysis can be
This may, for example, be an individual ani- conducted because there are only two units
mal, a cage of animals, an animal for a spec- in the experiment.
ified time-period, or a part of an animal. The
appropriate statistical analysis cannot be Reducing variability
carried out unless the experimental unit is The importance of uniformity of the experi-
correctly identified. For example, if an exper- mental material cannot be over-emphasised,
iment is designed with all the control ani- as it determines the extent to which treat-
mals in one cage and all the treated animals ment groups will be similar at the start of
in another, the animal cannot be the experi- the experiment. Research workers often go

Figure 1: Example of a hypothetical 2 x 2 x 2 factorial experiment

A factorial arrangement of treatments

Control Treated

Diet A DietB Diet A Diet B

Male n n n n

Female n n n n

The main interest might be in comparing the control and treated groups for a particular quan-
titative parameter. In this hypothetzcal example, it has been decided to incorporate two differ-
ent diets and both sexes; "n" is the number of animals in each subgroup. Unequal group
numbers can be accommodated wzth modern statistical analysis packages, although a com-
pletely blank cell would create problems. Note that the comparison between the treated and con-
trol group would probably require 12-22 animals, even if a single sex and diet were to be used
(as estimated by Mead's "resource equation" method; see text). The full factorial design can
similarly be done with n = 2 or 3, givzng a total of 16-24 animals. However, the factorial
design normally provzdes more information because it shows the extent to which any treatment
difference depends on the sex and diet of the animals.
Almost any factor which it is thought could znfluence the response can be used. For example,
instead of two diets, it would have been possible to use two time-points or strains, or another
type of treatment, etc. Simzlarly, the factorial arrangement of treatments can be carried out as
a randomised block design by, for example, letting n = 1 and repeating the mini-experiment,
say, three times as three blocks.
ECVAM Workshop 29: reduction 289

to great lengths to obtain animals of uniform ment variation. There are special designs
weight and age (often leading to unnecessary ("nested designs") which can be used to iden-
wastage, since such heterogeneity can often tify the sources of variation. Action can then
be accommodated by blocking), so that after be taken to control the variability, rather
random assignment to treatment groups the than simply increasing the number of ani-
mean weights and ages are very similar. mals.
However, for some reason, the genetic het-
erogeneity found in outbred stocks is often Size of the experiment
considered to be advantageous (27), even Methods of determining an appropriate size
though it means that treated and control for an experiment are not widely understood.
groups are more likely to differ genetically at This is not surprising, as this is an area of
the start of the experiment than if more- statistics which is complex and has not yet
homogeneous animals had been used. Larger been solved satisfactorily for all situations by
numbers of animals must then be used to mathematical statisticians. One approach is
compensate for these differences. Where to use power analysis (31). In the past this
genetic variation in response is considered to has been difficult, as the calculations are
be important, it should be incorporated into complex for experiments with more than two
the experimental design by using several dif- treatment groups. The availability of com-
ferent strains, stocks or breeds with a facto- puter programs for estimating sample sizes,
rial experimental design (2). This can be such as nQuery Advisor (32), has partially
done without increasing the overall total solved this problem. A power analysis
number of animals. Differences between requires: a) an estimate of the effect size
stocks are usually much greater than differ- likely to be of scientific interest; b) an esti-
ences between individuals within a single mate of the standard deviation (SD); c) spec-
stock, and therefore this will often result in a ification of the desired power (i.e. the chance
much wider range of susceptibility pheno- of detecting a specified treatment effect); and
types than the use of a single heterogeneous d) specification of the significance level to be
stock (27). used.
Those factors which could influence the The comparison of two laboratory animal
outcome of the experiment, such as the geno- diets, a standard diet and a new formulation
type, sex and age of the animals, and sources designed to reduce obesity, with the body
of uncontrolled variation, such as measure- weight of male mice after they have been on
ment error or time and space variables, need the diet for six months being the dependent
to be identified. For example, many behav- variable, can be used as an example. From
ioural and physiological parameters can vary previous work, it is known that the mean
with the time of day due to circadian body weight ± SD of this strain of mice at six
rhythms (28, 29). Even barometric pressure months is 44 ± 3.8g. Suppose it was specified
can affect animal behaviour (30). Many of that the result would be of interest if the
these factors cannot be standardised, but mice on the new formulation weighed 15%
often can be controlled by using randomised less (i.e. 37.4g) than those on the standard
block designs (26). A randomised complete formulation, and that Student's t test with a
block design is one where the experimental significance level of 5% and a power of 80%
unit (for example, the animals) has been were to be used. Necessarily, this specifica-
placed into smaller, more-homogeneous sub- tion is somewhat arbitrary. By using these
groups, which can be kept together through- figures, nQuery Advisor indicates that the
out the experiment to minimise variation experiment can be done by using five mice
due to non-homogeneous material and time per group. However, 18 mice per group
and space variables (Figure 2). Such designs would be required if the effect size was a
often lead to substantial increases in preci- reduction in weight of 7.5% compared with
sion at no extra cost. They are widely used in the mice on the standard diet.
agricultural research, but not by researchers The results of a power analysis are highly
using laboratory animals. Often the experi- dependent on the specifications, particularly
menters do not know what is causing the if a small effect is to be detected. Specification
variability in their studies. For example, it of an effect size of potential interest might not
may be animal-to-animal, day-to-day, sam- be too difficult with a simple experiment such
ple-to-sample or measurement-to-measure- as the one outlined above, but would, for
290 M.F.W. Festing et al.

Figure 2: Diagram of a randomised block experimental design with four blocks


and five treatments

Block 1 2 3 1 4 5

Block2 5 3 2 1 4

Block3 1 3 5 4 2

Block4 5 2 1 3 4

The numbers in each block are codes for the treatment given to each animal. The purpose of
such a design is to reduce heterogeneity associated with time, space and some physical varia-
tion. The five animals in Block 1 would be chosen to be as similar as possible with respect to
age, weight, genotype and any other variable which might influence the outcome. They would
then be assigned at random to one of the five treatments. The animals would be housed in close
proximity, and possibly even in the same cage, if the dtfferent treatments can be given in such
a situation. When measurements are to be made on the animals, those in Block 1 would be
treated as a group, with all measurements being made within a short period by the same per-
son. Animals in the other blocks would likewise be selected to be as uniform as possible, but
might differ in body weight or age, etc. from those in Block 1. They could be measured at a dif-
ferent time, if necessary by a different person. Thus, within each block any comparison among
treatments would be made on animals which are in all possible ways as similar as possible.
The unwanted variation shows up as differences between blocks. This is then removed mathe-
matically in the, statistical analysis.
The randomised block design provides a means to break down an experiment into smaller
parts, which can be handled more conveniently. In most cases, it will increase the precision at
no extra cost, apart from the need for a slightly more-complex statistical analysis.

example, be difficult for a factorial experi- An advantage of a power analysis is that it


mental design with several treatment combi- can be used to explore the implications of
nations and dependent variables. An estimate negative results, i.e. those in which there is
of the SD could be obtained from a previous no significant difference between treatment
study, from the literature, or from a pilot means. Such negative results might be of bio-
study. However, all these sources of informa- logical interest, particularly in safety testing,
tion are subject to error. The significance level if they are real, but are oflittle interest if the
is usually set, somewhat arbitrarily, at 0.05, lack of statistical significance was because
and the power is likewise often set at 80-90%. the experiment was too small to detect a
Thus, although doing a power analysis is a treatment difference of potential interest.
useful exercise in showing the potential capa- Thus, a power analysis can be used to find
bility of various proposed experiments to out the probability that the experiment
detect an effect of biological interest, it does would have been able to detect a specified
not always give a definitive answer of the treatment effect if it was really there. For
most appropriate size of each experiment. It is example, in the diet experiment discussed
strongly recommended, therefore, that sam- previously, suppose that an experiment had
ple sizes are continually reviewed as experi- been undertaken to compare the two diets
mental results become available. with ten mice per group, and that the mean
ECVAM Workshop 29: reduction 291

body weights at the end of the experiment designs, the number of blocks less one also
had been virtually identical at 44g with an needs to be subtracted. For example, in Fig-
SD averaged across groups of ± 4g. The sta- ure 2 there is a total of 20 animals, with five
tistical analysis would indicate that there treatments and four blocks; in this case, E =
was no significant difference in body weight 20 - 5 - 4 + 1 = 12. Note that for experi-
between the animals on the two diets, but ments with only two treatment groups,
would not indicate whether the experiment group size might appropriately be between
could have detected a difference of biological six and 11 to give E = 10-20, whereas for
interest. A power analysis could be used to larger experiments such as the factorial
determine the probability that this experi- experiment shown in Figure 1, which has
ment would have detected a 15% change in eight treatment combinations, a group size
body weight given that the mean weight of of only three would be required to give E =
the controls was 44 ± 4g, and there were ten 16. Thus, group size can be reduced and
mice per group. This is easily done by using more information can usually be obtained if
nQuery Advisor, which indicates that with there are several treatment groups. Note
such an experiment there would have been a that the use of blocking might appear to be
93% chance of detecting such a treatment counter-productive as it reduces E. However,
effect if it existed. Thus, if the diet was really the reduction in the error variance when
capable of reducing body weight by 15%, using a randomised block design usually
there would have been a good ii:hance of more than compensates for this (19).
detecting it. As the resource equation method does not
The alternative "resource equation" specify statistical power, effect size of inter-
method for determining the size of an exper- est, SD or significance level, it will not be
iment (20), is a rule-of-thumb approach known in advance how effective the experi-
based on the observation that, for experi- ment will be in detecting a particular treat-
ments involving quantitative variables, ment effect (21). It will be known, however,
diminishing returns of information are that little would be gained from using sub-
found if the size of the experiment is stantially more animals than the numbers
increased so that there are substantially required to give about 20 degrees of freedom
more than about 20 degrees of freedom for for error. However, for large ana complex
the error term, "E". However, good returns experiments, the upper limit of 20 degrees of
are found from using more animals if there freedom may be so restrictive that it is
are less than about 10 degrees of freedom for impossible to have a balanced experiment
error. Thus, the optimum size of an experi- with the same numbers of animals in each
ment usually has between 10 and 20 error group, which is also desirable. Therefore, the
degrees of freedom. As an example, the ear- limits of E = 10-20 should not be applied too
lier experiment involving a new formulation rigidly. Also, for some in vitro tests where,
of a mouse diet compares the mean body for example, the experimental unit might be
weights of the mice by using an unpaired a tissue culture dish, including more dishes
t test which has E = n - 2 error degrees of could be inexpensive. In such circumstances,
freedom, where n is the total number of ani- it might be economical to allow E to be much
mals. Thus, for E = 10-20, the experiment higher than would generally be acceptable if
should use a total of 12-22 mice, or 6-11 animals were being used. Having done some
mice per group, which is in broad agreement experiments applying the resource equation
with estimates from the power analysis. method, it might be desirable to explore their
The resource equation method is easy to power characteristics by using a power
use with quite complex experimental analysis.
designs. For completely randomised designs Designing experiments with excessive
(i.e. not randomised block designs), E is the numbers of animals resulting in unnecessar-
total number of animals minus the total ily high precision should also be avoided. The
number of treatment combinations; so, existence of very low probability (p) values
assuming a completely randomised design in (for example, p < 0.001) indicates that the
Figure 1, if n = 4, there will be a total of 32 experiment may have been unnecessarily
animals and eight treatment combinations, large. Hendriksen et al. (33) found that
so E is 32 - 8 = 24. If n is reduced to 3, E will assays of adsorbed diphtheria and tetanus
be 24 - 8 = 16. With randomised block vaccine could usually be undertaken with
292 M.F.W. Festing et al.

half the number of animals presently inaccessible to most experimentalists. Thus,


required, yet still be within the limits of con- this approach would normally require the
fidence stipulated by the European Pharma- active involvement of a professional statisti-
copoeia and the World Health Organization cian.
(WHO). They also suggested that there In conclusion, there seems to be consider-
should be some flexibility in national and able scope for improving the design of indi-
international requirements to allow for indi- vidual experiments to reduce the number of
vidual circumstances. animals needed for a given research output.
A person with expertise in experimental
Sequential experimentation design should be involved in planning exper-
Sequential designs (34) in which the out- iments, with this involvement being formally
come of an experimental (and control) treat- recognised.
ment is observed with small numbers of
animals (in a "mini-experiment"), followed
either by reaching a conclusion about the Statistical Analysis
effect of the treatment or by taking a deci-
s'ion to treat another sample of animals, The aim of statistical analysis is to extract all
could be more widely used. Indeed, there useful information from the data. The
may be scope for the more-widespread use of method of analysis will be closely linked to
such designs in experimental surgery (35). the experimental design and to the type of
Sequential designs often use substantially data to be produced. Researchers should nor-
fewer animals than those involving fixed mally have a clear idea of how they intend to
numbers (36), but they are only applicable to analyse the results at the experimental
relatively simple experiments where the design stage.
results are quickly available. The main limi-
tation is that the results of each mini-exper- Common problems
iment must be available before the next one A common statistical mistake seems to be
is undertaken, so that the appropriate deci- the use of Student's t test to analyse experi-
sion can be taken. ments which have more than two treatment
Sequential designs for evaluating the groups (41). In such circumstances, the t test
LD50 or ED50 (50% effective dose) of a com- may lack statistical power, so that real treat-
pound have been known for many years, but ment effects can be missed. It can also lead to
are sometimes difficult to apply. Two false positive results if many different com"
improved sequential approaches have been parisons are made, and it is not easy to test
described recently. The Fixed Dose Proce- for potentiation or interaction, such as a dif-
dure (37) and the Acute Toxic Class method ferent response to a drug treatment in males
(38) have been evaluated in some detail and and females, in a factorial experimental
both use fewer animals than conventional design. Other common mistakes include fail-
LD50 tests; in addition, they can incorporate ure to take account of variation among het-
observations of toxic symptoms rather than erogeneous experimental groups, and failure
using death as the endpoint. Wherever possi- to present any statistical analysis even
ble, an ED50 using a more-humane endpoint though numerical data are generated (21).
than death, such as a change in behaviour or
a specified reduction in body temperature Solutions
(39), should be used, even though it may not
strictly indicate acute lethal potential. In practice, most measurement data from
A special case of sequential experimenta- controlled experiments can be analysed by
tion is to use a Bayesian approach, where the using the ANOVA, a highly versatile tech-
researcher's prior beliefs about the outcome nique which can be used to analyse quite
of an experimental treatment are updated complex data sets. The method requires the
and modified by the availability of sequential assumption that the residuals (i.e. the devia-
sets of new experimental data. Unfortu- tions from the group means) are indepen-
nately, Bayesian statistical methods are not dently and normally distributed and are the
discussed in most elementary statistical text same in each group, although a scale trans-
books, and even introductory texts (40) have formation can be used to achieve these con-
a highly mathematical approach which is ditions. Most statistical packages now
ECVAM Workshop 29: reduction 293

provide diagnostic methods for studying well with the available data, was "to what
these residuals, and scale transformation is extent does genotype influence adduct lev-
easy, should it be necessary. Although facto- els?"
rial designs are commonly used, they are not The results of any experiment should be
always correctly analysed in terms of the clearly presented by using suitable tables
marginal means of each factor, and the inter- and graphs. The presentation of tables, in
actions between factors. In view of the particular, could often be improved. It is
importance and value of factorial designs, almost universal practice to quote a mean
more training in their use and analysis with an SD or SE based on the animals
might be appropriate. within that particular group, even though a
Many other statistical methods are poten- pooled SD across groups would provide a bet-
tially useful and could be more widely used, ter estimate of the population SD (41). If
including various tests for comparing pro- pooled SDs were more widely used, the
portions, tests for trends and correlations, means could be presented much more clearly
and multivariate methods such as principal without each one having ± SD appended to
components analysis for analysing data it. Journal editors and referees could suggest
where there are several dependent variables such modifications and also play an impor-
(16). tant part in improving the statistical quality
In conclusion, there appears to be consid- and presentation of published papers. Papers
erable scope for better statistical analysis of should generally be sent to a specialist sta-
experiments as a means of extracting more tistical referee, or a biologist with a good
useful information which, in the long run, understanding of statistics, if they contain
should reduce animal use. any numerical data. If editors have difficulty
finding statisticians prepared to referee
papers, they should consider offering a fee.
Interpretation and Communication Guidelines on statistical analysis have been
published in a few journals (15), although
The results of each experiment need to be these are difficult to develop in view of the
interpreted, and in many cases the design of wide range of methods that might be used.
the next experiment depends upon this However, all journals publishing papers on
interpretation, which is sometimes flawed. studies which involved the use of experimen-
A common error is to base the interpreta- tal animals should include a statement
tion on statistical significance (usually a requiring authors to adhere to strict humane
p value) rather than on the magnitude of standards, pointing out that this implies the
the treatment effect. A treatment effect can use of the minimum number of animals
be statistically significant but of little bio- needed to achieve the scientific objectives of
logical interest and, conversely, a biologi- the study (42).
cally interesting effect might not be In conclusion, there is scope for improving
statistically significant because the experi- the interpretation and presentation of the
ment was poorly designed and unable to results of individual experiments. This
detect it. As an example, a paper submitted would improve the communication of
for publication (and rejected in its present research results, and could also lead to a
form) claimed that in a genetically hetero- reduction in animal use.
geneous population exposed to a carcinogen,
the levels of DNA adducts (a measure of
DNA damage) were significantly associated Legislation and Internal Review
with genotype at a polymorphic locus. How-
ever, the paper was written in such a way Legislation and internal review procedures
that it was impossible to determine whether employing inspectors and/or ethics or animal
genotype was numerically important or was experimentation committees have been
just one of many factors that affected the developed, in part, as a response to the
adduct levels. Thus, the question that was demand for the use of humane techniques.
answered, as a result of undue emphasis on However, the law and requirements for
p values, was "does genotype affect adduct review procedures vary between countries,
levels?", but the question of real interest, and between institutes or companies within
which could have been answered equally a country.
294 M.F.W. Festing et al.

Laws relating to the use of animals in the where Directive 86/609/EEC provides a com-
European Union (EU) Member States have mon legislative framework, there is consid-
been enacted in response to Directive erable variation between Member States in
861609/EEC (43). Article 7(3) states that: "In the collection of statistics on animal use.
a choice between experiments, those which Some practical proposals have been made to
use the minimum number of animals, involve standardise the statistics, including some
animals with the lowest degree of neuro- modifications to the Directive to remove
physiological sensitivity, cause the least ambiguities, the use of a standard set of
pain, suffering, distress or lasting harm and tables by all Member States, the use of legal
which are most likely to provide satisfactory measures to enforce adequate data collec-
results shall be selected" (italics added for tion, and the development of some methods
emphasis). Thus, in the EU, there is a clear of quality control to ensure the accuracy of
requirement that reduction should be con- the data (50). These suggestions need to be
sidered as an integral part of the review implemented.
process, although countries differ in the Reduction in animal use should also be
exact wording of their specific national legis- related to research output. Strong emphasis
lation. on total numbers and the setting of arbitrary
Similar legislation has either not been targets should be avoided, as they can be
introduced, or has not been enforced, in counter-productive in terms of animal wel-
many countries worldwide. This has resulted fare. For example, such emphasis could
in many countries lacking a legal require- result in excessive re-use of animals, at the
ment to use the minimum number of ani- expense of their welfare. Pharmaceutical
mals. Reviews of the development of research and chemical production are
alternatives in relation to the legislation in increasingly conducted by multinational
force in various countries indicate that companies, and research can be relocated to
approaches vary considerably (44-48). The another country if the laws regulating ani-
establishment of some kind of ethics review mal use in a particular country were to make
committee seems to be common, although it too difficult. It would be counter-produc-
the composition, remit and effectiveness of tive, in terms of animal welfare, if there was
such committees varies. Whatever their com- too much migration of research and testing
position, such committees usually assess the to countries with less stringent animal wel-
quality of the proposed project, and often fare regulations.
suggest improvements which could result in International harmonisation (via the
a reduction in animal use. International Conferences on Harmonisa-
Adherence to regulations and guidelines tion) of standards governing the toxicity test-
not primarily designed to promote animal ing of pharmaceuticals appears to have
welfare, such as those associated with Good resulted in nearly a 50% reduction in the
Laboratory Practice and international stan- number of animals which are used to test
dards such as ISO 9001, might also lead to a some pharmaceuticals (51). If this reduction
reduction in the use of animals, because they is realised across the board, it will also rep-
ensure that procedures are carried out con- resent a considerable financial saving. Such
sistently, to predefined standards which are harmonisation should continue and be
less likely to be erroneous or inappropriate extended to the testing of other chemical and
(49). However, there is a danger that such biological compounds. Guidelines such as
regulations could prove inflexible, and could, those produced by the OECD should be
in some cases, result in animals being used to updated periodically, with emphasis being
satisfy bureaucratic, rather than scientifi- placed on the Three Rs, if possible by using
cally justifiable, objectives. an external ethical review panel. Where
The success of attempts to reduce the use improved methods are introduced, their
of animals as a combined result of replace- adoption should be promoted, and older
ment and reduction initiatives should be guidelines should be deleted after a suitable
monitored, although this presents prob- period of time. Where full harmonisation
lems. Few countries collect accurate statis- cannot be achieved, mutual recognition of
tics on laboratory animal use, and in no case data by national control or regulatory
can the use of laboratory animals be related authorities should be adopted, to avoid the
to research output. Even within the EU, duplication of tests.
ECVAM Workshop 29: reduction 295

In conclusion, there appears to be scope for 53). The syllabus for people in category C
greater international harmonisation of ani- includes some training in experimental
mal welfare legislation, methodology for design and statistics, which would serve to
internal review, and for the continued har- improve communication between
monisation of testing procedures for phar- researchers and statisticians. Although such
maceutical, chemical and biological courses have now been running for several
compounds, with specific emphasis being years, it will take some time before all
placed on the implementation of the Three research scientists have been trained.
Rs. Research should now be undertaken to
determine how effective the training has
been in improving the quality of research
Education and Information and in reducing animal usage, how the train-
ing can be improved, and whether refresher
The provision of suitable education for courses are needed.
researchers, animal house staff, members of The potential benefits of reducing the
ethics committees, and others involved in number of animals used in a given project
animal research should result in the use of are not always appreciated by research sci-
fewer animals for a given research output entists. While the financial impact of using
(52). However, there is considerable variation small numbers of additional animals appears
between countries, and between institutes to be negligible, research progress is often
within each country, in the provision of such limited by the resources available. In some
education. In some developing countries, cases, researchers spend considerable
there is no training available for animal tech- amounts of time reading slides or making
nicians or veterinary professionals, and there measurements on tissues taken from ani-
is no requirement or possibility for mals. Smaller experiments would save time
researchers to attend courses on humane and resources in addition to animals and
techniques in their own country. A relatively provided the experiments are well designed:
small additional effort, and financial support, research progress would be more rapid.
from developed countries working through Researchers might be more receptive to the
international organisations such as the Inter- concept of reduction if the wider economic
national Council on Laboratory Animal Sci- benefits were realised.
ence and the WHO could lead to considerable Ideally, each project team should have
improvements in laboratory animal welfare access to statistical advice. However, com-
standards and the quality of animal experi- munication between biologists and statisti-
mentation in these countries, in conjunction cians is often unsatisfactory, and in some
with a reduction in animal use. (usually academic) institutes there may be
Within developed countries, such as those a consultancy fee for statistical advice.
in the EU, there is usually a well-developed This is a strong disincentive for the
career structure for those working with ani- researcher to consult a statistician, with
mals, with a legal requirement for the provi- the possible result that animals are used
sion of veterinary staff in laboratory animal unnecessarily, for the reasons discussed
facilities. Researchers are also required to previously. Moreover, there is a need for
undergo training in the handling and use of improved communication between statisti-
experimental animals. Several working par- cians and researchers. If statisticians were
ties established by the Federation of Euro- consulted more frequently, and in some
pean Laboratory Animal Science Associations cases were included as joint authors, they
(FELASA) have considered the training of could more easily acquire an understand-
people using experimental animals. Four cat- ing of the practical aspects which have to
egories of such people were recognised (53): A be taken into account when optimising the
- those taking care of animals; B - those design of animal experiments. Funding
carrying out animal experiments; C - those authorities, and the institute where the
responsible for directing animal experiments; work is carried out, should be required to
and D -laboratory animal science specialists. provide sufficient resources to ensure that
The working party covering categories A research is conducted to high standards,
and C published its report in 1995, including and this should include sufficient funds to
a teaching syllabus (discussed in reference allow input from a statistician in cases
296 M.F.W. Festing et al.

where the researcher needs such advice. If 4. Lists of suitable reference literature and
the regulatory authorities administering computer programs should be developed
animal welfare legislation are not satisfied to resource such workshops.
that adequate statistical advice is available 5. To ensure that the optimum number of
at a cost which the researchers are able to
animals is used, a person with expertise
pay, the authority to carry out laboratory
in experimental design should be
animal research on those premises might
involved in planning all experiments,
have to be withdrawn.
and this involvement should be formally
The possible value of computer-aided
recorded.
learning of experimental design and statis-
tics needs to be evaluated. A number of 6. Guidelines should be developed for imple-
programs have been developed for teach- menting Recommendation 5, which
ing statistics, but they are not all relevant should include ways in which feedback
and none has yet been used as a means of and improvement might be incorporated
implementing the Three Rs. However, the into the consultation process.
use of such learning aids might help to 7. The education and training of experi-
cater for the heterogeneous backgrounds menters should include discussion of the
of researchers working with animals.
types of experimental design and their
Some home study, done in conjunction applications. The experimenters should
with formal teaching, might provide an
be actively involved by using real case
economic way of increasing the knowledge
studies wherever possible. Training
of experimental design and statistics of
courses should aim to bring experi-
researchers, and might provide useful
menters to a level where they can com-
material for refresher courses. municate effectively with experts in
experimental design, and should include
Conclusions and Recommendations an awareness of the range of available
experimental designs and statistical
analyses, and of the interpretation and
Reduction through the application of better
presentation of results. Those involved
research strategy, experimental design and
in the review process outlined above
statistical methods
should be informed of the level of train-
1. All proJects which might involve the use ing in experimental design achieved by
of experimental animals should be each researcher.
reviewed at regular intervals, to include
8. Resource materials, including a syllabus,
consideration of how reduction, refine-
for such courses should be gathered
ment and replacement are to be incorpo-
and/or developed.
rated in the experimental matrix or
strategy. The review panel should 9. Journal editors should be encouraged to
include at least one person independent require authors to provide brief descrip-
of the research group undertaking the tions of the type of experimental design
work. used, and to improve the presentation
of data and their analysis in publica-
2. Guidelines and/or checklists should be tions.
developed to assist this review process,
and methods should be developed to Leglslative and organisational framework
monitor the success or failure of the pro-
ject reviews in reducing animal use. 10. Laws to protect laboratory animals and
to encourage the development of high
3. Workshops covering research strategy
ethical standards in the use of animals
should be organised within companies, should be enacted in those countries
industry associations, academic insti- where no such laws currently exist.
tutes and scientific societies every 3-5
years. This topic should be supplemen- 11. There should be increased support for
tary to any coverage of experimental developing countries in establishing
design, and should presume knowledge guidelines, legislation and educational
equivalent to FELASA's Category C syl- programmes in relation to the use of
labus for research scientists. experimental animals and the Three Rs,
ECVAM Workshop 29: reduction 297

with particular reference to the potential the species to be used, possible microbio-
scientific and economic benefits from logical hazards to humans and animals,
improving the quality of biomedical sci- the design and conduct of experiments,
ence. This could most effectively be anaesthesia, analgesia, experimental
implemented through organisations techniques, replacement alternatives,
with suitable links with such countries, ethical aspects, and analysis of scientific
such as the WHO and the International papers. Such training could be based on
Council for Laboratory Animal Science. the curriculum proposed by FELASA,
12. Where there are scientific and academic but with adjustment for individual cir-
exchanges between developing and cumstances.
developed countries, information and 19. In training scientists in the humane use
discussion on the Three Rs should form of laboratory animals, the positive bene-
part of the study and academic pro- fits in terms of improved scientific qual-
grammes. ity and output resulting from reducing
13. Where laws exist to protect laboratory the numbers of animals used (as well as
animals, national authorities should from the consideration of other alterna-
ensure that they are effectively imple- tives) should be stressed.
mented. 20. Funding authorities and organisations
14. All institutes where experimental ani- involved with animal research should
mals are used should be required to ensure that the necessary resources are
maintain and document internal review available for conducting humane
processes that specifically address the research. These facilities should include
implementation of the Three Rs. Exam- access to training and education for all
ples of people who could be involved in categories of staff, and the provision of
this process include animal technical adequate statistical advice at a reason-
and veterinary staff, those with knowl- able cost.
edge of alternative methods, those with 21. Communication between researchers
expertise in statistics in relation to the and statisticians/biometricians needs to
needs of biological projects, and people be improved. Consideration should be
who are independent of the work of the given to the development of training
institution. courses or workshops on practical and
15. Information on internal review theoretical aspects of animal experimen-
processes, adopted either as a conse- tation for non-clinical statisticians/bio-
quence of national legislation or volun- metricians.
tarily, should be collated to assist in 22. An evaluation of computer-aided learn-
further developing these processes. ing courses, databases and information
16. All EU Member States should be sources which could assist in implement-
required to produce annual statistics on ing reduction should be undertaken, and
the use of experimental animals which the results made available to teachers
are accurate, comprehensive and compa- involved in training scientists in humane
rable. Research should be undertaken to techniques.
identify methods of measuring trends in 23. Education is a continuing process.
animal use relative to scientific output. Refresher courses should be offered to
17. International harmonisation of testing reinforce and update information. All
procedures should be a continuing courses should be taught in a flexible
process, and should specifically address manner to take account of, and benefit
ways in which the Three Rs can be fur- from, the heterogeneous background of
ther implemented. most participants.

Education, training and information General recommendations and possible


18. All research workers who use experi- methods for implementation
mental animals should have appropriate 24. The recommendations and conclusions
training in, for example, the biology of of two previous workshops, on "The
298 M.F.W. Festing et al.

Three Rs: the Way Forward" (54; Table 26. A person should be appointed, initially
I) and "Guidelines for Reviewing Man- on a short-term contract, to support the
uscripts on Studies Involving Live Ani- above committee and to provide human
mals" (55; Table II), which are relevant resources for implementing those recom-
to this workshop, should be reviewed in mendations where immediate interven-
terms of the progress made with their tion might be successful.
implementation. Where progress
appears to have been unsatisfactory,
ways should be sought to implement References
the recommendations more effectively.
1. Anon. (1994). ECVAM News & Views. ATLA 22,
25. A Standing Committee on Reduction, of 7-11.
5-10 members with suitable expertise 2. Russell, W.M.S. & Burch, R.L. (1959). The Princi-
in laboratory animal science and tech- ples of Humane Experimental Technique, 238 pp.
London: Methuen.
nology, statistics, and biomedical educa- 3. Flecknell, P.A. (1994). Refinement of animal use:
tion and information, should be assessment and alleviation of pain and distress.
established under the auspices of an Laboratory Animals 28, 222-231.
appropriate governmental or charitable 4. Festing, M.F.W. (1997). Variation and its implica-
tions for the design of experiments in toxicologi-
organisation, specifically to progress cal research. Comparative Haematology
the recommendations in this and other International 7, 202-207.
reports. 5. Gruneberg, H. (1952). The Genetics of the Mouse,

Table 1: Reduction alternatives: conclusions and recommendations of ECVAM


workshop report 11 a

13. In cases where a choice between species is possible, there is generally no scientific justi-
fication for using more of the smaller species than of the larger one.

14. Research strategy should be considered carefully, with a view to reducing the numbers of
animals u~ed. The example of Hendriksen et al. (7), in choosing strains of laboratory mice
in order to minimise the numbers needed in specific biological assays, should be followed
for those assays which use large numbers of animals and which are unlikely to be
replaced with in vitro alternatives in the near future.

15. The design of regulatory testing procedures, including the sample sizes required, should
be reviewed regularly, possibly as part of international harmonisation.

16. Substantial reduction in animal use could be achieved by further harmonising toxicity
testing regulations, for example, with respect to group sizes, dose levels and the length of
studies.

17. In view of the uncertainties inherent in "extrapolating" to humans, the need for very
high precision in data provided by animal experiments should be reconsidered.

18. There is evidence that some non-regulatory animal experiments are poorly designed and
incorrectly analysed. As a minimum, all research workers should have adequate training
in experimental design and the proper use of statistical methods.

19. The concept of the "named statistician" as an essential part of the regulatory framework
of animal experimentation should be explored.

a Taken from Balls et al. (54).


ECVAM Workshop 29: reduction 299

Table II: Conclusions from a workshop on guidelines for reviewing manuscripts


on studies involving live animals8

1. All journals publishing papers which might involve animal suffering or distress should be
encouraged to have a statement of journal policy with respect to the use of research ani-
mals. This should normally be published in the instructions to authors.

2. No single policy statement is appropriate for all journals.

3. A set of example statements should be developed, which could be made available to edi-
tors, in order to assist them in developing or enhancing an appropriate policy.

4. A policy statement alone is generally not sufficient to ensure that it is followed. All ref-
erees should have a copy of the guidelines for authors, so that failure to comply is more
likely to be noted by them.

5. Editors could consider requiring authors to sign a declaration that they have followed the
appropriate ethical procedures, or alternatively, an appropriate statement of ethical com-
pliance should be included in the journal article.

6. The paper should contain some justification for the use of animals, stating why no alter-
native approach could be used.

7. In some cases, papers do not give sufficient information about the animals to enable other
research workers to repeat or correctly interpret a study. A checklist of information
which might be appropriately recorded in the materials and methods section of a partic-
ular paper could be helpful to journal editors. It is not suggested that all the information
on the checklist would be used in every paper, but it would act as a reminder to authors
editors and referees of the need to provide sufficient information. - '

8. It is not possible for an ad hoc working party to develop and disseminate the appropriate
material to journal editors. This will require effort over quite a long period, and a mod-
est budget.

9. Dissemination should, as far as possible, be through existing "umbrella" organisations for


science editors, such as the Council of Biology Editors (CBE), the European Association
of Science Editors (EASE) and the International Committee of Medical Journal Editors
OCMJE).

10. It was agreed that ECVAM should be asked to consider establishing a working party on
this subject, with the aim of implementing and extending the ideas set out in this docu-
ment.

aTaken from Festing & van Zutphen (55).

650 pp. The Hague, The Netherlands: Nijhof. van Zutphen, L.F.M. (1994). Immunogenicity
6. Festing, M.F.W (1991). The reduction of animal testing of diphtheria and tetanus vaccines by
use through genetic control of laboratory animals. using isogenic mice with possible implications for
In Replacement, Reduction and Refinement: Pre- potency testing. Laboratory Animals 28, 121-129.
sent Posstbtllttes and Future Prospects (ed. C.F.M. 8. Festing, M.F.W. (1990). Contemporary issues in
Hendriksen & H.B.W.M. Koeter), pp. 193-212. toxicology: use of genetically heterogeneous rats
Amsterdam: Elsevier. and mice in toxicological research: a personal per-
7. Hendriksen, C.F.M., Slob, W., van der Gun, J.W., spective. Toxicology and Applied Pharmacology
Westendorp, J.H.L., den Bieman, M., Hesp, A. & 102, 197-204.
300 M.F.W. Festing et al.

9. Gore, S.M., Jones, I. G. & Rytter, E. C. (1977). Mis- the Society of Experimental Biology and Medicine
use of statistical methods: critical assessment of 145, 421-427.
articles in BMJ from January to March 1976. 29. Schwartz, W.J. & Zimmerman, P. (1990). Circa-
British Medical Journal 8 January 1977, 85-87. dian timekeeping in Balb/c and C57BU6 inbred
10. Altman, D.G. (1982). Statistics in medical jour- mouse strains. Journal of Neuroscience 11,
nals. Statist1cs in Medicine 1, 59-71. 3685-3694.
11. Altman, D.G. (1982). Misuse of statistics is uneth- 30. Sprott, R.L. (1967). Barometric pressure fluctua-
ical. In Statistics in Practice (ed. S.M. Gore & tions: effect on the activity of laboratory mice.
D.G. Altman), pp. 1-2. London: British Medical Science 157, 1206-1207.
Association. 31. Muller, K.E. & Benignus, V.A. (1992). Increasing
12. Altman, D.G. (1991). Practical Statistics forMed- scientific power with statistical power. Neurotoxi-
ical Research, 611 pp. London: Chapman & Hall. cology and Teratology 14, 211-219.
13. Beynen, A.C., Festing, M.F.W. & van Montfort, 32. Elashoff, J.D. (1995). nQuery Advisor User's
M.A.J. (1993). Design of animal experiments. In Guide, 202 pp. Boston, MA, USA: Statistical Solu-
Principle11 of Laboratory Animal Science (ed. tions.
L.F.M. van Zutphen, V. Baumans & AC. Beynen), 33. Hendriksen, C.F.M., van der Gun, J.W., Mars-
pp. 209-240. Amsterdam: Elsevier. man, F.R. & Kreeftenberg, J.G. (1987). The
14. Stephens, U.K. & Moore, M.T. (1997). Harmoni- effects of reductions in the numbers of animals
sation of the research effort: a communication used for the potency assay of the diphtheria and
model in the USA. In Harmonisation of Labora- tetanus components of adsorbed vaccines by
tory Animal Husbandry: Proceedings of the Sixth methods of the European Pharmacopoeia. Jour-
FELASA Symposium, 19-21 June 1996, Basel, nal of Biological Standardisation 15, 287-297.
Switzerland (ed. P.N. O'Donoghue), pp. 170-172. 34. Armitage, P. (1975). Sequential Medical Trials
London: The Royal Society of Medicine. 2nd edn, 191 pp. Oxford: Blackwell Scientific
15. Muller, K.E., Barton, C.N. & Benignus, V.A Publications.
(1984). Recommendations for appropriate statis- 35. Festing, M.F.W. (1975). Experimental design. In
tical practice in toxicological experiments. Neuro- An Introduction to Experimental Surgery (ed. J.
toxicology 5, 113-126. De Boer, J. Archibald & H.G. Downie), pp. 5-45.
16. Festing, M.F.W., Hawkey, C.M., Hart, M.G., Tur- Amsterdam: Excerpta Medica.
ton, J.A, Gwynne, J. & Hicks, R.M. (1984). Prin- 36. Chamove, AS. (1996). Reducing animal numbers:
cipal components analysis of haematological data sequential sampling. A WIC Newsletter 7, 3-6.
from F344 rats with bladder cancer fed the 37. Whitehead, A & Curnow, R.N. (1992). Statistical
retinoid N-(ethyll-all-trans-retinamide. Food and evaluation of the fixed-dose procedure. Food and
Chemical Toxicology 22, 559-572. Chemical Toxicology 30, 313-324.
17. Maxwell, AE. (1977). Multwariate Analysis in 38. Schlede, E., Mischke, U., Roll, R. & Kayser, D.
Behavioural Research, 164 pp. London: Chapman (1992). A national validation study of the acute-
&Hall. toxic-class method: an alternative to the LD50
18. Hoaglin, D.C., Mosteller, F. & Tukey, J.W. (1991). test. Archives of Toxicology 66, 455-470.
Fundamentals of Exploratory Analysis of Vari- 39. Wong, J.P., Saravolac, E.G., Clement, J.G. &
ance, 430 pp. New York: John Wiley. Nagata, L.P. (1997). Development of a murine
19. Festing, M.F.W. (1992). The scope for improving hypothermia model for study of respiratory tract
the design oflaboratory animal experiments. Lab- influenza virus infection. Laboratory Animal Sci-
oratory Animals 26, 256-267. ence 47, 143-147.
20. Mead, R. (1988). The Design of Experiments, 620 40. 'Lee, P.M. (1989). Bayesian Statistics: An Intro-
pp. Cambridge: Cambridge University Press. duction, 344 pp. London: Arnold.
21. Festing, M.F.W. (i996). Are animal experiments 41. Festing, M.F.W. (1994). Reduction of animal use:
in toxicological research the "right" size? In Sta- experimental design and quality of experiments.
tistics in Toxicology (ed. B.J.T. Morgan), pp. 3-11. Laboratory Animals 28, 212-221.
Oxford: Clarendon Press. 42. Boisvert, D.P.J. (1997). Editorial policies and ani-
22. Roe, F.J.C. (1994). Historical ·histopathological mal welfare. In Animal Alternatives, Welfare and
control data for laboratory rodents: valuable trea- Ethics (ed. L.F.M. van Zutphen & M. Balls), pp.
sure or worthless trash? Laboratory Animals 28, 399-404. Amsterdam: Elsevier.
148-154. 43. Anon. (1986). Council Directive of 24 November
23. Fisher, R.A (1960). The Design of Experiments, 1986 on the approximation of laws, regulations
248 pp. New York: Hafner Publishing Company. and administrative provisions of the Member
24. Maxwell, S.E. & Delaney, H.D. (1989). Designing States regarding the protection of animals used
Experiments and Analyzing Data, 902 pp. Bel- for experimental and other scientific purposes.
mont, CA, USA: Wadsworth Publishing Com- Official Journal of the European Communities
pany. L358, 1-29.
25. Cox, D.R. (1958). Planning Experiments, 208 pp. 44. Strandberg, J.D. (1997). National/regional devel-
New York: John Wiley. opments in alternatives and animal use: an
26. Cochran, W.G .. & Cox, G.M. (1957). Experimental overview. In Animal Alternatives, Welfare and
Designs, 611 pp. New York: John Wiley. Ethics (ed. L.F.M. van Zutphen & M. Balls), pp.
27. Festing, M.F.W. (1995). Use of a multi-strain 125-126. Amsterdam: Elsevier.
assay could improve the NTP carcinogenesis 45. de Greeve, P. & de Leeuw, W. (1997). Develop-
bioassay program. Environmental Health Per- ments in alternatives and animal use in Europe.
spectives 103, 44-52. In Animal Alternatives, Welfare and Ethics (ed.
28. Fox, R.R., Laird, C.W. & Kirshenbaum, J. (1974). L.F.M. van Zutphen & M. Balls) pp. 127-136.
Effect of strain, sex, and circadian rhythm on rab- Amsterdam: Elsevier.
bit serum bilirubin and iron levels. Proceedings of 46. Johnson, N.E. (1997). Developments in alterna-
ECVAM Workshop 29: reduction 301

tives and animal use in Australia/New Zealand. In refinement and reduction in animal use. EBRA
Animal Alternatives, Welfare and Ethics (ed. Bulletin November 1997, 4-9.
L.F.M. van Zutphen, & M. Balls), pp. 137-143. 52. Festing, M.F.W. (1981). The "defined" animal
Amsterdam: Elsevier. and the reduction of animal use. In Animals in
47. Kuroda, Y. (1997). Developments in alternatives Research: New Perspectives in Animal Experi-
and animal use in Japan. In Animal Alternatives, mentation (ed. D. Sperlinger), pp. 285-306. Chich-
Welfare and Ethics (ed. L.F.M. van Zutphen & M. ester: John Wiley and Sons.
Balls), pp. 145-150. Amsterdam: Elsevier. 53. van Zutphen, L.F.M. (1997). Education and train-
48. Stephens, M.L. (1997). Developments in alterna- ing in laboratory animal science. In Harmomsa-
tives and animal use in North America. In Animal tion of Laboratory Animal Husbandry:
Alternatives, Welfare and Ethics (ed. L.F.M. van Proceedings of the Sixth FELASA Symposium,
Zutphen & M. Balls), pp. 151-154. Amsterdam: 19-21 June 1996, Basel, Switzerland (ed. P.N.
Elsevier. O'Donoghue), pp. 160-166. London: The Royal
49. de Vrey, P. (1997). The use of quality assurance Society of Medicine.
systems in maintaining control over animal 54. Balls, M., Goldberg, A.M., Fentem, J.H., Broad-
experiments. In Harmonisation of Laboratory head, C.L., Burch, R.L., Festing, M.F.W., Frazier,
Animal Husbandry: Proceedings of the Sixth J.M., Hendriksen, C.F.M., Jennings, M., van der
FELASA Symposium, 19-21 June 1996, Basel, Kamp, M.D.O., Morton, D.B., Rowan, A.N., Rus-
Switzerland (ed. P.N. O'Donoghue), pp. 79-83. sell, C., Russell, W.M.S., Spielmann, H.,
London: The Royal Society of Medicine. Stephens, M.L., Stokes, W., Straughan, D.W.,
50. Rusche, B., Sauer, U.G. & Kolar, R. (1997). Yager, J.D., Zurlo, J. & van Zutphen, L.F.M.
Evaluation of the Statistical Information Con- (1995). The Three Rs: the way forward. The
cerning the Number of Animals Used for Exper- report and recommendations of ECVAM work-
imental or Other Scientific Purposes in the EU shop 11. ATLA 23, 838-866.
Member States According to Directive 55. Festing, M.F.W. & van Zutphen, L.F.M. (1997).
86/609/EEC, 46 pp. Neubiberg, Germany: Guidelines for reviewing manuscripts on studies
Akademie fiir Tierschutz. involving live animals. Synopsis of the workshop.
51. Lumley, C.E. & van Cauteren, H. (1997). Har- In Animal Alternatives, Welfare and Eth1cs (ed.
monisation of international toxicity testing guide- L.F.M. van Zutphen & M. Balls), pp. 405-410.
lines for pharmaceuticals: contribution to Amsterdam: Elsevier.

You might also like