Professional Documents
Culture Documents
Automated Stratigraphic Correlation - F. Agterberg (Elsevier, 1990) WW
Automated Stratigraphic Correlation - F. Agterberg (Elsevier, 1990) WW
Automated Stratigraphic Correlation - F. Agterberg (Elsevier, 1990) WW
1. A.J. Boucot
EVOLUTION AND EXTINCTION RATE CONTROLS
3. L.J. Salop
PRECAMBRIAN OF THE NORTHERN HEMISPHERE
4. J.L. Wray
CALCAREOUS ALGAE
5. A. Hallam (Editor)
PATTERNS OF EVOLUTION, AS ILLUSTRATED BY THE FOSSIL RECORD
8. D. Jan6ssy
PLEISTOCENE VERTEBRATE FAUNAS OF HUNGARY
Automated Stratigraphic
Correlation
El? Agterberg
Mathematical Applications in Geology Section, GeologicalSurvey of Canada,
601 Booth Street, Ottawa, Ont., K 1A OE8, Canada
ELSEVIER
Amsterdam - New York - Oxford -Tokyo 1990
ELSEVIER SCIENCE PUBLISHERS B.V.
Sara Burgerhartstraat 25
P.O. Box 21 1, 1000 AE Amsterdam, The Netherlands
ISBN 0-444-88253-7
All rights reserved. No part of this publication may be reproduced,.stored in a retrieval system or
transmitted in any form or by any means, electronic, mechanical, photocopying, recording or
otherwise, without the prior written permission of the publisher, Elsevier Science Publishers B.V./
Physical Sciences & Engineering Division, P.O. Box 330, 1000 AH Amsterdam, The Netherlands.
Special regulations for readers in the USA -This publication has been registered with the Copyright
Clearance Center Inc. (CCC), Salem, Massachusetts. Information can be obtained from the CCC
about conditions under which photocopies of parts of this publication may be made in the USA. All
other copyright questions, including photocopying outside of the USA, should be referred to the
copyright owner, Elsevier Science Publishers B.V., unless otherwise specified.
No responsibility is assumed by the Publisher for any injury and/or damage to persons or property
as a matter of products liability, negligence or otherwise, or from any use or operation of any meth-
ods, products, instructions or ideas contained in the material herein.
FOREWORD
Geological correlation of strata plays a key role in sedimentary basin
analysis. Such correlation, particularly when scaled in linear time,
requires that a series of unique points for non-recurrent events like
occurrences of fossils must first be determined, common to t h e
sedimentary record as observed a t different sites. An important
contention of geological correlation is that once such events, probably
grouped in biozones, have been properly determined and defined, these
units can indeed be used for correlation. This statement, which might
seem to be trivial, is made here because existing stratigraphic codes show
how to construct stratigraphic units but they do not define how to correlate
them. The actual correlation generally takes place in the subjective
domain of regional experts on a particular basin o r time period.
Procedures for correlation or stratigraphic equivalence depend on
subjective evaluation of the unique relation of each individual site record
to the derived and accepted standard. It follows that correlation as
practiced in geology cannot be readily verified without a detailed, and
probably exhaustive review of all the underlying facts. Traditionally
there is no method of formulating the uncertainty in fixation of individual
records t o the standard. Hence biostratigraphy often is more considered
an art rather than a science. The problem of using subjective judgement
only is not so much that it leads to right or wrong stratigraphy, but that a
single solution is proposed. It should be attempted to establish reasonable
criteria for successful correlation by providing insight into the actual
uncertainty in correlation, either in millions of years or in depth in meters.
PREFACE
The purpose of this book is to provide an introduction t o recent
developments in automated stratigraphic correlation using computer
programs for ranking and scaling of stratigraphic events. It is intended for
advanced geology students, research workers and teachers with a
background in stratigraphy and a n interest in using computer-based
techniques for problem-solving. The mathematical background provided
is sufficient to justify the methods that are used but the equations are
relatively few and concentrated in specific sections (mainly in Chapters 3,
6 and 8) and may be skipped by readers who are not mathematically
inclined. Occasionally, use is made of elementary statistical techniques
(t-test, chi-squared test or analysis of variance) on which additional
explanations can be found in one of the numerous excellent introductory
textbooks on probability and statistics in existence.
After data inventory for a region or time period, the stratigrapher
first proceeds to establish a regional zonation which later can be used for
correlation. Age calibration is a requirement for constructing this
zonation as well as for the process of stratigraphic correlation. The
computer can play a n integral r81e in these procedures. In this book, the
emphasis is on worked-out examples of application of ranking, scaling and
correlation of stratigraphic events using relatively small datasets, for
illustration of the intermediate steps made within the computer between
input and output. It should be clear t o the reader that automated
stratigraphic correlation is not a simple automatic process such a s
alphabetic sorting. The stratigrapher has to integrate vast amounts of
information which cannot possibly be stored in large databanks. Every
piece of evidence or link between different pieces of evidence or hypotheses
has its own sources of uncertainty associated with it. Using a computer for
problem-solving may violate uncertainties that cannot be quantified.
Computer input, therefore, always should be evaluated critically by expert
stratigraphers and paleontologists.
In total there are ten chapters. The purpose of the first two chapters
is to introduce the probabilistic method for automated stratigraphic
correlation and t o discuss principles of quantitative stratigraphy.
Applications of mathematical statistics and computer science not
specifically dealing with ranking and scaling but of interest t o
stratigraphers and paleontologists are presented in Chapter 3. Coding and
file management of stratigraphic information (Chapter 4) provides the
VlII
input required for ranking and scaling of biostratigraphic events by means
of the RASC method treated in the next two chapters. A number of topics
including rank correlation, precision of the scaled optimum sequence,
normality testing and t h e modified RASC method a r e presented
separately (in Chapters 7 and 8) as extensions and refinements of the
RASC method. The chapter on event-depth curves a n d multi-well
comparison (Chapter 9) contains examples of regional applications with
automated correlation between stratigraphic sections. Finally, in Chapter
10, much of the material on methods presented in earlier chapters is
summarized in a general description of t h e micro-RASC system of
computer programs for ranking, scaling and regional correlation of
stratigraphic events.
I a m indebted to many individuals and organizations for support.
Foremost among these is Felix Gradstein of the Atlantic Geoscience
Centre of the Geological Survey of Canada who started me thinking about
automated biostratigraphic correlation in 1978. From 1979 to 1986, I had
t h e privilege of being t h e Leader of Project 148 ( Q u a n t i t a t i v e
Stratigraphic Correlation Techniques) of the International Geological
Correlation Programme co-sponsored by Unesco and the International
Union of Geological Sciences. This project and later the Committee on
Quantitative S t r a t i g r a p h y of t h e I n t e r n a t i o n a l Commission on
Stratigraphy provided the framework for regular discussions with most
colleagues active in method development for quantitative stratigraphy. I
have used suggestions of m a n y of t h e s e colleagues, especially
P.O. Baumgartner (UniversitB de Lausanne, Switzerland), G.F. Bonham-
Carter (Geological Survey of Canada, Ottawa), J.C. Brower (Syracuse
University, Syracuse, New York, U.S.A.), J.M. Cubitt (Poroperm, Chester,
U.K.), E. Davaud (Universitb de Genkve, Switzerland), P.H. Doeven
(Petro-Canada, Calgary, Canada), C.W. Drooger (University of Utrecht,
the Netherlands), L. Edwards (U.S.G.S., Reston, Virginia, -U.S.A.),
C.M. Griffiths (University of Trondheim, Norway), J. Guex (Universitb de
Lausanne, Switzerland), C.W. Harper, Jr. (University of Oklahoma,
Norman, U.S.A.), W.W. Hay (University of Colorado, Boulder, Colorado,
U.S.A.), I. Lerche (University of South Carolina, Columbia, S.C., U.S.A.),
D.F. Merriam (Wichita State University, Wichita, Kansas, U.S.A.),
M. Rube1 (Academy of Sciences, Estonian SSR, Tallinn, U.S.S.R.),
W. Schwarzacher (Queen's University, Belfast, U.K.), B. S t a m (Shell
Syria, Damascus), J.E. Van Hinte (Free University, Amsterdam, t h e
Netherlands) and M. Williamson (Shell Canada, Calgary, Canada).
IX
Thanks are due to these individuals for their critical remarks during
development of the ranking and scaling techniques to be discussed. I am
grateful for assistance by computer programmers at the Geological Survey
of Canada especially to Ning Lew, Louis Nel and Jacqueline Oliver, and t o
Dan Byron, Marc D’Iorio, and Kazim Nazli as my students at the Ottawa-
Carleton Geoscience Centre.
For this book I have made extensive use of material in publications
authored or co-authored by me during the past 10 years. On eight
occasions, I was one of the lecturers of the one-week Quantitative
Stratigraphy Short Course given under the auspices of IGCP Project 148
and the Committee on Quantitative Stratigraphy in Canada (2 X 1, Brazil,
China, Holland, India, U.K. and U.S.A. Mostly attended by stratigraphers
and quantitative geoscientists from oil companies, this course provided a
stimulating environment for jointly exploring and testing ideas on how to
use computers intelligently. Those familiar with the earlier work will find
many extensions of the RASC method made during the past three years
especially in the fields of coding the original stratigraphic information,
comparison with other methods and statistical evaluation. For example, it
was well known that ranges on average range charts constructed by means
of RASC tend to be shorter than those resulting from most other methods.
The new modified RASC method yields range charts with wider ranges
connecting entries to exits for taxa in those stratigraphic sections where
these taxa were observed at their lowest and highest positions relative t o
all other taxa considered.
The Geological Survey of Canada has allowed me t o work on this book
project which involved extensive support including drafting and
photography. The project would not have been possible without the
invaluable help in word-processing received from Janet Gilliland, Shirley
Kostiew, Guylaine Leger and Diane Winsor. Martin Tanke of Elsevier has
provided guidance and encouragement. Last but not least I thank my wife
Codien for her help and understanding.
F.P. Agterberg,
Ottawa, January 1990
This Page Intentionally Left Blank
XI
CONTENTS
Foreword ...................................................... V
Preface ...................................................... VII
CHAPTER1. PROBABILISTIC M E T H O D F O R A U T O M A T E D
STRATIGRAPHIC CORRELATION
1.1 Introduction ............................................. 1
1.2 IGCPProject 148 ........................................ 2
1.3 Quantitative biostratigraphy ............................. 5
1.4 Quantitative chronostratigraphy ......................... 11
1.5 Quantitative lithostratigraphy ........................... 14
1.6 Recent developments in stratigraphy ..................... 15
INDEX . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 419
1
CHAPTER 1
STRATIGRAPHIC CORRELATION
1.1 Introduction
Emphasis in this book is on subjects (11, (5) and (6). This includes the
construction of range charts depicting periods of existence for different
fossil taxa in comparison with one another.
6
There are few basic studies that shed light on the actual distribution
of fossils in rocks from a statistical point of view. For a review and
applications t o modern benthic Foraminifera and Late Cretaceous
molluscs, see Buzas et al. (1982). The geological factors affecting the
chance of event detection generally remain unknown and cannot be
modelled prior to extensive sampling and stratigraphic analysis itself. On
the other hand, it is widely known from repeated observations that for
many groups of organisms, the majority of taxa is found a t relatively few
sampling sites and with few specimens. Figure 1.1 shows the cumulative
number of highest or lowest occurrences of taxa in well o r outcrop sections
in different areas of a large number of taxa of Mesozoic radiolarians,
Cenozoic dinoflagellates, Cenozoic Foraminifera and Cretaceous
nannofossils. The radiolarian and nannofossil data use lowest and highest
occurrences; the dinoflagellates and foraminifers highest occurrences only.
The graphs of Figure 1.1 show that the number of lowest or highest
occurrences of taxa found in at least 1 , 2 , 3 , ..., n sites, decreases steadily.
In other words, the majority of species (events) occur at few sites and few
species (events) are ubiquitous. It is noted that the sections used for the
examples vary in density and spacing and the shapes of the curves in
Figure 1.1 are influenced by methods of sampling. In Figure 1.1,
dinoflagellate events are most localized and nannofossils least. The use of
first and last occurrences increases traceability of taxa as shown for the
radiolarians and nannofossils. Obviously, quantitative stratigraphic
methods may want t o cull the data so as t o avoid use of species for which
the number of events is limited and enhances “noise”. Thresholds in, for
example, ranking and scaling (RASC) are set such that no use is made of
events that occur in less than h, sections; h, is set by the user. Rare events
of value for age calibration can be re-introduced later, during final
analysis.
Fig. 1 . 1 Cumulative frequency distributions of stratigraphic first and last occurrences of microfossils in
Mesozoic and Cenozoic strata: 1 = number of dinoflagellates occui ring in 2, 3, ... wells; data for 249 last
occurrences of Cenozoic dinoflagellates in 19 wells, northwestern Atlantic margin; 2 = data for 119 first
and last occurrences of late Cretaceous nannofossils in 10 wells, northwestern Atlantic margin; 3 = data
for 220 first and last occurrences of Mesozoic radiolarians at 76 sites, Mediterranean and Atlantic
realms; 4 = data for 116 last occurrences of Mesozoic foraminifers in 16 wells, northwestern Atlantic
margin; 5 = data for 147 last occurrences of Cenozoic foraminifers in 29 wells, central North Sea (from
Agterberg and Gradstein, 1988).
During the late 1950s and early 1960s’ Shaw (1964) had developed a
simple semi-objective method (Composite Standard method) of the
conservative type for dealing with inconsistencies. First and last
appearances of paleontological events in two sections are plotted against
each other. Next a line is fitted by using the method of least squares and
used for combining the two sections (line of correlation). The updated
positions of first or last appearances are those that are respectively lower
or higher in either of the two sections. A new section is plotted against the
combination of the first few sections. The procedure of adding other
sections is repeated until the “composite standard” is obtained that reflects
the maximum ranges of taxa. Shaw’s (1964) methodology was to a large
extent based on original work by earlier quantitative paleontologists,
notably Brinkmann (1929) who introduced basic concepts of statistical
biostratigraphy .
(1) The uncertainty due t o the fact that the optimum, or “true”, sequence
of fossil events has not been established. Under the influence of
Hay’s(1972) paper, ranking of events in time t o arrive a t their
stratigraphic order i s often referred t o a s “Probabilistic
Stratigraphy”. Binomial theory was used to evaluate superpositional
relations between events for statistical significance. However, as
Agterberg and Nel(1982a,b) have pointed out, there are no simple
models t o rank stratigraphic events according t o a numerical
probability. The problem is that order in time should be based both on
direct and on indirect estimates. For example, in Hay’s binomial
theory the fact that event A occurs above B in several sections ranks
the same as that A in some sections occurs above events C, D, E, F and
G, and that in some other sections C, D, E, F and G occur above B.
Both situations lead to the conclusion that A occurs above B, although
there is no simple way t o express this in terms of numerical
probability and more advanced mathematical methods for multiple
comparison have to be used.
(2) The uncertainty due t o the fact that the intervals between fossil
events along a relative time scale are not known (spacing or scaling
problem). In conventional biostratigraphy extensive use is made of
distances in time between events or (non) overlap of ranges t o produce
assemblage zones. In the simple, graphical technique of the
composite standard as developed by Shaw (1964), distance between
two or more successive events is a function of the relative dispersion
of each event in the sections considered; first occurrence levels are
minimized and last occurrence levels are maximized, but no direct
standard errors are available for the composite positions.
(3) The uncertainty due t o the fact that the geographic distribution of an
event is not known. Drooger (1974) refers to this as traceability. As
pointed out earlier, few taxa are ubiquitous and most species are rare.
10
Three types of error bars are shown in Figure 1.2. A local error bar is
estimated separately for each individual well. It is two standard
deviations wide and has the probable isochron location a t its center. Use is
made of the assumption that the rate of sedimentation is linear in the
vicinity of each isochron computed. Consideration of nonlinear
sedimentation rates results in the asymmetrical modified local error bar of
Figure 1.2B. Like the local error bar a global error bar (Fig. l.2C) is
symmetric but it is based on estimates of uncertainty in age which are
13
Fig. 1.2 Example of CASC multi-well comparison with three types of error bar. The probable positions of
the time-lines were obtained from event-depth curves fitted to the biostratigraphic information of
individual wells. For further explanation see text.
14
.
-200 -100 0 100 200 300 3.0 2.0 1.0 0.0 -1.0 -2.0
I I I
-
<
I I I I I I
0 , I ---__-__ Plio-Pleistocene 1
104
20 - 20 -
1 Miocene
A
0
Oligocene
2
Y
30-@’
gc 40-
Eocene
50 -
60
I
i
I
Crelaceour
70
’O’
Fig. 1.3 Comparison of the magnitudes of sea level events of the Tertiary as inferred by Vail et al. (1977)
from seismic stratigraphy, and the composite benthic 6 1 8 0 record according to Miller and Fairbanks
(1985). The encircled numbers refer to particular rises and falls examined by Williams et al. (1988). Also
see Table 1.1.
may yield results that are significantly different. For example, Odin
(1982) estimated the age of the Jurassic-Cretaceous boundary at 130 f 3
Ma but Harland et al. (1982) obtained 144 f 5 Ma. These 95 percent
confidence intervals do not overlap indicating unresolved problems of
methodology. This subject will be discussed in more detail in Section 3.12.
Menning (1989) has provided a synopsis of 30 complete and partial
geochronological time scales for the Phanerozoic published over a 70-year
period t o 1986. It is remarkable how close the most recent time scales are
to the first scale of Barrel1 (1917). For example, Barrell’s estimate of the
Jurassic-Cretaceous boundary was 135 M a which is identical to the age
estimate for this boundary in the above-mentioned 1989 global
stratigraphic chart. On the other hand, many geologists prefer the 144 Ma
estimate of Harland et al. (1982) and Kent and Gradstein (1985) for the
age of the Jurassic-Cretaceous boundary (cf. Section 3.12).
Seismic stratigraphy and isotope chronostratigraphy (Williams et al.,
1988) are providing new tools for the stratigrapher. For example, Figure
1.3 is a comparison of the magnitude of particular sea level events of the
Tertiary as inferred from seismic stratigraphy (Vail et al., 1977) and the
17
composite benthic 6l80 record (Miller and Fairbanks, 1985). The two
patterns exhibit a similar long-term trend. Table 1.1 (after Williams et
al., 1988) compares magnitudes of 8 Tertiary sea level events (rises or
falls) based on the two methods. These are 3rd order events. In almost all
instances, the inferred sea-level change using sequence boundary patterns
yielded larger estimated changes than the 6 l 8 0 signal. The overall
agreement is not good a t this level of detail but both these types of
methodology are new and subject t o continuous improvement. For a recent
review of this topic and other approaches of chemical stratigraphy t o time-
scale resolution, see Williams (1990).
TABLE 1.1
Comparison of the magnitude of particular sea level rises and falls based on seismically
defined unconformities with the 8180 record (after Williams et al., 1988, Table 11,
p. 112).
CHAPTER 2
PRINCIPLES OF QUANTITATIVE STRATIGRAPHY
2.1 Introduction
The original meaning of stratigraphy is “description of layers” and
like most earth science disciplines it is essentially a natural philosophy.
This implies t h a t stratigraphy is rooted in a body of organized,
historically-accumulated observations, governed by a series of widely
accepted principles and rules. The t w o physical principles of this
philosophy are:
2) sedimentary layers are laid down sequentially, one after another and
become younger upwards if left undisturbed (law of Steno; cf. Nowlan,
1986).
Over the last 200 or more years the science of stratigraphy has developed
into several major categories of effort and knowledge.
requires that a series of unique points for non-recurrent events must first
be determined, common t o the stratigraphic record as observed a t different
sites. An excellent introduction to this field of study is by Schwarzacher
(1985a,b).
interval
I 11
zone
concurrent
rangezone
assemblage
zone B
assemblage
zone A
multi-taxon
concurrent
range zone
Fig. 2.1 Types of zones commonly used for biostratigraphic correlation (simplified from Hedberg, Editor,
1976). See text for further explanation.
average
interval
zone
Fig. 2.2 RASC zonations are based on average stratigraphic events. The average interval zone between
the exits of taxa A and B begins before the highest occurrence of B in section 3 and ends before the
highest occurrenceof A in section 2.
23
0.0 1
1T;
1-2
2 -3
1 .o 3-4
4-5
5-6
6-7
7-8
Fig. 2.3 Construction of dendrograrn for scaled highest occurrences of eight taxa. Intervals between
successive (average) exits are plotted along the distance scale of the dendrogram. Events which are close
together along the distance scale on the left (such as exits 3 to 6) form clusters which can be shaded in the
dendrogram. Clusters separated by longer distances can be useful as (RASC) zones in a regional
biozonation. Because average exits are used, events belonging to the same cluster are characterized by
more frequent cross-overs of tie-lines between sections.
0.0 -
1.0 -
i 6-8
8-1 0
10-12
12-1
Q 1-7
c 7-1 4
8 2.0-
U
14-1 6
b- 16-3
3-1 1
3.0 . 11-5
5-1 3
13-9
13
9-1 5
9
4.0.
0.8 0.4 0.0
Distance
Fig. 2.4 Same as Fig. 2.3 using lowest and highest occurrences to construct the dendrogram
example of Figure 2.3 are averages. The seven intervals between them
were plotted along the distance scale to the right and a dendrogram was
obtained by constructing perpendicular lines moving downward from the
points that represent the average interval zones. Each perpendicular line
24
“species zone” methods remain valid to-day as a summary for the relation
between quantitative and qualitative methods:
“It would seem to me that there is no need to make a choice here, that is, the two methods are not
usually exclusive but complementary. It is indeed not at all possible to draw a sharp boundary between
them. In order to achieve a greater precision in chronology, we use sometimes (in the case of species
zones), second or third series of species in addition to our principal evolutionary series of species. We
compare, furthermore, the time ranges of individual species with one another and so succeed in
recognition of a number of subzones. In such instances, one already considers a certain percentage of the
total fauna. This naturally constitutes a transition to the faunal method. In practice, the latter method
also does not ever utilize the sum total of forms available but only a selection therefrom. The long-
ranging, chronologically useless representatives of a fauna, which usually form its percentage wise
predominant element, are in this case quietly denied any consideration.”
“A community of organisms is a complex thing, the components of which are characterized by very
different behavior. Some of the individual forms (taxa) are extremely dependent on facies. They only
bloom under quite definite, narrowly limited conditions of life. If these conditions are altered, they
become extinct locally in some instances. In other instances, they emigrate and reappear sometimes, at
least in the instances of long-ranging species in considerably younger horizons, the conditions of
deposition of which have satisfied their specific bionomic requirements. Other organisms are less facies-
dependent. However, their sensitivety varies so that the individual forms concerned (taxa), in turn,
behave very differently whenever the conditions of life undergo changes. The changes of facies are
therefore apt to result in faunal discordances and strong variations in the composition of the faunas
concerned.”
8P 1 OA I:, 1 1 5
It
Fig. 2.5 Possible relative age hypotheses for two taxa A and B according to Harper (1981). Vertical line
segments with arrows indicate ranges of taxa in time. Two hypotheses (10 and 11) are further divided on
the basis of presence or absence of a time gap between ranges of the two taxa.
29
Fig. 2.6 Theoretical example of Davaud (1982)showing distribution in space and time of seven different
taxa with true chronological succession.
Fig. 2.7 Diagrams to illustrate how biological events are recorded in sediments (after Davaud, 1982).
Diagram (a) shows time-space domain for a particular species. Population density is reflected by points
density. Diagram (b) illustrates that during same period of time and in same geographic area, the
sedimentation rate changed. When the sedimentation rate is applied to points of diagram (a) and
integrated over time, the points are moved to new positions in the sedimentary record as shown in
diagram (c). If the probability of detection is proportional to density of points in the sedimentary record,
the end point of the chronological range of a species could be underestimated, especially if sedimentation
rate was high at time of biological disappearance of the species.
(A) (D)
-
5
21 4
I
?
I
I 1
I 1 I
I1
3
I
T I
d I5
?I
Fig, 2.8 Sedimentary record of biological events in four stratigraphic sections corresponding to the
theoretical example of Fig. 2.6. Distortion due to differential role of sedimentation was similar to the one
shown in Fig. 2.7 (b).
c
”true“
range ;:li
f -e,-
I
base
observed
range
Fig. 2.9 Relationship between observed range extending from time t l to t ~and , “true” range extending
from time 81 to 82. Strauss and Sadler (1989) assumed that the probability of finding a fossil is constant
across its true range. If a species was less abundant at its time of appearance or disappearance, a s
illustrated by the density curve in the diagram, it becomes more difficult to estimate the true range even
if facies and sedimentation remained constant.
Strauss and Sadler as unbiased point estimators and their upper range
extension to 95 percent confidence interval. These authors used the
Dirichlet distribution which results from a Poisson process for uniform
sedimentation. It was assumed that each fossil existed for an unknown
period of time. The chances of finding it remained equal during this
period. The density curve for highest finds has a tail that extends in the
stratigraphically downward direction under these conditions.
Fig. 2.10 Ammonite ranges in late Cretaceous strata of Seymour Island, Antarctic Peninsula. Observed
local ranges (heavy vertical lines) and actual finds (solid circles) after Macellari (1986, Fig. 5).
Extrapolated end-points of ranges according to Strauss and Sadler (1989, Fig. 1). Light vertical lines
represent upper range extensions to unbiased point estimators. Dashed vertical lines a r e upper range
extensions to 95 percent confidence intervals. Numbers assigned to taxa a r e a s follows: 0 =
Diplomoceras lambi; 1 = Maorites seymourianus; 2 = Kitchinites darwini; 3 = Grossouurites gemmatus;
4 = Maorites weddelliensis; 5 = M. densicostatus morphotype-alpha; 6 = Kitchinites laurae; 7 =
Anagaudryceras seymouriense; 8 = Maorites densicostatus morphotype-gamma; 9 = Pachydiscus
riccardi; 10 = Maorites densicostatus morphotype-beta; 1 I = Pseudophyllites loryi; 12 = Pachydiscus
ultimus.
TABLE 2.1
Averages ( r ) ,standard deviation (d)and their ratio ( V = d / r ) as functions of sample size ( n ) as obtained
by means of computer simulation experiments (after Jasko, 1984).
n r d V n r d V
I oon 985 16 3 Ill 1259 405
2 864 1093 1265 17 3203 1259 393
3 I355 1 I28 832 I8 3231 1247 386
4 1663 I I63 699 19 3285 1263 385
5 I910 I I91 623 20 3323 1273 383
6 2 112 I188 562 21 3370 I267 376
7 2263 I199 530 22 3432 I288 375
8 2412 I209 501 23 3514 1270 361
9 2541 I206 475 24 3534 I277 361
10 2638 I227 465 25 3586 I249 348
II 2737 I247 456 26 3 563 I276 358
12 2817 I237 439 27 3648 I287 353
13 2893 1250 432 28 3692 I272 345
14 2971 I 250 421 29 3698 I 269 345
15 3 052 I 254 411 30 3777 I 292 342
TABLE 2.2
A B C
Fig. 2.11 Model of Signor and Lipps (1982) for alteration of diversity patterns by artificial range
truncation. In Fig. 2.11A, diversity is suddenly reduced by a catastrophic extinction event. Imposing the
artificial range truncation model illustrated in Fig. 2.118 on the pattern of Fig. 2.11A produces the
apparent gradual decline in diversity of Fig. 2.11C.
diversity than e.g. the curve for dinosaurs below the Cretaceous-Tertiary
boundary (cf. Russell, 1975,1977; Van Valen and Sloan, 1977).
OBSERVED
HIGHEST
/ OCCURRENCE
R E L A T I V E T I M E SCALE
Fig. 2.12 Schematic diagram representing frequency distributions for relative abundance (broken lines)
and location of observed highest occurrence (solid lines) for two taxa. Vertical line illustrates that
observed highest occurrences of two taxa can be coeval even when the frequency distributions of these
two taxa are different.
38
z z
0
0
F
+
V
3 z
I-
X
W
I
I
I
REWORKING
OOWNHOLE I M ISIDENTIFICATION
; REWORKING
I I
I I
I
a
l
I
I
I
I
CONTAMINATION,’ MISIDENTIFICATION
Fig. 2.13 Edwards’ (1982a) model to display probability of observing lowest - or highest-occurrence event
relative to “true” time of evolution or extinction in outcrop or core material for (a) first occurrence event;
and (b) last occurrence event. According to Edwards (1982), details for curves will vary for every
individual taxon, and gross shapes of curves will vary with kind of organism (e.g. rapidity of dispersal,
facies control) and nature of sample material (core, outcrop, cuttings).
with Edwards’ assumption. Likewise, the model of Signor and Lipps (Fig.
2.11B) is i n agreement with t h a t of Edwards because the slope of their
curve continues to increase in the stratigraphically upward direction.
Arrorlatlonr
bases
species E
A B
tops
species A
A A C D
Fig. 2.14 Baumgartner’s (1986) model for frequency curves of last appearance of species A and first
appearance of species B. The two species are actually co-occurring in section 7. The asymmetrical
smoothed curves in Fig. 2.14C a r e based on the bar-graphs representing the observed frequencies of Fig.
2.14B. In a probabilistic model, it could be assumed that these curves are symmetrical (broken lines)
extending upward and downward from the mean positions. If the means a r e used for constructing a
range, the result is ~ A B . A symmetrical Gaussian curve has the property that 68 percent of the area
undder the curve is contained between its inflection points located a t the mean plus or minus one
standard deviation. These intervals a r e shown as eA and eg. The Unitary Associations method would
result in the overlapping ranges for species A and B shown in Fig. 2.14D.The latter result would also be
obtained by using the Gaussian curves and assuming that and eg would extend two instead of one
standard deviations on either side of the mean.
40
Edwards (198213) has pointed out that if both highest and lowest
occurrences of taxa are used, there is a possibility that in some methods of
ranking, the highest occurrence of a taxon would end up below its lowest
occurrence. Possible and impossible arrangements for the events resulting
from 2 taxa are shown in Figure 2.15. Note t h a t all impossible
arrangements have in common that either A (lowest occurrence of first
species) occurs above B (highest occurrence of first species) or that C occurs
below D for the second species. If in a statistical method all events were t o
be treated independently, the final ranking might contain impossible
arrangements. A problem of this type can be avoided, e.g. by recognizing
during the coding of the stratigraphic events or within the computer
program for statistical analysis, that the lowest occurrence is below the
highest occurrence for each taxon in theory and practice.
D c l B A 1
C I D D
:I : IT
IVPOSSIBLE
A B T
IVPOSSIBLE
: 11
A
A
D
1
B r
IMPOSSIBLE
B T
IMPOSSIBLE
' I 1 :TI
C
IMPOSSIBLE
C " B 1 tLl
A I T 1,
IMPOSSIBLE IVPOSSIBLE
TT
IWOSSIBLE
D
:
C 11
c
::I
"
F
A 11 A
C
B
'I
D T
IVPOSSIBLE
b T
IFIPOSSIBLE
D TT
IVPOSSIBLE
A I' : C
A
B11
"
C
B T
IMPOSSIBLE
D
B T T D T : TI
IMPOSSIBLE IMPOSSIBLE IVPOSSIBLE
A A
B
I" B
A '
!il
D T
IMPOSSIBLE
D TT
IVPOSSIBLE IMPOSSIBLE
Fig. 2.15 The 24 arrangements of 4 events, where A and B are first and last occurrences of one species,
and events C and D are first and last occurrences of a second species. Only 6 of these arrangements are
possible (from Edwards, 198213). Quantitative stratigraphers should always look for impossible
arrangements in computer output and modify their algorithm if required.
41
Several possible frequency distribution models for highest and lowest
occurrences are shown in Figures 2.16 and 2.17. The spike (A) represents
abrupt disappearance of a taxon in Figure 2.16 and its immediate
widespread appearance in Figure 2.17. Because the spike is symmetrical,
the frequency curve also must be symmetrical when it is narrow (possibly
B in Figs. 2.16 and 2.17). Wider frequency curves have different values for
their mode (l),median (2) and mean (3), respectively. Curves for which
the order of the mode, median and mean is 123 are positively skew in the
direction of time. Those with order 321 are negatively skew.
Symmetrical curves have coinciding mode, median and mode. As
shown in the captions of Figures 2.16 and 2.17, all models discussed so far
correspond t o one of the 12 possibilities. It can be assumed that, with the
possible exceptions of A and C in Figures 2.16 and 2.17, all these frequency
curves exist in the fossil record. In practice, it is almost always impossible
t o precisely measure the shapes of the frequency distributions of the
highest and lowest occurrences of a taxon because one would need large
numbers of sections that are calibrated precisely according to time-lines.
Fig. 2.16 Six possible shapes for the frequency distribution of the observed last occurrence of a taxon. the
top (t) is the truly last occurrence. The numbers 1, 2 and 3 represent mode, median and mean,
respectively. These three statistics coincide for a symmetrical curve. Most paleontologists assume that
Fig. 2.16D is the most widespread shape. Arrow points in direction of time.
42
A C
E F
123
Fig. 2.17 Six possible shapes for the frequency distribution of the observed first occurrence of a taxon.
The base (b) is the truly first occurrence. The numbers 1, 2 and 3 represent mode, median and mean,
respectively. Opinions are divided as to which shape (Dor F) is most widespread.
lhl
Fig. 2.18 Examples of the effect of averaging illustrate the central limit theorem of mathematical
statistics. No matter what shape the frequency distribution of the original observations (a), taking the
average of two (b), four (c) or 25 (d) observations not only decreases the variance but brings the curve
closer to the normal (or Gaussian) limit (after Lapin, 1982; and Davis, 1986).
L i XL FT Vl iFi I( T flF
Fig. 2.19 Frequency histograms for finding a taxon within its range before and after mixing (from
Edwards, 1982b).See text for further explanation.
liltl
CHAPTER 3
APPLICATIONS OF MATHEMATICAL STATISTICS AND
COMPUTER SCIENCE TO ZONATION,
CORRELATION AND AGE INTERPOLATION
3.1 Introduction
This chapter contains background information f o r various
applications of mathematical statistics and computer science. It can be
skipped by readers who are not primarily interested in mathematically-
based theory. Concepts and methods t o be discussed include:
(1) probabilities, Bernoulli trials and the binomial model; (2) graph theory;
(3) multivariate analysis; (4) method of maximum likelihood; and
( 5 ) smoothing splines. Most of these techniques are illustrated by means of
geological examples of interest in paleontology and stratigraphy although
the emphasis in this chapter is on mathematical background. Not all
mathematical discussions are contained in this chapter. Other techniques
will be introduced in separate sections within later chapters as needed.
Modern mathematics and the theory of probability and statistics are
formally based on set theory. There have been several interesting
attempts t o formulate conventional stratigraphy in strict logico-
mathematical terms (Dienes, 1974; 1982; Dienes and Mann, 1977;
Carimati et al., 1982). The language of set theory, although a necessity in
pure mathematics, is not of immediate practical usefulness in stratigraphy
which has a well-developed language of its own. Although superpositional
relations between stratigraphic events can be precisely formulated in
terms of sets, the nomenclature of set theory is unpalatable t o most
stratigraphers as pointed out by Tipper (1989, p. 480).
The mathematical techniques introduced in this chapter are required
for statistical applications and for use in computer-based graphs and
graphics. Although these techniques are widely applied in other fields of
science, and may be elementary to those trained in mathematical
statistics, they have been used hardly at all in stratigraphy. The purpose
of this chapter is not only to review statistical methods that have been
48
The binomial test for randomness will be briefly discussed (cf. Hay,
1972; Southam et al., 1975; Blank and Ellis, 1982). If the sequence of a
pair of biostratigraphic events is random, the probability of one event
preceding the other is p = 1/2. Each observed superpositional relation is
thought to be the outcome of a Bernoulli trial. Suppose that two events (A
and B) both occur in N sections. Then the probability that A occurs above
B k times satisfies
P ( k ) = NCk2 - N
(3.1)
[
NCk = N! k ! ( N - k ) !
I -l (3.2)
For example, if N = 5, then P(O)= P(5)= 1/32; P(1)= P(4)= 5/32; and
P(2)= P(3)= 10/32. These probabilities add to one. It is also possible t o
write P(0 or 5) = 1/16, P(1 or 4) = 5/16 and P(2 or 3) = 10/16. In practice,
the observation that A occurs k times above B generally cannot be
distinguished from B occurring k times above A when the hypothesis
p = E W N ) = 112 is being tested. In this expression, E( ...I denotes
expected value. K denotes the binomial random variable with observed
frequencies k (=O, 1, 2, ..., N). The test hypothesis obviously cannot be
rejected if KIN becomes equal to 1/2, a situation which may be observed
when N is even. For k > N/2, the probability
N
Pc(k) = 2 1 NCk2-N (3.3)
r=k
P ( K = k ) = P ( k ) = NCk p k ( l - p ) N - k ( k = O , 1, ...,N
(3.4)
P ( k ) = e-’Ak/k! ( k = 0 , 1 , ...,N)
(3.5)
It is noted that the two scales in Figure 3.1 are logarithmic and that
the lines are approximately straight unless p is relatively large. This is
because the equation for zero probability of the Poisson distribution, which
provides a good approximation when p is small, plots as a straight line on
logarithmic graph paper. If 10 is used as the base of the logarithms, the
equation of each line in Figure 3.1 is simply loglo N=loglo A - loglo p with
P = P ( K = 0) = exp (-A) as follows from Equation (3.5).
The binomial distribution model on which Figure 3.1 is based also can
be used to estimate confidence intervals for any specific proportion value
( p ) . Unfortunately, it turns out that large samples would be needed to
estimate, with precision, the relative abundances of many different
species. In general, proportions estimated from actual samples are
51
Fig. 3.1 Size of random sample (n)needed to detect a species occurring with proportional abundance ( p )
in population with probability of failure to detect its presence fixed at P (after Dennison and Hay, 1967).
Geological background
Tojeira 1 Tojelra 2
\-
25
23
22
Metres
20
18
16
14
12
-9
11 -
-7
10
8
6 -6
6A
-5
5 -
3A -3
6.2 -12.1
Fig. 3.2 Left side: Tojeira 1 section with sample members 6.2-6.29 (after Stam, 1986); ammonite zones
(Planula and Platynota Zones) of Mouterde et al. (1973) also are shown. This section is immediately
overlain by the poorly exposed sandy Cabrito Formation. Right side: Tojeira 2 section with sample
numbers 12.1-12.11 and 11.1-11.23(after Stam, 1986).
53
30 -c 10'
60
50 I '
40
20 I '
30
20
I ..
10
10
0
,::.,..
5 10 15
or:
0
.
10 20 30 40 50 60 70
Eopunulha SPP E mosq~en~i~
40 40
I 70 1
30
6o I
50
20 40
30
10
t
10 20
. . . 10, :..,
5
~
10
~-
15 0 10 20 30 40
0 .0" 10 20
'
30
0
40 0 10 20 30 40 50 60 70
0 SbUmoSUm s Ie""ISElma 0 s,,"m"sl,m s 1e""lSslma
Fig. 3.3 Left side: Proportions of four benthonic Foraminifera for seven replicate samples from same
sites in Tojeira 1 section based on determinations by Stam (horizontal axis) and Gradstein (vertical axis).
Right side: ditto for eleven replicate samples in Tojeira 2 section. See text for discussion of lack of
agreement.
proportion values as well as total benthos counted for these 5 samples were
shown in Agterberg et al. (1990, Table 1). The measured proportions are
markedly different, again illustrating the uncertainty commonly
associated with microfossil abundance data.
As a first step for an M.Sc. project, Nazli (1988) subjected Stam's data
for 14 benthonic species in 31 samples from Tojeira 1 to the ARIMA (Auto
Regressive Integrated Moving Average) procedure of the Statistical
Analysis System (SAS) as implemented on the IBM mainframe computer
at the University of Ottawa in 1986. SAS (Statistical Analysis System) is
a statistical software package with separate versions for mainframes and
personal computers (available from SAS Institute Inc., Box 8000, Cary,
NC, U.S.A.). The ARIMA method was originally developed by Box and
Jenkins (1976). The first part of SAS ARIMA output for E . mosquensis is
shown in Figure 3.4. In autocorrelation, successive values along a time
series are correlated with one another for different lags ( = intervals along
the series). Normally in applications of ARIMA, the values are equally
spaced along the time axis. The decompacted sedimentation rate during
deposition of the Tojeira Formation was about 5cm per 1000years.
Although the shale is homogeneous in composition, it cannot be taken for
granted that sampling it at equal intervals would yield a series with points
56
SAS
ARIMA PROCEDURE
T o j e i r a 1: E. m o s q u e n s i s
AUTOCORRELATIONS
LAG C G V A R I N E CORRELATION
0 160.079 1.00000
1 79.9485 0.49943
2 85.2347 0.53245
3 58.3794 0.36469
4 32.1471 0.20145
5 27.9955 a.174eg
6 14.9058 0.09312
7 25.9934 0.16238
8 23.4033 0.14620
9 19,8307 0.32388
10 12.4919 0.07804
GSC
Fig. 3.4 Partial output of SAS ARIMA procedure for E . mosquensis proportions in Stam's 31 samples
from Tojeira 1 (for complete print-out, see Nazli, 1988, Fig. 4-12, p. 98). ARIMA maximum likelihood
estimation gave three statistically significant coefficients for first order autocorrelation coupled with
two-term moving average. This result is compatible with assumption of signal-plus-noise model in
Figure 3.5.
-0
0.05
1 2 3 4 5 6
, GSC
7
a
lag x
Fig. 3.5 Estimated autocorrelation coefficients of Figure 3.4 plotted along logarithmic scale a n d
approximated by exponential function.
that are equally spaced in time. The 31 samples used for Figure 3.4 are
approximately equally spaced in the stratigraphic direction (see Fig. 3.2,
left side). The resulting autocorrelation pattern for E . mosquensis is
approximately exponential. In Figure 3.4, the first few estimated
autocorrelation coefficients (lags 1 and 2) are greater than zero with a
57
Discussion
TABLE 3.1
Comparison of standard deviations (in percent) due to counting (sg) and total local random variability
( s ~ for
) species with average proportion jj (in percent) and approximately exponential autocorrelation
function (after Agterberg et al., 1990).
trinomial model successfully estimated the probability that two events are
coeval in several applications (see Section 6.10).
In the RASC model, observed ties are not ignored but each tie of two
events Ei and Ej is scored as a 50 percent probability that Ei occurs above
Ej and a 50 percent probability that Ej occurs above Ei. Observed scores So
can be compared with estimated frequencies S , = P,x R in which the
estimated probabilities P, (for Ei occurring above Ej) satisfy P, = cP(d,); d,
may be estimated by means of the weighted scaling option of the RASC
computer program in which variations of sample size R are considered.
The agreement between observed and estimated scores was excellent for
Cenozoic Foraminifera on the Labrador Shelf - Grand Banks (see Section
6.10, for details). The chi-squared test for goodness of fit was used for
making this comparison. This shows that the scaling method of RASC
permits the use of significance tests for comparing pairs of events with one
another on the basis of probabilities estimated from the order relationship
of all events considered simultaneously.
a b c d e
Fig. 3.6 Example of concepts of graph theory applied in biostratigraphy (after Guex, 1980). (a)
Adjacency matrix containing same information as Fig. 3.6f for sections in Fig. 3.6b; (b) space-time
relationship of 8 species numbered 1 to 8; heavy black vertical lines represent stratigraphic sections with
observations on domains of existence (closed regions) of the eight species; T = time, E = space; (c)
relative chronological position of the intervals I to VI for maximal cliques representing “Unitary
Associations”derived from Figs. 3.6d and 3.6g; (d) matrix relating maximal cliques ( K ) of Fig 3.6g to the
eight species ( X ) ; (el maximal cliques ( K ) identified in four sections (pl-pz) of Fig. 3.6b; (0
biostratigraphical graph G representing co-occurrences and superpositional relationships between the 8
species as observed in the four sections; (g) undirected graph G, representing co-occurrences of Fig. 3.6f
only; (h) directed graph G, with arcs for superpositional relationships. The original purpose of this
diagram was to illustrate, for a simple example, that construction of an interval graph (see Fig. 3.7)
normally does not result in a chronological ordering. Only “reproducible Unitary Associations” are
chronologically ordered as shown in Fig. 3.6e (Guex, 1980).
GI
2-
1 2 4 5
Jfd Jlw/
JfvJ
Fig. 3.7 G1 and Gz are examples of interval assignments A t ) , i = 1, 2, ... for undirected graphs. An
interval assignment for 2 4 with vertices u. u, wand z does not exist (after Roberts, 1976).
64
each arc (e.g. u to u ) where c represents number of times this arc occurs in a
C, within the strong component and r is the total number of times the arc
occurs in the strong component. If the coefficient s of an arc is high, this
may indicate reworking or contamination. If reworking is suspected, u is
omitted in beds where it w a s observed t o occur above u. F o r
contamination, u would be removed from below u.
Guex and Davaud (1984)have developed further rules for interactive
or automated elimination of other forbidden structures from G. For
example, Z, is removed by assuming “virtual” co-occurrence for either a
pair of two or all four of the fossils involved. Two fossil species are said to
co-occur virtually if their co-occurrence was not observed but inferred.
After elimination of all inconsistencies, the biostratigraphic graph G
yields an interval g r a p h G , of which t h e maximal cliques can be
determined. These are the Initial Unitary Associations (1.u.A.’~). They
are called “initial” because Guex and Davaud (1984)added the following
method for combining some of the I.U.A.’s with one another in order to
form the U.A.’s. The I.U.A.’s are identified in sections as previously
illustrated for the Unitary Associations i n Figure 3.6e. A complete I.U.A.
may not be observed i n a section. However a given I.U.A. is fully
characterized by anyone of its unique species or pairs of species. I.U.A.’s
characterized by “virtual’*(inferred, not observed) co-occurrences of fossils
only cannot be identified i n sections. Guex and Davaud (1984)then
proceeded by constructing the directed graph Gk of superpositional
relations between the I.U.A.’s as identified i n t h e sections. T h e
construction of Gk with t h e I.U.A.’s as vertices i s identical to t h e
extraction of Ga for the original biostratigraphical graph G. Next they
find the I.U.A.’s with the longest path in Gk. In general, a vertex in a
directed graph Ga is connected to another vertex by means of a “path” if
the arrows on the arcs between these two vertices point in the same
direction. Each I.U.A. not on the longest path is combined with the I.U.A.
on the path with which it has a n interval in common. This gathering
process yields the final Unitary Associations (U.A.’s) which are identified
in the sections as the I.U.A.’s were before. If the new 1.U.A.-U.A. method
is applied to the example of Figure 3.6, the Initial Unitary Associations I1
and I11 would be combined with one another.
67
Y Y
Fig. 3.8 Schematic diagrams of cubic interpolation spline and cubic smoothing spline. The cubic
polynomials between successive knots have continuous first and second derivatives at the knots. The
smoothing factor (SF) is zero for interpolation splines. Here as well as in later applications, the abscissae
of the knots coincide with those of the data points.
Y
40
30
20
10
-10
-20
-30
-40
-50
-60
-70
-80 I , I I I I 1 I
1 2 x
GSC
Fig. 3.9 De Boor (1978, Fig. 8.1, p. 224) simulated irregular spacing along x-axis by selecting 12 points
(solid circles) from set 49 regularly spaced measurements of a variable (y) as a function of another
variable (x). The optimum fifth order interpolation spline (with 7 knots) provides poor fit except around
the peak.
Y Y Y
50-1 A 501 B
,:;if Ji;(
1 , , 1 1 X
0:5 1 1.5 2 215 0 0.5 1 1.5 2 2:5
Y X Y
50
20
10 0 5 ..
0
X
0 2 4 6 8 1 0 0 2 4 6 8 1 0 0 0 , 5 1 1 5 2 2 5
LEVEL LEVEL GSC
Fig. 3.10 Top part Cubic interpolation splines with knots a t data points fitted to irregularly spaced
data. (A) Use of same 12 points as in Fig. 3.9 gives good result; (B) deletion of 3 points in the valleys still
gives fair interpolation spline although local minima at both sides of the peak are not supported by
original data set of 49 measurements; (C) deletion of 2 more points in the valleys results in poor cubic
interpolation spline. Bottom part: Indirect method of cubic spline-fitting. (D)The six intervals along
the x-axis between data points were made equal before calculation of cubic interpolation spline; (E)non-
decreasing cubic spline with small positive value of smoothing factor (SF = 0.038) was fitted to interval
as function of “levels”; (F) curves of (D)and (E)were combined with one another and re-expressed as
cubic spline function which does not show the unrealistic fluctuations of the cubic interpolation spline of
Fig. 3.10C.
0
8.0 6.0 4.0 3.0 2.0 1.0
0 ,
’ L.7
I
-:
e
.I c
.-0
0 20
.-U
? 40
2z
0 80
U
.-
C
1201
E 80
UI I
fn I
6 I
4 100 I
-C
.-0 I
I
g.; os,
0
0 120 I
I
,mu N
I ”
.2 :
140 :. 14c 0 1 0
0
E
.-
0
I-
0 0
O \
\;
Y Y
GSi
180 18C
Fig. 3.11 Left side: Indirect method of cubic spline-fitting illustrated in Fig. 3.10 (D-F) applied to
probits of E . mosquensis abundance data for Tojeira 1 section. Right side: Same with observations and
spline-curve for Tojeira 2 section superimposed. Patterns were slid with respect to one another until a
reasonably good fit was achieved. Zero distance (at sample 6.29 in Tojeira 1) falls just below base of
overlying Cabrito Formation (cf. Fig. 3.2). Correlation between the two sections is poorest along the 35m
data gap in Tojeira 2.
its fractile of the normal distribution in standard form and adding 5 to the
result). The purpose of the latter expression is to reduce the relative
influence of both relatively high and low values. Such “normalization” is
desirable because smoothing splines are fitted by using the method of least
squares in which the influence of each deviation from the curve increases
according to the square of its magnitude. The smoothing factor (SF)
should not be mainly determined by relatively few values only.
Results for the indirect method applied to E . mosquensis in Tojeira 1
and 2 are shown in Figures 3.11A and B, respectively. The two spline-
curves were slid with respect t o one another until a “best” fit was found
(see Fig. 3.11B). A 10m downward movement of the Tojeira 2 sequence,
which places the base of the overlying Cabrito Formation in nearly the
same stratigraphic position in both sections, produces the best correlation.
73
It is noted that there is a 35m data gap in the Tojeira 2 section so that the
local maximum and minimum located within the equivalent of this gap in
Tojeira 1 could exist in Tojeira 2 as well. For Tojeira 1, sampling was
restricted to the shales of the Tojeira Formation whereas samples for the
underlying Montejunto Formation in which E . mosquensis is absent or
rare were also obtained and used for Tojeira 2. In real distance, the two
sections are about 2km apart. It may be concluded from the pattern of
Figure 3.11B that it is likely that both Tojeira 1 and 2 share essentially
the same relative changes in abundance of E . mosquensis during
deposition of the approximately 70m of late Jurassic shale in this part of
the Lusitanian Basin.
Stam’s (1986) plots for the P/B (planktonhenthos) ratio in the Tojeira
sections suggested that there may exist several oscillations with peaks
where benthos and plankton are nearly equally abundant separated by
valleys with little or no plankton. Precise correlation of these peaks and
valleys is not possible because of “noise” which even became more
prominent when P/B ratios for Nazli’s samples were added. Agterberg et
al. (1989) showed results obtained by the indirect method of spline fitting
applied to the transformed data for P/B ratio in the two sections. Locations
of samples were shown with respect to Stam’s sample 6.29 in both sections
(Tojeira 2 was slid 10m downward as in Fig. 3.11B). Although, on the
average, more plankton was deposited in the area of Tojeira 2, the spline-
curves display patterns that can be interpreted as similar. In total, there
were probably four peaks in the PA3 ratio indicating successive periods of
planktonic bloom during deposition of the upper Jurassic shale. This
result collaborates the one described for the E . mosquensis abundance data
(see Fig. 3.11).
Not only abundance data can be used for correlation. Reyment (1980)
has reviewed basic techniques combining statistics and time series
analysis applied to morphometrics of evolutionary sequences. Ecologically
induced changes in morphology may be useful for biostratigraphic
correlation as well.
In 1982 two time scales were published (Odin 1982; Harland et al.
1982). There is general agreement on the ages along most of these time
scales. The largest discrepancies amount t o about 10 percent of the ages
estimated (also see Section 1.6). Harland et al. (1982) estimated 144 Ma
for the Jurassic-Cretaceous boundary and 590 Ma for the Precambrian-
Cambrian boundary, and Odin (1982) 130 Ma and 530 Ma, respectively.
Such differences are related to the nature of the materials used for dating.
Although they are helpful for pointing out the existence of significant
discrepancies (see e.g. Gradstein et al., 1988), statistical methods cannot
be used t o resolve difficulties related to the nature of the materials used for
dating. Neither can they solve the problem of choosing decay constants in
order to avoid bias in radiometric dating. However, any radiometric
method is subject t o a measurement error which increases with age and is
usually much greater than the uncertainties associated with the relative
ordering of events using methods of stratigraphic correlation (e.g.
biostratigraphic or magnetopolarity methods). The problem of having to
estimate the age of stage and chronozone boundaries from relatively
imprecise isotope determinations remains even if all sources of bias
related to these methods could be eliminated.
I=1 (3.6)
10
0 5-
00 I I I I I I r X
30 20 10 00 10 20 30 40
00 I I I I 1 I X
-3 0 -2 0 -1 0 00 10 20 30 40
GSC
Fig. 3.12 Weighting functions on basis of which likelihood function can be estimated. A. The function
f c x ) follows from assumption that every age determination is sum of random variables for (1) uniform
distribution of (unknown) true ages, and (2) Gaussian distributions for measurements. B. The function
f&) is for inconsistent ages only. Its log-likelihood function is -E2,
79
(3.7)
or:
(3.9)
(3.10)
older than the age of this boundary. This problem can be solved if a
weighting function f i x ) is defined. The boundary is assumed to occur a t the
point where x = 0. If one is only interested in the lower boundary of a
stage, Q, { ( t- t,)/o} can be set equal to one yielding the weighting function
f ( x > t , ) = l - @ ( x ) which is graphically shown i n Figure 3.12A.
Alternatively, this weighting function can be derived directly: If all
possible age above the stage boundary have an equal chance of being
represented, then the probability that their measured age assumes a
specific value is proportional t o the integral of the Gaussian density
function for the errors. In terms of the definitions given, any inconsistent
age ty greater than te has x > 0 whereas consistent ages with ty < t, have
x < 0. It is assumed that standardization of a n age tyi or t,i can be
achieved by dividing either (tyi - t,) or (t,i - t,) by its standard error si
yielding xi = (tyi - t,)/s; or xi = (t,i - t,)/si.
(3.11)
(3.12)
(3.13)
where n > te because n, inconsistent ages are used only. This weighting
function is shown in Figure 3.12B. If the corresponding likelihood
function is written as L,, it follows that E2 =-log, L,.
For example, the quantity E2 is plotted in the vertical direction of
Figure 3.13 for the Caerfai-St. David’s boundary example taken from
Harland et al. (1982, Fig. 3.7i). The data on which this chronogram is
based are shown along the top. Values of E2 were calculated at intervals of
4 Ma and a parabola was fitted to the resulting values by using the method
Y Y Y Y
I
I I
4-
I 0 0
I I
00
I
0
rn-s m m+s
I
3-
2-
I
1 -
07
570 580 Ma
Geologic time
GSC
Fig. 3.13 Chronogram for Caerfai-St. David’s boundary example and parabola fitted by method of least
squares. E z = - log-likelihood is plotted in vertical direction. Dates belonging to stages which are older
and younger than boundary are indicated by o and y, respectively. Standard deviation follows from d
representing width of parabola for Ez equal to its minimum value augmented by 2.
82
E2 = - a - b t e
-ct2
e (3.14)
The estimated ages of the Caerfai - St. David’s boundary and their
standard deviations obtained for L , and L also are similar. This
conclusion will be corroborated by a more detailed comparison of the
weighting functions for L and L, a t the end of this section, and by
computer simulation experiments t o be described in the next section.
However, La does not provide a good approximation of L when inconsistent
ages are missing.
A parabolic chronogram is more readily obtained when the consistent
ages are used together with the inconsistent ages as in the method
discussed here. A numerical example of the kinds of differences in results
obtained is as follows. An age estimate based on the chronogram of
Harland et al. (1982, Fig. 3.4h, p. 57) for the Norian-Rhaetian boundary
would be approximately 213 Ma. The corresponding standard error as
reported by Harland et al. (1982) is 9 Ma. The maximum likelihood
method using the same set of 6 data gives an estimated age of 215.5 Ma
with corresponding standard error of 4.2 Ma.
-4
P
0
0
5 -5-
a
-
Y
m
3 -6-
-7-
Y Y Y Y
I
I I I 1
0 0 0 0 0
Fig. 3.14 Caerfai-St. David’s boundary example. Age ( m ) estimated by maximum likelihood method
using L. Standard deviation (s)and width of 95 percent confidence interval are approximated closely by
results shown in Figure 3.13.
84
(3.17)
P (t) = I - @ ( + )
IY
rn
= f(5)
m (3.18)
Cox and Dalrymple (1967) next introduced the trial boundary age t ,
and defined a measure of dispersion of all inconsistent dates t, with
respect to t , satisfying:
(3.19)
(3.20)
log, {l - @ ( J ) } = - p r 2 +u (3.21)
0 1 2 3 4 5 6 7 8 9 10
1
I
1111
OII 1
I I I
I I I
II
II
Ylll
11l11 1
I Ill I
I
I
1 I I H (a)
1 1 I I Ill1
GSC
Fig. 3.15 Two examples of runs (Runs No. 1 and No. 7) in computer simulation experiment. True dates
(a) were generated first, classified and increased (or decreased) by random amount. Younger and older
ages are shown above and below scale (b), respectively.
(see e.g. Rao, 1973); and (c) how do results derived from the chronograms
in Harland et al. (1982) compare t o those obtained by the maximum
likelihood method.
+
P = @ ( z ) for values of t on the interval [te - 3, t, 31 where @ ( z ) denotes
cumulative frequency of the normal distribution in standard form. The
frequency corresponding t o 3 is equal t o 0.999 of which the natural
logarithm is equal to -0.001. For this reason, values outside the interval
t, +_3yield probabilities which are approximately 1 (or 0 for the log-
likelihood function) and these were not used for further analysis. Thus a
natural window is provided screening out dates that are not in the vicinity
of the age of the chronostratigraphic boundary to be estimated. Most
probabilities are greater than 0.5. Only inconsistent dates (asterisks in
Table 3.2) give probabilities less than 0.5. The value of the log-likelihood
TABLE 3.2
Run 1 for computer simulation experiment. True dates T were classified as younger (A) or older (B) than
true age of stage boundary ( = 5 ) . Dates t with measurement error are compared to trial age ( t , = 4.6).
Inconsistent ages are indicated by asterisks. z = -x for younger rocks (A) and z = x for older rocks (B).
Standard normal z-value is fractile of probability P . Total of logs of P gives value of log-likelihood
function fort, = 4.6.
X
T t ( = t-4.6) 2 P 4, p
4.587 A 4.380 -0.220 0.220 0,5871 -0.5325
7.800 B 8.048 3.448 3.448
2.124 A 2.193 -2.407 2.407 0.9920 -0.0081
0.668 A 2.239 -2.361 2.361 0.9909 -0.0092
6.225 B 5.802 1.202 1.202 0.8853 -0.1218
9.990 B 9.945 5.345 5.345
4.896 A 4.574 -0.026 0.026 0.5102 -0.6730
4.606 A* 6.487 1.887 -1.887 0.0296 -3.5211
0.796 A 0.553 -4.047 4.047
1.855 A 2.526 -2.074 2.074 0.9810 -0.0192
6.292 B 6.923 2.323 2.323 0.9899 -0.0101
3.280 A 1.998 -2.602 2.602 0.9954 -0.0046
2.422 A 1.435 -3.165 3.165
1.397 A 0.912 -3.688 3.688
4.538 A 4.365 -0.235 0,235 0.5928 -0.5230
0.830 A 0.803 -3.797 3.797
6.194 B* 4.033 -0.567 -0.567 0.2854 -1.2540
4.545 A 3.930 -0.670 0.670 0.7490 -0.2890
4.774 A * 4.814 0.214 -0.214 0.4154 -0.8786
0.905 A 0.713 -3.887 3.887
9.763 B 11.197
8.285 B 8.902 4.302 4.302
3.131 A 3.676 -0.924 0.924 0.8224 -0.1955
9.987 B 9.435 4.835 4.835
9.442 B 9.620 5.020 5.020
Total = -8.0397
88
TABLE 3.3
Values of log-likelihood functions estimated for Run 1 and predicted values for parabola fitted by method
of least squares. Initial guesses of extreme values are indicated by asterisks.
function for te is the sum of the logs of the probabilities as illustrated for
t, = 4.6 in Table 3.2.
Log-likelihood values for Run No. 1 are shown in Table 3.3 with t,
ranging from 3 to7 in steps of 0.1. The largest log-likelihood value is
reached for t, = 5.6 and this value was selected as the first approximation
t,l of the age of the stage boundary. In total, 21 values o f t , with I t, - tel I
< 1.0 were used for fitting a parabola as shown in Figure 3.16. The fitted
-
parabola is more or less independent of number of values used ( = 21) and
width of neighborhood ( =2). However, the neighborhood should not be
made too wide because of random fluctuations (local minima or maxima)
near t, = 3 or 7 (see e.g. Table 3.3). These edge effects should be avoided.
89
(a)
m-s m mtr
(b) , m;s T m:s , +
H-z
i ; : u
r6 : A
8-
YY Y Y Y
I I
I I I
0 0 0
- 91 2,
GSC
Fig, 3.16 Maximum-likelihood method used for estimating mean of age of stage boundary in Run 1 (data
as in Fig. 3.15). Standard deviation (s) and 95 percent confidence interval also are shown. A. Likelihood
function L was used. B. Chronogram for Run 1 (using La instead of L ) . Note similarity of s and 95
percent confidence interval in Figs. 3.16A and B.
They are due t o the fact that the initial range of simulated time was
arbitrarily set equal t o 10 in the computer simulation experiment. The
peak of this parabola provides the second approximation rn = Ze2 of the
estimated age. The standard deviation ( s ) of the corresponding normal
distribution can be used to estimate the 95 percent confidence interval
rn k 1.96s also shown in Figure 3.16.
The sum of squares E 2 for La, using inconsistent dates only, is also
shown in Table3.3 as a function of t,. The first approximation of its
minimum value is 5.3. The corresponding parabola is shown in
Figure 3.16. The mean age resulting from La is about 0.3 less than the
mean based on L and its standard deviation is nearly the same. It is
fortuitous that the mean based on La is closer t o the population mean ( = 5 )
than that based on L. On the average, the original maximum likelihood
( L )method gives better results (see results for 50 runs given a t the end of
this section).
Younger and older ages generated in each of the first 10 (unit
variance) computer simulation runs are shown in Figure 3.17 together
with their estimated mean and 95 per cent confidence interval using L.
Theoretically, each population mean ( = 5) is contained within the
95percent confidence interval around the sampling mean with a
probability of 95 percent. The means and standard deviations used for
90
Fig. 3.17 Dates generated in first 10 runs of computer simulation experiment (cf. results for No. 1 and
No.7 shown in Fig. 3.15). Mean and 95 percent confidence interval estimated by maximum-likelihood
method are shown for comparison with true mean ( = 5).
Figure 3.17 are listed in Table 3.4 (Maximum likelihood method with
parabola). Also listed in Table3.4 are the corresponding results for La
(Gaussian weighting function with parabola). The means based on La are
close t o those for L. The estimated standard deviations tend to be either
91
TABLE3.4
First 10 runs of computer simulation experiment. Comparison of results obtained by fitting parabola and
scoring method, respectively. Standard deviations marked by asterisks are too large (cf. Fig. 3.18B).
I 5.6 5.582 0.479 5.554 0.481 5.3 5.269 0.470 5.260 0.500
2 5.7 5.632 0.481 5.663 0.489 6.3 6.190 0.480 6.264 0.500
3 5.1 5.153 0.420 5. I42 0.423 4.8 4.884 0.335 4.828 0.316
4 4.5 4.506 0.W7 4.507 0.452 4.2 4.321 0.395 4.216 0.354
5 5.1 5.070 0.461 5.089 0.466 5.3 5.217 0.482 5.293 0.408
6 4.4 4.419 0.502 4.448 0.505 4.6 4.625 0.749*
7 5.7 5.710 0.531 5.728 0.542 5.8 5.767 3.924*
8 5.2 5.205 0.406 5.200 0.411 5.0 5.025 0.364 5.017 0.408
9 5.0 5.022 0.417 5.018 0.419 5.0 4.966 0.614*
10 4.2 4.231 0.609 4.232 0.623 4.3 4.248 l.OOl*
slightly smaller or much greater. It can be seen from the results for
Run No. 7 shown in Figure 3.18 that the greater standard deviations are
due to a break-down of this particular method of estimation.
R e s u l t s obtained by m e a n s o f t h e method o f s c o r i n g
(see e.g. Rao, 1973, p. 366-374) also are shown in Table 3.4. In our
application of this method, the following procedure was followed. As
before, the log-likelihood was calculated for 0.1 increments in t, and the
largest of these values was used as the initial guess. Suppose that this
value is written a s y . Two other values x and z were calculated
representing log-likelihood values close t o y at small distances and
l o w 4along the t,-axis. The quantities D1 = 0 . 5 ( z - x ) . l o 4 a n d
+
D2 = (x - 2y z). l o 8 were used to obtain a second approximation of the
mean by substracting from the initial guess. The procedure was
repeated until the difference between successive approximations became
negligibly small. Then the standard deviation of the estimate is given by
SD = 1/1021.
For L , the scoring method generally yields estimates of SD which are
slightly greater than those resulting from the parabola method. However,
the difference is negligibly small (Table 3.4). For La, the scoring method
provided an answer in only 6 of the 10 experiments of Table 3.4.
Similar results were obtained for runs in a second type of computer
simulation experiment using variable measurement error (see Agterberg,
1988, for details). In total, 50 runs were made for each of the two types of
92
l j
m-s m m+s
-' I I 1
I Y Y Y
f-
m &
z o +++++++++++++++++++
$
I
0 0 -1
-4
40 4'5 50 55 60 65 70
40 45 50 55 80 65 70
Simulated geologic time Simulated geologic time
GSC
Fig 3.18 Maximum-likelihood method used for estimating mean age of stage boundary in Run 7 (data as
in Fig. 3.15). A. Likelihood function L was used. B. Likelihood function La did not give good result.
TABLE 3.5
Ages and estimated standard deviations used for fitting spline-curve No. 1 shown in Figure 3.19.
chronogram; and (d) the standard deviation was set proportional to the age
range listed in the summary time scale (Harland et al., 1982, pp. 52-55)
with constant of proportionality equal to 3 d 2.
The fourth modification (d) is based on the earlier considerations
corroborated by the computer simulation experiments proving that the
parabola for La provides an excellent approximation to the parabola for L.
A cubic spline-curve was fitted to the data in Figure 3.19 for the
following reasons. A spline-curve is very smooth because there are no
abrupt changes in the rate of change of its slope; the principle of least
squares is used; and deviations between observed values (crosses in
94
1-+141
7- Spline-curve 1
819 -
2 82+
10-
I
11112-
4- ~ 1
13 -
ul
a,
14- I I
15116~
u
a
n
0
P
ti l c
23
24 -
25 -
26 ~
27 ~
28/29.
30 -
33
Fig, 3.19 Spline-curves fitted to ages of stage boundaries listed in Table 3.5. Spline-curve 1A was fitted
to data for stage boundaries numbered 7 to 27 only.
Fig. 3.19) and spline-curve are permitted to exist but the sum of squares of
these deviations can be regulated; a weight can be assigned to each
observed value. This weight is inversely proportional to the variance of
the observed value.
Let t h e vertical a n d horizontal axes i n Figure 3.19 represent
observations written as x i , yi ( i = 1,..., n ) , respectively. Then t h e
smoothing spline-function to be constructed minimizes
(3.22)
95
(3.23)
Here the s(yi) are the standard deviations of the values yi. The sum of
standardized deviations S is a random variable approximately distributed
as chi-squared with n degrees of freedom and variance equal to 2n. The
expected value of S, which is equal to n, was used in the applications of this
section.
It can be seen in Figure 3.19 that the fitted spline-curve No. 1tends t o
follow the stage boundaries in the Cretaceous more closely because these
are relatively precise. In places where the uncertainity is great, the
spline-curve tends t o become a straight line. Spline-curve No. 1A shown
also in Figure 3.19 was fitted t o points for stage boundaries between the
Anisian and Cenomanian. It is nearly straight and closely approximates
Spline-curve 1.
TABLE3.6
Ages used for fitting spline-curve No. 2 based on equal duration of Hallam's ammonite zones in the
Jurassic; without and with tie-points, respectively.
11112 \
I 77h\
13
o) 14 4 24 x
\ J
15/10
P 530'
2 a
al
(51
In \ s
c
\ '\
23
+
' 6 36 1
24
2, 83
yy4
25 + \
26 a
5
27 5
28129
30 7 07+
\ Art
4 24+ Sah ASS
I- - I -~~
80 100 120 140 160 180 200 220 240 260 Ma
Fig. 3.20 Spline-curve fitted to ages of stage boundaries for Jurassic listed in Table 3.6. This cubic
smoothing spline passes exactly through two tie-points with SD = 0.
1
130 r I3O
-i
G'l
ClV
t Ib0
I 0th
- 170
Fig, 3.21 Comparison of spline-curve ages (rounded off to nearest integer Ma values) for Jurassic to ages
estimated by Harland et al. (1982)and by Kent and Gradstein (1985). The asterisks in column 4 denote
key ages of tie-points through which the spline-curve solution was forced to pass. For further
information see Agterberg (1988).
98
The input for spline-curve fitting was further modified by using as tie-
points 156 Ma instead of 151 Ma for the Oxfordian-Kimmeridgian and
208 Ma instead of 212 Ma for the Triassic-Jurassic boundary, respectively,
setting the standard deviations of these ages equal t o zero. As
demonstrated in Agterberg (1988, Appendix 21, the spline-curve has the
property of passing exactly through points of which the standard deviation
is zero. Spline-curve No. 2 with tie-points is shown in Figure 3.20. The
ages of stage boundaries (rounded off t o 1Ma) obtained by three methods
of cubic spline-fitting are shown in Figure 3.21 for comparison with the
other age estimates. Ages for the modified spline-curve (No. 2) for equal
duration of zones but without use of tie-points are shown between those
based on Figures 3.20 and 3.21. The spline-curves all gave 208 Ma for the
age of the Triassic-Jurassic boundary which is younger than estimate of
213Ma in Harland et al. (1982) although the same original age
determinations were used.
The spline-curves yield ages of 138 Ma and 140 Ma for the Jurassic-
Cretaceous boundary which are younger than the 144 Ma age in Harland
et al. (1982) and Kent and Gradstein (1985). This relatively young age is
mainly due to the effect of (a) a relatively young Oxfordian glauconite age
listed as 148.22 Ma in Harland et al. (1982) and a s 145 k 3 Ma in
Armstrong (1978) who, i n t u r n , extracted it from Gyji a n d
McDowell(1970), and (b) 4 other relatively young glauconite ages listed in
Harland et al. (1982) for the Tithonian. If these 5 dates would not be used,
the spline-curves would also give an age of approximately 144 Ma for the
top of the Jurassic. In the beginning of Section 3.9 it was pointed out that
Odin (Editor, 1982) using more glauconite dates estimated a much
younger age (130 Ma) for this boundary. The problem of estimating the
age of the Jurassic-Cretaceous boundary also will be considered in the next
section.
three sections it has been shown that statistical estimation of the ages of
chronostratigraphic boundaries in the geological time scale can be
improved in two ways: (a) the maximum likelihood method can be used for
estimation of the age of individual chronostratigraphic boundaries, and
(b)after estimating the ages of a set of successive boundaries by the
method of maximum likelihood, these can be further improved by using a
cubic spline-curve for smoothing. The resulting methodological
improvements, however, are small in comparison with changes that result
from changing the input data. Harland e t al. (1982) used high-
temperature dates mainly. If low-temperature dates are used (cf. Odin,
Editor, 1982) significantly younger ages are obtained, for some stages,
especially those near the Jurassic-Cretaceous and Proterozoic-Phanerozoic
boundaries.
Haq et al. (1987) provided a new sea level and sedimentary cycles
chart, calibrated t o a new geological time scale for which they used
mixtures of low- and high-temperature dates. This procedure was
criticized by Gradstein et al. (1988) partly because it can be shown that the
low-temperature (glaucony) ages are systematically younger. Odin
(Editor, 1982) had pointed out for one sample (NDS2) that its glauconite
age of 39.6k1.8 Ma is a minimum age and that 1.5 t o 2 Ma should be
added t o it “bearing in mind the long time necessary for the evolution of
the dated glaucony”. Similar corrections may have to be applied to other
glauconite dates as well.
The following statistical experiments performed by the author was
briefly described in Gradstein et al. (1988). In total, 19 low-temperature
and high-temperature dates listed by Harland et al. (1982;Table 3.1, p. 61)
were used to estimate three different ages of the Jurassic-Cretaceous
boundary. The 7 high-temperature dates in this group of 19 dates are
plotted along the top of Figure 3.22, and the 12 low-temperature dates
along the bottom. The maximum likelihood method was applied taking
the high- and low-temperature dates separately, and t o the combined
group of 19 values. Best-fitting parabolas are shown in Figure 3.22. Trial
ages te at intervals of 4 Ma were used. Detailed calculations are shown in
Table 3.7 for t e = 132 Ma for high-temperature dates only. The parabola
fitted to the log-likelihood values of the high-temperature dates shows a
relatively poor fit mainly because these values are determined, to a large
extent, by a single Jurassic date (153.32f 5.00 Ma). The other older date
100
0-
-5 -
U
0
-y"
L
-10-
.-
-I
do
-I
-1s -
Fig, 3.22 Maximum likelihood method used for estimating age of Jurassic-Cretaceous boundary. See
text for further explanation.
TABLE 3.7
Calculation of logs of probabilities ( P ) for trial age of 132 Ma using 7 high-temperature dates only. The
sum of these values is one of the values plotted in Figure 3.22 and used to fit the parabola for high-
temperature dates. Procedure is similar to the one followed in the example of Table 3.2. However, every
z-value for an age was obtained after dividing the deviation from the trial age by the measurement error
(s) which previously was equal to unity for all deviations in Table 3.2. A and B represent Cretaceous and
Jurassic material, respectively.
which is d 2 times smaller than the errors of the individual ages. This
result is in agreement with the maximum likelihood approximation of L
by La.
Various authors have assigned different meanings t o the error on the
Mesozoic and Paleozoic time scales of Harland et al. (1982). For example,
Carr et al. (1984) assumed that Harland et al. (19821, by stating that this
error is 2.5 Ma, estimated the age of the Jurassic-Cretaceous boundary and
95% confidence interval as 144k2.5 Ma. On the other hand, Menning
(1989) quotes “confidence limits” for this boundary as 1 4 4 k 5 Ma. The
standard error corresponding to the error of 2.5 Ma estimated by Harland
et al. is (2.5/d2=) 1.77 Ma. Multiplication of this standard error by 2
gives a statistically-based estimate of 144 k3.5 Ma for the 95% confidence
interval. This width is between those of Carr et al. (1984) and Menning
(1989), respectively.
In order to estimate the precision of the ages of chronostratigraphic
boundaries, it is important to have good estimates of the errors of the
isotopic dates on which these age estimates are based. Harland et al.
(1982) found that although most determinations quote a n error, a
significant number do not. Errors for these determinations were
estimated by fitting a linear regression line to the available errorhime
data.
For those isotopic ages that have published errors, it may not be
immediately obvious whether these are standard deviations or 95%
confidence limits. For example, Harland et al. (1982) used a number of
Ordivician and Silurian fission track ages from McKerrow et al. (1980)
with quoted errors of about 10 Ma. In Gale et al. (1980), these same ages
are tabulated with errors “at the 20 level” that are twice as large (about 20
Ma). From this, it can be inferred that the age determination errors in
Harland et al. (1982) are indeed standard deviations, although they were
not identified as such in McKerrow et al. (1980).
If errors are standard deviations, it generally can be assumed that
there is 68 percent probability that the unknown true value occurs within
the error interval reported. By taking error limits that are twice as large
this probability is increased to 95 percent. It should be kept in mind that
statements of this type imply that the error distributions are Gaussian or
“normal”.
103
CHAPTER 4
CODING AND FILE MANAGEMENT OF STRATIGRAPHIC
INFORMATION
4.1 Introduction
During the past five years it has become common practice t o use
microcomputers for the creation, updating and quantitative analysis of
stratigraphic information. Lists of fossils and stratigraphic events
observed in wells or outcrop sections can be coded and stored together with
measurements on their position. The resulting files can be readily
submitted t o various types of data processing. In the Microsoft Disk
Operating System (DOS), for example, files are identified by filenames
which are from one to eight characters long. These filenames may be
followed by extensions consisting of a period followed by one, two or three
characters.
In order to illustrate data management in biostratigraphy, a number
of datasets ranging from small and simple, to large and complex will be
introduced in this chapter. Later, these same datasets will be used t o
illustrate automated stratigraphic correlation techniques. The primary
purpose of the data management required is to create various types of
sequence files for different stratigraphic sections which can later be
systematically compared with one another in preparation of automated
stratigraphic correlation. Before presentation of the datasets, five types of
files are defined which will be used in the examples. For convenience, the
different types of files are indicated by three-letter extensions as in
Microsoft DOS.
DIC files
Dictionary (DIC) files contain lists of fossil names (or event names).
They include all names to be used for a regional study. The order of the
names in the DIC files is arbitrary when the file is created. The names
may be initially ordered according to a system selected by the user. For
example, the alphabetic order of taxa can be used, taxa can be grouped
according to families, with alphabetic order within families, or use can be
made of the order in which different taxa are identified in one or more
relatively complete stratigraphic sections for a region.
Microsoft DOS permits rapid alphabetic sorting of names. (It also is
possible to obtain alphabetic lists by means of RASC.) However, most
stratigraphers prefer other types of order for their lists. When a list of
fossil names, alphabetic or otherwise, is available for a region, the names
can be automatically numbered for the DIC files. The assigned sequence
numbers will later be used as codes for the taxa. It is convenient t o enter
only one name per taxon in the original DIC file for a region. In
exploratory drilling, when well cuttings are used to determine highest
occurrences of taxa (and lowest occurrences are not used because of
105
downhole contamination), the DIC file initially created for taxa, can be
used for the highest occurrences as well. If both highest and lowest
occurrences of taxa are used, it may be necessary t o create a new DIC file
for events from the DIC file for taxa. A simple procedure for this is t o
automatically replace each taxon dictionary number i (i = 1,2,...,n) by two
numbers (2i-1) and (2i). The odd numbers (2i-1) may be used for lowest
occurrences and even numbers (2i) for highest occurrences. In the RASC
computer program for this procedure the same taxon name is used for
highest and lowest occurrences. They are distinguished in the event
dictionary by preceding them with the indicators HI and LO, respectively.
DAT files
Data (DAT) files contain information on all events in all sections to be
used for the study of a region. Different formats can be used. These formats
may emulate data entry procedures of the paleontologist. DAT files consist
of separate lists of samples corresponding to the separate stratigraphic
sections or wells for a region. Examples of formats are as follows: For
exploratory wells, the paleontologist often works with cuttings which
successively become available while proceeding in the stratigraphically
downward direction. For each well, the depth of a sample, e.g. as measured
from sealevel, can be entered , followed by the highest occurrences of all
taxa identified for this sample. For outcrop sections, the paleontologist
usually works in the stratigraphically upward direction. The distances
measured in the stratigraphic direction (perpendicular to bedding) may be
measured for each region from the base of each section upwards.
Consequently, every section has its own scale. The origins of these scales
which are set at the stratigraphically lowest points in the sections usually
do not occur in the same bed. A common procedure of coding t h e
information consists of entering the name of a taxon followed by its lowest
and highest occurrence measured along the scale for the section. This scale
may be in meters or feet, or may be a sequence of numbers representing
beds counted in the stratigraphically upward direction. If beds without
highest or lowest occurrences are skipped in the counting, the numbers
represent so-called “event levels”. DAT files can automatically be changed
into SEQ and preliminary DEP files. The depth files that can be created
from a DEP file are preliminary because information on probable depths of
events in wells (or probable locations of events in outcrop sections) which
106
is needed for automated stratigraphic correlation only can be added after
application of ranking and scaling to the SEQ file.
SEQ files
Sequence (SEQ) files consist of sequences of all stratigraphic events in
all sections t o be used for the study of a region. The events are positioned
according to their relative stratigraphic position, usually proceeding in the
stratigraphically downward direction. Normally, SEQ files a r e
automatically created from DAT files, replacing them by superpositional
or equipositional (coeval) relations. The relative event levels are used for
indicating order in the SEQ files. The information in a SEQ file is
sufficient to ascertain for any pair of events (A, B) in a section whether A
was observed t o occur stratigraphically above or below B, or whether A
and B were observed to be coeval in this section. SEQ files will be used for
ranking and scaling of the events in the region. In the optimum sequence
for a region, each event will obtain a rank above o r below other events. In
the scaled optinum sequence there will be different intervals between
successive events. Zero interval between successive events along the
RASC scale would indicate that the events are coeval on the average for
the study region.
PAR files
Parameter (PAR) files contain the settings of switches and values of
parameters needed t o run the RASC computer program. For example, the
user may decide t o only use events that occur in k, or more sections. The
value of the parameter k, then has to be set in the PAR file. In some
versions of RASC (e.g. micro-RASC, see Chapter lo), the parameters have
default values which can be changed interactively by the user.
DEP files
Depth (DEP) files contain information on the depths (in meters or in
terms of event levels) of stratigraphic events measured i n t h e
stratigraphically downward direction for single sections. This information
is compared t o the average positions of the events expressed either as
107
1 A-Vaca Valley
8-Pacheco Syncline
i C-Tree Plnos
D-Upper Rellr Creek
i E-New ldria
F-Media Ague Creek
G-Upper Canada
j de Sante Anita
H-La8 Crucee
I-Lodo Gulch
I J-Simi Vslley
Dictionary (DIC file) for Hay example. LO and HI represent lowest and highest occurrences of
nannofossils, respectively.
I LO DISC'OASTER I)ISTINC'TlIS
2 LO C'OC'CC~LlTHllSCRIHELLLJM
3 L O DlSC'OASTE R C;ER M A N ICll S
4 1.0 ('O('C'OLITH1JS SOLlTllS
5 LO ('O( '('OLI T H 1J S G A M M AT ION
h L O RHARDOSPHAERA SCABROSA
7 1.0 DISCOASTER MlNlMlJS
8 L O DIS('0ASTER CRllClFORMlS
9 H I DISC'OASTER TRlBRACHlATllS
10 LO DIS('0LITHUS DISTINCTIIS
TABLE 4.2
Fossil name file (preliminary DIC file) for Sullivan database coded by Davaud and Guex (1978) and
Agterberg et al. (1985). A RASC input DIC file was obtained automatically from this file (see text).
11
STRATIGRAPHIC INFORMATION
A B C D E F G H I 1 2
n n
< <
Fig. 4.2 Hay example. Highest and lowest occurrences of Lower Tertiary nannofossils selected by Hay
(1972) from the Sullivan database. The 10 events are represented by symbols (cf. Fig. 5.1) which
correspond to numbers in Tables 4.1 and 4.3. 6=lowest occurrence of Coccolithus gammation; 0 =lowest
occurrence of Coccolithus cribellum; 0 = lowest occurrence of Coccolithus solitus; V = lowest occurrence of
Discoaster cruciformis; < =lowest occurrence of Discoaster distinctus; n =lowest occurrence of
Discoastergermanicus; U lowest occurrence of Discoaster minimus; w = highest occurrence of Discoaster
tribrachiatus; A = lowest occurrence of Discolithus distinctus; 8 =lowest occurrence of Rhubdosphaera
scabrosa. See Fig. 4.1 for locations of the 9 sections (A-I). The columns on the right represent a subjective
ordering of the events and Hay's original optimum sequence, respectively.
TABLE 4.3
Two SEQ files for Hay example. Minus signs (or hyphens) denote coeval events (cf. Fig. 4.1). The last
entry for a section is followed by -999. Left side: SEQ file for stratigraphically downward direction.
Right side: SEQ file for stratigraphically upward direction.
A A
9 8 7 6 -5 -4 -3 -2 -1-999 1 -2 -3 -4 -5 -6 7 a 9-999
B B
9 10 -6 - 5 -4 - 7 -3 -2-999 2 -3 -7 -4 -5 -6 -10 9-999
C C
9 1 5 2-999 2 5 1 9-999
D D
10 9 8 5 7 1 2-999 2 1 7 5 8 9 10-999
E E
9 6 4 8 7 3 1 5 -2-999 2 -5 1 3 7 8 4 6 9-999
F F
10 9 8 -7 2 5 -4 3 -1-999 1 -3 4 -5 2 7 -8 9 10-999
G G
9 8 -10 5 -2 -1 4 -3 7-999 7 3 -4 1 -2 - 5 10 -8 9-999
H H
4 9 5 -1 -10 7-999 7 10 -1 -5 9 4-999
I I
10 9 6 4 5 1 -3 2-999 2 3 -1 5 4 6 9 10-999
(F)
MEDIA AGUA CREEK
Fig. 4.3 Original stratigraphic information for three sections (F-H) of Sullivan database with
stratigraphic correlation based on nannoplankton faunizones according to Sullivan (1965). Table 4.4
contains information on distribution of 9 taxa in samples from Media Agua Creek section.
112
Table 4.3 shows two possible SEQ files for the stratigraphic
information of Figure 4.2.They are for the stratigraphically downward
and upward directions, respectively. For reasons t o be discussed in
Chapter 5 , the RASC computer program may give slightly different results
for the upward and downward directions. It will be instructive to run the
program on both SEQ files of Table 4.3 in order to illustrate the minor
changes brought about by inverting the order. Such minor changes are
usually much smaller than those resulting from altering the dataset by
resetting switches or parameters in the PAR file (see later). Unless stated
otherwise, we will use SEQ files for the stratigraphically downward
direction which is also the direction in which results are printed out in
tables and graphical displays.
The SEQ files of Table 4.3 contain all information represented in
Figure 4.2. Coeval events are shown by hyphens in the SEQ files. The
RASC computer program reads these hyphens as minus signs. There is
one-to-one correspondence between the SEQ files of Table 4.3 and the
graphical representation of Figure 4.2 in t h a t the latter can be
reconstructed from the former and vice versa. No use was made of a DAT
file in order to obtain the SEQ files from Figure 4.2. This stage can be
skipped for the Hay example because the stratigraphic information is of a
simple nature. Normally, the stratigrapher will wish to construct a DAT
file from which the SEQ file is extracted automatically. This procedure
will be illustrated in the next section.
TABLE4.4
Stratigraphic distribution of nine taxa of fossil nannoplanton for individual samples in the Media Agua
Creek area, Kern County, California (according to Sullivan, 1964, Table 3, and Sullivan, 1965, Table 6).
Stratigraphic distance (D)in feet measured upward and downward from base of “Tejon” Formation;
Paleocene-Eocene boundary occurs between 103 and 118 feet. Fossil (F) numbers in first column as in
Table 4.2; A-abundant; C-common; 0-few; x-rare. Single bar indicates stratigraphic events E l to E l 0
used in Table 4.1 and Figure 4.3 (as defined for samples extending up to 88 feet below base of “Tejon”
Formation); relative superpositional relations are changed by using lowest occurrences of four taxa in
Paleocene shown in lower part ofthe table (also see Table 4.5). Level (L) as in Guex (1987, p. 228).
depths in feet of highest and lowest occurrences. The second file (Table
4.5B)has different depths for the lowest occurrences of five taxa because
the data from the Paleocene also were used. P a r t i a l SEQ files
automatically constructed from the data in Table 4.5are shown in the first
two rows of Table 4.6.The first row (Eocene only) duplicates the row for
Section F in Table 4.3 (stratigraphically downward direction). The SEQ
file in the second row is different from the initial result. It is more realistic
because events 1, 2, 5, and 8 already existed before the Eocene. As
mentioned before, continued use will be made of the original Hay example
114
of Figure 4.2 and Table 4.3 for historical reasons. The extended SEQ file
incorporating the Paleocene data shown in Table 4.6 will be employed as
well. Differences between the SEQ files of Tables 4.3and 4.6 are restricted
TABLE4.5
Examples of partial DAT files for Media Agua Creek section of Table 4.4. Distances (in
feet) measured downward from base of“Tejon” Formation. Guex Levels are shown a s L in
bottom row of Table 4.4.
83 146 -522 5 15
17 257 2 2 14
91 88 57 7 9
11 86 -1080 7 17
19 257 -522 2 15
94 72 57 9 9
90 241 -514 2 15
89 257 48 2 9
86 34 -522 10 15
115
to sections F and G because these are the only sections with additional
data not used by Hay (1972).
Artificial truncation of the observed ranges of some of t h e
nannoplankton taxa may occur when the coding and analysis are
restricted to relatively narrow time intervals, e.g. for one or two ages. Such
artificial truncation effects should be avoided as much as possible in
practice. It is likely that the relatively large number of coeval events a t
the base of sections A and B in Figure 4.2 is in part also due to artificial
truncation. It is noted that Hay (1972)ignored coeval events in his original
method of obtaining an optimum sequence thus counteracting the possible
truncation effect. In the RASC method, coeval events will always be
considered. Although some ranking methods give the same results
whether or not observed coeval events are considered, the scaling methods
make extensive use of coeval events and these should not be ignored. The
truncation drawback of the Hay example will be avoided in most other
datasets to be discussed later.
The lowest and highest occurrences in the DAT and SEQ files for the
Hay example are based on rare occurrences within samples. Sullivan
(1965)adopted the widely used semi-quantitative method of categorizing
abundance (rare, few, common, abundant) in order to improve upon coding
presences and absences only without following the laborious and possibly
counter-productive, route of actually counting large numbers of individual
fossils. His charts normally show uninterrupted sequences for the
“abundant” and “common” categories (A’s and C’s in Table 4.5), whereas
the sequences for the “rare” and “few” categories (x’s and 0’s in Table 4.5)
are interrupted. As pointed out by Hay (1972),the only reasonable
explanation for the gaps in the sequences of x’s and 0’s is that the presence
or absence of a rare taxon is the realization of a random variable (also see
Section 3.3). All taxa were rare when they first and last appeared in a
TABLE4.6
Partial SEQ files in stratigraphically downward direction for Media Agua Creek section as
derived from partial DAT files ofTable 4.5. Event code numbers a s in Table 4.1.
Eocene l(Distances) 10 9 8 -7 2 5 -4 3 -1
EoceneZ(Guexleve1s) 10 9 -8 -7 2 -5 -4 -3 -1
EoceneandPaleocene 1 10 9 7 4 3 1 8 -2 -5
EoceneandPaleocene2 10 9 -7 4 -3 1 8 -2 -5
116
basin. Some taxa (e.g. F 17 in Table 4.4) never became abundant contrary
to others (e.g. F 89 in Table 4.4) which were abundant as well as rare.
Stratigraphic events can be defined on the basis of rare occurrences as
well as abundant occurrences of a taxon. For example, Doeven et al. (1982)
applied ranking to a mixture of events in order to construct a nannofossil
range chart for Cretaceous nannofossils along the Canadian Atlantic
margin. This mixture included subtops (last consistent occurrences) and
superbottoms (fist consistent occurrences) as well as the tops (last observed
occurrences) and bottoms (first observed occurrences) for selected
nannofossils. Definition of more than two events for these taxa helped to
improve the range chart. In general, subtops and superbottoms are less
subject t o random variability in time than first and last occurrences (also
see Doeven, 1983).
The eighteen levels “L” in Table 4.5 were based on maximal horizons
for all ( = 82) taxa occurring in the Media Agua Creek area. The 44
samples of this section were combined into 18 levels by Guex (1987) with
loss of information on the relative order of first and last occurrences.
Many pairs of events were made coeval during the coding, although they
had a distinct order in the section before the cliques were determined. For
118
Pig. 4.4 Example of interval assignment J ( i ) , i = 1, 2, ... for undirected graph (after Roberts, 1976). If
applied to a single stratigraphic section, each clique represents a maximal horizon or Guex level.
I Karlsefni H-13
2 Snorri J - 9 0
3 Herlolf M-92
4 Blarni H-81
5 Gudrid H-55
6 Corlier D - 7 9
7 LeifE-38
8 Leif M-48
9 Indian Harbour M-52
10 Freydis 8 - 8 7
11 Bonavisto C - 9 9
12 Cumberland 8 - 5 5
13 Dominion 0 - 2 3
14 Egrel K - 3 6
15 E g r e t # - 46
16 Osprey H - 8 4
17 Heron H - 7 3
+ I6 Bran1 P-87
19 Kittiwake P - l l
20 Wenonoh J - 7 5
21 Triumph P - 5 0
22 Mohican 1-100
J3
'4.
I5
.I6
+
I
64'
I
56'
I
48.
Fig, 4.5 Location of 22 wells along Eastern Canadian margin used for Cenozoic foraminifera]
stratigraphy by Gradstein and Agterberg (1982). Original samples were obtained from Eastcan and
others: Karlsefni H-13 (1760-12 990'), Snorri J-90 (1260-9950'), Herjolf M-92 (3030-78001, Bjarni H-81
(2760-6060'), Gudrid H-55 (1660-8580'1, Cartier D-79 (1950-6070'); Tenneco and others: Leif E-38 (1210-
3557'); Eastcan and others: Leif M-48 (1300-5620'); BP Columbia and others: Indian Harbour M-52
(1740-10 480'); Eastcan and others: Freydis B-87 (1000-5260'); BP Columbia and others: Bonavista C-99
(1860.11 940'); Mobil Gulf Cumberland B-55 (920-11 830'), Dominion 0-23 (1380-10 260'); Amoco Imp
Skelly: Egret N-64 (1060-2070'), Egret K-36 (860-2270'), Osprey H-84 (1190-2660?, Brant P-8 (1050-
6270'); Amoco Imp: Heron H-73 (970-5800'), Kittiwake P-11 (970.55603; PetroCanada Shell: Wenonah
5-75 (1000-4750'); Shell: Triumph P-50 (990-5490'). Mohican 1-100 (1276-5320').
120
Agterberg, 1982). Figure 4.5 shows the locations of the 22 offshore wells
used. They were divided into two groups. Sixteen of these wells are located
on the Labrador Shelf and northwestern Grand Banks (northern region).
Six occur on the Scotian Shelf and southern Grand Banks (southern
region). In total, the highest occurrences (exits) of 206 benthonic and
planktonic Foraminifera, were used. Of these 150 and 157 occurred in the
northern and southern regions, respectively.
Initial biozonations for the northern and southern regions were based
on smaller sets of 41 and 60 data, respectively. The two regions had 14 of
these taxa in common. The southern biozonation had 32, mostly Eocene
and Miocene index planktonics and the northern zonation 6, essentially
Eocene ones. This difference reflects pronounced post-Middle Eocene
latitudinal water mass heterogeneity and differential post-Eocene
shallowing across the continental margin. The biozonation with relatively
many planktcnics for the southern region helped to establish the initially
largely unknown biozonation for the northern region.
Later, data for 10 wells were added for the northern region, mainly in
the vicinity of the Hibernia oil field on the Grand Banks between wells 13
and 14 in Figure 4.5. New taxa were identified and the original dictionary
for the 22 wells of Figure 4.5 was updated. The enlarged dictionary is
given in Table 4.7 which is part of the Gradstein-Thomas database for 24
wells on the Labrador Shelf and Grand Banks, published in Gradstein et
al. (1985, pp. 515-520).
It is noted that not all events in Table 4.7 are highest occurrences of
Foraminifera. For example, four seismic events were included in the
database. Also, in total there are 238 events in Table 4.7 which is less than
the greatest number (=275) assigned t o a taxon. Gaps in the numbering
are due t o revisions made in the identification of taxa. For example, a
taxon with one name in Table 4.7 may be the composite of two taxa of
which one had a different name which became obsolete after the renaming.
In order t o preserve the unique identifier of the name that was retained, a
dummy code (e.g. xxx) was assigned in the dictionary to the name that was
deleted. The advantage of this procedure is that other taxa retain their
original dictionary numbers in RASC input and output files regardless of
revisions applied t o relatively few taxa.
Table 4.8 is a partial DAT file using 4 of the 24 wells. The depths of
the samples were measured in feet for earlier wells and in meters for wells
121
TABLE4.7
DIC file of Cenozoic Foraminifera in Gradstein-Thomas database for Canadian Atlantic margin.
Partial DAT file for Gradstein-Thomas database. Numbers in brackets below well names a r e for rotary
table height and water depth, respectively (M=meters; F=feet). Depths (first column) are followed by
highest occurrences.
SEQ file for 24 wells of Gradstein-Thomasdatabase for Labrador Shelf and Grand Banks.
BTARNI H-81
16 67 20 -21 18 -69 -70 -71 15 24 25 34 29-261 42 -74 -41 -32 30-264
-75 57 46 56-999
CARTIER D-70
16 18 15 21 -70 67 69 24-172 25 259 34 260-261 118 -85 -29-263 46 -42
-32 35 41 -51 54 56 175 -59-999
F'REYDIS B-87
16 181 -67 -21 -18 20 69 -27 15 -70 25 190 -34-206 -42 -74 260 29-261 -45
33 -81 -41 -75-210 -32 211 -85 -94 57 -88 -86 -30 -46 -35 56 54 213 -55 59
-999
GUDRID H-55
10 -17 265 20 -21 -18 -16 24 15 -25 33 259 40 -34 84 -90 -36 37-260-261
29 35 45 -74 42 57 -88 -30 32 46 -50 56 -59 -54 55-999
INDIAN HARBOUR M-52
1 -3 -4 -5 -8 9 -10 269 2 -7 6 -18 15 -20 -16 17 24 -25 26 -27
-28 259 261 30 260 -32 33 34 -35 263 -36 -39 29 -40 -41 -42 86 37 -38 44
45 -46 -47 49 57 -54 -50 -52 55 -56 59 60 -61 -62-999
KARLSEF'NI H-13
228 67 25 41-118 69 260-261 68 -39 53-206 29 86 -30 -63 -34 46-264 230
-44 -42 96 -36 164 -50 52 45 -54 56 55 -62 61-253 258-999
LEIF M-48
228 -77 -10 181 16 -67 15 20 -21 -18 70 69 85 -24 25-238 42 29 260 -34
57 -74-118-263 30 -41 46 -56 -54-999
LEIF E-38
228 -77-270 17 67 -16 18 -21 20-999
SNORRI J-90
77 228 16 67 15 -21 18 25 57-263 -32 -34 29-260 -53 -41 -30 -36 27 -46
118 264 230 86 -63 42 45 56 59 -54-999
HERJOLF M-92
67 18 -15 -20 -16 78 70 25-259 85-145 -71 -40 45 -35-263-261 -34 29 41
-53 -30 -32-264 86 57 54 46 190 47-154 -56 55 60 59-999
BONAVISTA C-99
76 -77 10 17 -16 21 25 -20 18 79 -15 259 24 -26 81 -33 82 83 40 84
-27 29-261 32-263 85 -86 -87-264 41-34 57 88 -42 -90 89 159 -92 -93 -94
56 -50 -30 47 -96 -36 46-999
DOMINION 0-23
177-109-169 11 -9 17 10-117 -78 112 18 179 -16 -15 -71 122 180 26-123-137
14-136 27 20 21-181 201 24 25 34 264-260 -38 259 142 -81 184 -82 -30-146
69-263 202 32 68 187 49-188-147-190-140 29 -40 191-156 151 250-226 36 -44
194 -90 -57 203 50 -47-158 161 -52 -46 37-159-162 196 45-230 164-999
EGRET K-36
17 26 16 20 -21 -18 -71 -15 24 27 -42 202 69 82-999
OSPREY H-84
17 18 -20 15 -16 26-181 81 82 84-147 -69-148 90 -89 -33-187-234 -34-244
52 -51-162-159-166 -50 -93-999
CUMBERLAND B-55
76 228 -1 17 10 -11 -9-109 -71 265 -16 -20 18 15-119 117 219 26 24 25
-259 132 42 261 41 84 29 32 226 144 49 57 -36 90 52 -54 161 -93 -96-151
-164-157 46 -50-159 55 -56-254-194-999
EGRET N-46
11 -16 -18 14 -27 -71 26 -20 202 15 -24 172-999
ADOLPHUS D-50
10 71 218 16-136 18 20 179 201 26 15 -81 -69 24 -33-202 259 -25 263 82
85-261 203 147-260 68 32 40 30 49 -29 144 -90-156 -37 -89 234 160 -93 36
161-164 50-230 54 57 -56 55 194 -95-999
125
HIBERNIA 0-35
17 201 26 18 -20 16 275 24 -71 72 27 140 202 34 -81 203 259 -29 -25 15
-28 57-260-261 204 40 -32 91-999
nYING FOAM 1-13
9 -10 16 71 17 275-265 18-110 70 26 -15 -81 201 24 -20 -27 25 259 202
263 -32 -34 260-261 264 29 -57-203 54 46 36 41 230-999
BLUE H-28
77 1 4 267 269 110 -10 -64 266 124-125 -6-113 122 26 -71 268 -2 147 -27
29-261 -81-150 82 -15-118-138 146 -84 32 -79-172 -53 -68 164-190 42 86-151
33 -94 -57 37 90 -52-999
HARE BAY H-31
228-270 77 1 10 136 16 70 -15 24 18 -20 -25 260-263 259 29-233 -69-118
-32 -81 68 49 41 227 93 -42 -96 50 57 66 -54 55-161 -56 59 253-255 -46
-999
HIBERNIA K-18
201 16 -18 -20 -71 -72 24 -27 15 -34 81 202 259 147 25 -29-260 30 -57-203
32 263 36 -40 -63 45 -91-155-230204-999
HIBERNIA B-08
17 26 18 -20 16 15 -27 -71 72 81 -25 24 146-259 32 -57-147-260-261-263
36 -40 45 63 47-144-194 -54 -91-230 56 55 -61 52 -59 -96-253-999
HIBERNIA P-15
17 18-265 16 20-100 26 201 15 71 72 69 202 81 27 147 24 25 -32 -57
-259-260 261 29 203 53-263 40 45 204-999
drilled more recently. Rotary table height and water depth are given
separately for each well. For the DEP files to be constructed later for the
purpose of automated stratigraphic correlation, rotary table height will be
subtracted so that all depths were measured from sealevel downward. Feet
will be converted to metres.
Only the relative depths of the samples with respect to one another
are used in ranking and scaling. For example, the Adolphus D-15 well has
32 distinct “event levels” for 50 exits. The majority ( = 19 of 32) of these
levels have a single observed exit; there are 10 levels with 2 , 2 with 3, and
1 with 5 exits, respectively. The total number of samples studied exceeded
the total number of event levels because highest occurrences of
microfossils were coded only. The exits in Table 4.8 have the same
numbers as the Foraminifera in Table 4.7. The complete SEQ file for all 24
wells in the Gradstein-Thomas database is shown in Table 4.9.
(4) Many of the samples are small which limits the detection of species
represented by few specimens; this contributes to factor (3) and to the
erratic, incoherent geographic distribution pattern of some taxa.
Number of wells: 1 2 3 4 5 6
Numberofevents: 56 51 29 21 10 6
TABLE 4.10
RASC computer program preprocessingoutput for number of times that successive events occur in a well;
e.g. event 1 occurs in 2 wells and event 2 in 1 well.
Northern region:
Number of wells: 1 2 3 4 5 6 7 8 9 1 0 1 1 1 2 1 3 1 4 1 5 1 6
Southern region:
Number of wells: 1 2 3 4 5 6
These occur in fewer than h, sections. Although unique events are not used
for ranking and scaling, they are inserted later on the basis of their
superpositional relations with other events in the one or more sections
containing them.
The study of the frequency distribution of the events in a region,
selection of the threshold parameter h, and definition of unique events
belong t o the preprocessing module of the RASC computer program.
During this stage, the user should also identify possible “marker
horizons”. These are stratigraphic events with positions that can be coded
with certainty in the h, or more sections containing them. Marker horizons
(e.g. bentonite layers or seismic events) will receive more weight than
other events in the scaling part of RASC.
TABLE 4.11
Artificial sequences of events A, B and C created from random normal numbers with E(X) = 2 and Var
( X ) = l taken from Table A-23 of Dixon and Massey (1957). Event “Distances” were obtained by
subracting I from random normal numbers in column 1, maintaining column 2, a n d adding 0 . 5 to
random normal numbers in column 3.
1 2 3 A B C Sequence
2.422 0.130 2.232 1.422 0.130 2.732 BAC
0.694 2.556 1.868 -0.306 2.556 2.368 ACB
1.875 2.273 0.655 0.875 2.273 1.155 ACB
1.017 0.757 1.288 0.017 0.757 1.788 ABC
2.453 4.199 1.403 1.453 4.199 1.903 ACB
ACB
CAB
CAB
ABC
ABC
134
TABLE 4.12
Sequences of artificial stratigraphic events A, B and C generated from random normal numbers for
subsamples 1 to 5. Sequences for subsample 1 are same as those shown in last column ofTable 4.11.
I 2 3 4 5
BAC ACR CBA BAC A BC
ACE ACB ACB ACR A BC
ACB RAC ABC ACB A BC
ABC ABC ACB ACB ACB
ACR CAB BAC ACR CAB
ABC CAB CBA ABC ACE
BAC ABC BAC ACE A BC
ACB BCA ACB ARC A BC
ABC ACR ACB ACR ACR
BAC BAC ACE ABC A BC
BAC CBA ACR ARC AC B
ACR ACR ABC ACR BAC
ABC ABC ACB ABC ABC
ABC CBA ACE ARC A BC
BCA ACB ACR BAC ABC‘
ABC BAC ABC BAC CBA
CAB BCA ARC ABC A BC
ABC ABC CAB ABC ACR
ABC ABC ABC BAC A BC
BAC ACB ACB ABC ACR
RAC ACB ABC RAC RAC
ABC ABC ABC ACR CAB
ACB ABC ACE ACB BAC
ACE ABC ACR CRA ARC
ABC CAB ACE ACB A BC
ACE ABC ACR ABC CAB
CAR CAB ARC BAC A BC
CAB BAC ABC BAC ACE
ABC BAC BAC ARC A BC
ABC ACE RCA ACR A BC
t o the numbers in column 3, artificial “distances” along the real line were
created for the events A, B and C which are regarded as realizations of the
normal random variables XA, XB and Xc, respectively.
On the average, the random numbers for events A, B and C occupy the
positions E(XA)= 1.0, E(XB)= 2.0, and E(Xc) = 2.5 which follow one
another along the real line. Consequently, their expected or average
“optimum” sequence is ABC. Each event, however, has variance equal to
one. This implies, that in the realizations, simulating separate
stratigraphic sections, A may be following B or C instead of preceding
them. Thirty “observed” sequences for sections are shown in the last
135
column of Table 4.11. The artificial sequences are of nine different types
with the following frequencies:
Frequency: 12 8 6 1 3 0
Sequence file with artificially created superpositional relations for 20 events (numbered 1 to 20) in
25 sections. The interval between expected positions of the events along the linear scale was set equal
to 0.5.
1 2 5 4 3 6 10 8 9 11 13 14 12 15 7 17 16 18 19 20
1 4 3 2 7 8 9 6 11 5 12 13 10 15 18 19 16 14 17 20
3 1 2 4 5 6 10 8 7 9 12 11 13 15 16 14 17 18 19 20
5 3 1 2 4 7 6 8 9 10 12 11 13 14 18 19 16 15 17 20
2 1 3 5 6 4 7 8 9 12 10 13 11 14 15 16 19 17 20 18
3 4 5 2 1 6 11 9 7 10 12 8 16 15 14 13 17 18 20 19
2 3 4 1 7 6 9 10 5 12 8 13 14 15 11 16 18 17 19 20
1 3 5 4 9 6 2 7 11 12 8 10 13 16 15 14 17 19 18 20
1 8 3 2 4 6 9 5 12 7 10 11 14 13 15 16 18 17 20 19
2 3 4 1 8 7 6 5 10 12 14 16 11 13 9 15 17 18 19 20
1 5 6 2 3 4 8 7 9 13 10 14 16 11 12 15 17 18 19 20
1 4 6 2 3 5 8 7 9 13 11 14 10 12 15 17 18 16 19 20
2 4 1 5 3 11 6 7 9 8 10 13 14 12 16 15 17 18 19 20
6 3 1 4 2 5 7 8 14 9 11 12 15 16 10 13 17 18 19 20
3 4 2 1 5 7 6 8 9 12 10 11 14 13 16 17 15 19 18 20
3 1 7 6 2 5 4 8 10 15 12 9 13 14 11 17 16 20 19 18
1 2 4 5 7 3 8 6 14 10 9 11 16 12 13 19 18 17 15 20
2 1 4 3 8 6 5 7 9 11 15 14 12 13 10 16 17 20 18 19
1 2 4 7 3 5 6 9 10 11 8 18 13 12 14 15 16 17 19 20
'2 1 4 3 6 5 7 11 10 9 8 14 15 16 12 13 18 17 19 20
3 1 5 4 10 6 2 7 8 11 9 12 14 16 13 17 15 18 19 20
1 2 5 3 4 6 8 7 9 11 10 15 14 13 12 16 19 17 18 20
1 5 4 3 6 2 8 7 11 9 12 10 16 14 17 15 18 13 19 20
2 1 7 3 6 5 4 8 13 12 9 10 11 16 18 20 14 15 19 17
4 1 3 2 8 6 5 7 11 9 13 10 12 16 14 15 17 18 20 19
137
TABLE 4.14
Sequence file with artificially created superpositional relations for 20 events (numbered 1 to 20) in
25 sections. The interval between expected positions of the events along the linear scale was set equal
to 0.3.
5 1 4 2 10 3 6 8 11 9 15 13 14 17 12 16 7 19 18 20
1 4 3 7 2 8 9 11 6 12 13 18 15 5 10 19 16 20 17 14
3 1 2 4 5 6 10 12 8 9 11 7 16 15 13 17 14 18 19 20
5 3 1 7 6 4 2 9 8 10 12 13 11 14 18 19 17 20 16 15
2 1 3 5 6 8 7 12 9 4 10 14 13 19 15 11 16 17 20 18
3 4 5 11 9 2 6 1 7 10 12 16 15 14 8 17 13 18 20 19
2 3 4 7 1 10 9 6 12 13 15 14 5 8 16 18 11 17 19 20
1 9 3 5 4 6 2 11 7 12 10 16 8 13 15 14 19 17 18 20
8 3 1 2 4 6 9 12 5 10 7 14 11 15 13 16 18 17 20 19
2 3 4 8 7 1 6 10 5 14 12 16 15 13 11 17 9 18 19 20
1 5 6 2 3 8 7 13 4 9 16 14 10 11 12 17 15 18 19 20
1 4 6 5 3 2 8 7 14 13 9 17 11 15 10 12 18 20 19 16
2 4 5 11 3 1 9 6 7 8 13 10 14 16 12 15 17 18 19 20
6 3 4 1 2 5 14 7 8 11 9 16 12 15 17 13 10 18 19 20
3 4 2 1 5 7 12 9 8 6 11 10 14 16 13 17 19 15 18 20
3 1 7 6 5 15 2 10 8 4 14 12 13 9 11 17 16 20 19 18
1 4 7 2 5 14 8 6 3 10 16 11 9 19 12 18 13 17 15 20
2 8 4 1 3 6 7 5 9 11 15 14 12 13 20 18 16 17 19 10
7 1 4 2 5 3 6 9 18 10 11 13 8 12 14 15 16 17 19 20
2 4 1 6 7 3 5 11 14 10 9 8 16 15 18 17 12 13 19 20
3 10 1 5 6 4 7 2 8 11 9 14 12 16 17 13 15 18 19 20
1 2 5 3 4 8 6 7 9 11 15 10 14 13 19 12 16 17 18 20
5 1 4 6 3 11 2 8 7 9 12 16 17 18 14 10 15 13 19 20
2 7 6 1 3 5 13 8 12 4 16 9 10 20 18 11 14 19 15 17
4 3 8 1 2 6 5 11 7 9 13 12 10 16 14 17 15 18 20 19
TABLE 4.15
Sequence file with artificially created superpositional relations for 20 events (numbered 1 to 20) in
25 sections. The interval between expected positions of the events along the linear scale was set equal
toO.l.
5 10 4 2 1 11 17 15 14 8 9 13 6 3 16 12 19 20 18 7
1 4 7 18 11 19 9 13 8 12 3 15 20 2 6 16 17 10 14 5
3 4 1 2 6 LO 12 5 16 11 15 8 9 13 7 17 18 19 20 14
5 7 3 6 9 1 4 8 18 10 19 2 12 13 14 20 11 17 16 15
2 5 12 1 3 19 8 6 7 9 10 15 14 20 16 13 17 4 11 18
11 16 9 5 4 3 10 12 6 15 7 17 2 14 18 1 20 13 19 8
10 15 9 3 7 12 4 2 13 14 6 16 18 1 17 8 5 11 19 20
9 1 5 6 3 4 11 16 12 19 7 15 2 10 13 17 14 18 8 20
8 12 3 9 6 1 15 14 4 2 16 10 13 18 11 7 17 5 20 19
2 8 3 4 7 16 14 10 6 12 15 1 17 13 5 19 18 20 11 9
5 6 1 13 16 8 7 14 9 2 3 4 10 17 12 18 19 11 15 20
4 1 6 5 3 8 2 17 14 13 20 15 19 18 11 7 9 16 12 10
11 4 5 2 9 13 8 7 3 6 10 1 14 16 17 18 19 15 12 20
6 3 14 4 16 11 17 5 15 2 8 1 7 12 9 19 18 20 13 10
3 4 12 5 7 2 9 14 8 1 16 19 17 11 6 10 13 15 18 20
3 1 15 7 6 10 14 8 13 5 12 20 17 2 I 16 11 19 9 18
14 7 16 4 1 19 8 5 2 10 6 13 11 12 17 3 9 13 20 15
2 8 15 20 7 4 6 11 14 9 i9 5 18 3 17 1 13 16 12 10
18 7 4 5 1 9 10 11 2 6 13 3 12 14 16 17 15 8 20 19
2 4 14 1 11 6 7 16 10 15 9 5 3 8 18 17 19 20 13 12
10 3 5 6 7 1 4 8 11 16 14 17 12 9 2 19 18 15 13 20
5 1 2 3 4 8 15 14 11 6 7 19 13 9 10 16 18 17 12 20
5 11 6 4 1 3 8 18 16 17 9 7 12 2 14 15 19 20 10 13
7 13 2 20 6 16 12 18 5 8 3 1 19 10 9 4 11 14 15 17
8 4 11 13 3 6 16 5 17 9 1 2 18 12 7 15 14 10 20 19
139
S u bsample: 1 2 3 4 5
Relative frequency: 0.633 0.533 0.433 0.600 0.633
The average relative frequency is 0.5667. One might suspect that the
average is a better estimate of the “true” population value because it is
based on a sample that is five times larger. For this example, this
assumption is not correct, because the true relative frequency is
W0.5N2) = 0.638. In the latter expression, CD represents the fractile of the
normal distribution in standard form (see later). In general, if the interval
between the mean positions of two events along the real line is written as
D (D=0.5 for the interval between B and C in the example), then the
population is equal t o Q(DN2).
Tables 4.13 to 4.15 form an artificial database consisting of three SEQ
files for 20 events in 25 sections. The same set of 20x25=500 normal
random numbers was used for each SEQ file. The events are numbered 1
to 20. Because their mean positions follow one another along the real line,
the optimum sequence is also 1to 20 for each SEQ file. The 20 events were
given expected values that are equally spaced. The spacing along the real
line was 0.5,0.3and 0.1 for Tables 4.13,4.14and 4.15, respectively.
Relative frequencies for the order of pairs of consecutive events in
Table 4.13 are similar to those for B and C in Table 4,12, because the
interval D between mean positions is equal to 0.5 in both situations. For
example, the relative frequencies for the first five ordered pairs in Table
4.13 are
Sequence: 12 23 34 45 56
Relative frequency: 0.640 0.520 0.600 0.600 0.560
CHAPTER 5
RANKING OF BIOSTRATIGRAPHIC EVENTS
5.1 Introduction
In this chapter and the next, ranking and scaling techniques will be
illustrated using the Hay example introduced at the beginning of the
previous chapter. In this example, there are 10 stratigraphic events and 9
sections (see Fig. 4.2; Tables 4.1 and 4.3). The preprocessing of the RASC
computer program begins with a tabulation of the number of stratigraphic
sections in which each event occurs. For the Hay example, this gives:
142
Numberofsections: 8 8 6 7 9 4 7 5 9 6
Number o f sections: 1 2 3 4 5 6 7 8 9
Frequency of events: 0 0 0 1 1 2 2 2 2
Curnulativefrequency: 10 10 10 10 9 8 6 4 2
As explained previously (Section 4.81,this frequency distribution is
helpful in selecting the threshold parameter h, which is set to retain only
those events that occur in h, or more wells. For the Hay example, all
events occur in at least 4 sections. Initially, we will set k,= 1 (Default
value for h, in micro-RASC, see Chapter 10) so that all events will be
retained for further analysis.
upper left hand part of the matrix being 1/4, which is less than 0.5. After
making this correction it can be seen that S should come below both 9 and
V, these relationships being expressed by the fractions 0/5 and 1/4,
respectively. Finally, it is evident that the position of in the sequence
needs to be changed because its relation to 6 is 1/4, t o V is 0/5, to q is 1 6 ,
and to < is 1/5. It must come below any of these symbols, and, in fact,
became the lowest event in Hay’s original optimum sequence shown in
column 2 of Figure 4.2. The revised matrix using Hay’s optimum sequence
is shown in Figure 5.1B. All values greater than 0.5 now are in the upper
left part of the matrix. Note that both the upper part and the lower part
contain fractions equal to 0.5. These occur in pairs and signify events that
are coeval “on the average”.
Before or after creation of the optimum sequence, every fraction in the
matrix can be tested for statistical significance by comparing it t o 0.5
using the binomial frequency distribution model as explained in Section
3.2. Figure 5.2 shows the difference between 1 and the cumulative
probability P, ( h , R ) that an event occurs h times above another one in a
sample of pairs of events with size R . If 1-P, ( h , R ) exceeds 0.95, the
Fig. 5.1 (A) Matrix for the relations of biostratigraphic events in Fig. 4.2. The number (N)in the lower
right of each square is the number of sections in which the pair of events is separable. The number ( n ) in
the upper left of each square is the number of times the event on the bottom row occurs below the event
on the left side. The sequence from lowest to highest on the bottom and left side of the matrix is that
shown in column (1) on right side of Fig. 4.2. (B)Revised matrix in which the ratio nlN has been
rearranged so that all values greater than 3 are in the upper left part of the matrix. The lowest-highest
sequence along the bottom and left side of the matrix now represents Hay’s original optimum sequence
also shown as column (2) on right side of Fig. 4.2 (after Hay, 1972).
144
fraction klR is greater than 0.5 with a probability of 95 percent. The
hypothesis of nonrandom average superpositional relationship can only be
accepted for 6 of 45 pairs of events. These are 6 of nine pairs involving the
event W which occurs a t or near the top of all (9) sections (A t o I in
Fig.4.2). In total, two of the values in Figure 5.2 exceed 0.99 They
correspond to the facts that (1)W occurs above in eight sections, and
(2) W occurs above < in eight sections. These two superpositional
relations are statistically significant with a probability of 99 percent.
The binomial model has a drawback for testing whether or not the
observed superpositional relation of two events is random, because it
ignores the relations of these two events with all other events. For
example, the binomial test of Figure 5.2 suggest that W occurs above @.
On the other hand, the fact that A occurs above cD in 4 out of 4 sections
would not be statistically significant, because the sample size is too small.
However, W and A occur near the top in all sections. In those sections
where they coexist, each occurs above the other one 3 out of 6 times. This
would suggest that, although the relation between W and A remains
undecided, both events probably occur above a. The relations between
these three events are shown graphically in Figure 5.3A. If in addition t o
- 5.2 Values of 1-Pwhere P reoresents the orobabihtv that the seauential relation between two events
Fig.
in nonrandom (cf. Eq. 3.2 for cumulative probability of binomial probability with p = 0 . 5 ; after Hay,
1972).
145
the relations between these three events (W, A and cp), their relations with
a fourth event (V) are also considered, the probability that A occurs above
is further increased (see Fig. 5.3B). A multivariate statistical test which
considers all pairs of events simultaneously and is not subject t o the
drawback of the binomial test of considering pairs of events in isolation,
will be developed in the next chapter on scaling.
Fig. 5.3 Diagrams to illustrate superpositional relations between (A) three events and (B) four events in
the Hay example. Although A and ID both occur in only 4 sections, their superpositional relation is
probably nonrandom because of their relations with other events.
TABLE 5 . 1
A. F-matrix of frequencies of events occurring above or below one another in the sections. The events for
the Hay example a r e labelled 1 to 10 as in Tables 4.1 and 4.3. B. R-matrix of frequencies ofcoexistence of
two events in the same section. Coeval events also were counted.
A I 2 3 4 5 6 7 8 9 I0
1 x 4 1 1 2 0 2 0 0 0
2 1 x 2 2 1 0 1 0 0 0
3 1 2 x 0 1 0 1 0 0 0
4 4 2 3 x 3 0 3 1 1 1
5 3 3 3 1 x 0 3 0 0 0
6 2 2 2 2 2 X l l O O
7 4 4 3 2 3 1 x 0 0 0
8 5 5 4 3 5 1 4 x 0 0
9 8 8 6 6 9 4 7 5 x 3
1 0 4 4 3 3 4 1 4 2 3 x
0 I 2 3 4 5 6 7 8 9 10
1 ~ 7 5 6 8 3 6 5 8 5
2 7 x 6 6 8 4 6 5 8 5
3 5 6 x 6 6 4 5 4 6 4
4 6 6 6 x 7 5 6 4 7 5
5 a ~ f i 7 ~ 4 7 5 9 6
6 3 4 4 4 4 x 3 2 4 2
7 6 6 5 6 7 3 ~ 5 7 5
8 5 5 4 4 5 2 5 x 5 3
9 8 8 6 7 9 4 7 5 x 6
I0 5 5 4 5 6 2 5 3 6 ~
2. Matrix notation
While arranging the information in matrix form, it is customary to
number the rows from left t o right and the columns from top to bottom.
Table 5.1A shows the so-called F-matrix of frequencies which are similar
t o the counts shown previously in Figure 5.1. The corresponding sample
sizes for frequencies of co-existence of two events in the same section are
shown in Table 5.1B. Note that the main diagonal goes from the top left to
the bottom right side in Table 5.1 .
As already stated in Section 4.3, SEQ files, such as the one shown in
Table 4.3A, normally are for the stratigraphically downward direction
147
TABLE5.2
A. S-matrix of scores obtained by adding half of the frequencies of ties (shown in Table 5.2B) to the
frequencies of the F-matrix (see Table 5.1A). B. T-matrix of frequencies of ties.
A 1 2 3 4 5 6 1 8 9 10
1 x 5.0 2.5 1.5 3.5 0.5 2.0 0.0 0.0 0.5
2 2.0 x 3.0 3.0 3.0 1.0 1.5 0.0 0.0 0.5
3 2.5 3.0 x 1.5 2.0 1.0 1.5 0.0 0.0 0.5
4 4.5 3.0 4.5 x 4.5 1.0 3.5 1.0 1.0 1.5
5 4.5 5.0 4.0 2.5 x 1.0 3.5 0.0 0.0 1.0
6 2.5 3.0 3.0 3.0 3.0 x 1.5 1.0 0.0 0.5
I 4.0 4.5 3.5 2.5 3.5 1.5 x 0.5 0.0 0.5
8 5.0 5.0 4.0 3.0 5.0 1.0 4.5 x 0.0 0.5
9 8.0 8.0 6.0 6.0 9.0 4.0 7.0 5.0 x 3.0
10 4.5 4.5 3.5 3.5 5.0 1.5 4.5 2.5 3.0 x
B 1 2 3 4 5 6 7 8 9 10
1 x 2.0 3.0 1.0 3.0 1.0 0.0 0.0 0.0 1.0
2 2.0 x 2.0 2.0 4.0 2.0 1.0 0.0 0.0 1.0
3 3.0 2.0 x 3.0 2.0 2.0 1.0 0.0 0.0 1.0
4 1.0 2.0 3.0 x 3.0 2.0 1.0 0.0 0.0 1.0
5 3.0 4.0 2.0 3.0 x 2.0 1.0 0.0 0.0 1.0
6 1.0 2.0 2.0 2.0 2.0 x 1.0 0.0 0.0 1.0
I 0.0 1.0 1.0 1.0 1.0 1.0 x 1.0 0.0 1.0
a 0.0 0.0 0.0 0.0 0.0 0.0 1.0 x 0.0 1.0
9 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 x 0.0
10 1.0 1.0 1.0 1.0 2.0 1.0 1.0 1.0 1.0 x
148
Coeval events were ignored in Figure 5.1 and Table 5.1A. Although
ranking by means of Hay's original method would not be influenced by this
modification, two events which are coeval in a section will be scored by
adding 0.5 t o the two counts for the first event occurring above and below
the second event, respectively. Suppose that the elements of the F-matrix
of Table 5.1A are written as Fij (i = 1, 2, ..., n; j = 1, 2, ..., n ) for n events
( n = 10 in the example). The subscripts i a n d j indicate rows and columns,
respectively. It is noted that these subscripts refer to positions of elements
in a matrix. They do not necessarily coincide with the original code
TABLE5.3
A 1 2 3 4 5 6 7 8 9 1 0
1 x 5.0/7 2.5/5 1.5/6 3.5/8 0.5/3 2.0/6 0.0/5 0.0/8 0.5/5
2 2.0/7 x 3.0/6 3.0/6 3.0/8 1.0/4 1.5/6 0.0/5 0.0/8 0.5/5
3 2.5/5 3.0/6 x 1.5/6 2.0/6 1.0/4 1.5/5 0.0/4 0.0/6 0.5/4
4 4.5/6 3.0/6 4.5/6 x 4.5/7 1.0/4 3.5/6 1.0/4 1.0/7 1.515
5 4.5/8 5.0/8 4.0/6 2.5/7 x 1.0/4 3.5/7 0.0/5 0.0/9 1.0/6
6 2.5/3 3.0/4 3.0/4 3.0/4 3.0/4 x 1.5/3 1.0/2 0.0/4 0.5/2
7 4.0/6 4.5/6 3.5/5 2.5/6 3.5/7 1.5/3 x 0.5/5 0.0/7 0.515
8 5.0/5 5.0/5 4.0/4 3.0/4 5.0/5 1.0/2 4.5/5 x 0.0/5 0.5/3
9 8.0/8 8.0/8 6.0/6 6.0/7 9.0/9 4.0/4 7.0/7 5.0/5 x 3.0/6
10 4.5/5 4.5/5 3.5/4 3.5/5 5.0/6 1.5/2 4.5/5 2.5/3 3.0/6 x
B 1 2 3 4 5 6 7 8 9 1 0
1 x 4 . 0 ~ 5 1.0/2 1.0/5 2.0/5 0.0/2 2.016 0.0/5 0.0/8 0.014
2 1.0/5 x 2.0/4 2.0/4 1.0/4 0.0/2 1.015 0.0/5 0.0/8 0.0/4
3 1.0/2 2.0/4 x 0.0/3 1.0/4 0.0/2 1.0/4 0.0/4 0.0/6 0.0/3
4 4.0/5 2.0/4 3.0/3 x 3.0/4 0.0/2 3.015 1.0/4 1.0/7 1.0/4
5 3.0/5 3.0/4 3.014 1.0/4 x 0.0/2 3.016 0.015 0.0/9 0.014
6 2.0/2 2.0/2 2.0/2 2.0/2 2.0/2 x 1.0/2 1.012 0.0/4 0.0/1
7 4.0/6 4.0/5 3.0/4 2.0/5 3.016 1.0/2 x 0.0/4 0.0/7 0.0/4
8 5.0/5 5.0/5 4.0/4 3.0/4 5.0/5 1.012 4.0/4 x 0.0/5 0.0/2
9 8.0/8 8.0/8 6.0/6 6.0/7 9.0/9 4.0,'4 7.0/7 5.0/5 x 3.0/6
10 4.0/4 4.0/4 3.0/3 3.0/4 4.014 1.0/1 4.0/4 2.0/2 3.0/6 x
149
TABLE 5.4
Illustration of algorithm for systematic checking of superpositional relations i n Hay method for
constructing optimum sequence. A. Positions of events 1 and 4 were interchanged because in Table 5.2A
the element ( = 1.5)in the fourth column of the first row is less than its counterpart (=4.5)in the lower
triangle of the matrix. Original event code numbers a r e shown in parentheses. B. Positions of events 6
and 4 were interchanged during second iteration. C . Positions of events 9 and 6 were interchanged
during third iteration. D. Final order relation matrix after 22 iterations. This matrix has the property
that all its elements in the upper triangle a r e greater than or equal to their counterparts in the lower
triangle. Elements in the upper triangle equal to their counterparts are underlined in Table 5.4D. The
events corresponding to these elements are coeval on the average. Note t h a t the final (optimum)
sequence is nearly the reverse of the original sequence in Table 5.2because code numbers were assigned
to the events while moving in the stratigraphically upward direction (cf. Tables 4.1 and 4.3).
A 1 2 3 4 5 6 7 8 9 1 0
I41 12) (31 ill 151 161 171 181 19) 1101
-
1141 x 30 45 45 15 I0 35 10 10 15
2121 30 x 30 20 30 10 15 00 00 05
3131 I5 30 25 20 10 15 00 00 05
4111 15 50 25 x 35 05 20 00 00 05
5151 25 50 40 25 .i 10 35 00 00 10
6161 30 30 30 30 30 15 10 00 05
I(7) 25 45 35 25 35 15 x 05 00 05
8(81 30 50 40 30 50 10 45 x 00 05
9191 60 80 60 60 90 40 I0 50 x 30
101101 35 15 35 35 50 IS 45 25 30 x
8 1 2 3 4 5 6 7 8 9 1 0
161 121 131 Ill 151 (41 171 I81 191 (101
1161 x 30 30 25 30 30 15 10 00 05
2(21 10 x 30 20 30 30 I5 00 00 05
301 10 30 x 25 20 I5 15 00 00 05
4(1) 05 50 25 v 35 15 20 00 00 05
5(51 I0 50 40 45 x 25 35 00 00 10
6(41 10 30 45 45 45 x 35 10 10 15
I(7l 15 45 35 40 35 25 x 05 00 05
8(8) I0 50 40 50 50 30 45 x 00 05
91% 40 80 60 80 90 60 70 50 x 30
101101 I5 45 35 45 50 35 45 25 30 x
C I 2 3 4 5 6 7 8 9 1 0
191 121 131 Ill (51 (41 I71 18) (61 110)
1191 x 80 60 80 90 60 I0 50 40 30
2121 00 x 30 20 30 30 15 00 10 05
3(31 00 30 x 26 20 15 15 00 10 05
411) 00 50 25 x 35 I5 20 00 05 05
515) 00 50 40 45 x 25 35 00 10 10
6141 10 30 45 45 45 x 35 10 10 15
7171 00 45 35 40 35 25 x 05 15 05
8181 00 50 40 50 50 30 45 II 10 05
9(61 00 30 30 25 30 30 15 10 x 05
1011Ol 30 45 35 45 50 35 45 25 15 x
D 1 2 3 4 5 6 7 8 9 1 0
I91 1101 (61 181 14 171 151 11) (91 121
119) x 30 40 SO 60 70 90 80 60 80
21101 6 I5 26 35 45 50 45 36 41
3161 00 06 x Q 30 Is 30 25 30 30
4181 00 05 Q x 30 48 50 50 40 50
5(4) I0 I5 I0 10 x 35 45 45 45 30
8(7L 00 05 05 25 x 35 40 35 45
I151 00 LO 10 00 '25 x 45 40 30
8111 00 05 05 00 18 20 35 x Q 5G
9131 00 05 I0 00 15 15 20 2.6 x 9
lIll2) 00 05 10 00 30 15 30 20 30 (i
151
TABLE 5.5
Optimum sequence output of the RASC computer program. Order of events is same as in Table 5.4D.
TABLE 5.6
A. Transposed S-matrix (cf. Table 5.2A). B. Final order relation matrix obtained after 5 iterations
A I 2 3 4 5 6 7 8 9 10
(1) (2) (3) (4) (5) (6) (7) (8) (9) (10)
1(1) x 2.0 2.5 4.5 4.5 2.5 4.0 5.0 8.0 4.5
2(2) 5.0 x 3.0 3.0 5.0 3.0 4.5 5.0 8.0 4.5
3(3) 2.5 3.0 x 4.5 4.0 3.0 3.5 4.0 6.0 3.5
4(4) 1.5 3.0 1.5 x 2.5 3.0 2.5 3.0 6.0 3.5
5(5) 3.5 3.0 2.0 4.5 x 3.0 3.5 5.0 9.0 5.0
6(6) 0.5 1.0 1.0 1.0 1.0 x 1.5 1.0 4.0 1.5
7(7) 2.0 1.5 1.5 3.5 3.5 1.5 x 4.5 7.0 4.5
8(8) 0.0 0.0 0.0 1.0 0.0 1.0 0.5 x 5.0 2.5
g(9) 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 x 3.0
lO(10) 0.5 0.5 0.5 1.5 1.0 0.5 0.5 0.5 3.0 x
B 1 2 3 4 5 6 7 8 9 10
(2) (1) (3) (5) (7) (4) (6) (8) (9) (10)
x 5.0 3.0 5.0 4.5 3.0 3.0 5.0 8.0 4.5
2.0 x 2.5 4.5 4.0 4.5 2.5 5.0 8.0 4.5
3.0 2.5 x 4.0 3.5 4.5 3.0 4.0 6.0 3.5
3.0 3.5 2.0 x 3.5 4.5 3.0 5.0 9.0 5.0
1.5 2.0 1.5 3.5 x 3.5 1.5 4.5 7.0 4.5
3.0 1.5 1.5 2.5 2.5 x 3.0 3.0 6.0 3.5
1.0 0.5 1.0 1.0 1.5 1.0 x 1.0 4.0 1.5
0.0 0.0 0.0 0.0 0.5 1.0 1.0 x 5.0 2.5
0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 x 3.0
0.5 0-.5 0.5 1.0 0.5 1.5 0.5 0.5 3.0 x
in the sequence of columns and rows in Table 5.6A. Table 5.6B shows the
final order relation matrix which now was obtained after 5 iterations only.
Table 5.7A is RASC output for the optimum sequence of Table 5.6B.
The original SEQ file for this RASC run was shown in Table 4.3B.
Because proceeding from left to right in this SEQ file corresponds t o
moving in the stratigraphically upward direction, the optimum sequence
of Table 5.7A is upside down. Table 5.7B is identical to Table 5.7A except
for a reversal of the sequence numbers. It is interesting to compare
Table5.7B with the previous result (Table 5.5). The sequence order is
different in 4 places. In 3 of these, the order of a pair of two events was
156
Probabilistic ranking
TABLE5.7
A. Optimum sequence output of RASC computer program corresponding to Table 5 . 6 8 . This result was
obtained by using Table 4.3B as SEQ tile instead of Table 4.3A. B. Reversed optimum sequence of
Table 5.7A. The sequence numbers 1 to 10 for the optimum sequence of Table 5.7A were replaced by new
sequence numbers 10 to 1 .
TABLE5.8
A-matrix to denote average superpositional and coeval relations. Method of probabilistic ranking (or
“presortingoption”) applied to Hay example using S-matrix of Table 5.2A as starting point. F-matrix of
Table 5.1A gives same A-matrix. Events will be reordered on the basis of their row totals (At).
1
2
1
x
0.0
2
1.0
x
3
0.5
0.5
4
0.0
0.5
5
0.0
00
6
0.0
0.0
7
0.0
0.0
8
0.0
0.0
9
0.0
0.0
lo
0.0
0.0
I A‘
1.5
10
3 0.5 0.5 x 0.0 0.0 0.0 00 0.0 0.0 0.0 1.0
4 10 05 1.0 x 1.0 0.0 1.0 0.0 0.0 0.0 4.5
5 1.0 1.0 1.0 0.0 x 0.0 0.5 0.0 00 0.0 3.5
6 1.0 1.0 1.0 1.0 1.0 x 0.5 0.5 0.0 0.0 6.0
7 1.0 1.0 1.0 0.0 0.5 0.5 x 0.0 0.0 0.0 4.0
8 10 10 1.0 1.0 1.0 0.5 1.0 x 0.0 00 6.5
9 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 x 0.5 8.5
10 1.0 1.0 1.0 1.0 1.0 1.0 1.0 10 0.5 x 8.5
A, 75 80 80 45 55 30 50 25 05 05 1
carry out similar tests for the second position. In Table 5.10 it is shown
that it took four iterations t o bring event 9 to position 1, followed by five
iterations t o bring event 10 to position 2. Continuation of the algorithm to
find the events for the third and subsequent positions gave the optimum
sequence of Table 5.9 after 31 iterations. The new result is identical t o
that obtained before (Table 5.5). The uncertainty range of an optimum
sequence obtained by probabilistic ranking can be determined by using the
same method as before (see Section 5.4).
As a further experiment, probabilistic ranking was applied using the
SEQ file of Table 4.3B instead the one of Table 4.3A. This is more or less
equivalent t o ranking the events in ascending order using the column
totals Aj of Table 5.8. When the events were first ranked according to
descending order of magnitude of their column totals, reversal of the
resulting optimum sequence gave an optimum sequence identical to the
one shown in Table 5.7 except that event 10 was situated above event 9.
The uncertainty ranges resulting from this experiment were identical t o
those given in Table 5.9.
159
TABLE 5.9
Optimum sequence output of RASC computer program corresponding to Table 5.8. Events were
reordered on the basis of their row totals.
TABLE 5.10
Illustration of computer algorithm used in probabilistic ranking to reorder events on the basis of their
row totals in Table 5.8. Final result obtained after 31 iterations is identical to results previously
obtained by Hay method (cf. Tables 5.4 and 5.5).
Iteration I 2 3 4 5 6 7 8 9 10
1 4 2 3 I 5 6 7 8 9 IIJ
2 6 2 3 I 5 4 7 8 9 10
3 8 2 3 I 5 4 7 fi 9 10
4 9 2 3 1 5 4 7 6 8 10
5 1 3 2 5 4 7 6 8 10
6 5 3 2 I 4 7 6 8 10
7 4 3 2 1 5 7 6 8 10
8 6 3 2 I S 7 4 8 10
9 8 3 2 1 S 7 4 6 10
10 10 3 2 I 5 7 4 6 8
11 1 2 3 5 7 4 6 8
12 5 2 3 1 7 4 6 8
13 7 2 3 1 5 4 6 8
I4 4 2 3 1 5 7 6 8
15 6 2 3 1 5 7 4 8
16 8 2 3 1 5 7 4 6
17 1 3 2 5 7 4 6
in 5 3 2 1 7 4 6
19 7 3 2 1 5 4 6
20 4 3 2 1 5 7 6
21 6 3 2 1 5 7 4
22 1 2 3 5 7 4
23 5 2 3 1 7 4
24 7 2 3 1 5 4
25 4 2 3 1 5 7
26 1 3 2 5 7
27 5 3 2 1 7
28 7 3 2 1 5
29 1235
30 5 2 3 1
31 1 3 2
ranking. The ranking numbers of events 26 and 67 are revised row totals.
For this reason, they are not multiples of 0.5 like the other ranking
numbers in Table 5.11A. Reordering the 26 events on the basis of the
ranking numbers gives the optimum sequence of Table 5.11B.
Probabilistic ranking can be regarded as a primitive kind of scaling
method because the events are assigned values along an interval scale.
161
TABLE 5.11
TABLE 5.12
Ranking numbers obtained by averaging probabilities for the Hay example. See text for further
explanation.
I 15 5 53 10 42 0 292 0 238
2 14 0 55 7 43 0255 0 163
3 12 0 46 5 32 0261 0 156
4 24 5 51 18 38 0480 0474
5 21 5 60 13 43 0358 0302
6 17 5 30 12 19 0583 0632
7 20 5 50 17 43 0410 0395
8 28 0 38 28 36 0737 0 778
9 56 0 60 56 60 0933 0933
10 32 5 41 28 32 0793 0 875
are row totals for the S-matrix (Table 5.2A) and the R-matrix (Table 5.1B),
respectively. The sum of the row totals in column 2 is twice as large as the
sum of the row totals in column 1. The numbers in column 3 of Table 5.12
are row totals for the F-matrix (Table 5.lA). These were divided by the
numbers of column 4 that represent sample sizes for pairs of events after
exclusion of ties (Table 5.2B). The sum for column 4 is twice the sum for
column 3.
351
25
-
D
al
-
v)
I
C
>
al
W
5 15
L
0,
n
$
Z
!/Threshold
6 7 8 85 9 1
Average nlN
Pig. 5.4 Method of ranking used by Blank and Ellis (1982). Left side: The design of the matrix used to
synthesize local range data found among a group of geological sections. All taxa range endpoints a r e
identified as being a top or base and a r e listed a t the left and across the top of the matrix. The matrix
elements a r e the ratios d N , and contain the empirical stratigraphic positionings of all endpoints found
for a region, taken two a t a time. For example, n2lN2 is the second matrix element and shows that the
Top of taxon A and the Top of Taxon B a r e found stratigraphically separated in N z sections, and the Top
of A is found above the Top of B, n2 times. A row represents a n endpoint's total stratigraphic positioning
compared to all other endpoints with which i t shows a preferred sequence, dN>i. Conversely, d N < b
also shows a preferred (reversed) stratigraphic sequence and was included in the row total as I-nlN. A s
the total for a row approaches +, an endpoint shows a more random stratigraphic positioning, and is not
useful in determining biostratigraphic sequence trends. The threshold a t which a n endpoint is
considered randomly distributed with respect to another or with respect to all endpoints with which it is
physically associated depends on the level of confidence one is willing to accept. Right side: Threshold
value determined for the North Atlantic Ocean database of Blank and Ellis (1982). The horizontal axis
represents the average dN for a taxon as compared to all other taxa with which it occurs. The vertical
axis represents the taxa remaining in the database after successively deleting taxa that fall below a
certain value. The relationship defined for the North Atlantic Ocean database in the main body of the
figure reveals that a t threshold value 0.85, the database maintains a minimum level of confidence and a
maximum number of taxa for further analysis The implication is that taxa falling below the threshold
values are less useful in biostratigraphic classification based on sequential similarities (from Blank and
Ellis, 1982).
165
and Ellis determined a threshold value of 0.85 for their very large
database of DSDP data (see Fig. 5.4B).
9 10 11 12
5 6 7 8 9 10 11
3 4 5 6 7 8 9 10
2 3 5 6 7 9 10
1 3 5 9
TABLE 5.13
Rubel’s matrix of stratigraphic relations between 12 taxa in single section (example of local ranges
+
discussed in text). The row totals a. b and c a r e for , 0 and -,respectively.
1 2 3 4 5 6 7 8 9 1 0 1 1 1 2 a b c
t x + O + O + + + O + + + 8 3 0
2 - x 0 + 0 0 0 + 0 0 + + 4 6 1
3 0 0 x 0 0 0 0 0 0 0 + + 2 9 O
4 - - 0 x 0 0 0 0 0 0 + + 2 7 2
5 0 0 0 0 x 0 0 0 0 0 0 + 1 1 0 0
6 - 0 0 0 0 x 0 0 0 0 0 + I 9 1
7 - 0 0 0 0 0 x 0 0 0 0 + I 9 1
8 - - 0 0 0 0 0 x 0 0 0 + I 8 2
9 o o o o o o o o x 0 0 0 0 1 1 0
1 0 - 0 0 0 0 0 0 0 0 x 0 0 0 1 0 1
1 1 - - - - 0 0 0 0 0 0 x 0 0 7 4
12 ~ . . . . . . - 0 0 0 x O 3 8
Suppose that local ranges for the taxa are available for another
section. A table similar to Table 5.13 then can be constructed for this other
section. The tables for the two sections can be superimposed on one
another and combined into a single new table using the following algebra
(Rubel, 1978, p. 244): & = + + +,
-&-=-, = & O = O and -&=O. I t is
+
implied that O& = 0 and O&-= 0. If one or both taxa are missing in one of
the sections, the matrix element ( + ,- or 0) for their relation in this section
is unknown. Writing x for such a n unknown element, the following
+
combinations can be added: &x = ,-&x =-, O&x = 0 and x&x =x. +
It is possible t o add more sections to a combination of two sections.
The matrix resulting from adding all available sections for a region is
independent of the order in which the sections are added to one another.
+
A in this final matrix, means that, of the two taxa compared, one occurs
above the other in all sections considered. The is accompanied by a - as +
its counterpart. A zero means that the two taxa coexisted in at least one
sample in at least one section. Great importance is given to coexistences
of taxa because the ranges in the composite standard are extended to cover
all observed coexistences of taxa. Obviously, this makes conservative
ranking methods sensitive to reworking and stratigraphic leaks. Such
effects should be eliminated before application of the method.
168
TABLE 5.14
A-matrix for Rubel’s example of 12 local ranges. Each taxon was assigned separate code numbers for its
lowest and highest occurrence, respectively. See text for further explanation.
1 2 3 4 5 6 7 8 9 10 11 12
I 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 I 8 19 20 21 22 23 24 A,
l x l l l h l l l h l l l 1 1 1 1 h l 1 1 1 1 1 1 21.5
2 0 ~ 1 1 h l l l h l l l1 1 1 1 h l 1 1 1 1 1 1 20.5
3 0 0 ~ 1 0 1 1 1 0 1 h lh l 1 1 0 1 h 1 1 1 1 1 165
4 0 0 0 x O 1 1 1 0 1 h l h l 1 1 0 1 h l 1 1 1 1 15.5
5 h h l l x l l l h l l l 1 1 1 1 1 1 1 1 1 1 1 1 21.5
6 0 0 0 0 0 x h h 0 1 0 1 0 1 h l 0 1 0 1 I I 1 1 115
7 0 l ) 0 0 0 1 x 1 0 1 0 1 0 1 h l 0 1 0 1 I 1 1 1 125
8 0 U 0 U h h 0 x 0 1 0 1 0 1 h l 0 1 0 1 1 1 1 1 115
S h h l l h l l l x l l l 1 1 1 1 h l 1 1 1 1 1 1 210
~ ~ ~ ~ ~ ~ 1 n I ~ OO h 0 O 0 h 00 0
1 ~0 10 hh l 1 1 7.0
I I i l O h h O 1 1 I 0 I x 1 h I I I 0 1 h l 1 1 1 1 16.0
I ~ I I I ~ I I ~ I I I I I O ~ O C h ~ OI h~ X0 1 0 1 0 1 1 1 65
I : 1 0 I l h h O 1 1 1 0 1 h 1 X I I 1 0 1 h l 1 1 1 1 16.0
l l O U U O 0 0 0 0 0 h O h O x O h 0 1 0 1 h l 1 1 7.0
1 5 O U O O ~ h h h O 1 0 10 1 X I 0 1 0 1 I 1 1 1 11.5
I i i 0 0 0 0 0 0 0 0 0 h 0 h O h O x 0 1 0 1 h l 1 1 7.0
1 7 h h I I 0 I I I h 1 1 1 1 1 1 1 X I I 1 1 1 I 1 205
1 B 0 I I l J i l O l l 0 0 O ~ O 0 0 0 0 0 O x O h O h h h 20
1 9 0 0 I 1 I 1 0 1 1 I 0 I h I h l 1 1 0 1 X I 1 1 1 1 100
2 0 0 0 0 0 0 u 0 0 0 0 0 0 0 0 0 0 O h O x O h h h 20
2 1 0 0 0 0 0 0 0 0 0 h 0 1 O h O h 0 1 0 1 X I I I 75
2 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 O h O h O x h h 20
2 J 0 U l J 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 1 X I 40
2 . i 0 I I 0 0 0 U 0 0 I ~ 0 0 0 0 0 0 0 h h h h h h O x 30
In terms of graph
~- theory, Table 5.13 is the adjacency matrix for a local
range chart represented as an interval graph. However, after addition of
one or more other sections, using the preceding algebra, it may not be
A B C D E F C H I
3 4 5 c. 5 6 11 6 7
HlO(H11) H7(H2)
I:- I::I I-
:I - 3
Pig. 5.5 Graphical representation of all possible configurations of relations between the local ranges of
two taxa in Rubel’s (19781 example. Numbers of taxa used for example a r e same a s in Tables 5 13 and
5.14. Each relation corresponds to a square block of four numbers (1, h = 0.5 or 0) in the upper triangle of
‘Fable 5.14 and its counterpart in the lower triangle. All Harper’s (1981) possible relative age relations
between two taxa (H1 to H 1 1 with numbers a s in Fig. 2.5) a r e represented.
169
It
Fig. 5.6 Rubel’s (1978)possible explanations of potential inconsistencies for superpositional relations of
3 events in 3 or more sections. In both spatial distribution patterns (A and B), coexistence of the taxa ( a l ,
a2 and ag) cannot be observed in any of the sections (Sl,S2 and S3).
Worsley and Jorgens (1977) have found that the algorithm of Section
5.3 does not necessarily yield an optimum sequence because cyclical
inconsistencies may occur in which more than two events are involved.
Their original example of cycling events is shown as the first matrix of
Table 5.15. When the algorithm is applied, the original S-matrix reoccurs
after every set of six consecutive iterations. Hence an optimum sequence
could never be determined by means of the preceding algorithm.
TABLE 5.15
Example of cycling events (initial matrix from Worsley and Jorgens, 1977). Unlike the example of Table
5.4, the algorithm for ordering does not yield a n optimum sequence because the initial matrix returns
after 6 iterations. Note that event D does not participate in the cycling.
Fig. 5.7 Three-event cycle (ABC) in set of four events is characterized by successive arrows pointing in
same direction a t both sides of vertices (A, B and C). Arrow between two events indicates that one event
precedes other event.
For this example, the data of Table4.10 were run setting the
threshold parameters equal to h, = 7 and m,l = 5, respectively. For n = 26
events , it is possible to make n(rt-1)/2=325 comparisons. However,
because of the treshold m,l=5, forty pairs were not used. The presorting
option was used (see Table5.11) and the 26 events were reordered by
172
means of the modified Hay method using the ranks in the last column of
Table 5.11. The final result is shown in Table 5.17. A three-event cycle
involving events 25, 27 and 69 was identified with the corresponding
output shown in Table 5.16. The event positions printed below the cycling
events are temporary and can be used to identify which pair of events (11
and 12) was ignored in order to break the cycle.
A B O D E @ G * . . @ L.**
+ + + + + *.* + + *..
\
Fig 5 . 8 Graphical illustration of algorithm developed to locate three-event cycle. Elements in
successive rows of upper triangle a r e tested proceeding from left to right. Row and column interchanges
only take place when element is less than its counterpart in lower triangle. In example, element circled
in margin C will be replaced by K which, in turn, will be followed by F. Cycle C K F will repeat
indefinitely.
173
TABLE: 5.16
Selected output from RASC program including information on a single 3-event cycle encountered when
data of Table 4.10 a r e run with h, = 7 and m,l= 5. See text for explanations.
C Y C L I N G EVENTS: 27 25 69
EVENT P O S I T I O N S : 11 13 12
MATRIX ELEMENTS :
RANKING S O L U T I O N O B T A I N E D W I T H :
1 0 2 I T E R A T I O N S O U T OF MAXIMUM 9000
TOLERANCE OF 0.0
TABLE 5.17
RASC program output of optimum sequence ofdata of Table 4.10with k,=7 and m,l= 5.
1 17 0- 2 Asterigerina gurichi
2 16 1- 3 Ceratobulimina contraria
3 67 2- 4 Scaphopod s p l
4 18 3- 6 Spiroplectammina carinata
5 21 3- 6 Guttulina problema
6 20 5- 7 Gyroidina girardana
7 15 6- 8 Globigerina praebulloides
8 26 7-10 llvigerina dumblei
9 70 7-12 Alabamina wolterstorffi
in 24 8-1 I Turrilina alsatica
11 27 10-12 Eponides umbonatus
12 69 11-13 Nodosaria s p 8
13 25 12-14 Coarse arenaceous spp.
14 31 13-16 Pteropod s p l
15 29 13-16 Cyclammina amplectens
I6 34 15-17 Marginulina decotata
17 41 16-18 Plectofrondicularia spl
18 42 17-19 Cibicidoides alleni
19 30 18-20 Cibicidoides blanpiedi
20 36 19-23 Pseudohastigerina wilcoxensis
21 45 19-22 Bulimina trigonalis
22 57 21-23 Spiroplectammina spectabilis
23 46 22-25 Megaspore spl
24 50 22-25 Subbotina patagonica
25 54 24-26 Textularia plummerae
26 56 25-27 Glomospira corona
E D
Fig. 5.9 Cycles of more than three events can occur when all events, except those involved in cycle, a r e
pairwise simultaneous (relative frequency P , is equal to 0.5). Pair of events that a r e coeval on average
have connecting lines without arrows in examples for 4-, 5- and 6-event cycles shown.
.Ancn ~ B C A E B D A
A x + + - L)x 0 - + cx - t -
B - x + o B o x + - B + x o -
c-- x t c + - x - D- 0 x t
D + o - x A - + t Y A + + - x
~ A D B ACDB
c x - + - Ax + - +
A + x - + c- x + -
D - t x o D +- Y 0
B t - o x D - + o x
Fig. 5.10 Illustration of pseudo-cycle (ADCB) which initially develops when the algorithm is applied but
is automatically replaced by the three-event cycle (ADC). Events with hats a r e being observed a t a
“window” and checked for periodicity in the algorithm.
177
TABLE 5.18
KASC program output of optimum sequence for Hay example after modifications of SEQ file of Table 5.3
(cf. Table 4.6). A. Additional information for Paleocene was used. B. Guex levels were used for data
reduction.
Kendall(1975), and Brunk (1960) who scored ties as 0.5 above and below
the principal diagonal of the matrix for frequencies. However, arguments
that ties should be ignored in some situations have been presented by
Hemelrijk (1952) and Tocher (1950). It has already been pointed out that,
in the absence of cycling (see Section 5.7), the modified Hay method
produces exactly the same optimum sequence as the original Hay method.
178
In the methods of Davaud and Guex (1978) and Rube1 (1978), occurrences
of fossil species are considered to be coeval if they are observed t o the
coeval at least once. For example, even if fossil A is observed to occur
above fossilB in several sections, their coexistence in a single section
results in the two fossils t o co-occur in the standard contructed on the basis
of all sections. Clearly, more weight then is assigned to ties than in either
the Hay method or modified Hay method. Guex and Davaud (1984) have
made extensive use of graph theory in developing their technique. This
allowed them t o construct an optimum sequence of multiple events which
may be subdivided into parts called “Unitary Associations” (see Section
3.5) that can be identified in the original sections and used for correlation.
CHAPTER 6
SCALING OF BIOSTRATIGRAPHIC EVENTS
6.1 Introduction
Artificial Example 1
Artificial Example 2
Artificial Example 3
Artificial Example 4
Fig 6.1 Graphical illustration of RASC method for ranking and scaling of stratigraphic events in many
stratigraphic sections (shown a s vertical lines). Ranking in the stratigraphically downward direction
provides optimum sequences AB (A stratigraphically above B) in Examples 1 and 3,A-B (undecided) in
Example 2, and ABC in Example 4. Scaling gives distance estimates of intervals between successive
events along a linear (RASC) scale. The distance between A and B is estimated a s (1) 1.28, (2) 0.00,
(3) 0.32 and (4) 0.36 for Artificial Examples 1,2,3and 4,respectively (from Gradstein e t al., 1990).
TABLE 6.1
Example of Z-values for selected relative frequencies P . The Z*-values in last column are linearly
related to the frequencies and are approximate Z-values.
P z Z*
0 00 -Pc -2.930
0 05 -1.645 -1.319
0 10 -1.282 -1.172
0 20 -0.842 -0.879
0 30 -0.524 -0.586
0 40 -0.253 -0.293
0 50 0.000 0.000
0 60 0.253 0.293
0 70 0.524 0.586
0 80 0.842 0.879
0 90 1.282 1.172
0 95 1.645 1.319
100 4c 2.930
frequency density function of the interval between two events along the
RASC scale is uniform. This, in turn, would mean that frequency density
functions of individual events along the RASC scale would have different
shapes depending on the value of Z*; e.g. for Z*AB = 0, A and B would
have U-shaped density functions with local minima a t their mean
locations. It is more realistic t o assume that the individual species have
density functions with maxima a t or near their mean values. The mode
and mean coincide for the normal (Gaussian ) curve model used in RASC.
This model is not satisfactory for small densities in the tails where
artificial truncation is applied when the cumulative frequency of the
sample is observed t o be either 0 or 1 (see before). It is good to keep in
mind that decrease in density away from the mode could be different for
different taxa. Also, for the same species it could be different in the
stratigraphically upward and downward directions (cf. Chapters 2 and 9).
The scaling algorithms presented in this chapter form the second part
of the RASC program for ranking and scaling of biostratigraphic events
and other events which can be uniquely identified. An optimum sequence
constructed by means of a ranking algorithm provides the starting point
183
Fig. 6.2 Scaled optimum sequence for 21 wells on Labrador Shelf and Grand Ranks (k,=7, r n ,l = 2 ,
r n , ~=4). Dendrogram values along horizontal axis are interfossil distances ( = i n t e r v a l s between
successive exits) also given in numerical form in the vertical direction. Each distance represents
distance between an event and its successor of which the dictionary code number and name are printed
on the next line. The tenfold zonation is representative for the regional Cenozoic stratigraphy There are
eleven unique events, shown with double asterisks. These unique events occurred in fewer than k , = 7
sections so that they were not used for scaling. Their interfossil distances were estimated later, by re-
inserting them into the scaled optimum sequence on the basis of their relative stratigraphic positions
(with respect to events that were used) in the one or more sections containing them. A shading pattern
was used to enhance the stratigraphically most useful parts of the dendrogram. The large distances on
either side of the Eocene, Oligocene and Miocene assemblages are sedimentary cycle boundaries
(cf. Gradstein e t al., 1985, pp. 146-151).
186
Figure 6.3 shows D E N 0 output for the Hay example (cf. Fig. 4.2,
Table 5.5). All 10 events were used and the threshold parameters m,l and
m,2 were set equal to 2. The relatively short intervals between events 1 to
7 in Figure 6.3b reflect the fact that these events tend to be coeval on the
average in the lower parts of the sections (see Fig. 4.2). On the other hand,
events 8,9 and 10 tend to occur above the others. Clearly, the dendrogram
(scaled optimum sequence (Fig. 6.3b)) contains more information than the
optimum sequence (Fig. 6.3a). As another example of this, it may be
considered that events 9 and 10 are coeval on the average according t o
Figure 6.3a. This would imply that there is 50 percent probability that
event 9 occurs above 10. However, in Figure 6.3b, event 9 occurs above 10
with distance of D=0.4354. It will be shown in the next section that the
estimated probability P , corresponding t o D satisfies P , = @(I)).
Consequently, event 9 would occur above 10 with probability Pe=@
(0.4354)=0.67 o r 67 percent which is slightly greater than 50 percent.
Although W (event 9) occurs three times above A (event lo), and h three
times above W in Figure 4.2, it also can be seen that if W occurs above A ,
the latter event is coeval to six (Section B), one (Section G) and two
(Section H) other events, respectively. On the other hand, if A occurs
above W, the latter event is not coeval to any other events. Because all
possible pairwise comparisons are considered simultaneously in scaling,
event 9 (W) is placed above 10 ( A ) in the scaled optimum sequence instead
of at the same position.
6
5
R
9 3
1 ,c
br
I- >
INlER~OSSIl DISTANCIS
Fig. 6.3 D E N 0 output for the Hay example (from Agterberg and Gradstein, 1998). The clustering of
events 1 to 6 in the dendrogram (b) reflects the relatively large number of cross-overs and many coeval
events near the base of most sections used (cf. Fig. 4.2).
final testing either verifies or negates the results obtained by means of the
statistical model.
Figure 6.4 shows the basic model initially adopted for the scaling
algorithms. Each event (e.g. A) would assume a position XAi in section i
where X A ~is the distance to A from an origin with arbitrary location along
the relative time scale (x-axis in Fig. 6.4). The distance x ~ isi assumed to
be the realization of a random variable X A whose probability distribution
is shown in Figure 6.4. Similar random variables are defined for the other
events B, C,...
The random variable X A satisfies the normal (Gaussian) probability
distribution N ( E X A , u2) with expected (or mean) value EXA and
variance u2. The mean values of the events differ from one another but the
standard deviations of all events are assumed to be equal to u in the model
of Figure 6.4.
188
Fig. 6.4 Probabilistic model for clustering of biostratigraphic events (A, B, C, ...) along relative time
scale (x-axis). Relative position of event (for example, A) in section or well is random variable ( X A ) which
is distributed normally around average location (EXA)with standard deviation o.
fc
I
0
I
AAE
-
dAB= x B - xA
Fig. 6.5 Direct estimation of distance AAB between events A and B from cross-over frequency P ( D A B<O).
Random variable DAB(=XB-XA)is negative only when order of A and B in section is reverse of order of
EX* and EXB. Variance O f D A B is twice as large as variance 02 of individual events A and B.
(6.1)
This formula follows from the fact that the difference DAB = X B - X A has
a normal distribution N(AAB,20') which is shown in the bottom part of
Figure 6.5. The distance between events A and B for a specific section can
be written as dAB = XB- XA. The hatched area in Figure 6.5 is for
P(DAB<O)= ~ - P ( D A B > O ) . If represents fractile of the normal
distribution in standard form, it follows that
(6.2)
Consequently,
P(D > O ) = @(AAB/0d2)
AB (6.3)
Fig. 6.6 Indirect estimation of distance AAB between events A and B from cross-over frequencies with
event C. Indirect distance DAB,C=DAB-DBC has variance which is four times as large as variance of
individual events A. B and C.
190
(6.4)
m,2. However, later work has shown that better results can be obtained by
setting m,2 > m,l. For the example of Table 5.3, mc2=3 and m,l= 1.
When an average distance between two events is estimated from Z-
values for 10 events, it could be based on as many as nine seperate
estimates of the distance. The direct estimate of the distance between
events i and j follows from Z ~ and
J the indirect estimates involving other
events h follow from the differences Zik - Zjk ( h # i j ) where i a n d j = i + 1
are successive rows. However, because Zij = -Zji, the differences
Zkj - Zki ( h z ij),where i and j = i + 1 are successive columns, also can be
used. For example, the direct estimate of distance between events 4 and 7
which occur i n columns 5 and 6, respectively, satisfies D(4-
7 ) = Z56= 0.210. The corresponding i n d i r e c t e s t i m a t e s a r e
z16-z15 = 1.645-1.068 = 0.577, 2 2 6 - 2 2 5 = 1.282-0.524 = 0.758,
and six other, similar differences between Z-values in adjacent columns.
The differences for all pairs of events are shown in Table 6.2C.
In the RASC program, Z-values in the upper triangle are used only.
The lower triangle is used t o retain information on sample sizes. Addition
of indirect and direct estimates yields the sum of the N* separate
estimates. For events 4 and 7, Sum= 1.56 (see Table 6.2C). The average of
all N*=9 estimates of the interval between events 4 and 7 amounts to
Sum/9 = 0.174. This is called an unweighted estimate of distance between
successive events in the output of the RASC program. The complete set of
9 intervals is shown in Table 6.3. The cumulative RASC distance or
distance from the first event (No. 9) is shown in the last column of Table
6.3. Because of missing values (see Table6.2) or pairs of cross-over
frequencies which both are equal t o one (see later), distance estimates may
be based on fewer than N* ( = 9 for the example) pairs of events.
Theoretically, the direct estimate of distance (cf. Fig. 6.5) has half the
variance of the indirect estimates (cf. Fig. 6.6). Thus it should be weighted
twice as heavily. This will be done in weighted distance estimation in
which errors in Pi,. due to small sample sizes also will be considered.
TABLE 6. 2
Unweighted distance estimation to obtain intervals between successive events along RASC distance
scale for Hay example. A. P-matrix of relative frequencies for the 10 events in order of optimum
sequence. Values excluded because of threshold mzc= 3 a r e shown as 000. B. Z-values corresponding to
P-values. Note t h a t threshold qc is equal to 1.645. C. Values a r e differences between values in
successive columns of Table 6.2B. Zero differences for pairs of q,-values a r e shown as 000 and were not
used. Bottom row shows sums for columns with number of values ( N * )used for obtaining sum.
A 9 10 8 6 4 7 5 1 3 2
9 x 3 0/6 5 015 4 014 6 017 7 011 9 019 8 018 6 0/6 8.018
H 9 10 8 6 4 I 5 1 3 2
9 Y 0000 I645 I645 I068 I645 I615 I645 I645 1645
C 10 8 6 4 7 5 1 3 2
0000 I615 000 0577 0 5i7 000 000 000 000
Y 0967 000 000 0 758 0315 0315 0 132 0 132
0 678 Y 000 000 0 608 0 I63 0000 0000 0000
000 000 ‘L 0674 Ofii4 0674 0 293 0 293 0000
0544 0150 1lOOl1 \ 0 210 0 156 0308 0000 0674
0 363 0 1)” I2 S2 0210 Y 0000 0430 0094 0 150
0678 0678 0971 0308 03fiR \ 0157 0273 0112
I 0363 0 3 fil Ofii8 0293 0244 0273 Y 0000 0566
3 0495 0496 0971 0000 0 150 0091 0.130 ‘L 0 000
4 0363 0 3 F3 0971 0674 0674 0356 0248 0566 x
SullVV’ 3 9803 05618 4 8716 I 1617 I 5619 I fiOl8 1 6918 0 5118 006/8
194
TABLE 6.3
Unweighted distance analysis of values shown in Table 6.2 continued to obtain RASC distances of
events. The origin of the scale is set a t the first event. Consequently, the distance for event 9 is equal to
zero. Event 10 has distance of 0.435. Event 2 has the largest cumulative RASC distance ( = 2.140).
the sizes of the samples used to obtain the 2-values. The weight-corrected
equation for estimating the distance between events i a n d j is:
(6.5)
(6.6)
(6.7)
195
(6.10)
-2
1 RIJe
w = - - -
’I &Z) 21VlJ(1 - P L J )
(6.11)
TABLE 6.4
Weighted distance analysis of values shown in Table 6.2. The Z-values were weighted according to
sample size (see Eq. 6.5 and 6.6 in text). Standard deviations were computed by using Eq. 6.13. Note that
the interval between events 3 and 2 (on bottom row) is negative. As a result, event 9 has RASC distance
(=2.149) whichisless than thatofevent 8(=2.155).
(6.12)
with
'
N
x = AAB; W = 2 wi
1=1
and
x , = Z A B , w 1 = w AB
x2 = zAC-ZBc' w 2 = w AB.C
weight W and sum Ewjxj for the Hay example were given in Table6.4.
The corresponding standard deviation s(2) shown in the last column of
Table 6.4 is the positive square root of
N'
(6.13)
When all possible comparisons can be made as for the pair of events 4
and 7, N* = N-1 where N denotes total number of events. However, in the
RASC computer program, N* may be less than N-1 for the following two
reasons: (1)The total number of comparisons is reduced by one for each
value xi that cannot be computed because one of the 2-values needed is
missing (this includes the case that both 2-values are missing); (2) if
Sij = Rij, Pij = 1 and the corresponding 2-value is set equal t o the
threshold value qc ( = 1.645 in Table 6.2). Pairs of 2-values both equal to
q,, and with zero-difference, are not used for estimating the average
distance A,q unless a pair of this type is contained within a cluster of
mutually inconsistent events. For this reason, pairs of values (Zjk, Zjk) in
successive columns (i, j = i + 1) are tested by letting h decrease from
h = i+ 1. Suppose that, for a given value of h , 2 i k = 2 j k = q,. This pair is
not used for the distance estimation unless a pair of 2-values, which are
not both equal to q,, is found for a smaller value of h . In the RASC
program, it is assumed that this situation is encountered as soon as five
pairs of 2-values equal to q, have been identified for decreasing h .
198
The last interval estimated in Table 6.4 is negative. For this reason,
it is desirable to reorder the events before a dendrogram of successive
interfossil distances is constructed. The cumulative distance from the first
event (No. 9) in the original optimum sequence obtained by ranking can be
calculated for each event in weighted as well as unweighted distance
analysis. In Table 6.4, the distance between events 9 and 2 (2.149) is less
than that between 9 and 3 (2.155). If distances from event 9 are used, it
follows that event 2 should lie above 3 in the scaled optimum sequence.
The events always can be reordered on the basis of this cumulative
distance. This allows the clustering of successive distances as shown, for
example, in Figure 6.2.
Figure 6.7 illustrates that the preceding iterative process for final
reordering does not necessarily converge to a single solution. Suppose that
the numbers in Figure 6.7 represent estimated distances between pairs of
200
TABLE6.5
Example of weighted distance analysis after reordering. The optimum sequence used as input for scaling
was not the ranking result used for Tables 6.2 to 6.4 but the scaled optimum sequence in the ranking of
events in last column ofTable 6.4. Differences between Tables 6.4 and 6.5 are restricted to values in two
rows at the bottom only.
B n
A @c ;@
A
4
I 2
0 4 3 2
E 3 D E 2 c
Fig. 6.7 Artificial example for demonstrating that the final reordering option of the RASC computer
program does not necessarily converge to unique solution. See text for further explanation.
artificial example have been chosen in such a way that this new sequence
again has only one negative distance (between D and B) and reordering
ADBCE gives the original sequence ABCDE. Consequently, a unique
solution with positive distances between successive events does not exist.
Situations similar t o the one illustrated in Figure 6.7 do occur in practice,
especially in situations where the estimated distances are not very precise.
TABLE6.6
RASC method of scaling applied to data of artificial example. For meaning of column headings, see text.
The analysis shown in Table 6.6 was repeated for the 5 smaller
subsamples. In all instances, the weighted mean distance provided the
best estimate (see Table 6.7). It also can be seen, however, that in small
samples, the estimated distance may differ considerably from its expected
value.
TABLE6.7
Su bsa rnple D 1) Do D
(direct) (indirect) (Ave) (Ave)
columns for fand 2 in Table 6.6, it readily is computed that WAB = 77.593,
WAC = 60.139, and WBC = 94.549. On the other hand, WAB.C = 36.758,
WAC.B = 42.618, and WJJC.A = 33.880. The latter three weights are not
exactly half as large as the first three weights. The reason for this
discrepancy is that the values of f a n d 2 are approximations only. They
were estimated from samples and used instead of the population values in
the RASC method. The RASC weighted distances become 1.149,1.457 and
0.308, instead of 1.152, 1.480 and 0.327 shown in Table6.6 for D(Ave).
Their correponding SSD value becomes 0.061 indicating that the D(Ave)
2 04
values (with SSD = 0.053) of Table 6.6 are better in this artificial
example.
TABLE 6.8
First two artificial sequences used in complete set of computer simulation experiments (20 events in 50
sections) with E(D) equal to 1.0,0.5,0.3,0.2,0.1,and 0.0, respectively.
I 1.0 I 2 4 5 3 6 8 10 9 7 II I2 13 14 15 17 16 18 19 20
2 1.0 I 4 3 2 7 6 5 8 9 I1 10 12 I3 15 14 16 18 17 19 20
I 0.5 I 2 5 4 3 6 10 8 9 I1 I3 I4 12 15 7 17 16 18 19 20
2 0.5 I 4 3 2 7 8 9 6 I1 5 I2 13 10 I5 18 19 16 14 17 20
I 0.3 5 I 4 2 10 3 6 8 I1 9 15 I3 14 17 12 16 7 19 18 20
2 0.3 1 4 3 7 2 8 9 II 6 I2 13 18 I5 5 10 19 16 20 17 14
I 0.2 5 4 I 2 10 II 3 6 8 9 15 17 14 13 12 16 19 18 20 7
2 0.2 I 4 7 3 I1 8 9 13 18 I2 2 I5 6 19 20 10 16 5 17 I4
I 0.1 5 10 4 2 I II 17 I5 I4 8 9 13 6 3 16 I2 19 20 I8 7
2 0.1 I 4 7 18 II 19 9 13 8 12 3 I5 20 2 6 16 17 10 14 5
I 0.0 5 10 17 IS II 4 14 13 I6 19 2 9 20 8 I I8 6 I2 3 7
2 0.0 I I8 19 4 20 II 13 15 7 I2 9 8 17 16 3 10 6 14 2 5
206
TABLE6.9
2 -
3 -
4 -
3
4
5
6
1 5
6
7
8
0 2032
0 4502
0 2263
0 4360
6 5 - 1 9 0 3653
1 6 - 8 10 0 1063
8 7 - 9 11 0 3099
9 I - 10 12 o 4ng3
10 10 9 - 11 13 0 0571
11 10- 12 14 0 4810
12 11 - 13 15 0 0266
13 13 12- 14 16 0 8437
17 n 0487
,
14 I4 13- I5
15 15 1,- 18 1.8 n 5037
16
17
16
17
15-
1.8-
17
18
I 19
20
0 6017
I8 1.8 17 - 19
19 19 1.8- 20
20 20 19- 21
INTERFOSSIL DISTANCES
Fig. 6.8 Optimum sequence and dendrogram for sample drawn at random from population (theoretical
model of equally spaced events labelled 1 to 20 along RASC scaled with E(D)=0.5). Original SEQ file is
shown in Table 4.13.
The modified Hay method did not change the probabilistic ranking
result in the experiments with E(D)= 0.5 or 0.3. Probabilistic ranking
results were changed somewhat by subsequent application of the modified
Hay method for E(D)= 0.1. Eight 3-event cycles occurred and were broken
by temporarily zeroing the (first) pair of elements in the cycle with the
smallest difference as shown in Table 6.10. In total, seven (12, 13) pairs
and one (11,14)pair were ignored in order t o obtain the optimum sequence
of Figure 6.10. However, a detailed comparison of the probabilistic
ranking result (Table 6.11A) with the final optimum sequence (Table
6.11B) shows that the probabilistic ranking is closer to the true sequence
than the modified Hay ranking. The numbers in the bottom rows of Tables
6.11A and B are absolute values of differences between ranking results
and true order numbers. Their sum is 22 for Table 6.11A and 33 for Table
6.1 1B suggesting that the probabilistic ranking result is slightly better in
this type of application (also see Section 7.4). For further comparison,
Table 6.11C shows differences for the scaled optimum sequence of Figure
1 0 1391
3 0 1123
SEWENCE FOSSIL
'
RANCE 2 0 0887
PO51 TI ON NUMBER
4 0 2411
1 1 0 - 2
3 1 - 3 5 0 0792
2
6 0 2671
3 4 2 - 4
7 0 2540
4 2 3 - 5
5 5 4 - 6 8 0 2801
9 0 3117
6 6 5 - 7
7 7 6 - 0 1 1 0 0291
8 8 7 - 9 10 0 1780
9 9 8 - 10 12 0 1800
10 10 9 - 1 1 14 0 0998
13 0 2940
1 1 11 10- 12
12 12 1 1 - 13 16 0 0050
13 14 12- 14 15 0 4702
14 13 13- 15 17 0 0783
15 16 14- 16 18 0 3302
18 15 15- 17 19 0 4130
I 1 20
17 17 16- 18
18 18 17- 10
19 19 10- 2 0
20 20 1 s - 21
INTERFOSSIL DISTANCES
Fig. 6.9 Optimum sequence and dendrogram for computer simulation experiment of Fig. 6.8 repeated
with E(D)=0.3. Original SEQ file is shown in Table 4.14.
4 0 1171
3 0 0546
SEQUENCE FOSSIL
RANCE
POSITION NUMBER 1 0 0131
1 4 0 - 2 5 0 0190
2 5 1 - 3 6 0 1015
3 3 2 - 4 2 0 0693
4 1 3 - 5 7 0 0783
5 2 4 - 6 8 0 1806
6 6 5 - 7 9 0 0293
7 8 6 - 8 1 1 0 0621
8 7 7 - 9 16 0 0104
9 1 1 8 - 10 14 0 0664
10 9 9 - 1 1 10 0 0483
1 1 14 lo- 12 12 0 0121
12 16 1 1 - 13 15 0 0632
13 10 12- 14 13 0 1676
14 12 13- 15 17 0 0155
15 15 14- 16 18 0 0953
16 13 15- 17 19 0 3337
I
17 17 16- 18
18 18 17- 19
19 19 18 - 2 0
20 20 19- 21
INTERFOSSIL DISTANCES
Fig. 6.10 Optimum sequence and dendrogram for computer simulation experiment of Fig. 6.8 repeated
with E(D)= 0.1. Original SEQ file is shown in Table 4.15.
6.10. These add t o 28 which is about midway between the preceding two
sums.
209
TABLE 6.10
Eight 3-event cycles detected during application of modified Hay method (after probabilistic ranking) to
SEQ file shown in Table 4.15.
r
A B c D
4 3 5 2 1 6 7 6 2 10 9 11
x 11 14 x 11 14 x 11 13 x 9 13
14 x 10 14 x 12 14 x 11 16 x 9
11 15 x 11 13 x 12 14 x 12 16 x
E F G H
14 9 11 16 14 12 10 16 14 16 13 12
x 11 13 x 10 16 x 10 13 x 12 16
14 x 9 15 x 12 1s x 10 13 x 12
12 16 x 9 13 x 12 15 x 9 13 x
TABLE 6.11
Comparison of true optimum sequence with optimum sequences resulting from (A) probabilistic ranking,
(B) modified Hay method after probabilistic ranking, and (C) scaling after probabilistic ranking and
modified Hay method. Absolute values of differences between estimated and true ranks can be regarded
a s penalty points. In the RASC step model (Chapter 7) these penalty points will be added to obtain a
statistic from which Kendall's rank correlation can be computed.
Unweighted (21) and weighted ( 2 2 ) estimates of intervals in scaled optimum sequences for computer
simulation experiment with E(D)=0.5. Standard deviations s ( i 2 ) of weighted estimates are shown in
last column.
Events N* LI X? s! i.L)
TABLE 6.13
Table 6.15 contains summary statistics for the three data sets.
Separate standard deviations (s and 6) were computed from the samples of
19 values with respect t o the sample mean (unbiased estimate, 18 degrees
of freedom) and the population mean (unbiased estimate, 19 degrees of
freedom). For E(D)= 0.5, the pairs of estimates (s and 6 ) are nearly equal
t o one another. For E(D)= 0.3 and 0.1, &(XI) and 8322) are larger than ~(321)
and ~ ( 3 2 2 because
) the order of the events in the corresponding optimum
sequences differs from the expected order (1, 2, ..., 20). Ordinary product-
moment correlation coefficients r ( f 1 ,i 2 ) are shown at the bottom of Table
6.15. They indicate that the unweighted and weighted analysis results are
strongly correlated.
212
TABLE 6.14
SameasTable6.12forE(D)=0.1
TABLE 6.15
Comparison of estimates of Tables 6.12 to 6.14 to population parameters. See text for explanations of
expressions in first column.
Mutual interdependence of xi
The following considerations a r e helpful for understanding t h e
mutual interdependence of separate estimates xi of the interval between
two events A and B. Let C and D be two other events which can be used for
indirect estimation of the distance between A and B.
Three events A, B and C are related by the six probabilities PABC,
PACB,PCAB,PBAC,PBCA and PCBA. It follows immediately that
Similar expressions can be written out for PBA, PCA and PBC but it is
simpler to regard these probabilities as complementary to PAB, PAC and
PBC, or
P, = 1-PAB; PCA = 1 -PAC; PCB = 1 -PBc
(6.15)
2 14
where Z*ACB and Z*BCA are linear functions of PACB and PBCA,
respectively. Consequently, for the two events C and D, it follows that
(6.17)
(1) An occurrence table can be constructed with the final ranking plotted
in the vertical direction. Sections in this table are represented by
columns. If an event occurs in a section its presence is indicated by
an X.
(2) Each section can be compared to the optimum sequence by using a
system of scoring penalty points when an event is out of place. This
procedure is called the Step Model (cf. Section 7.3). The relative order
of every pair of events is checked against their order in the optimum
sequence. If the order is different, one penalty point is scored. Coeval
events each receive half a point. Obviously, an event with many
penalty points is likely to be either too high or too low in a section. A
drawback of the step model is that events which belong to clusters of
events with many internal inconsistencies are likely to accumulate
high total scores even if they occur in normal positions. Thus, it may
not be easy to distinguish between anomalous events which are out of
place and events which are part of a cluster.
TABLE 6.16
RASC normality test output for 9 sections of Hay example. E=event code number; L=event level
number in stratigraphically downward direction; X = cumulative RASC distance; and U = second order
difference. Single asterisk indicates that event is out of place with probability greater than 95%. Two
asterisks indicate that event is out of place with probability greater than 99%.
TABLE 6.17
Normality test applied to all ( = 51) second-order differences of Table 6.16. The expected frequencies (El
of the ten classes are all equal to 5.1. The chi-squared test is used for comparing observed (0) and
expected (E)frequencies with one another. Abnormally large values in the last column would be
indicated by asterisks (one and two asterisks for lack of fit with probability greater than 95% and 996,
respectively).
~
Class 0 h 0E (0E R E
1 3 51 21 0 39
2 5 51 ni 0 00
3 8 51 29 0 74
4 7 51 19 0 32
5 5 51 01 0 00
6 3 51 21 0 39
7 5 51 01 0 00
8 7 51 19 0 12
9 3 51 21 0 39
10 5 51 01 0 00
Chi-squared = 2.56
Fig. 6.11 Direct estimation of distance AAB between events A and B from relative cross-over frequency
when A is a marker horizon with zero variance. The variance of D A B is equal to the variance of event B.
TABLE 6.18
RASC method of scaling applied to data of artificial example. Event B is marker horizon. These results
should be compared to those shown in Table 6.6 where B, like A and C, had unit variance.
- xs
-Xa
I
VI
.r
X
m
1 - z1+ +R
Fig. 6.12 Simple example to illustrate application of unique event option. A unique event was observed
in a single section simultaneous to the event S, stratigraphically below the event A and above the events
B, B' and B". The cumulative RASC distances of the latter five events are shown along the scale on the
left. The positions of S, A and B were averaged to obtain first approximation f l for the unique event. The
second approximation was based on the RASC distances of all events within the range R.
the position 21 representing the arithmetic average of xa, xs, and q-,. In
practical applications, S may be missing. (The special situation that A or
B is missing would occur only if the unique event were to occupy the first
or last position in a section.)
More t h a n a single e v e n t ma y be observed i n t h e positions
immediately above, simultaneous to, or below the unique event. Then, xa,
xs, or q, will be computed as averages for these events which, in turn, will
be averaged to estimate 21.
A range of 21 k (1/2)R can be defined for all events encountered
within the vicinity of 21 with a probability greater t h a n 5percent.
Because u2 = 0.5, (1/2)R = 1 . 9 6 ~= 1.386. The events i n the scaled
optimum sequence with locations i n the interval 21 k (1/2)R can be
identified. For the simplified example of Figure 6.12, these are the events
A, S, B, and B' (but not B"). For each event above the unique event in the
section (A), a value is computed which is the average of its location (x,)
+
and the value 21 (1/2)R. Similarly, for each event below the unique
223
event (B or B'), a value is computed which is the average of its location (xb
or x'b) and the value 31 - (1/2)R. These average values which are shown as
arrows in the diagram on the right of Figure 6.12 are averaged together
with the values (x,) for events observed to be simultaneous with the
unique event. This gives the second approximation 32. If the unique event
occurs in more than one section, the preceding calculation is performed for
each section and the resulting values of 32 are averaged.
The choice of a range R in the method of Figure 6.12 is somewhat
arbitrary. However, the location of the second approximation 3 2 is
independent of R when the number of events within the interval
31 k (1/2)R remains constant. Although the unique event option
generally is used for the construction of biozonations from the scaled
optimum sequence, it also can be used in association with an optimum
sequence obtained by ranking. In that situation, the sequence numbers for
events in the optimum sequence are used as x-values and R is set equal to
a larger value (e.g. R =3.0).
Examples of using the unique event option t o include index fossils in
biozonations are given elsewhere (see e.g. Fig. 6.2). The following example
illustrates the concept of re-including an event that initially was excluded.
Event 6 in the Hay example occurs in 4 sections only. By setting the
threshold parameter h , equal t o 5 , it can be excluded from the
computations required for ranking and scaling. Table 6.19 shows
optimum sequences obtained from 8 events with later re-insertion of event
6. In both sequences, event 6 is positioned between events 8 and 4 (but
closer t o 4 than t o 8) as in the results previously obtained for the Hay
example.
Test of unique event option applied to Hay example. Event 6 which occurs in 4 sections only was
excluded by setting k , = 5 . Later it was re-inserted in the optimum sequence derived by ranking as well
a s in the scaled optimum sequence.
k= I
N
INk!)
(6.18)
other sections, then a conservative method would place the upper limit of
the range for taxon 1 above those of taxa 2 and 3. On the other hand, this
point would fall below those of taxa 2 and 3 when the average location of
E l is determined. From a statistical point of view, the estimation of an
average exit is more satisfactory because the position of the endpoint is
more susceptible t o random fluctuations. Moreover, the average value is
more robust if events are locally out of place due t o anomalous
circumstances such as sediment mixing or misidentification. In the RASC
computer program, individual sections can be compared t o the “standard”
which consists of a set of average distance values along the linear scale L
(normality test; also see Gradstein, 1984).
0.4
Fig. 6.13 Probability of a tie as a function of “distance” (6) between mean positions of events along linear
distance scale (after Glenn and David, 1960).
228
the RASC model. As a first step for calculating average distances between
s
events along this scale, the observed “cross-over” fre uencies ( P ) are
converted to 2-values according t o the transformation CP- ( P ) = 2. This is
the inverse of P = Q(2) where 0 denotes the fractile (cumulative
frequency) of a normal distribution in standard form. The model without
ties can be extended to the model with ties as follows.
Suppose that the random variable D represents “distance” along the
linear scale L between two events in a single section. D is assumed to have
unit variance and its average value is 6. Glenn and David (1960) have
introduced a threshold parameter I;. A tie of the two events is assumed to
occur when D is less than T and greater than -I;. The probability of a tie
(P3) then depends on both T and the mean distance (6) between the two
events considered. This relationship is illustrated in Figure 6.13 for T = 0.2
and T = 0.4.
It is readily shown that Glenn and David’s model results in the
following three probabilities for Al, A2 and A3:
P , = D(6-r)
(6.19)
Consequently,
P, + P, = @(6+d
(6.20)
This indicates that 6 and T can be estimated from P I , P2 and P3. A set
of observed frequencies using the format ( F , T I R ) a r e shown i n
Table6.20A. This is the Hay example as used in Agterberg and Nel
(1982b)and Agterberg (1984). It is convenient to define
229
TABLE 6.20
Example of 10 biostratigraphic events forming optimum sequence as in Agterberg and Nel(1982b, Table
7, p. 74). A. Numbers F , TIR are for pairwise comparison using a trinomial model. If rows are labelled
by the index i and columns by j , then F denotes the number of times EJ follows El in the sequences, T
represents number of ties, and R is number of times E, and E, were observed in the same section.
Example: the first entry of the second column (4,217) indicates that event 1 follows event 2 four times
while the two events were observed to be coeval in two sections. Because R = 7 , this implies that event 1
precedes event 2 in SEQ file for one section. B. Matrix consisting of elements A = ( F + T)IR
corresponding to Table 6.20A.
A
2 1 3 5 7 4 6 8 9 10
B
2 1 3 5 7 4 6 9 10
A , . = (F
V V
+ T V )IRV.'
A .. = ( F . .
JI JI
+ TLJ.)IR.,
0 (6.21)
These values are shown in matrix form in Table 6.20B, with A G in the
upper triangle and Aji in the lower triangle. The transformation @-'(AG)
230
(6.22)
The d-values computed from the values of Table 6.21A are shown in
the upper triangle of Table6.21B and the t-values in its lower triangle.
The d-values can be treated in exactly the same way as the 2-values were
treated in scaling for obtaining average distances between events along
the linear scale L . Each of the t-values can be regarded as an estimate of T.
A frequency distribution of the 32 observed t-values of Table 6.21B is
shown in Table 6.22. Their average amounts to t = 0.4520 which seems t o
be a fairly precise estimate of T. (The standard deviation oft is 0.046.)
Glenn and David (1960) have shown that the preceding simple
averaging method does not result in a least squares solution of T and the
distances between events. They proposed a modified model replacing the
Gaussian curves along the distance scale L by cosine curves. Then the
preceding expressions for d and t represent the least squares solution when
(Aij) is replaced by arcsin(2Aij - 1). Application of the arcsin
transformation to the values of Table 6.20B yields Table 6.21C instead of
Table 6.21A. Table 6.21D was derived from Table 6.21C in the same way
as Table 6.21B from Table 6.21A and also can be used for estimating I; and
the distances. The modified average value now amounts t o t = 0.4080 as
shown in Table 6.22.
A more elaborate test of the preceding version of Glenn and David's
model consisted of its application to 48 events each occurring at least
5 times in the set of 18 wells used by Gradstein (1984). First a n optimum
sequence was obtained (probabilistic ranking followed by modified Hay
method with r n , l = l ) . This sequence was split into t w o segments
consisting of 21 and 27 events, respectively. T was estimated separately by
the two methods (Gaussian Model and Cosine Model) for these two groups
which contain 75 (Group 1 in Table 6.23) and 173 (Group 3) individual
t-values, respectively. Group 2 in Table 6.23 is for 39 t-values arising from
comparison of events in Group 1to events in Group 3. The average values
oft (Gaussian Model) are 0.2419, 0.1914 and 0.2179, respectively. These
23 1
TABLE 6.21
A. Values CP-1 ( A ) corresponding to Table 6.20B. Values for samples with R = 2 were not used and are
written as x. Values corresponding to 1 and 0 are written as a and -a, respectively. For some subsequent
calculations a was set equal to qc=1.645. B. Values d (in upper triangle) and t (in lower triangle)
obtained by Eq. (6.22). The values aa and aaa are undetermined. C. Same as Table 6.21A except that
the transformation arcsin (2A-1) was used. For some subsequent calculations a was set equal to
qc= 1.571 (instead of 1.645). D. Same as Table 6.21B except that the transformation arcsin (2A-1) was
used.
2 I 3 5 7 4 6 8 9 10
8 -a -a -a -a -0.644 -0.524 X X a a
9 -a -a -a -a -a -0.796 -a -a X 0.000
10 -0.644 -0.644 -0.524 -0.340 -0.644 -0.20 1 X -0.340 0.000 X
~~
values are not significantly different from each other at the 5 percent level
of significance when analysis of variance is applied. This demonstrates
232
TABLE 6.22
Frequency distribution of t-values shown in lower triangles of Tables 6.21B and 6.21D. G.M. denotes
Gaussian Model; C.M. -Cosine Model; N - sample size; S.D. - Standard Deviation.
0.000 4
0.001 - 0.200 1
0.201 - 0.400 5
0.101 - 0.600 11
0.601 - 0.800 5
0.801 - 1.000 6
N 32 32
Mean 0.4520 0.4080
S.D. 0.2603 0.2489
S.D./NS 0.0460 0.0440
TABLE 6.23
Glenn and David’s trinomial model applied to 48 exits of Cenozoic Foraminifera observed in 18 wells on
northwestern Atlantic Margin. Abbreviations as in Table 6.22. Groups resulted from splitting the
optimum sequence after 21 events. Group 1 (see Table 6.24 for original data) is for pairwise comparisons
of events belonging to first 21 events, Group 3 is same for last 27 events, and Group 2 is for comparison of
events of Group 1to events of Group 3.
0.000 28 28 20 20 73 73
0.001 - 0.200 5 14 1 2 13 21
0.201 - 0.400 22 14 14 8 47 43
0.401 - 0.600 II I1 8 6 27 25
0.601 - 0.800 6 6 3 3 10 10
0.801 - 1.000 I 1 0 0 2 2
1.001 - 1.200 1 0 0 0 1 0
N 75 75 39 39 173 173
Mean 0.2419 0.2242 0.1914 0.1854 0.21 79 0.2008
S.D. 0.2402 0.2321 0.2196 0.2249 0.2288 0.2204
S.D./N% 0.0277 0.0268 0.0352 0.0360 0.0174 0.0168
, 77
221
10
65
22
17
67
16
71
0 0000
0 3377
D 0642
0 1760
0 0114
0 0082
0
0
1427
0 2832
1832
ELPHlDlUM
COSCINODISCUS SPP
SP
C A S S I D U L I N A TERETIS
UVI CER INA CANAR IENS IS
C O S C I N O D I S C U S SP1
A S T E R I C E R I N A CUR1 C H I
SCAPHOPOD SP1
CERATOEULIMINA C O N T R A R I A
E P I S T O M I N A ELEGANS
233
rl 33
31
82
29
34
0
0
0
0
0
0718
0334
1809
0862
0123
TUREOROTALIA POMEROLI
PTEROPOD SP1
C L O E I C E R I N A LINAPERTA
C V C L A M U I N A AMPLECTENS
MARGINULINA DECORATA
85 0 1226 PSEUDOHASTICERINA M I C R A
40 0 0062 EULlMlNA ALAZANENSIS
iia o 1178 EPISTOMINA SP5
41 0 1270 P L E C T O F R O N D I C U L A R I A SP1
30 o ogao CIEICIDOIDES ELANPIEDI
35 0 0930 SPIROPLECTAMMINA DENTATA
42 0 0544 C l E l C l D O l D E S ALLEN1
32 o 2685 OUADRIUORPH I N E L L A INCAUTA
86 o oa84 TURR IL I N A B R E W S P IR A
49 0 1135 OSANGULARIA EXPANSA
53 0 1452 U V I C E R I N A EATJESI
57 0 0505 SP IROPLECTAMMINA S P E C T A E I L I S
90 0 0138 A C A R I N I N A DENSA
36 0 0912 PSEUOOHASTI CER IN A W I LCOXENS I S
93 0 0684 A C A R I N I N A AFF BROEOERMANNl
45 0 1372 EULIMINA TRI CONALIS
43 0 1215 E U L l M INA MIDWAVENS I S
50 0 10.91 SUEBOTINA PATAGON IC A
46 0 1150 MEGASPORE S P l
54 0 2800 T E X T U L A R I A PLUMMERAE
52 0 3469 ACAR I N I N A SOLDAOOENS I S
r 56
5s
59
o
0
31137
1139
GLOMOSPIRA CORONA
GAVELI N C L L A BECCAR IIFORM Is
RZEHAKINA EPICONA
n
" .
t
9
I N T E R F O S S I L DISTANCES
Fig. 6.14 Dendrogram for distances between successive events estimated by Glenn and David's trinomial
model assuming Gaussian probability curves for events. Each event (except the last one) is followed by
estimate of distance connecting it to the event immediately below it. These distances were plotted
toward the left and clustered.
234
r 77
228
0 0000
0 k233
ELPHlDlUM SP
C A S S I D U L I N A TERETIS
22 0 0677 C O S C I N O D I S C U S SPP
I 61 0 1k21 SCAPHOPOD SP1
r 21
20
0
0
0954
1726
GUTTULINA PROBLEMA
GYRO I D I N A G I RARDANA
:r 15
26
0
0
0420
4655
GLOEICERINA PRAEBULLOIDES
UVIGERINA DUMBLEI
70 0 0265 A L A E A U I N A WOLTFRSTORFFI
31 0 1858 P T E R O P O O SPY
36 0 0095 P S E U O O H A S T I CER I N A W1 L C O X E N S I S
90 0 '114 A C A R I N I N A DENSA
93 0 0216 ACARININA AFF BROEDfRMANNl
45 0 1492 BULI M I N A TR I G O N A L I S
43 0 0823 B U L I M I N A MIDWAYENSI S
50 0 0756 SUEBOTI N A P A T A G O N I C A
46 0 1322 U E G A S P O R E SP1
5 1 0 3215 ' E X T U L A R IA P L U M U E R A E
I 4 52 0 2234 ACARININA SOLDADOENS I S
56 0 4460 GLOMOSPIRA CORONA
55 0 08YO GAVELINELL b BE C C A R 11 F O R M I S
59 RZEHAKINA LPIGONA
INTERFOSS I L D l STANCCS
Fig. 6.15 Same as Fig. 6.14 except that cosine-shaped probability curves (instead of Gaussian curves)
were assumed for events. Note that differences between patterns of Figs. 6.14 and 6.15 are small,
indicting that choice of shape of probability curves for events probably is not ofcritical importance.
235
TABLE 6.24
Estimation of probabilities (PfandP t ) and frequencies (fe and 1,) corresponding to observed successions (fl
and ties ( t ) . Trinomial model was applied to first 21 events (Group 1)of optimum sequence for 48 exits of
Cenozoic Foraminifera also used in Table 6.23 and Fig. 6.14. Last columns show estimated values for
scores (s) based on modified binomial model using RASC weighted distance analysis. See text for
explanations of other column headings. Event numbers of column 1 are explained in Fig. 6.14(from
Agterberg, 1984).
10-17 2.215 0.320.53 2.70.180.9 1.0 0.190.65 3.3 I 8 20 7,5115 0.24 0.50 7.' 0.lY 2.8 Y.I 0.10 0.62 Y.l
10-16 1,116 0.47 0.59 1.5 0.17 1.0 1.5 0.94 0.83 5.0 18-11 10,31 I6 0.41 0.57 9.1 0.18 2.X I1.I 0.14 0.71 11.1
10-17 2.111 0.76 0.70 2.1 0.15 0.4 2.5 1.41 0.92 2.8 I 8 26 7,018 0.12 0.01 4.9 0.16 1.1 7.0 0.68 c1.7> b.'i
17~16 3,116 0.15 0.46 2.8 0.19 1.1 4.5 0.54 0.71 4.2 18-10 6.118 0.96 0.77 6.1 0.12 1.0 6.5 1.01 0.8'4 6 8
17-71 2.011 0.43 0.58 1.7 0.18 0.5 2.0 1.01 0.85 2.1 18 2 6 10.1112 I .02 0.78 9.4 0.12 1.4 10.I 1.16 0.88 10.1
17-18 6.117 0.62 0.61 4 . 5 0.16 1.1 6.5 1.19 0.88 6.2 18 2 I 11.0112 1.14 0.82 9.8 0.10 1.2 11.0 1.41 <>.Y2 11.1
17-20 6.117 C.86 0 . 7 1 5.1 0.13 0.9 6.5 1 - 9 9 0.91 6.5
17-15 6.117 1.03 0.79 1.5 0.11 0.8 6.1 1.71 0.96 6.7 2011 8.2114 0.17 0.47 6.h'O.IY 2.6 9 0 0.2I 0.60 X.1
65 16 4,115 -0.09 0 . 3 7 1.9 0.19 1.0 4.5 0 . 8 4 0.80 9.0 2026 1.118 0.27 0.11 4.1 0.18 1.1 5.5 0.18 Oh5 1.2
77-228 1,214 0.00 0.40 1.6 0.19 0.8 2.0 0.00 0.50 2.0 2070 6.017 0.72 0.68 4 . 8 0.15 1.0 0.0 0.71 0.76 1.1
228-22 2.013 0.58 0.63 1.9 0.16 0.5 2.0 1.21 0.89 2.7 2024 9.1111 0.77 0.70 7.7 0.14 1.6 9.5 0.86 L.81 8.Y
2025 9;1/10 0.YO 0.74 7.4 0.11 1.3 9.3 1.11 0.87 8.7
16-22 3,011 0.230.10 2.10.190.9 3.0 0.01 0.51 2.5 2027 5,017 0.95 0.76 1.3 0.12 0.9 1.0 1.13 0.87 6.1
16-67 3,216 0.140.46 2.80.19 /.I 2.0-0.320.89 2.9
16-71 1,116 0.28 0.12 1.1 0.18 1.1 l . > 0.49 0.69 4.1 I526 4,018 0.10 0.45 3.6 0.19 1.3 l.O 0,lb 0.16 4 4
16-18 Il.1llb 0.47 0.59 9.4 0.17 2 . 8 12.1 0 . 6 4 0.74 11.8 I17U 4.318 0.11 0.62 1.0 0.16 1.1 I.5 0.47 0.68 5.5
16-20 11.2114 0.71 0.68 9 . 1 0.15 2.1 12.0 0.95 0 . 8 3 11.6 I12u 9.1112 0.60 0.64 7.7 0.18 2.1 9.5 q.62 0.73 8.8
16-15 11,4115 0.88 0.74 1 1 . 1 0.11 2.0 11.0 1.18 0 . 8 8 13.2 1525 10.1112 0.73 0.69 8.2 0.15 1.8 10.5 0.8Y 0.81 9.8
16-26 7,018 0.980.77 6.2O.iZ 1.0 7.0 1.310.91 7.3 1527 5.017 0.78 0.71 4.9 0.14 1.0 1.0 0.89 0.81 1.7
1581 4.116 1.13 0.81 4.9 0.19 0.6 4.5 1.30 0.90 1.b
22-71 2.111 0.52 0.61 1.80.170.5 2.) 0.48 0.68 2.0
22-21 2,113 0.840.13 2.20.130.4 2.5 ..800.79 2.4 26 24 5,117 0.10 0.60 4.2 0.17 1.2 5.1 0.08 0.69 '4.8
22-18 1,011 0.700.68 1.4 0 . 1 5 0 . 8 1.0 0.610.74 1.7 26 25 I,014 0.63 0.65 2.6 0.16 0.6 1.0 0.71 0.77 11
22-20 4,115 0.940.76 1.80.120.6 4.5 0.930.82 4.1 26 27 Lli5 0.68 0.61 3.4 0.15 11.8 1.5 0.75 0.77 1.,J
22-15 4.015 1 . 1 1 0.81 4.00.100.5 4.0 1.170.88 4.4 70 2'4 4,016 0.05 0.43 2.6 0.19 1.1 4.0 0.15 0.56 1.4
61-21 U,ll5 0.710.70 3.50.150.7 4.5 0.840.80 4.0 70 25 6,118 0.18 0.48 3 . 8 0.19 I 1 6.1 0.42 0.66 5.1
67-18 1.116 0.61 0.64 1.9 0.16 1.0 5.1 0.68 0.71 4.1 70 27 2.011 0.21 0 . 5 0 1.1 0.19 0.6 2.0 0.U2 0.66 2.0
71-21 1,113 0.3) 0.54 .1.6 0.18 0.5 1.1 0.32 0.63 1.9 70 81 2,011 0.18 0 . 6 3 1.9 0.16 0.1 2.0 0.81 0.80 2.Q
71-18 2.216 0 . 1 8 0 . @ 8 2.90.19 1 . 1 3.C 0.150.56 3.4 70 11 3,014 0.81 0.72 2.9 ,O.I4 11.6 3.0 1.02 rr.81 l.b
71-20 i.I/b 0.410.17 3.40.18 1.1 4.5 0.460.68 4.1 24 2 5 6,119 0.12 0.41 4.1 0.19 1.7 6.5 0.26 0.66 3.4
71-15 4.216 0.600.64 1 . 8 0 . 1 6 1.0 5.0 0.700.76 4.5 24 27 4,016 0.18 0.48 2 . 9 0.19 1.1 '1.0 0.27 0.61 1.6
71-26 4.015 0.700.68 1.4 0 . 1 5 0 . 8 4.0 0.840.80 4.0 2'1 81 3,014 0.11 0.61 2.4 0.17 0.7 3.0 0.83 0.80 3.2
71-27 2.113 1.18 0.87 2.60.080.2 2.3 1.580.94 2.8
2127 1,015 0.06 0.41 2.1 0.19 1.0 3.0 0.'10 0.50 2.1
2148 1.1111 - 0 . 1 50.15 3.8 0.19 2.1 1.5 -0.17 0.43 4.8 2331 7.119 0.61 0.61 1.9 0.16 1.4 7.1 0.60 0.71 6.)
2120 1.619 0.10 0.44 4.0 0.19 l.'7 6.0 0.14 0.15 5.0 2781 2.011 0.31 0.14 1.6 0.18 0.1 2.0 0.41 5.66 2.0
11-15 9.2110 0.27 0.11 5 . 1 0.18 1.8 5.0 0.38 0.65 6.5 2711 5,017 0.18 0.63 4.4 0.16 1.1 1.0 0.60 0.73 5.1
21-26 2,014 0.170.55 2.20.180.7 2.0 0.120.70 2.8 2782 2,011 0.61 0.61 1.9 0.16 0.5 2.0 0.68 0.71 2.1
21-70 >,I16 0.820.72 4.10.14 0.8 5.5 0.810.80 4.8 81 11 l.Oi5 0.21 0.50 2.5 0.19 0.9 1.0 0.19 0.18 2.9
21-24 7.118 0.87 0.74 1.9 0.13 1.1 7.5 1.00 0.84 6.7 8182 1,114 0.27 0.51 2.0 0.19 0.7 3.1 0.27 0.61 2.4
2127 5,016 1.05 0.79 4.7 0.11 0.7 5.0 1.26 0.90 5.4 11-82 4.015 0.01 0.42 2.1 0.19 1.0 4.0 0.08 0.11 2.7
that Glenn and David's trinomial model indeed can be used for describing
the frequencies of observed ties.
TABLE 6.25
Comparison of observed and estimated frequencies for 75 pairwise comparisons of Table 6.24. First six
columns are for trinomial model and last three columns for binomial (RASC weighted scaling) model. If
model provides good tit, the U-values are approximately distributed as chi-squared with single degree of
freedom. Totals are shown in bottom line.
Te To "t Fe Fo 'f e
'
9.09 13 1.69 33.31 39 0.97 45.85 45.5 0.00
10.91 12 0.11 44.53 49 0.45 53.49 53 0.00
9.14 12 0.&9 40.30 41 0.01 45.77 47 0.03
8.90 15 4.18 30.07 29 0.04 35.93 36.5 0.01
10.52 10 0.02 46.72 51 0.39 54.95 56 0.02
8.86 5 1.68 36.00 42 1.00 42.51 44.5 0.09
8.35 6 0.66 34.28 36 0.09 39.58 39 0.01
10.34 4 3.88 32.14 39 1.46 40.39 41 0.01
7.15 2 3.71 22.60 29 1.81 26.29 30 0.52
- -- -- -- --
83.25 79 16.83 319.95 355 6.22 384.76 392.5 0.70
CHAPTER 7
RANK CORRELATION AND PRECISION OF SCALED OPTIMUM
SEQUENCE
7.1 Introduction
Suppose that a number of objects has been ranked in two different
ways, e.g. by using different characteristics. One then may be interested
in the mutual agreement or disagreement of the two rankings. Rank
correlation methods are described in detail by Kendall (1975). Many
authors have applied these methods in biostratigraphy for comparing
sequences of events, e.g. as obtained by different methods, with one
another (see, for example, Brower, 1985,1989; Harper, 1984). In the first
part of this chapter, rank correlation will be discussed in connection with
the RASC step model. Examples of application will be given. A method for
estimating the precision of the cumulative RASC distances of the scaled
optimum sequence will be presented in the second part of this chapter.
(7.2)
+
where S is a total score of 1for pairs of elements having the same order
in both series and -1otherwise. The total number of elements is written as
n. Spearman’s rho is based on the sum of squared differences (SSD) of
rankings of the elements in the two series compared t o one another.
240
A B C D E F G H I J
Rankingl: 7 4 3 10 6 2 9 8 1 5
Ranking2: 5 7 3 10 1 9 6 2 8 4
Listing of all 45 pairs and their scores for Kendall’s(1975) first example with 10 rank members A-J.
AF -1 DH +1
AG +l DI +1
AH -1 DJ +1
A1 -1 EF -1
AJ +l EG +1
BC +1 EH +1
BD +l EI -1
BE -1 EJ -1
BF -1 FG -1
BG -1 FH -1
BH -1 FI +l
BI -1 FJ -1
BJ -1 GH +1
CD +1 GI -1
CE -1 GJ +1
CF -1 HI -1
CG +1 HJ -1
CH -1 IJ -1
CI -1
A B C D E F G H I J
Ranking 1: 7 4 3 1 0 6 2 9 8 1 5
Ranking 2: 5 7 3 101 9 6 2 8 4
242
Differencesd 2 -3 0 0 5 -7 3 6 -7 1
Differences2d2 4 9 0 0 25 49 9 36 49 1
TABLE 7.2
Kendall’s tau and Spearman’s rho for optimum sequences of Table 6.9 correlated to underlying true
sequence consistingof integer numbers from 1to 20.
1 2 3 4 5 6 7 8 9 10
Rankingl: 7 4 3 10 6 2 9 8 1 5
Ranking2: 5 7 3 10 1 9 6 2 8 4
Sequence2: 5 8 3 10 1 7 2 9 6 4
TABLE7.3
9 5 0 7 7
1
6 a 0 7 7
2
3 3 2 2 4
3
2 10 2 5 7
4
10 1 3 2 5
5
5 7 5 0 5
6
1 2 3 1 4
I
a 9 6 0 6
8
7 6 3 0 3
9
4 4 0 0 0
10
Sum = 24 24 48
245
(7.3)
This equation, for example, can be used to evaluate the relative strength of
correlation of each of the.,three series in the previous example of Table
6.11. It already was pointed out that the total numbers of penalty points
amount to 22, 33 and 28 for the situations of Tables 6.11A, B and C,
respectively. Because n=20, it follows from Equation (7.3) that the
corresponding tau-values are 0.884,0.826 and 0.853.
Table 7.4 shows another example of application. The 25 original
input sequences of Table 4.15 (cf. Sections 4.9 and 6.5) were correlated to
the scaled optimum sequence extracted from this dataset after final
reordering (see Fig. 6.10). All tau-values for rank correlation in Table 7.4
TABLE 7.4
Kendall’s tau for 25 sequences of Table 4.15 correlated to scaled optimum sequence of Fig. 6.10. Values
probably different from zero are marked by one (a= 0.05)and two (a = 0.01) asterisks, respectively.
are positive but the differences between values are relatively large. The
smallest tau-value is 0.03 and the largest one is 0.61. Values that differ
significantly from 0 are marked by asterisks in Table 7.4. A single
asterisk indicates that a value exceeds the threshold value for level of
significance equal to a = 0.05; two asterisks mean that the significance
level for a = 0.01 is exceeded as well. Most computer programs for rank
correlation provide statistics for testing the significance of Kendall’s tau
and Spearman’s rho (also see Kendall, 1975, Chapter 4). It can be shown
that S in Equation (7.1)has variance equal to
uarS = n ( n - l ) ( 2 n + 5 ) / 1 8
(7.4)
(7.5)
(3) the probability (P,) that two adjacent horizons correspond t o the same
time interval.
Harper conducted 3 experiments (A, B and C)of which the parameters are
shown in Table 7.5. For each sample site, nt sets of stratigraphic
succession data were obtained, with nt representing the number of
iterations. Run, sample site, and sequence data were sent to the RASC
computer program in order t o obtain three types of optimum sequences
(a)probabilistic ranking (presorting only); (b) modified Hay method only;
and (c)scaled optimum sequence as derived from (b). The threshold
parameters employed are shown in Table 7.5. Harper (1984, Fig. 4-6)
compared experimentally-obtained optimum sequences with the “true”
optimum sequence on the range chart by using Kendall’s rank correlation
coefficients. In total, 1950 tau-values were calculated, one for each
+
comparison; all turned out to be relatively close to 1, and significantly
greater than zero. This signifies that all rankings were good. However, by
comparing methods with one another, and looking a t small differences
between average tau-values, it can be determined which one of a pair of
techniques is better. Average differences between tau-values for
comparing presorting with the modified Hay method are shown in the
bottom four rows of Table 7.5. Each of the values shown is the average of
50differences between tau-values, except the two values in the last
column which were based on 100 differences; n.0. indicates that an average
for 100 runs was not obtained for Run C. A negative value signifies that
the modified Hay method gave poorer rankings than presorting. Except
for Run B (first run), the negative values are significantly different from
zero as determined by Student’s t-test (Harper, 1984, Tables 2-7). The
results for exits and entries are similar as can be expected, and the first
two values in the last two columns also duplicate one another.
T A B L E 7.5
A B C
Number of sites: ns 22 16 6
Probability of presence: p, 0.20 0.20 0.10
Sampling probability: p2 0.55 0.80 ax5
Adjacency probability: p3 0.10 0.10 0.20
Number of datasets: nt 50(or 100) 50(or 100) 50
Minimum number of sites: kc 5 7 3
Minimum number of pairs: mc 4 5 3
Ratio: kJm, 1.25 1.40 1 .00
exits - 0.013 - 0.003 - 0.022
Average difference entries - 0.014 - 0.003 - 0.020
between tau-values: both - 0.004 - 0.001 - 0.007
both( 100) - 0.005 - 0.000 n.o.
Agterberg and Nel(1983a) and routinely has been used in RASC runs
after 1980. The results of presorting are independent of the choice of the
threshold parameters m,, and mc2 which apply t o the modified Hay
method and scaling, respectively. As a result of Harper’s experiments, the
RASC program was modified in 1983 to allow the choice of separate
threshold parameters for these two techniques. Before then, all runs
including those performed by Harper had m,, = mC2.
Application of the modified Hay method after probabilistic ranking
can be regarded as a fine-tuning operation in situations when there are
many missing data. The presorting could yield poor results when many
frequencies are undetermined. Then it should be useful to compare the
ranking of each event with all others in order t o find the optimum
permutation as is done in the modified Hay method. Ideally, the threshold
parameter m,, should be set equal to 1 so t h a t all frequencies are
considered. However, a decrease in mCl frequently corresponds to an
increase in number of cycles (inconsistencies involving 3 or more events).
It then is necessary to use a value greater than 1 in order t o reduce the
number of iterations.
Harper (1984) also found negative differences between tau-values
when the modified optimum sequence resulting from scaling was
compared to the optimum sequence resulting from the modified Hay
method only. However, the lower tau-values in this instance may have
been caused by the fact that Harper (1984, p. 16) regarded a s tied
successive events which were less than 0.5 apart along the RASC scale. A
modified formula for estimating Kendall’s rank correlation coefficient was
used t o accommodate tied events. On average, events preceding other
events along the RASC scale, occur before those other events on the range
chart as well, even when distances between successive events are small.
Scoring them as tied, therefore, results in a somewhat smaller tau-value.
This may explain why the optimum sequence from the modified Hay
method, in which no ties were allowed, yielded somewhat higher tau-
values.
Finally, Harper (1984)showed that exits and entries, run separately,
gave somewhat higher tau-values than when both were mixed together.
This was t o be expected (also see Edwards and Beaver, 1978) because, on
the average, exits will be moved downward, and entries upward, with
respect to their relative positions on the range chart when stratigraphic
succession data for sample sites are generated using probabilities of
occurrence (PI,P, and P J . If exits or entries are considered on their own,
this bias will not show up. However, if they are mixed, some exits will
probably assume final positions, in any type of optimum sequence, below
entries of other taxa which occur above them on the range chart. Although
smaller tau-values are t o be expected for sequences of mixed entries and
exits, these differences were almost negligibly small in the results of
Harper’s experiments. Harper’s experiments were limited t o a single type
of artificial dataset. It may be expected that different specific conclusions
would result from other datasets. Nevertheless, the preceding discussions
illustrated that valid generalizations can be derived from computer
simulation experiments.
Matrix of 2-values of computer simulation experiment of Tables 4.15, Fig. 6.10 and Table 7.4. The 20 events in 25 sequences have expected
values which are closely spaced (at 0.1 intervals) along the RASC scale. The column averages provide estimates of these mean positions
variant of unweighted scaling method, see text for further explanation). Successive values within any column are stochastically independent
because they deviate randomly from their mean values. The latter are for distances from the mean position of the event labelling the column.
The standard deviation of the column average, therefore, can be estimated, e.g. by the jackknife method, without distortion by autocorrelation
effects. This property is preserved when the jacknife method is applied to unweighted or weighted distance estimation a s in the RASC
computer program.
4 3 1 5 6 2 7 9 11 16 14 10 12 15 13 17 18 19 20
8 --
4 x - 151 0 253 0 151 0.253 0.468 0 253 0.468 0468 1175 0 842 0.706 0706 0842 0 842 0994 1405 1405 1405 0 994
3 0 151 x 0.151 - 253 0.151 0.151 0 358 0.253 0.253 0.253 0 842 0.842 0.706 0 842 0.994 0583 1405 0842 1175 1405
I - 253 151 -253 -.050 0 151 0.151 0468 0358 0.106 0 583 0.468 0706 0583 0.842 0994 0 994 0 842 1751 1405
5 - 151 0 253 0.253 I 0 050 0 358 0.253 0.468 0 583 0.468 0 253 0358 0.468 0.583 0 583 0583 0842 0842 1175 1175
6 - 253 -.151 0.050 - 050 - 151 0 151 0 151 0.358 0253 0 994 0583 0.583 0706 0.106 0706 1405 1405 0 994 1175
2 -468 - 151 ~.151 - 358 0.151 x -.050 0.253 0.151 0 358 0 468 0.706 0.583 0.358 0.583 0842 0706 0994 0 994 1405
7 ~ 253 -.358 -.I51 453 -.I51 0.050 ,050 0.358 0050 0 358 0.583 0.706 0.468 0.253 0583 0842 0706 0 994 1175
8 468 - 253 -.468 ~.468 -.I51 -253 0 050 x 0.358 0.253 0 253 0.253 0.358 0.468 0.358 0468 0706 0706 0 842 1405
9 -468 - 253 - 358 - 583 -.358 -.I51 - 358 - 358 x -.358 0 050 0 151 0.358 0.253 0 253 0583 0358 0583 0 468 0 842
II - 1 18 253 - 106 -.468 -.253 - 358 -.050 -253 0.358 x 0 151 -.050 -.050 0.253 0.468 0358 0583 0253 0 468 0 583
16 -842 - 842 - 583 2.53 -.994 -.468 -.358 2.53 -050 - 151 x 253 0.253 0.358 0.358 050 0842 0706 0 706 0 583
I4 - 706 - 842 -.468 -.358 -.583 -.’I06 -.583 -253 - 151 0050 0 253 x -.050 -.050 0.050 0253 0468 0583 0 706 0 994
10 706 - 706 - 706 -.468 -.583 -.583 - 706 -.358 -.358 0.050 - 253 0.050 x 0.151 0.253 0358 0583 0358 0 151 0 706
12 -.a42 -.842 583 -.583 -.706 - 358 - 468 -.468 253 -253 358 0.050 -151 x 0.358 0050 0151 0 151 0 468 0 994
15 - 842 994 442 - 583 - 706 - 583 253 - 358 -253 -.468 358 -.050 -.253 -.358 x 0253 0050 0 253 0 253 0 994
13 -.994 - 583 - 994 - 583 - 706 - 842 -.583 -468 - 583 - 358 0 050 -253 -358 -.050 -253 x 0 151 0253 0 050 0 706
17 -1 41 -1 41 - 994 - 842 - 1 41 -.I06 - 842 -.I06 -.358 -.583 842 -468 ~583 -151 -.050 151 x 0 151 0 358 0 706
18 -1 41 - 842 842 - 842 ~1 41 994 - 706 -.I06 - 583 - 253 706 583 -358 - 151 -.253 253 151 x 0 050 0 706
19 -141 -1 18 - I 75 - 1 18 994 -.994 ~ 994 -842 -.468 -468 - 706 -706 - 151 -468 - 253 050 358 050 0 468
20 - 994 -1 41 -1 41 -1 18 - 1 18 -1 41 -1 18 -1 41 - 842 -.583 - 583 -.994 -706 -994 - 994 -706 706 706 - 468 x
Ave -709 -584 -528 ~473 ~482 -.345 -.277 -.184 -005 0093 0.128 0 140 0.215 0260 0318 0.381 0.599 0591 0.707 0.971
253
TABLE7.7
Comparison of four scaling methods applied to example of Table 7.6. Ave represents column average of
Table 7.6 after addition of 0.709 (=minus first column average). X,and X are RASC computer program
unweighted and weighted scaling results. E (X)represents true mean value which is multiple of 0.0707.
Q and s ( Q ) are jackknife estimate and jackknife standard deviation using RASC weighted scaling
method. t (X)is studentized deviation of X from true mean value. Penalty points (pp) for event numbers
of column 1are shown in last column.
TABLE 7.8
Comparison of differences between successive values for example of Table 7.7. D and s(D)are intervals
and their standard deviations estimated by weighted scaling in RASC computer program. D1 and s(D1)
are corresponding jackknife estimates.
Jackknife method applied to computer simulation experiment of Table 4.13 and Fig. 6.8. The 20 events
in 25 sequences have expected values E ( X ) spaced at intervals which are 5 times wider than those used in
the previous example of Tables 7.6 to 7.8. X,E(X),Q and s(Q) as in Table 7.7. The weighted distance
results X and Q were based on N* and N differences between successive 2-values, respectively. t(Y) is
+
studentized deviation of Y = X-E /X) 0.559.
-
I 0 000 0 000 0 000 0 000 I 0 559 ***
3 0 492 7 0 707 0 5.10 0 063 8 0 343 5 439**
TABLE 7.10
Jackknife method applied to Hay example. X, Q and slQi are weighted scaling results for cumulative
RASC distance, its jackknife estimate and jackknife standard deviation, respectively.
asterisks it is shown that some values of s ( Q ) ,especially those near the top
of Table 7.9, are too small. Although this indicates that, locally, there are
statistically significant discrepancies between X and E(X), these
differences are rather small in relative terms. In Table 7.7 the maximum
difference between X and E(X) is 0.254 or about 16 percent of the total
range ( = 1.598) of the RASC scale. In Table 7.9, the maximum difference
is 0.897 or 13 percent of total range (=6.718). It may be concluded that, on
the whole, the jackknife method yields good estimates of the positions of
the events in the scaled optimum sequence provided that the initial
ranking was good.
Table 7.10 shows Q and SCQ) in comparison with X for the Hay
example. The six events in the lower part of the scaled optimum sequence
are not only subject to strong clustering but also have relatively large
standard deviations. Events 8 , 9 and 10 clearly are above the other events
with events 8 and 10 having relatively small standard deviations. Event 6
may be intermediate between the preceding two groups. Differences
between X and Q for the Hay example are larger than those in Table 7.7
and 7.9. More research would be needed t o determine which estimate ( X or
Q ) is better than the other. It is known that jackknife estimators in
parametric estimation frequently are superior because bias of order n-l
(i.e. inversly proportional to sample size) tends to be eliminated (see e.g.
258
Miller, 1974). On the other hand, this advantage may be offset by the
introduction of bias related t o lack of stochastical independence of the
pseudovalues.
259
CHAPTER 8
8.1 Introduction
The normality test of the RASC computer program was briefly
described in Section 6.6. In this chapter, it will be explained in more
detail. The problem of estimating the autocorrelation of the second-order
differences used in this test will be discussed first. A simple method will
be introduced by which it is possible to determine statistically whether or
not anomalous events belong to the normal distribution of the second-
order differences. For comparison with results obtained by Guex and
Davaud (1984)for a reworked bed using the Unitary Associations method,
the normality test will be applied to Drobne’s (1977)alveolinids from
Yugoslavia. The RASC computer program with normality test also will be
applied to Palmer’s (1954)data for the fauna of the Riley Formation of the
Llano Uplift in central Texas. Earlier, Shaw (1964)had constructed a
composite standard from Palmer’s database which involved t h e
determination and elimination of what he considered to be anomalous
events. It will be seen that the majority of the events deleted by Shaw are
not anomalous when the normality test is applied and this difference in
conclusions will be discussed.
The modified RASC method will be presented using the Gradstein-
Thomas database for example. This procedure can be used to construct
conservative range charts. Various types of range charts constructed by
different methods will be compared with one another in the last two
sections of this chapter. The modified RASC method can be very useful for
defining marker events which have variances that are much smaller than
the variances of other events. Modified RASC also provides new
information on the shapes of the frequency distributions of stratigraphic
events.
260
TABLE8.1
Normality test output from the original RASC program: Comparison of the observed frequencies (Oi)of
second order difference-values in each of the ten classes i = 1.2, ..., 10, with the expected frequencies (E,)
which are constant for each of the ten classes.
Gradatein 39.8 50 36 32 41 43 31 39 42 38 46
( 1 9 8 4 , Table 3 )
261
TABLE 8.2
Normality test output for ten computer simulation experiments. Observed frequencies 0,are compared
to the expected frequency (=go) for each of the ten classes i = 1.2, ..., 10. E(D) represents the expected
interval (or RASC distance) between event-positions along the RASC-scale in these experiments.
O r i g i n a l RASC O1 O2 O3 O4 OS O6 O7 O8 O9 O10
of successive d i s t a n c e s Xk a n d Xk+l is w r i t t e n a s p w i t h
p = Cov (Xk,Xk+ 1)/u2. The variance of the second-order difference
satisfies
(8.1)
It follows that
0: = 202(p2-4p+3)
(8.2)
C o v ( X k - l , X k + l )= p202
if
p = 2-41+03+
(8.3)
I
lln' = l / n + 2 p d ( l - p ) - l / ( l - p ) 2
I /n2
(8.4)
This allows us to estimate n' which is part of the output of the RASC
program. In the chi-squared test for goodness of fit, expected frequencies
Ei of stochastically independent data in pclasses are related to the
corresponding observed frequencies Oi by
1=1 (8.5)
264
if t w o parameters of the fitted distribution were estimated. For
autocorrelated data, the sum on the left-hand side of this equation may be
multiplied by n'ln in order to obtain a n approximate estimate of
chi-squared.
The 10 classes of the normality test in the RASC program (cf. Section
6.6) were constructed by dividing the expected ordered sequence of second-
order differences into 10 equal parts in order to obtain 10 equal expected
frequencies for comparison t o the corresponding observed frequencies. The
class limits are given by the 2-values of the relative frequencies 0.1, 0.2,
..., 0.9 multiplied by 6,. This procedure provides a convenient normality
test. The individual second-order differences (top part of normality test
output as shown in Table 6.16) were compared to the 95% and 99%
confidence intervals k 1.960 6, and k 2.576 6,, respectively.
The preceding method generally yields sets of observed frequencies Oi
(i = 2,3,...,9) which are equal t o one another (and to Ei)except for random
fluctuations. The frequencies (0, and Ole) in the tails of the distribution
may be too high when anomalous events occur in several of the sections.
Results of applying the revised normality test for nine databases are
shown in Table8.3 and for six computer simulation experiments in
Table8.4. Other statistics for most of these computer runs are given in
Tables 8.5 and 8.6.
The normal distribution model provides a good fit for 13 of the 15 tests
in Table 8.3 according to the approximate chi-squared test (see last column
of Table 8.3). The 95 and 99 percent confidence limits of j;2(7)which should
not be exceeded if the normality assumption holds true (with levels of
TABLE 8.3
Revised normality test output for the nine databases in Agterberg et al. (1985) using RASC program.
Table 4.9 is slightly improved version ofdatabase 1; Tables 4.13,4.14and 4.15 are same as databases 9A,
9B and 9C, respectively.
TABLE 8.4
Normality test output for six computer simulation experiments. See text for further explanation.
TABLE 8.5
Some statistics for RASC results for 9 databases of Table 8.3. The equivalent number ( n ' ) of
stochastically independent values was derived from number of second-order differences (n),standard
deviation 82 of Gaussian curve fitted to second-order differences (large values were not used, see text),
and estimated autocorrelation coefficient (0).
TABLE8.6
Autocorrelation statistics for RASC runs of five computer simulation experiments. If the original values
along the RASC-scale were stochastically independent, the ratio $2 I o would be equal to 1. Note extreme
reduction from n to n' for E(D) = 0.0. The negative autocorrelation coefficients 01 apply to second-order
differences (see text).
It follows that
p3- 4p2+ 7 p - 4
P, =
2p2-8p +6
267
1
11 7/ 1 1 1 '1
I!:I ; I
211-----
111 14 1
1 l
1 1
1 1
1 1
1
~
1
1
1
1
1 1
1
1 1 1
( I ) A. moussoulensis (9) A . montanarii
( 2 ) A. aramaea (10) A. aragonensis
( 3 ) A. solida
I (11) A . dedolia
(4) A. globosa (12) A . subpyreneica
( 5 ) A . avellana (13) A. laxa
( 6 ) A . pisiformis (14) A . guidonis
( 7 ) A . pasticillata (15) A . decipiens
( 8 ) A . leupoldi
Fig. 8.1 Occurrence of 15 alveolinids (1 to 15)from Yugoslavia (data from Drobne, 1977) in 11 sections
(I to XI). SAM: Sample numbers originally used by Drobne. Successive maximal horizons are numbered
in the stratigraphically upward direction for each section (see last column). Section XI is an isolated
occurrence described on page 92 of Drobne (1977). See Table 8.7 for names of sections.
TABLE8.7
:
1 .
I? I --
Marble
. ..
rn %:%lndles Flysch
Kozlna beds
Fig. 8.2 Drobne's (1977) original stratigraphic data for Section 11 in Fig. 8.1 (Dane near Divata). Circled
crass indicates stratum typicurn of new species. Samples 7,16,20 and 23 are for maximal horizons (Guex
levels).
The information of Table 8.1 was converted into RASC input by replacing
each fossil number i ( = 1, 2, ...,15) by two numbers (2i-1) for highest
occurrences and 2i for lowest occurrences, respectively. RASC was run on
the resulting data set with kc = 4, mcl = 1 and mc2 = 2. Setting kc = 4
ensured that no events were eliminated as in the U.A. computer program.
However, it became immediately apparent that 7 of the 15 species were
observed in one bed only in the sections containing them. Because the
highest and lowest occurrences of these 7 species coincided everywhere, I
decided to maintain a single number for each of these species indicating
occurrence only. (The odd numbers for these taxa indicate coinciding
highest and lowest occurrences.) Probabilistic ranking was applied and
followed by the modified Hay method. Three cycles occurred and each of
these involved the species 3 and 4. Based on mc2 = 2,42 out of 253 pairs of
271
TABLE8.8
Final Unitary Associations (U.A.) for Drobne's alveolinids a s derived by Guex and Davaud (1984); upper
part of table is range chart with ones for taxa belonging to a particular Unitary Association; lower part of
table shows in which sections the final U.A.'s were identified.
1 0 0 0 0 0 1 0 1 1 1 1 0 0 0 1
2 0 0 1 1 0 1 0 1 1 1 1 1 1 1 0
3 0 0 1 1 0 1 1 0 0 0 0 0 0 0 0
4 1 0 1 1 1 0 0 0 0 0 0 0 0 0 0
5 1 1 0 0 0 0 0 0 0 9 0 0 0 0 0
U.A. Sections:
1 2 3 4 5 6 7 8 9 1 0 1 1
1 0 1 1 1 0 1 0 1 0 0 1
2 1 1 1 1 0 1 0 0 1 1 0
3 0 1 1 0 0 1 1 1 0 0 0
4 0 1 1 0 0 0 1 0 0 0 0
5 1 0 1 0 0 0 1 1 0 0 0
Explanation of numbers used for taxa: (1) A . mowsoulensis; (2) A. arumueo; (3) A. so/id(~;(4) A. glohosa; ( 5 ) A.
auelluna; (6) A. pisiformis; (7) A . posticillato; (8) A . leupoldi; (9) A. monfunarii;(10) A . aragonensis; (11) A. dedolio;
(12) A . suhp.yreneica: (13) A. luxu; (14) A . guidonis; (15) A . deciprens.
matrix elements were zeroed for scaling. Weighted distance analysis was
applied. From the results of the normality test (see Table 8.9),it may be
concluded that species 3 (A. solida) occurs too high in Section I (because of
reworking). In Table 8.9, A. solida has event number 5 for its lowest
occurrence (LO) which coincides with its highest occurrence (see before).
TABLE8.9
RASC normality test output for Drobne's Fatji hrib section with reworked bed at top (events 15 and 5
respresenting highest occurrences of fossils 8 and 3, respectively); the second-order differences were
tested for statistical significance; events with two asterisks are out of place with a probability of 99%;
those with one asterisk with a probability of 95%.
4 +-
I:2 - -, 5 unrrery aSSOCieb0"S ' I
average Ho (LAD)
4'5 I
Fig, 8.3 Comparison of RASC results to Unitary Associations for Drobne's alveolinids. Fossils were
ordered according to increasing RASC distance of their highest occurrence (HOor LAD).
Its RASC distance ( = 2.660) is larger than those of its neighbors in this
section. This discrepancy was brought out by computation of the second-
order difference (=-4.390**) in Table 8.9. The two asterisks indicate that
the event is out of place with a probability of more than 99 percent.
Figure8.3 shows a comparison of the 5 Unitary Associations of
Table 8.8 with the scaled optimum sequence used for obtaining Table 8.9.
The highest occurrences of the 15 fossils were ordered in Figure 8.3
according to their RASC distances. Because average highest and lowest
occurrences are estimated by scaling, the distances between them on the
RASC scale are less than their true stratigraphic ranges. According to the
original scaling model, events in sections are normally distributed about
their average position with standard deviations equal t o u = 0.7071.
Consequently, the observed highest occurrence of a fossil in a section
would occur with a probability of 95 percent below its RASC value
273
TABLE 8.10
Alphabetic DIC file for Palmer's database. Numbers are for highest occurrences. Subtraction of one
gives code numbers for corresponding lowest occurrences. For example, 99 LO Angulotretu triangularis
is lowest occurrence corresponding to first entry (= 100) listed.
TABLE 8.11
SEQ file for 7 sections of Palmer’s database. The event code numbers are explained in Table 8.10
MORGAN CREEK
119 -120 -123 -124 84 -100 -108 -114 82 -105 -106 101 -102 -103 -104 -113 90 -99 -107 87
-88 -92 81 42 -68 -83 -85 -86 -89 -91 69 -70 -77 -78 -79 -80 24 -65 -66 -67
-73 -74 40 8 -54 60 -64 38 -56 59 -62 63 22 23 -61 34 -49 -50 -52 -55
30 -39 -43 -44 -51 -53 19 -20 -25 -26 -27 -28 -29 -31 -32 -33 -35 -36 -37 -41
13 -14 -15 -16 -17 -18 -21 7 -10 9 5 -6
WHITE CREEK
120 113 -114 -117 -118 -121 -122 119 100 107 -108 82 -115 -116 99 92 -98 89 -90 -91
-97 45 -46 -81 24 -40 42 -56 -65 -66 -67 -68 59 -60 54 8 -36 22 33 -34
-41 -47 -48 -53 -55 -57 -58 35 27 -28 -39 21 -7.3 7 -13 -14 4 2 -3 1
JAMES RIVER
117 -118 100 82 -108 90 -97 -98 -107 81 -89 -99 24 -47 -48 -56 -68 -70 40 42
-77 -78 55 -63 -64 -65 -66 -67 -69 -71 -72 60 -61 -62 23 -59 8 -22 -30 -34
-50 29 -33 -35 -36 -39 -41 -49 7 -15 -16 -17 -18 -19 -20 -21
LION M O W A I N
84 -114 -118 -119 -120 82 -100 -106 -108 -112 -117 102 -104 99 -101 -103 -105 -107 -111 -113
81 -83 -87 -88 -91 -92 42 -68 -69 -70 67 7 -8 -31 -32 -34 -47 -48 -49 -50
-53 -54 -55 -56 29 -30 -33 -35 -36 -39 -40 -41 -43 -44 -45 -46
PONTOTOC
82 -100 99 107 -108 -109 -110 45 -46 -91 -92 -97 -98 8 3 -84 87 -88 81 41 -42
-68 67 -70 -75 -76 64 39 -40 -63 -69 22 21 -33 -34 8 -10 7 11 -12 9
6 5 3 - 4
STREETER
91 82 99 -100 92 81 -89 -90 40 -41 -42 -47 -48 -67 -68 -69 -70 -77 -78 24
-61 -62 33 -34 -53 -54 22 -23 -39 16 -18 -21 15 -17 14 7 -8 -13 9 -10
276
Fig. 8.4 Scaled optimum sequence (RASC 5/1/3run) for Palmer’s database for the Riley Formation in
central Texas.
277
Table 8.13 shows results of the overall normality test applied to the
180 second-order differences for events occurring in 5 , 6 or 7 sections. The
sum of the values in the last column is 3.163. This chi-squared value is not
statistically significant indicating that if there are anomalous events in
the sections, these are rare. Table 8.14 shows RASC normality test output
for the Morgan Creek, White Creek and Pontotoc sections.
TABLE 8.12
Kendall’s rank correlation coefficients for sequences of 7 sections correlated with scaled optimum
sequence of Fig. 8.4.
Section Tau
Morgan Creek 0.86
White Creek 0.81
James River 0.79
Little Llano River 0.80
Lion Mountain 0.74
Pontotoc 0.82
Streeter 0.75
278
TABLE 8.13
Overall normality test applied to Palmer’s database using taxa that occur in a t least 5 of the 7 sections.
No significant departures from normality are indicated.
4 18 18 0 0.000
5 16 18 -2 0.104
6 16 18 -2 0.104
7 17 18 -1 0.026
8 22 18 4 0.415
9 14 18 -4 0.415
10 18 18 0 0.000
TABLE 8.14
RASC normality test output for 3 sections in Palmer’s database. Only the lowest occurrences of
Tricrepicephalus coria and Opisthotreta depressa would be “too high” in the Pontotoc section. (Note that
both fossils occur in single beds in this section). Within the context of the entire database, these events
are not anomalous because, on the average, 4 single star events and 1 double star event are expected to
occur in every set of 100 events.
TABLE 8.14(continued)
HI A P W S P I S WALCOTTI nz 0.2285
HI ANGULOTRETA TRIANGULARIS -100 0,0000 0.9959
LO ANGULOTRBXA TRIANGULARIS 99 1.3420 -1.5233
LO LABIOSTRIA CONVMIMARGINATA 107 1.1606 -0.1092
HI LABIOSTRIA CONVEXIMARGINATA -108 0.2955 2.2396
LO RAASCHELLA ORNATA 91 2.2445 -1.5685
HI RAASQiELLA ORNATA -92 2.0504 0.3617
LO APHELASPIS WALCOTTI 81 2.7926 1.7199
LO TRICREPICEPHALUS CORIA 41 5.3148 -3.5439 W
HI TRICREPICEPUALUS CORIA -42 3.7185 1.4412
HI MARYVILLIA CF. M. ARISTON - 68 3.5635 0.2288
LO MARYVILLIA CF. M. ARISTON 67 4.2118 -0.5914
HI COOSIA CF. C. ALBERTENSIS -70 3,6942 2.0758
LO OPISmOTRETA DEPRESSA 39 5.8268 -2.9945 91
HI OPISTHOTRBXA DEPRESSA -40 4.3905 1.0286
Lo CWSIA CF. C. ALBERTENSIS - 69 3.9826 1.2991
HI KINSABIA VARIGATA 22 5.4485 -0.4212
LO KINSABIA VARIGATA 21 6.4933 -0.9993
Lo CO0SET.l.A BELTENSIS -33 5.9641 -0.0920
HI CDOSELLA BELTENSIS -34 5.3429 0.2207
HI KORMAGNOSTUS SIMPLM 8 5.5170 0.8579
LO KORMAGNOSTUS SIMPLM 7 6.5489
To those who have read Shaw's (1964) book, the preceding evaluation
of Palmer's database may seem surprising in that during his construction
of the composite standard, Shaw frequently did not use events which were
deviating more than other events from the straight lines fitted by the
280
and (2) modified scaling are applied alternately until a stable solution is
reached upon convergence. In these two methods, the variances of the
events are not assumed t o be equal to one another. Application of this
method t o highest occurrences of Cenozoic foraminifers along the
northwestern Atlantic Margin (Gradstein-Thomas database) showed
(1) unequality of variances for different events; and (2) minor departures
from normality of the frequency distributions for separate events.
Changes in the scaled optimum sequence resulting from the iterative
procedure were negligibly small. The new approach allows identification
of small-variance e v e n t s which d i s a p p e a r e d a p p r o x i m a t e l y
simultaneously from different sections in the same study region.
The RASC method for ranking and scaling consists of (1) forming a
single, optimum sequence from mutually inconsistent sequences of
observed events for different stratigraphic sections, and (2) positioning
these events along a relative time interval scale. In modified RASC, the
scaling part of the RASC method is generalized t o account for possible
differences in uncertainty associated with the positioning of different
events along the RASC interval scale. The original scaling model was
illustrated in Figure 6.4. Each of a group of biostratigraphic events (A, B,
..., G) was assumed to be a random variable (XA,XB, ...,XG)with Gaussian
probability distribution along the RASC scale. These Gaussian curves
have different means (EXA, EXB, ..., EXG) but their variances (u2) are
assumed to be equal to one another. By means of this model it became
possible to estimate the intervals between the successive mean values
denoted as EXA, EXB, ...,EXG. The model of Figure 6.4 can be generalized
by allowing the variances of the events t o be different. Such an extension
of the method only is possible if the variances CJA,UB, ..., OG of the
frequency distributions ~ ( x A )flxg),
, ...,~ ( x G of
) the events can be estimated.
A possible estimation procedure is described here.
The original RASC method provides estimates xi of EXi where i
denotes events. In each stratigraphic section xi can be plotted against ui,
representing relative position of event i in the so-called event level scale of
the section. New estimates fi of EXi in the section can be obtained by
fitting a cubic spline curve with u as the independent variable. The
differences (+xi) can be collected from all sections in which event i occurs
and plotted as a histogram that provides an approximation of flxi-EXi).
The shape of the latter distribution is the same as that of f l x i ) . The
standard deviation Si of the differences provides an estimate of oi.
282
(8.8)
(8.10)
RASC distances and variances si2 estimated for 44 species (event numbers as in Gradstein et al., 1985)
before (First run) and after (Fifth and Sixth runs with refinement) convergence.
I.'IRBI'KUU
-I I 3 5 1 9 I1 13 15 11 19 21 23 25 -I I 3 5 1 9 I1 13 15 11 19 21 23 25
Level LQVQ~
Fig. 8.5 Results of fitting a spline-curve to data for Adolphus D-50well before (A) and after (B) iteration.
For Fig. 8.5A, the smoothing factor (SF) was set equal to SF=0.7071 and standard deviations for
individual data (si) were kept equal to 1.000, This procedure provides results identical to setting
SF= 1.000 and s,=0.7071 for all i). For Fig. 8.5H,the smoothing factor was set equal to S F = 1.000 and
use was made of s,-values obtained after convergence. In both diagrams, SF exceeded the standard
deviation of the residuals so that the spline-curve became a best-fitting stratight line.
At the beginning of the iterative process, the average variance for the
44 species is equal to 0.500. A t the end of the process the overall variance
has become 0.351. This implies that the standard deviation u = 0.70 was
reduced to 0.59. The total range for the species along the RASC scale was
reduced from 7.78 (original RASC output) to 7.16 after steps 5 and 6 (cf.
Table8.15). This shrinking is related to the reduction in the standard
deviation.
The mean deviation of the species in individual wells from their
spline-curves was computed a t each step of the iterative process. In
Figure8.6, this mean deviation is plotted against RASC distance at the
beginning (RASC output) and end of the iterative process (modified RASC
output). Clearly, there is a systematic departure from zero near the top
and bottom of the stratigraphic sequence. The average deviation of the
first 3 species amounts to -0.65 and that of the last 9 species is 0.28 in
Figure 8.6B. The discrepancies for these 12 events were not significantly
reduced during the iterative process. It indicates that, on the average, the
fitted spline-curves slightly underestimated RASC distances near the tops
of the sections and overestimated them near the bottoms. This effect
would be reduced if more weight were given to the 12 events, e.g. by
centering their variances with respect t o the average deviations.
However, this also would result in a further decrease of the overall
variance with increased shrinking of the total range for the species along
the RASC scale.
A I
i
V
c
P
a
8.5
9
e
d
I
f
f I 00
e
P
e
n -8.5
-
C
e Foraminifera of the Grand Banks
and Labrador shelf
e
r
a
9
8.5 1
i
d
I
f
f
e
r
e
n -8.5 .:
C
e Foraminifera of the Grand Banks
and Labrador shelf
Fig. 8.6 Mean deviation from spline-curves per species plotted against RASC distance before (A) and
after (B) convergence. For further explanation see text.
patagonica passed almost exactly through the points for this taxon. It may
be concluded that S.patagonica is an excellent marker, whose position in
individual sections is everywhere close t o its position in the scaled
optimum sequence. This property is enhanced when modified RASC is
used. On the other hand, Czbicidoides alleni which is a rare benthonic
species has a variance above 0.5, both before and after iteration. Its
histogram also has not changed significantly (see Fig. 8.7). This taxon
seems t o have a bimodal frequency distribution. According t o
F.M. Gradstein (personal communication, 1987), C. alleni is not well
defined taxonomically and may actually represent two different forms.
289
-1.5 -1.1 -0.7 -0.3 0.1 0.5 0.9 1.3 -1.5 -1.1 -0.7 -0.3 O.! 0.5 0.9 1.3
-1.3 -0.9 -0.5 -0.1 0.3 0.7 1.1 1.5t -1.3 -0.9 -0.5 -0.1 0.3 0.7 1.1 1.5t
DIFFEREKE DlFNlwtE
7
*
3 .. - 3 .. -
2 2 ..
1 ..
n 1 ..
r. ,I1 ! : : A. n
Fig. 8.7 Histograms of Cibicidoides alleni and Subbotina patagonica before (A) and after (B) iteration.
After iteration, the bimodal histogram of C. alleni has remained approximately the same, whereas the
histogram of S . patagonica has become very narrow.
290
TABLE 8.16
Selected statistics for the 44 species after convergence. Degrees of freedom f,= ni-1 where ni represents
sample size for event i. Skewness 1 and 2 are sample statistics per species using zero mean and sample
mean for deviations from spline-curves, respectively. The pooled variance s2 is equal to 0.351.
Variance ratio s,2/s2 has asterisk if its value is below 0.005 fractile or above 0.995 fractile of
corresponding x 2 / f distribution. Last column shows individual terms added to give Bartlett’s
9 2 = 180.734 (see text). Constant C= 1.034 was computed by formula in Hald (1975, p. 291).
Event h Skewness 1 Skewness2 sz,/sz f , * h ( S ~ ~ I ISC~ )
10 9 -1.367 -0.059 3 900' -9.589
17 11 -1.678 -1.276 1 999 -7.367
16 21 -1.392 0.205 0 745 5.983
67 7 -2.375 -1.297 1 492 -2.710
18 21 -1.140 -0.451 0 264 27.034
21 9 -1.074 -0.507 0 025; 32.066
20 19 -1.542 -1.108 0 198' 29.681
71 12 -1.040 -0.617 1061 -0.683
26 12 -0.016 0.368 1172 -1.838
70 6 -0.479 -0.965 0 384 5.556
15 21 -1.548 -1.284 1 I92 -3.570
24 16 -0.792 -0.469 0 512 10.370
27 12 -1.313 - 1 045 2 094 -8.575
69 10 -1.139 -0.253 1 799 -5.680
25 18 -0.586 0.233 0 677 6.778
81 11 -1.652 -0.563 1 776 -6.109
202 6 -1.499 -1.153 0 266 7.689
259 13 -0.357 0.495 0 263' 16.782
147 6 -0.812 0.601 0 472 4.359
34 14 -0.727 0.103 1578 -6.172
33 6 -0.404 0.148 3 251* -6.841
260 14 1.681 1.442 0 431 11.399
261 14 1.920 0.809 0 199' 21.836
263 12 0.791 0.425 0 998 0.038
29 18 -0.034 -0.027 0 385 16.633
32 17 -0.481 .0.836 0 627 7.672
40 9 1.207 0 651 1 232 -1.816
42 12 1.356 0.859 2 399. -10.I57
264 6 2.403 1 808 1023 -0.131
41 11 0.358 0.429 1 029 -0.307
30 11 0.600 0 229 1185 - 1 816
-2 6 -1 8 -1
Standardized deviations
Fig. 8.8 Histogram of 550 standardized differences from all spline-curves for all species after
convergence. Standardization was achieved by dividing each difference by the standard deviation sL for
its species.
1 .o
2.0
:3.0
I
.-
m
1
U m
h
u
vI 4.0
2 I?
5.0
6.0
7.0
Fig. 8.9 Extended RASC ranges for Cenozoic Foraminifera in Gradstein-Thomas database. Letters for
taxon 59 on the right represent (A) estimated RASC distance, (B) mean deviation from spline-curve, and
(C) highest occurrence of species (i.e. maximum deviation from spline-curve). B is shown only if it differs
from A. Good markers such a s taxon 50 (Subbotinaputugonica)have approximately coinciding positions
for A, B and C. Note that a s a first approximation it could be assumed that the highest occurrences (C)
have RASC distances which are about 1.16 units less than the average position (cf. Section 8.3). This
systematic difference in distance is equivalent to approximately 10 m.y. (cf. Fig. 9.2, see later).
the range extensions have their own variances and are subject to more
uncertainty t h a n t h e RASC distances themselves. The subject of
conservative range charts also will be discussed in the next two sections
with applications to smaller datasets.
Modified RASC method applied to original Drobne example of Section 8.3. After 4 iterations, the RASC
distances ($4) are close to the original RASC distances ($1). The event variances ( 9 4 ) are for zero mean
deviations and differ from one another. Degrees of freedom (d.f.) in last column are equal to 3 or 4 for
nearly all events. For 3 degrees of freedom the 95% confidence interval of the sample variance ranges
from 0.3202 to 3.1202. H e r e 4 is the expected value of the variance which is approximately equal to 0.5 in
this application. According to this single variance test, the variance of event 15 would be too large and
those of events 20,27,22,2,23, 1 and 3 would be smaller than average. However, modified RASC gives
results that are approximate if samples sizes are very small. It will be seen later (see Table 8.21) that
only the variances of events 27,2 and 1 are again much smaller than average after enlarging the dataset
and re-running modified RASC.
is shown as 51 in Table 8.17. It was the starting point for modified RASC
which, after four iterations, produced nearly the same scaled optimum
sequence ( f 4 in Table 8.17).
It is noted that on the basis of the results by modified RASC described
in the previous section (also see D’Iorio, 1988) indicating that the order of
events does not change significantly when this method is applied, it was
297
TABLE 8.18
Deviations of observed relative positions of events from spline-curves after 4 iterations. Numbers along
top indicate the eight sections used. Event numbers are given in first column. Events 15,23,25,5 and 3
have asterisk for coinciding highest and lowest occurrences in all sections. The variances of Table 8.17
were based on these numbers. Largest deviations for even code numbers (=highest occurrences) and
lowest deviations for odd code numbers (=lowest occurrences) were used for range chart of Fig. 8.10.
These numbers are shown in bold print. Rows with asterisks have two bold numbers.
1 2 3 4 6 7 8 10
28 X -0.97 -0.23 -0.04 -0.47 -0.07 X
The variances of the events (s24) had not completely converged after 4
iterations. Because the number of degrees of freedom for s24 is small for
all events ranging from 3 to 6, these results are subject to considerable
uncertainty. According to Table 8.17, events 2 and 3, corresponding to the
highest occurrence of species 1 (A. moussoulensis) and the lowest
occurrence of species 2 (A. aramaea) have variances closest t o zero and
could be good marker horizons. However, these two events each occur in 4
sections only. The fact that their positions are on the fitted spline-curves
may not be significant because there are so few data. It should be kept in
mind that small variance events receive relatively more weight than other
events in spline-curve fitting. In fact, zero-variance events have the
property (cf. Section 3.11) that the best-fitting spline-curve is forced to
pass exactly through their points on the scattergram. The possibility,
therefore, exists that an event which happens t o have a small variance
because it occurs in so few sections, obtains zero-variance during the
convergence process which involves repeated spline-curve fitting for all
sections.
The final deviations of the 19 events from the 8 fitted spline-curves
are shown in Table 8.18. If all variances are assumed to be equal, numbers
with absolute value greater than 1.16 denote events out of position with
probability greater than 95%. The two events with this property are event
15 (species 8) and event 5 (species 3). The latter event occurs in a reworked
bed as discussed in Section 8.3. According to the preceding equal variance
test applied to Table 8.18, species 8 would occur too high in Section X.
However, this result would need confirmation by additional evidence or
other experiments because there are too few event levels per section in this
dataset for a fully convincing application of modified RASC.
Fig. 8.10 Comparison of five types of ranges for Drobne’s alveonilids along relative time scale of Brower
(1990) who pointed out that RASC ranges are significantly shorter than Unitary Associations (U.A.) and
Seriation (SER) ranges. These results are compared to the modified RASC (MR) ranges and the average
highest occurrences (ave HO) and average lowest occurrences (ave LO) on which these MR ranges are
based. The relative time scales used for U.A., SER, RASC and MR, respectively, have different units and
are not completely comparable (cf. Brower, 1990). However, on the whole, the MR ranges are about as
wide as the U.A. and SER ranges.
Brower (1990) used his own computer algorithms for U.A. and RASC
which differ somewhat from those used by Davaud and Guex (1984) and in
Gradstein et al. (1985). Also, because different methods have different
time-scales, plotting all ranges along a single time-scale may distort some
300
Table 8.20 shows normality test results for the 3 sections with events
that are anomalous with a probability of 99%(2 asterisks for second-order
TABLE 8.19
SEQ tile for recoded Drobne dataset. Most sections have more event levels than in Fig. 8.1. Section 2
(Dane near Divafa, see Fig. 8.2) has 9 event levels which were reduced to 4 maximal horizons in Fig. 8.1.
The number - 999 denotes end of section in SEQ file.
SECTION 1
15 -16 7 -8 -13 -14 -23 -24 11 -12 3 -4-999 0 0 0 0 0 0 0
SECTION 2
28 18 -21 2 -14 -24 1 - 1 2 -17 -21 -22 23 11 -25 -26 4 -6 15 -16 3
-5 -9 -10 -13-999 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
SECTION 3
18 -20 28 19 30 27 17 21 -22 23 -24 -29 14 -26 12 -25 6 -11 5 -13
3-4-999 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
SECTION 4
20 -28 18 29 -30 7 -8 -19 -27 2 -15 -16 -22 -23 -24 1 - 1 3 -14 -17 -21
-999 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
SECTION 5
7 -8 9 -10-999
SECTION 6
19 -20 -27 -28 7 -8 -15 -16 1 -2 -14 13 -25 -26-999 0 0 0 0 0
SECTION 7
14 -25 -26 -29 -30 5 -6 -13 9 -10 3 -4-999 0 0 0 0 0 0 0
SECTION 8
20 -28 19 15 -16 -27 11 -12 -25 -26 13 -14 4 -10 3 9-999 0 0 0
SECTION 10
23 15 -16 19 -20 24 1 -2 -11 -12 -21 -22-999 0 0 0 0 0 0 0
SECTION 11
19 -20 -27 -28 7 -8 -17 -18 1 -2 -14 13 -25 -26-999 0 0 0 0 0
301
TABLE 8.20
RASC normality test output for the 3 sections in the recoded Drobne dataset with one or more events
with double asterisks.
LO A . LEUPOLDI 1) 1.9144
HI A . I.EUI'0LUl -I6 1.914' - 1.1814
LO A . GLOBOSA I I . 3920 1. '3814
HI A . GLOHOSA -8 I . 397.0 %.3005 9:
I,0 A . PASTICII.IATA - 11 3.6925 - 1 ,1,991, ?:9:
111 A . PASTICILLATA - I4 2.11935 0.61432
LO A . SUBPYRENEICA -21 1.9371 0.4907,
HI A . SUBPYRENEICA -24 1.8122 0.5950
LO A. PlSlFORMlS 11 '3 .260', -0.9526
HI A . PISIFOKMIS -12 2 ,533 7 4 I. 7861,
10
. A . MUMAFA 1 5.0S96 -I.9?l7
HI A . MUMAFA -4 4.491%
HI A . GUIDONIS 28 0.0000
HI A . WNTANAKII 18 0.5241 0. 10'32
I,0 A. G U I W N I S -17 0.6910 0.6962
HI A . MOlISSOUI,F.NSIS 7 7.0151 -0.3842
HI A . PASTICILLATA -14 2.4935 -1.0991
I11 A . SUBPYRENEICA -7.4 1.8722 O.H391
10
. A . MOUSSOIJLENSIS I 7.5'117 0.0678
HI A. PISIFORMIS - 12 7. .83 11, -1.1310
LO A . MOEPTANARII -17 1.9921 1.2539
LO A . UELW)LIA - 7.1 7.. 4006 -0.1461
HI A. IIEDULlA -22 2.0631 -0.1494
L,O A . S W P Y R E N E I C A 23 1.9377 1.4482
1.0 A . PISIFOKMIS 11 3.2605 -n.9277
LO A . wu(A -25 '3.1941 -0.1121
lil A . lA4.4 -26 1.01,,6 I . 1926
HI A . ARAMAFA l4 .
4 1,') 17 -4.0524 ??::
LO A . A W A 3 5.05'>6 -2.8790 f
HI A. AKACONENSIS 20 O.l?61
HI A. GUIWNIS -78 o.ooon 0.1094
1.0 A. AKACONENSIS 19 0.6595 (1.5955
1.0 A . I.I'uP0LDI I5 1.9144 -0.5782
HI A. LKUP0I.DI -16 1.9144 -1.2234
1.0 A . GUlUONlS -? 1 11.6910 3.1161 ;S':
1.0 A. PISIFOKMIS 11 3.2605 -2.3158 f
HI A. PlSIFflRMIS - 12 2.8374 0.1798
1.0 A . IAXA -7 5 1.1941 -0.5352
HI A. w(A -26 3.0156 n. i i n 6
LO A. PASTICIISATA I3 '3 .6925 -1.1991
HI A. PASTICILIATA -14 2.4931 7 . 5199 <:
HI A. AKAMAEA 4 4 ,4912 -1.3594
HI A. AVELSANA - 10 4 .4 5 2 1 -0.031~
1.0 A . ARAMAFA 3 5 . 0 5 ~ -0.8023
1
.0 A . AVELLANA > 4.864?
302
TABLE 8.21
Modified RASC method applied to recoded Drobne dataset. n is number of sections in which event was
observed. f 1 , i 3 and f 4 are RASC distances at beginning and after 3 and 4 iterations, respectively.
Variances after 3 and 4 iterations are for zero mean deviation and are only approximately equal to one
another. SK1and SK2 are skewness statistics with and without zero means, respectively.
-1 .o
1 4 10 9
0.0
T
1 .o
1
1 4 a
W
7
V
C
(0 2.0 6
e
.-ul
'0 T 13
$ 3.0
d
s
4.0
t ave HO
5.0
-
.-ave LO
LO
6.0
Species with code numbers
Fig. 8.11 Extended modified RASC ranges for Drobne's dataset. As in Fig. 8.3, the species were ordered
on the basis of the RASC distances of their average highest occurrences. The sample sizes were small
and this is the main reason for the random fluctuations in the positions of the highest (HO) and lowest
(LO) occurrences. Deletion of events with double asterisks in normality test (see Table 8.20) would result
in shorter ranges for species 8 and 14 as shown by arrows in Fig. 8.11.
The RASC ranges plotted in Figure 8.13 are the final modified RASC
distances ( i 7 ) of Table 8.22. Deviations for highest and lowest occurrences
were taken from Table 8.23. For most taxa in Figure 8.13, the three
ranges on the left side (modified RASC, Shaw and Edwards ranges) are
approximately equally wide. The same holds true for the two ranges on
the right (Hay and RASC ranges) which are considerably shorter than the
three ranges on the left. On the whole, modified RASC has the widest
ranges, partly because its ranges are clearly wider than the Shaw ranges
for taxa 4, and 36 (with highest occurrences 8 and 70, respectively). The
deviations in Table 8.22 corresponding to these two taxa are 1.27 (for event
8) in the Morgan Creek section and -1.07 (for event 69) in the Pontotoc
306
TABLE 8.22
Modified RASC method applied to Palmer’s database. Approximate convergence was reached after 7
iterations. See Table 8.21 for explanations of column headings. The average deviation (Ave) is
significantly less than zero for the first 6 events listed (see text for further discussion).
82 I 0 31 0 23 0 23 1 43 145 I 43 0 97 161 0 53
99 7 146 138 1 38 0 64 0 64 0 64 0 49 -2 05 0 54
90 5 2 02 I 95 1 95 0 55 0 55 0 55 0 54 2 13 0 59
92 6 2 17 2 40 2 40 0 02 0 02 0 02 0 08 2 06 0 32
89 5 2 82 2 83 2 82 0 07 0 07 0 07 0 06 0 44 0 80
91 6 2 85 2 89 2 88 0 06 0 07 0 07 0 04 0 90 0 28
81 7 3 08 3 11 3 10 0 07 0 07 0 07 0 08 1 28 0 16
68 I 3 86 3 63 3 62 0 08 0 08 0 08 0 05 0 72 0 02
70 6 3 99 3 80 3 80 0 14 0 15 0 15 0 13 0 11 1 84
42 7 4 02 3 81 3 81 0 18 0 18 0 19 0 10 0 36 0 60
69 6 4 28 4 14 4 14 0 57 0 51 0 59 0 12 0 19 0 88
24 5 4 35 4 21 4 21 0 30 0 30 0 30 0 28 I 17 0 43
61 7 4 51 4 40 4 42 0 23 0 19 0 20 0 21 129 0 52
40 I 4 69 4 55 4 55 0 93 0 91 0 91 0 19 0 64 I55
56 5 4 71 4 64 4 64 0 66 0 63 0 63 0 10 1 16 0 53
54 5 5 05 4 91 4 91 0 20 0 19 0 19 0 II 0 96 0 20
48 5 5 21 5 07 5 07 1 44 1 40 1 39 0 23 127 0 36
47 5 5 21 5 07 5 07 I 44 140 I39 0 23 1 27 0 36
55 5 5 46 5 37 5 36 0 12 0 12 0 12 0 05 0 66 I 39
53 5 5 47 5 33 5 33 0 20 0 20 0 20 0 02 0 25 0 03
41 7 5 62 5 45 5 44 1 27 I24 1 23 0 13 I 48 1 02
34 I 5 64 5 49 5 46 0 08 0 07 0 07 0 10 1 I4 0 33
22 6 5 75 5 61 5 61 0 17 0 17 0 17 0 02 0 53 0 26
8 I 5 82 5 63 5 63 0 98 0 97 0 96 n 06 0 70 0 14
23 5 6 10 5 91 5 90 0 25 0 25 0 25 0 10 I 04 0 04
39 7 6 13 5 91 5 90 0 16 0 15 0 14 0 07 2 36 I70
33 7 6 26 6 04 6 03 0 17 0 I6 0 I6 0 13 2 35 I 33
21 6 6 I9 6 53 6 52 0 II 0 11 0 II 0 18 2 28 0 80
7 I 6 85 6 68 6 69 0 41 0 46 0 47 0 21 I 96 0 94
R
A
S
C
D
I
S
T
A
N
C
E
Fig. 8.12 Comparison of observed highest and lowest occurrences (shown as x-es) with best-fitting spline-
curve (=straight line) after iteration. The line shows a relatively poor fit at the first two event levels.
The RASC distances plotted in the vertical direction are close to those of the scaled optimum sequence
used in Fig. 9.23(see later). The spline-curve in Fig. 9.23was obtained by cross-validation and provides
a better fit than the straight-line fit of Fig. 8.12.
Deviations of observed relative positions of events after 7 iterations. Values were corrected for average
deviation from spline-curve (Ave in Table 8.22). Numbers 1 to 7 for columns correspond to the 7 sections.
Largest deviations in bold print were used to construct modified RASC range chart of Fig. 8.13.
Event n 1 2 3 4 5 6 7
~
this regards a lowest occurrence which would be situated too high, the
extended range chart is not affected by it. The modified RASC results had
309
Fig. 8.13 Comparison of range charts obtained by five different methods for Palmer's database. Modified
RASC and RASC results were added to ranges previously plotted by Edwards (1982). Lowest and highest
occurrences were ranked for each method and these ranks were used to display the ranges. The modified
RASC, Shaw (1964) and Edwards (1978) results are similar. The Hay (1972) and RASC ranges were
based on average highest and lowest occurrences. These generally a r e shorter than the other
(conservative) ranges.
(cf. Fig. 9.23 and later discussion in Section 9.10). It may be assumed that
the calibration problem of lack of fit near the tops and bottoms of some
sections is related to a slight overestimation of the smoothing factors
which, in turn, is equivalent t o a slight overestimation of the event
variances in these sections.
The sample sizes (n)of the events in the Palmer’s database are too
small to decide which events have variances that are significantly smaller
or larger than average. Neither can it be decided from the skewness
statistics in the last column of Table 8.22 which events have a n
asymmetrical frequency distribution. It is interesting that the five largest
(positive) skewness values (events 39, 33, 55, 41 and 7) are for lowest
occurrences whereas the two smallest (negative) skewness values (events
70 and 40) are for highest occurrences. All remaining events have
skewness values which are less than 0.90in absolute value.
The preceding observation would support the hypothesis t h a t
Palmer’s trilobites satisfy the model advocated by Edwards (see Fig. 2.13)
and Baumgartner (see Fig. 2.14). For the latter model, a lowest occurrence
has its longest tail pointing in the stratigraphically upward direction
(positive skewness) whereas a highest occurrence has its longest tail in the
stratigraphically downward direction (negative skewness).
311
CHAPTER 9
EVENT-DEPTH CURVES AND MULTI-WELL COMPARISON
9.1 Introduction
Adolphus D-50
line of observalion
(events versus depths)
I
I
1
\
3 16
optimum sequence, and best fit curves (smoothing splines, see later) are
calculated. A spline fit yields a function such that, for each optimum
sequence position, the most likely stratigraphic equivalent position can be
found in the individual sequences. These normalized tiepoints then are
correlated.
Figure 9.1 graphically depicts the principal steps, executed for the
correlation of event 29 (top of Cyclammina a m p l e c t e n s ) i n t h e
Adolphus D-50 well on the Grand Banks, which is part of the Gradstein-
Thomas database. The y-axis is the optimum sequence in 21 of the wells
( h , = 7, m,l = 2; probabilistic ranking followed by modified Hay method).
Instead of the optimum sequence, the scaled optimum sequence can be
used (see later). The x-axis is the observed sequence of events, whereas the
z-axis is the common depth scale of the well. The lower scattergram
expresses mismatch of the individual sequence and the optimum sequence.
The best fit line for the graph (here visually estimated) is the line of
correlation.
Working with event scales initially has the advantage that complica-
tions due t o different rates of sedimentation in different places which may
be hundreds of km apart are avoided. Moreover, equal spacing of values
for the independent (x-axis) variable in spline-curve fitting has the
considerable advantage that the possibility of unrealistic oscillations of
the fitted curve between irregularly spaced control points is avoided.
However, the number of levels in the event scale differs from section to
section in a “random” manner. For correlation between wells it is
necessary to replace the levels of the event scales by depths (in km). This
replacement is shown in the upper part of the scattergram of Figure 9.1.
the most likely correlation tiepoints and a line can be drawn t o connect
them.
60 50 40 30 20 10 0
I
ri
2i:q t
1
228
16
I
184
50
46
54
56
55
59
61
RASC TIMESCALE
Fig. 9.2 Plot of the Cenozoic scaled optimum sequence (21 wells; 7/2/4 run) versus linear time in Ma. The
inter-event distances are plotted cumulatively. For some selected events in the scaled optimum sequence
the numerical age is known (dots),and this allows to scale the whole fossil sequence in linear time.
319
Biochronology
60 50 40 30 20 10 Ma o
Pal. Eocene Oligocene Miocene
1000 -
2000.
Adolphus D-50
3000 -
r
4000-
.-
1
C
5
a
2
5000.
6000 -
7000 -
@ subjective age
8000.
RASCage
9000 -
SEDIMENT ACCUMULATION
Fig. 9.3 The RASC biochronology of Fig. 9.2 is used to estimate rate of sedimentation (dashed line) in the
Adolphus well. The solid line (subjective) shows approximately the same trend, using independent well
history data (from Gradstein and Agterberg, 1985).
320
this new RASC biochronology (horizontal axis) is used to estimate the rate
of sediment accumulation (dashed line) in the Adolphus D-50 well.
Several years earlier, prior t o development of RASC and CASC a n
approximate chronostratigraphy of this well section had been given, in
system units from the Paleocene upward. As shown in Figure 9.3, there is
a close approximation of the two, independently arrived at, sediment
accumulations. The earlier interpretation obscured a possible late
Oligocene-early Miocene hiatus. Scaling in time of the scaled optimum
sequence is a practical way of erecting a regional time scale.
++-o
0 B
0
0 0
Fig. 9.4 Schematic illustration of method followed in CASC mainframe computer program to establish
relation between RASC distance and age. (a) Two (or more) RASC distances for the same age a r e
averaged. (b) Cubic spline-curve is fitted using age as the dependent variable; smoothing factor (SF)
representing standard deviation of differences between event ages and curve is chosen in advance, before
curve-fitting. (c) Standard deviation (SD)for differences between original values and curve is computed
after curve fitting. (d) Fitted curve is used to convert any RASC distance into corresponding age.
322
minimizing the sum of squares of deviations between points and curve in
the vertical (age-) direction of Figure 9.4b. The smoothing factor SF can be
chosen beforehand by the user of the interactive CASC computer program.
It is equal t o the square root of the mean square deviation between points
and curve. Because this standard error generally is not known
beforehand, the user can determine i t by trial and error while
experimenting with different plots on the screen of the monitor. In Figure
9.4b a curve was fitted to 5 original values (0)and 2 averages of two values
+
( ). The standard deviation of the original data in relation to this curve is
also shown on the screen (SD in Fig. 9.4~). The fitted curve does not
extrapolate outside the range of the RASC distances used for the curve
fitting. Consequently, the circle with the highest RASC value is not
considered for estimating SD in this example. It is noted that a curve also
could be fitted directly through the 8 circles in Figures 9.4a and 9 . 4 ~ Then
.
SD would be equal to SF.
(a) 0 -1 (C)
0-+4
0
0
?+4
0
0
Fig. 9.5 Schematic illustration of preliminary computing and optional editing procedure a t beginning of
CASC mainframe computer program. (a) Events found to be anomalous with a probability of over 99 per
cent (asterisk) may be omitted from spline-curve fitting and later plots; RASC distances of two (or more)
coeval events are averaged. (b) Cubic spline-curve is fitted using RASC distance as the dependent
variable; smoothing factor (SF) representing standard deviation of differences between RASC distances
assigned to levels and curve is chosen in advance. (c) Standard deviation (SD)is computed from
differences between original values and curve after curve-fitting; original values (e.g. those labelled R)
can be deleted. (d) New curve with new standard deviation (SD)is obtained without use of deleted
values.
323
Next, the CASC user can display and edit the RASC distances for any
well from the set of the wells used. Editing options are schematically
shown in Figure 9.5, which displays preliminary data analysis. The scale
in the vertical direction is relative. It shows successive levels for the
stratigraphic events in the well considered. RASC distances of 2 or more
coeval events are averaged (see Fig. 9.5a) before cubic spline fitting (Fig.
9.5b). The user has the option of omitting events for which the second-
order differences were anomalously high (i.e. shown by two asterisks in
the RASC normality test). Such anomalous events are then displayed by
use of a special symbol (single asterisk in Fig. 9.5a) and are not employed
for curve fitting. The deviations are measured in the horizontal direction.
SF and SD serve the same purpose as in Figure 9.4. The user may wish t o
remove other events considered t o be anomalous, for example, those
labelled R in Figure 9.52. Then a new cubic spline-curve will be fitted for
the reduced data set (Fig. 9.5d). If extreme values are deleted, SD will
probably be decreased in value. The original RASC model is based on the
assumption that positions of events in a well are distributed around their
expected value, according t o a normal probability distribution, with
standard deviation set equal to l N 2 = 0.7071. One, therefore, would
expect SD t o be approximately equal to 0.7 if the number of events in the
stratigraphic section is sufficiently large.
For further analysis in preparation of automated correlation, RASC
distance is replaced by age (see Fig. 9.6a) using the earlier derived
relationship between RASC distance and age (see Fig. 9.4d). In the
following discussion, the variables for event level, age and depth are
denoted as x, y and z, respectively. A spline-curve can be fitted to express y
as a function of x, as was done for distance in Figure 9.5d. It also is
possible to replace the levels by their depths and fit a spline curve t o
express y as a function of z using depth as the independent variable. This
leads directly to a plot similar t o Figure 9.6f.
However, the rate of sedimentation may have changed significantly
during geologic time at a well site and this can result in irregular
distribution of the points along the z-axis. This, in turn, may make it
difficult t o obtain a spline-curve that extrapolates in a satisfactory manner
across data gaps along the z-axis corresponding t o short periods with
increased sedimentation rates (also see Section 3.6). For this reason, the
indirect method given in Figure 9.6 can be employed instead. Assume that
the spline-curve of Figure 9.6a is written a s y = f(g) + ey where ey
represents a random deviation in the y-direction. The bar under x
324
indicates that y is regressed on x using data points which are regularly
spaced along the x-axis. Depth ( z ) is plotted against x in Figure 9.5b,and a
+
separate spline-curve with z = g(3) e, is obtained, using the same set of
regularly spaced data points along the x-axis. The deviation e, points in
15 10 5
v< 2p I yo.0
- 0.2
- 0.4
- 0.8
- 0.8
- 1.0
- 1.2
I I I I I I 1.4
0.0
1 2 3 4 5 6
Level ( x - )
02 (0 -02
Jl-;
04 -04
O6 9 - hfzJ
-08
-08
O8 ii
10 8 10
12 12
14 14
V V
Fig. 9.6 Schematic illustration of calculation of an event-depth curve from RASC output for a well.
(a)RASC distances have been replaced by ages using relation illustrated in Fig. 9.4d; new spline-curve f
(xJ is fitted; bar in x_ denotes use of regular sampling interval for x ; smoothing factor (SF), which was
selected before curve-fitting using one age per level, is smaller than standard deviation (SD)for all
original values. (b) Spline curve g (2)is fitted to express depth as a function of level x ; bar in Zdenotes use
of regular sampling interval for 2 ; SF= SD is equal to some small value. (c) P represents spline-curve g (z)
in Fig. 9.6b now coded as set of values for x a t regular interval of z. (d) Q, denotes curve passing through
set of values of y a t regular interval of z obtained by combining spline-curve of Fig. 9.6a with that of
. 9 is spline-curve fitted to values yzx of Fig. 9.6d using new smoothing factor SF. (0 Standard
Fig. 9 . 6 ~ (e)
deviation SD is computed after curve-fitting, using one age per level.
325
The curve for z in Figure 9.6b again is shown in Figure 9 . 6 ~ .It has
been rewritten in the form 32 = g-'(g), t o indicate that estimates f were
obtained at points which are regularly spaced along the z-axis. Assume
that j is obtained for the irregularly spaced values of x in Figure 9 . 6 using
~
f i x ) shown in Figure 9.6a. This results in a set of values of j , , = fig-'(,))
for regularly spaced points along the z-axis (see Fig. 9.6d). The function
fig-1(g)) is not a simple mathematical expression. For example, its first
derivative is not readily available. A cubic spline j = h(z_)can be fitted to
the values j , , (see Fig. 9.6e). In Figure 9.6e, j is considerably smoother
than j X z .By using a smaller smoothing factor (SF), the difference between
j and j x zmay be kept negligibly small (see curve to be used for example in
Fig. 9.7a). The standard deviation SD for points used for fitting in Figure
9.6a with respect to the curve 4 is provided in Figure 9.6f. The deviations
from j are measured in the y-direction. A similar age-depth diagram is
shown in Figure 9.7a where less smoothing was applied. The spline-curve
j = h(z)can be used t o assign a probable age t o any point along the well.
The local error bar in Figure 1.2a was obtained by multiplying s(y)
( = SD) along the y-axis by rate of sedimentation to obtain a modified error
s(z) along the z-axis, as shown in Figure 9.7b. The rate of sedimentation
(Fig. 9 . 7 ~is) the first derivative dzldy for z in j = h ( z ) . In general, a cubic
spline curve y, fitted t o n data points, consists of (n-1) successive cubic
polynomials
326
I I I I I I ,
- 0.2
(a)
- 0.4
- 0.6
- 0.8 5
1.o
1.2
I I l l I I
I J.
GSC
Fig. 9.7 Schematic illustration of estimation of local error bar and modified local error bar. (a) Standard
deviation SD was computed after curve-fitting, using one age per level. (b) Error bar of age value plus or
minus SD along Y-axis is transformed into error bar along Z-axis using first derivative (dzldyl of age-
depth curve. (c) Rate of sedimentation (=dz/dy) can be displayed on screen during CASC interactive
session. (d) Modified local error bar is asymmetrical with respect to depth value for a given age.
/
0
- I
RASC distance
2u
I
GSC
Fig, 9.8 Schematic illustration of estimation of global error bar. Theoretical standard deviation
a (=0.7071) along RASC distance scale is assumed to remain constant. It is transformed into variable
SD along age scale (e.g. SD'and SD").
327
y = y, t cl,d + c2,d2 t c3,d3
(9.1)
at any point. Inversion of this expression gives dzldy. The new standard
error s(z) = (dz/dy) s(y) can be displayed for any z as the local error bar
z k s(z) (see Fig. 1.2a). This propagation of error is based on the local rate
of sedimentation, which is assumed t o remain approximately constant
over the interval y +_ s(y). The latter condition frequently is not satisfied,
especially when j has many inflection points (between local maxima and
minima in sedimentation rate). Curvature of j t is considered in the
construction of a modified local error bar as illustrated in Figure 1.2d. For
any point z = h-l(y), this bar extends from the point h - l b - s(y)} t o
h - l b + s(y)}. It is asymmetrical with respect to z and is significantly
shorter at places where the rate of sedimentation is high.
Finally, a global error bar (Fig. 1 . 2 ~can
) be constructed as illustrated
in Figure 9.8. The standard deviation u = l l d 2 of events along the RASC
linear scale for distance is changed into a variable standard error s(y)
along the age scale. This new, variable standard error is changed into s(z)
according t o the method used for SD in Figure 9.7b. In global error bar
estimation, it is assumed that a single RASC distance error u can be
applied to all wells. On the other hand, in local error bar estimation, use is
made of a constant SD value along the age scale which was estimated from
the deviations between the points used for spline-fitting and the spline
curve itself (cf. Fig. 9.6e). Because of possible elimination of anomalous
events and averaging of ages for events at the same levels, the local error
bar is likely to be narrower than the global error bar. It is possible that the
quality of the biostratigraphic information is not the same in all sections
considered. Such differences would be considered in local error bars but
not in global error bars.
The purpose of the error bar is to quantify the uncertainty of the
observed depths of events with respect t o their estimated depths in the
wells. Each local or global error bar extends from the estimated depth
328
Output from a 7/2/4 RASC run on 2 1 wells and a 5/2/3 run on 24 wells
were used as input for examples of actual CASC runs in the remainder of
this section. Table 9.1 shows the optimum sequence, modified optimum
sequence (after final reordering) and RASC distances for the 7/2/4 run on
21 wells (also see Fig. 6.2). Several events, occurring in fewer than
sevenwells, were later inserted as unique events. Table 9.2 shows
estimated ages of 22 events, including these unique events. Average
RASC distances for events with the same age are shown in the last column
of Table 9.2. Figure 9.9a shows the ages plotted against the RASC
distances. The displays in Figure 9.9 (and Figs. 9.10-12) were redrafted
from hardcopy of displays on a Tektronix terminal. A cubic spline function
with smoothing factor SF = 2.0 was fitted to the 15 ages, using the
average distances shown in the last column of Table 9.2. The smoothing
factor SF is the standard deviation of differences between the 15 ages and
corresponding estimated ages on the spline-curve for the same RASC
329
TABLE 9 . 1
RASC output for 7 / 2 / 4 run on 21 wells (Grands Banks - Labrador Shelfl used as CASC input. Event
levels (sequence position numbers 1-40) (A), optimum sequence of events identified by their dictionary
numbers (B), modified optimum sequence after final reordering (C), and cumulative RASC distances for
events in last column.
W C W C
A B C Distance A B C Distance
TABLE 9.2
Estimated ages for 2 2 events and calculation of average RASC distances for two or three events with
same estimated age.
I
330
a AGE IN M a
7 0
s - a m
m-z.sim
:t
C AGE IN Ma d FIRST D E R I V A T I V E
7 0 e O M 4 0 9 0 2 0 1 0 0 10 9 8 7 3 6 4 3 ? ? 0
m
<
m
z
-10 -I
0
m
-I
0
2 =
-
z
x
c
3
4 L4
Fig. 9.9 Example of CASC displays for Indian Harbour well based on 7/2/4 RASC results for 21 wells. ( a )
Age-RASC distance relationship a s derived from the 21 wells file. (b) Initial CASC plot for default
smoothing factor. ( c ) Age-level plot for default SF. (d) First derivative of ( c ) . (e) Level-depth plot. (fl
Age-depth plot for default SF; spline-curve was fitted directly to the data, using irregularly spaced
depths.
further analysis. As mentioned before, one of two different routes can be
selected at the beginning of mainframe CASC. These consist of using
either optimum sequence data or RASC distances for the events. In both
subprograms, event levels for successive, non-coeval events are defined, as
illustrated for Indian Harbour M-52, in the second column of Table 9.3. In
the second subprogram, the RASC distances in a well are transformed into
ages using the spline-curve fitted in Figure 9.9a (see last column in Table
9.3). The methods used in the two subprograms are identical, except that
sequence position numbers instead of ages a r e used in the first
subprogram. Only the option that uses the ages (in Ma) will be illustrated
in detail here.
Mainframe CASC produces a number of successive plots. For each of
these plots the user is required to answer one or more questions. The plot
that comes after Figure 9.9a during a CASC session is shown in Figure
9.9b. It shows the RASC distances of Table 9.3 plotted against their event
levels. Before this plot is actually shown on the Tektronix screen, the user
is asked if he wishes to exercise the option of deleting anomalous events
which are out of place with a probability of 99percent according t o the
RASC normality test. Moreover, points can be deleted from Figure 9.10
TABLE 9.3
CASC input for Indian Harbour well; definition of 18 event levels; and transformation of RASC distances
into ages using spline-curve in Fig. 9.9a.
. / *
S
m
m
C
10 ;
m
m
1s
f SEDIMENTATION RATE
1 0 0
Fig. 9.10 Example of CASC displays for Indian Harbour well (continued from Fig. 9.9). (a) Spline-curve
for small (default) SF fitted to combination of Figs. 9 . 9 ~
and 9.9e; indirect method explained in Fig. 9.6
was used. (b) Sedimentation rate in 0.1 k d m y (=first derivative of spline-curve in Fig. 9.10a multiplied
by 10); local maximum and minimum are due to lack of smoothness of spline-curve as explained in text.
(c) Age-level plot for SF=4.0 instead of default, used in Fig. 9 . 9 ~ .(d) First derivative for Fig. 9 . 1 0 ~ ;
magnitude of peak in Fig. 9.9d has been reduced. (e) Spline-curve for small (default) SF fitted to
combination of Figs. 9.9e and 9 . 1 0 ~ (0
. Sedimentation rate in 0.1 k d m y corresponding to Fig. 9.10e.
TABLE9.4
Data used for fitting spline-curves in Indian Harbour well example shown in Figs. 9.9 to 9.11
found for which the distance does not anywhere decrease with increasing
depth. The default solution is shown in Figure 9.9b. The smoothing factor
is the standard deviation of the differences between the 18 average RASC
distances and the fitted spline curve. The standard deviation of residuals
( = 0.5664) representing differences between original RASC distances and
fitted spline-curve is also given in Figure 9.9b. It is noted that this value is
only slightly less than u = 0.7 representing the theoretical standard
deviation along the RASC scale.
Figure 9.9e shows the relationship between depth and event level
with fitted spline-curve for SF = 0.02. It passes almost exactly through
the observed values. After display of this plot, the CASC user has the
option of either using this spline-curve in conjunction with the age-event
level plot of Figure 9.9b, or of by-passing the indirect procedure by directly
fitting a curve to the event-depth diagram in which event levels have been
replaced by their depths. The default result for the direct method is shown
in Figure 9.9f.
334
d EVENT LEVEL
30 25 20 15 10 5 1
0
/
1
0
m
+
P
I
Z Z
X
<
f SEOIMENTATION R A T E
1
0
V
I
2 2
X
<
Fig. 9.11 Example of CASC displays for Indian Harbour well (continued from Figs. 9.9 and 9.10. (a)
Unsmoothed combination of Figs. 9.9e and 9.10~;note similarity with spline-curve i n Fig. 9.9e for
SF=O.l. (b) Curve of Fig. 9.11a smoothed with SF=O.l. (c) Sedimentation rate in 0.1 kmlmy
corresponding to Fig. 9.11b. (d) Level-depth plot for SF=0.0. (el Spline-curve for small (default) SF
fitted to combination of Figs. 9 . 1 0 ~and 9.11d; note similarity with spline-curve in Fig. 9.10e. (fl
Sedimentation rate in 0.1 k d m y corresponding to Fig. 9.11e; local maxima and minima are due to lack
of smoothness of spline-curve as explained in text.
335
occur in Figure 9.12e which is the first derivative of the event-depth curve
(Fig. 9.12d), obtained by combining the spline-curves of Figures 9.12a and
9 . 1 2 ~with one another using the indirect method. The smaller peak i n
Figure 9.12e, which occurs at a depth of about 1600m, represents the place
(level 14) where the curve of Figure 9.12a has its steepest dip. The same
e SEDIMENTATION RATE
f AGE IN Ma
10 D 0 7 0 6 4 5 2 1 0 70 W W 40 W 20 10 0
Fig. 9.12 Example of CASC displays for Adolphus well f5/2/3 RASC results using 24 wells). fa) Age-level
plot for SF=2.2. (b) First derivative corresponding to (a); note small peak near level 14. (c) Event level-
depth plot; note relatively steep slopes at depths near 0.7 k m and 2.2 km, respectively (d) Spline-curve
with small (default) SF fitted to combination of (a) and (c). ( e ) Sedimentation r a t e in 0.1 kmlmy
corresponding to fd); two relatively high peaks correspond to steeper slopes in (c); intermediate small
peak corresponds to highest first derivative in (b). (0 Event-depth spline-curve fitted directly to the data
using irregularly spaced depths; note similarity with spline-curve of (d); direct method yields poorer
results t h a n indirect method when one or more intervals between successive ages a r e much larger t h a n
average, due to high sedimentation rate or relative lack of microfossils.
337
TABLE9.5
Information for Adolphus D-50 well used for CASC experiments of Figs. 9.12 to 9.14; ID are identification
numbers of foraminifers (cf. Tables 4.7 and 4.8); rank gives position of event in scaled optimum sequence;
age was derived from RASC distance; level refers to successive samples taken at different depths.
The basic idea of the smoothing spline was explained in Section 3.1 1.
It was pointed out that S representing the sum of standardized residuals in
Equation (3.23) is distributed as chi-squared. This result is derived from
statistical theory for t h e distribution of t h e v a r i a n c e s2 (see
e.g. Hald, 1957, p. 278) which has mean E(s2) = u2 and variance
Var(s2) = 2u4/fwhere f = n-1. Setting S = ns2/u2, it follows that E(S) = n
and Var(S) = 2f. Thus the preceding interval extends from one standard
deviation below the mean ( = n )to one standard deviation above it. This
idea has led users of smoothing splines t o the choice S = n (“Reinsch’s
suggestion”, see e.g. Wahba, 1975). Because the smoothing factor is
defined as SF = (S/n)*,the use of Reinsch’s suggestion is equivalent to
setting SF = 1.0 if all values of s(yi) are known. This is in fact the method
of spline-fitting previously used for constructing geological time scales (see
Section 3.11) and in modified RASC (Chapter 8).
339
(9.3)
(9.4)
TABLE 9.6
CASC 2 output for Adolphus D-50 (age-level plot, cf. Fig. 9.13a); smoothing factors SFk range from
1.8158 for k = l (first spline-curve satisfying law of superposition of strata) to 3 2519 for k = l l (best
fitting straight line); optimum smoothing factor has lowest cross-validation value cvk.
1 1.8158 13.2344
2 1.9594 12.1967
3 2.1030 11.3395
4 2.2466 10.4125
5 2.3902 9.4796
6 2.5338 9.0214
7 2.6775 9.2514
8 2.8211 9.7706
9 2.9647 10.3320
10 3.1083 10.9710
11 3.2519 11.3223
34 1
- -26
A SO 5.0883
e SEDIMENTATION RATE
f AGE IN Ma
10 o I r e 6 1 s z 1 o ro m w u) so '20 10 o
to to
SF - 2.5800
I
SO - 2.5800
Fig. 9.13 Analysis on example of Fig. 9.12 for Adolphus D-50 repeated using optimum smoothing factors
obtained by cross-validation for spline-curves in Figs. 9.13a and 9.13f. Largest differences in fitted
curves occur in Figs. 9.13b (cf. 9.12b) and 9.13f(cf.9.120. For further explanation see text.
(9.5)
343
,=l (9.6)
(9.7)
q, = ( n - 2 ) s L- ( n - 3 ) s
(9.8)
n- 1
(9.9)
Consequently:
(9.10)
344
Jackknife values can be obtained for all four coefficients which determine
a cubic curve for each the (n-1) intervals between successive values x L and
x L + l (i = 1, 2, ..., n-1). Use of all coefficients results in a jackknife spline
which interpolates between these successive values.
For example, Table 9.7 shows the values s L ,q Land s(qJ for the spline-
curve of Figure 9.13a. The corresponding jackknife spline based on
complete sets of four coefficients is shown in Figure 9.14a together with
95 percent confidence intervals for the values 9,. Comparison of the values
s(qi)in Table 9.7 indicates that spline and jackknife spline for SF = 2.456
are close t o one another. Nearly all standard deviations s ( q i ) are less
than SF. Only from level 10 to 14, the d q , ) values are relatively large as
can also be seen in Figure 9.14a. It would be possible to transfer the error
bars of Figure 9.14a to the data points in Figure 9.13d, and t o project them
along the depth scale by one of the methods illustrated in Figure 9.7.
Instead of expressing the uncertainty of the observed events with respect
t o their most likely positions, these new error bars would give the
uncertainty of the estimated ages themselves.
TABLE 9.7
CASC 2 output for Adolphus D-50 (age-level plot, ef. Figs. 9.13a and 9.144; the values s l are situated on
the spline-curve with optimum smoothing factor (SF= 2.456); the values q r with standard deviations
6 ( q , ) belong to the correspondingjackknife spline.
spline for SF = 2.2 is not as smooth as the spline for SF = 2.2 that was
originally selected in a subjective manner.
2 F - I8 19
~ 68
58 i- 1Age i n L
81" ,
5
18
15 1
d
I
68 58 48
Age in k
38 28 18 8
j.j
25
t38 /3
Fig. 9.14 Jackknife spline-curves with approximate 95% confidence limits for Adolphus D-50 results
previously shown in : a. Fig. 9.13a; b. Fig. 9.13f; c. Fig. 9.12a; d. Fig. 9.12f. The optimum smoothing
factor patterns of Figs. 9.13a and 9.13f a r e relatively closely approximated by their jackknife splines,
contrary to the subjectively selected spline-curves of Figs. 9 12a and 9.12f. The latter two jackknife
splines (Figs. 9 . 1 4 ~and 9.14d) show violations of the law of superposition of strata. In general, the
indirect method (Figs. 9.14a and 9 . 1 4 ~ yields
) results which are superior to those of the direct method
(Figs. 9.14band9.14d).
are characterized by lack of data along the depth axis due to relatively
high sedimentation rates. Although the subjectively derived event-depth
curve for SF = 2.1 (Fig. 9.12e) is relatively close t o the "optimum" event-
depth curve shown in Figure 9.13d, it obviously could not be duplicated by
its jackknife estimator. This confirms that it may be dangerous t o fit
splines to irregularly spaced data. The results of Figure 9.14 clearly
demonstrate that the indirect method of constructing event-depth curves
illustrated in Figure 9.6 is to be preferred to the direct method. The
discrepancy between the patterns of Figures 9.13d and 9.13f also can be
explained now. Although the pattern of Figure 9.13f is for a n optimum
smoothing factor and was reasonably well duplicated by its jackknife
spline (Fig 9.14b), the irregular spacing of control points along the depth
347
Age in Ma Age in Ma
0 40 30 20 10 0 40 30 20 10
0 0
4
m
2 0
E E
X X
._
C c
5a 5
a
'4
0
W
d
r r
K P
N
m
2 0
E E
Y Y
.-C c
5a 5
a
?L 0
'0 W
r r
: P
N
Fig. 9.15 Computer simulation experiment. Random normal deviates were added to theoretical curve
(A). Cross-validated smoothing spline (B) was approximated by i t s jackknife estimate (C). First
derivative of spline-curve ( B ) gave sediment accumulation rate curve (D, solid line) which is compared
with first derivative of theoretical curve (D, broken line).
TABLE9.8
C A S C 2 output for computer simulation experiment of Fig. 9.15. See Table 9.6 for explanation of column
headings.
k SFk cvk
1 0.828 3.607
2 0.897 3.314
3 0.967 2.968
4 1.036 2.565
5 1.106 2.219
6 1.175 2.131
7 1.245 2.192
8 1.314 2.290
9 1.384 2.395
10 1.453 2.445
11 1.523 2.448
observations used for estimating the spline-curve s, Table 9.9 also shows
residuals E2i = y2i - qi for new observations y2i obtained by adding 21
other random numbers to t i . These new observations have wider
confidence belts with widths controlled by s(E2i) = SF d ( l + h i i 2 ) , also
shown in Table 9.9. This second type of confidence belt would, for example,
apply to test suspected outliers not used for calculating the smoothing
spline.
TABLE9.9
Random normal deviates (Eo,) were added to theoretical values (1,) on curve of Fig 9 15a to give observed
values y, Cross-validated smoothing spline values (s,)on curve of Fig 9 15b were approximated by
their jackknife estimates (qJ of which standard deviations dq,) could be computed. Standard deviations
s (El,) and s (E2J are for residuals of data used ( E l , ) and not used ( E z J for estimating qL,respectively
1
-
1 1000 054 10 54 9 69 9 79 102 075 059 048 155
Firstly, ten selected zone markers were traced through six wells.
Starting point was the Cenozoic scaled optimum sequence (Fig. 6.2, 7/2/4
run for 21 wells). For the interactive spline fitting of the bivariate plots,
all CASC defaults were accepted, unless otherwise specified. The first
CASC default is the smoothing factor (SF)that defines the spline-curve for
which an increase in position or depth along one axis does not anywhere
correpond t o a decrease in position (or time) along the other axis. This
default is obtained by means of an algorithm that calculates spline-curve
fits with SF increasing o r decreasing according to a binary search method.
352
The default satisfies the condition that the observed sedimentation rate is
never negative.
In three wells the mainframe CASC cursor option was used to delete
aberrantly positioned events: in Hibernia P-15, one point was deleted on
event level 12; in Bonavista C-99, three points were deleted (on levels 4 , l l
and 15), and SF for the events versus depth graph was changed from its
default (=0.02) to 0.15; and in Snorri J-90, one point was deleted on
level 6.
The results of the CASC multi-well comparison are shown i n
Table 9.10, listing both the observed and the most likely depths of the ten
selected Paleocene through Miocene zone markers 50 (561, 90, 32, 29,261
(260), 259, 24, 26(15), 18 and 16. In two wells substitute taxa were
correlated rather than the three designated events50, 261 and 26. The
substitutes 56, 260 and 15 are neighbors of the original events in the
optimum sequence.
In most instances, the observed and the most likely depth values are
within half the length of the error bar (68% probability) around the most
TABLE 9.10
Observed (above line) and most likely depth (in m) of ten Eocene through Miocene zone markers in six
wells. The fossil numbers a r e the RASC dictionary numbers. Results are based on optimum sequence
CASC (21 wells; k,=7, rnCl=2,rn,2,4); * means that a t that site substitute fossils (neighbors in the
optimum sequence) were used.
Fig. 9.16 Tracing of ten foraminifera1 events through six wells, using the CASC (optimum sequence)
method to calculate the most likely depths. Black bars show the deviations of these depths from the
observed depths. The chronostratigraphic segmentation is based on observed depths only.
likely value. As pointed out in the previous section, the actual precision of
the estimated depth of a n event in a well is probably greater t h a n that
indicated by the local error bar for single event positions along the spline
curve. Also, the local error bar at any depth is initially calculated over the
time interval along the (scaled) optimum sequence scale ( y ) , as defined by
twice the standard devition (SD) in t h a t ( y ) direction. I t is directly
354
proportional t o the fitted average sedimentation rate for each point (cf.
Fig. 9.7).
Figure 9.16 graphically correlates the ten events through the six
wells. The conventional chronostratigraphic segmentation, which is
shown for comparison, only uses the observed depth of events. The new,
most likely, zone marker depths would lead t o slight up or down
adjustments of the age boundaries. It could be assumed that such a change
might violate stratigraphic boundaries as adjusted for major lithology
changes as determined from well logs. However, using sonic and gamma
logs, no evidence for this was found. In the Snorri well there is no direct
micropaleontological evidence for the presence of events 259, 24 and 26,
associated with Oligocene-Early Miocene strata, although the CASC
method predicts their likely depth in this well. These depths are not
unreasonable given that Oligocene strata were thought t o be present at
that depth, based on palynology.
57 Ma - event 194 -Planorotalites chaprnani; disappears in standard zone P6. Specimens a r e often
transitional between P. chaprnani and Pseudohastigerina. The latter is thought to appear at the
boundary of P5 and P6, or k 57 Ma ago.
52 Ma-event 93 -Acarinina broederrnanni; the species has its top well below A . densa, probably
in the A . pentacamerata - Hantkenina aragonensis Zone, near the Early-Middle Eocene boundary
a t 52 Ma. In some RASC runs, A . broedermanni falls between Early and Middle Eocene zones.
49Ma-event 90-Acarinina densa; this is the time of the optimum climatic warming in the
Labrador Sea, in early Middle Eocene time. Less common a t this time a r e A . senni, A . aff.
penlacamerata, A . aff. broedermanni, Mororouella caucasica, M . spinulosa, a n d M . a f f .
aragonensis. The event probably falls in the Hantkenina aragonensis - Globigerinatheka
subconglobata Zone a t Anomaly 21 time or 52-46 Ma (average 49 Ma).
(7) 40 Ma - event 29 - Cyclamminu arnplectens; in RASC runs this event falls below Turborotalia
pomeroli and Globigerina yeguaensis and above Acarininu densa. In Poland its peak occurrence is
so-called Middle Eocene; it is less frequent in upper Eocene strata (Gradstein and Berggren, 1981).
Theevent was tentatively placed a t 40 Ma.
(8) 38 Ma -event 85 - Pseudodhastigerina micra; same reasoning a s for Turborotalia pomeroli (see
below), but often disappears in slighty older beds, a s also shown in the scaled optimum sequence
(Fig. 6.2).
(9) 37 Ma - event 33 - Turborotalia pomeroli; co-occurs in southern wells with Subbotina linuperta,
Globigerinu yeguaensis and Pseudohastigerinu micra, of the Turborotalia cerroazulensis Zone, late
Late Eocene. The top was placed just below the inferred Eocene/Oligocene boundary.
(10) 28 Ma - event 24 - Turrilina alsatica; the top of this distinctive Oligocene taxon roughly equates
with the top of the Boom Clay in Belgium and the top of the Globorotalia opirna opima Zone, a t f
28 Ma.
15 Ma- event 179 -Globorotalia scifula praescitula; probably occurs in the late Early to early
Middle Miocene warming event, a s observed from the northern incursion of warmer water
planktonic taxa.
3.5 Ma -events 266,4,269 and 5 - Both Globorotalia puncticulata, G. inflata, G. crassaformis, and
Neogloboquadrina atlantica are thought to disappear with the onset of major glaciation in the
Labrador Sea, dated at approximately 3.5 Ma.
Four other events occur a t or near significant breaks in the 5-2-3 and 7-2-4 scaling solutions for
21 and 24 wells. These breaks were equated with zonal boundaries and series breaks as follows:
Figure 9.9a was a plot of the ages of the previously listed events in a
RASC distance scale (21 wells, 7/2/4 run) versus linear time scale.
Smoothing of the spline-curve function diminishes some of the uncertainty
in subjective assignment. The spline function now can be used t o convert
the RASC distance scale into an age scale.
Next, the question can be asked what is the most likely depth in the
wells of the principal boundaries between RASC zones, expressed in Ma.
Gradstein and Agterberg (1985) have traced the boundaries between the
successive Cenozoic RASC zones (Fig. 6.2), which are close approximations
to the boundaries between Paleocene and Eocene (-56 Ma), Early Eocene
and early Middle Eocene (-52 Ma), early Middle Eocene and Middle Eocene
(-49 Ma), Middle and Late Eocene (-44 Ma), late Eocene and Oligocene
(-36 Ma) Oligocene and Miocene (-24 Ma), Early and Middle Miocene
(- 16 Ma) and top of Middle Miocene (- 12 Ma).
TABLE 9.11
Observed (above line) and most likely depth (in m) of the 5 6 , 5 2 , 4 9 , 4 4 ,36, 24, 16 and 12 Ma isochrons in
10 wells on the Grand Banks and Labrador Shelf. Results are based on scaled optimum sequence or
distance-CASC(21 wells; k c = 7 , rncl=2, mc2=4).
error bar estimate was deleted. In one well, Karlsefni H-13, both
foraminifers and palynomorphs agree on the absence of Oligocene beds
(Turrilina alsatica Zone). Batch CASC calculates a thin Oligocene
interval (24-36 Ma). Above the Eocene, the well has only a few data points
and results are crude.
The local error estimates of the most likely depths for the isochrons
are within 1 t o 10% of the actual depth values, and more frequently 2 to
5%. In about ten cases the subjectively assigned depths for the zonal
boundaries as converted to isochrons are outside the 68%confidence limits
(k1 SD). For geological interpretations, it should be borne in mind that
the error in most likely depth is an upper limit, and the SD is probably
smaller by a factor that, amongst others, is related to the number of
observations per spline-curve, as explained earlier. Palynologically
determined depths for these stage boundaries often are outside the depth
interval (most likely depth k 1 SD), calculated by CASC. The errors in
this independent biostratigraphic correlation are unknown, but the
comparison suggests that multiple biostratigraphy uncertainties exceed
the CASC-type of errors using one fossil discipline only. The conclusion
may be drawn that the CASC program is able to predict reliable and
objective well to well isochrons. The error expression, that remains vague
in conventional, subjective correlation schemes, is conservatively large
when one fossil discipline only is used.
358
Fig. 9.17 Correlation of 8 Cenozoic isochrons, according to their most likely depths in 10 wells on the
Grand Banks and Labrador Shelf. The depths were computed by means of the RASC-CASC method
explained in the text. Subjective estimates fur the depths of these isochrons a r e shown with x.
t I WEST
FLYING FOAM
-I L-23
HlBERNlA
OFLYING FOAM
NAUTILUS C-92
K-1800B-08
0
ADOLPHUS D-50
+
G-55 000350P-15
OEGRET K-36
-
0 Km 20
Fig. 9.18 Locations of 13 boreholes used by Williamson (1987)for RASCXASC application on northern
Grand Banks.
the same zonation. For this reason, the concise account of Williamson
(1987) is followed with minor changes, using his original illustrations.
F i g u r e 9.19 i s t h e s c a l e d o p t i m u m s e q u e n c e w i t h
chronostratigraphically useful average interval zones highlighted through
shading. Based on the original RASC run with 54events, t o which
9 unique events were added for (further) chronostratigraphic calibration,
eleven RASC zones s t a n d out, numbered from X I t h r o u g h I ,
Kimmeridgian-Cenomanian. This zonation considerably expands
stratigraphic resolution previously a v a i l a b l e . T h e r e i s good
correspondence of the average position of the disappearance levels of the
taxa in the wells and the upper part of stratigraphic ranges reported in the
literature. Some longer ranging taxa of the literature, on the Grand
Banks have relatively short ranges, as is the case with L. nodosa (no. 10)
and D.gradata (no. 111). N . uarsouiensis (no. 64) was not previously
reported so young. The tight clustering of events in the Albian zones 111
and I1 reflects considerable uncertainty on their exact disappearance
360
Fig. 9.19 Williamson’s (1987) eleven-fold average interval zonation, using ranking and scaling for the
Upper Jurassic and Lower Cretaceous foraminifera1 record, northern Grand Banks. Asterisks indicate
unique events.
361
I
DEPTH(m1
0
0
I
0
P
Fig. 9.20 Upper part: Depth values of RASC zones in northern Grand Banks wells. Numbers above
each boundary a r e based on subjective interpretation. Below each boundary a r e most likely depths using
the CASC method with error bars in meters in parentheses. Lower part: Comparison of subjective
(solid line) and most likely (dashed) depths for Cretaceous isochrons in northern Grand Banks wells
(after Williamson, 1987).
363
several good marker events. Each (CASC) age versus depth plot per well
was executed with isochron boundaries for zones and the result is
displayed in the lower half of Figure 9.20. The dashed lines are based on
the CASC method and the solid lines are a subjective interpretation. An
advantage of the CASC type of interpolation method is that it can be used
for isochron cross-sections at for example 1m.y. intervals. Such cross-
sections as constructed by Williamson (cf. Williamson and Agterberg,
1990) have realistic geological properties and are of use in relating seismic
cross-sections to geochronologic results and in detection of hiatuses in one
or more wells. This type of application considerably enhances the role of
biochronology in regional basin studies.
\\ A
I I I I I I I
1
180 150 120 90 60 30
AGE M a
Fig. 9.21 Burial history of Hibernia 0-35 well accounting for CASC derived error limits. Curve A has
minimum associated error, Curve B has maximum error.
as input into the program producing the two observed subsidence curves
shown. Such an approach provides an error envelope of burial curves
within which maturity calculations can be made which would help
determine the effect of chronology on the timing of peak generation and
expulsion of hydrocarbons.
2. -’
2.5 ..
E
x
X
3. -’
I-
P
Ly
3.5 -
Fig. 9.22 Isochron correlation between West Flying Foam and Flying Foam wells showing unconformity
and interpolations between known stratigraphic sections.
367
RASC distance
6.0 4.0 2.0 0.0
‘ 20
Fig. 9.23 RASC distance-event level plot for Morgan Creek section. Spline-curve is for optimum (cross-
validation) smoothing factor SF = 0.382.
368
TABLE 9.12
Smoothing factor (SF) and cross-validation value (CV) for RASC distance versus event level plot of
Morgan Creek section (Fig. 9.231. A. Minimum a n d maximum SF values correspond to f i r s t
monotonically increasing spline and best-fitting straight line, respectively. B. Zooming in on window
provided optimum value SF = 0.38.
The curve of Figure 9.23 was combined with its line of observation
(depth-versus-level curve) to produce the RASC distance-depth plot of
Figure 9.24A. The lowest event (LO Kormugnostus simplex) with RASC
distance equal to 6.55 in the Morgan Creek Section is not shown in this
diagram which was redrafted from CASC 2 output. (The fitted curve does
not extend to 6.55 because some information was lost at the edges due t o
use of cross-plots). Figures 9.24B and C show similar plots for the White
Creek and Pontotoc sections. The standard deviations for the three curves
are equal to 0.39, 0.36 and 0.61, respectively, and nearly equal to the
optimum smoothing factors (see before). The three fitted curves become
steeper in the downward stratigraphic direction reflecting higher
sedimentation rate (cf. Shaw, 1964). Figure 9.24 can be used to determine
the probable depths of specific RASC distances in the three sections for
automated stratigraphic correlation. Figure 9.25 shows the results of the
CASC comparison together with Palmer’s zones and Shaw’s R.S.T. values.
The modified local error bars (k1 SD) shown for RASC distances 2.0, 5.0,
and 6.0 illustrate that the uncertainty increases in the downward
direction due to the higher sedimentation rate. The three sets of lines of
correlation agree closely with one another near the tops of the sections
where biostratigraphic control is relatively good. It is noted that the lines
for Palmer’s zones were drawn through the locations of the collections with
the highest stratigraphic position classified as belonging t o a particular
zone by Palmer (1955).
500 -
U
Y)
n
E
400
e
al
+
._
v)
n
300
RASC distance
6.0 4.0 2.0 0.0
I I I I I I I 800
-
-
I
LL
700
al
Y)
n
E
?
600
al
-5
C
v)
500
-
600 '
c
Y)
n
E
500
e
al
0
m
c
._
v)
Fig. 9.24 Spline-curves for positions of RASC distance values in three sections obtained by means of
indirect method. A. Morgan Creek section. Curve of Fig. 9.23 was combined with curve for positions of
event levels according to method of Fig. 9.6. Second (cf. Fig. 9.6b) and third (cf. Fig. 9.6e) smoothing
factors used were equal to 0.02 and 0.2, respectively Final standard deviation of deviations from curve is
SI)=0.390. B. White Creek section (SD=0.357). C. Pontotoc section (SD=0.615).
371
600 - 700
0.5 ).5
1.0
1.0 l.0
2.0 9.0
3.0
1.0
5.0 5.5
5.5
400 - 500
6.0
8.0
6.5
300 - 400
6.5
Fig. 9.25 Stratigraphic correlation of three sections by 3 methods. Palmer's (1955) zones and Shaw's
(1964) R.S.T. value correlation lines are superimposed on CASC results using spline-curves of Fig. 9.24.
Modified error bars extend one standard deviation on either side of probable positions for RASC distance
values equal to 2.0, 5.0 and 6.0, respectively. The uncertainty of the correlation lines increases in the
stratigraphically downward direction due to higher sedimentation rate.
Grabens, North Sea. Although CASC applications have not yet been
published, this case-history study is interesting because i t involves
integration of biostratigraphic and lithostratigraphic information, seismic
stratigraphy and correlation of Cenozoic hiatuses across the Atlantic
Ocean.
Following the widespread deposition of Danian chalk, south of about
60"N, the North Sea Basin underwent rapid subsidence (Sclater and
Christie, 1980; Gradstein and Berggren, 1981; Wood, 1981). As a result,
terrigenous clastic sediments in excess of 3 km thick accumulated in the
central portion of the basin. Thickest sediments are found in the Central
Graben, whereas the Viking Graben received between 2 and 3 km of
sediment. Mudstones predominate, with deep marine clastic fans, like
those of the Forties and Frigg oil fields developing during the early stage
of Tertiary subsidence. In the Ekofisk area post-Danian olistostromes
occur. By Middle Miocene time the North Sea trough had been filled,
leaving a neritic environment with a predominantly calcareous benthic
microfauna dominated by Cassidulina, Elphidium, Fursenkoina, and
Cibicidoides.
The post-Danian, Paleocene through Early-Middle Miocene
mudstones harbour a rich and diversified flysch-type agglutinated benthic
fauna (Gradstein and Berggren, 1981), which includes over 60 taxa. Many
benthic taxa show minor and some major inconsistencies in relative
stratigraphic position of highest occurrence events as sampled in 29 wells
(Fig. 9.26). Over 2000 cuttings samples, sidewall cores, and some core
samples were analyzed, and the final analysis involves the tops of
147 benthics and relatively few planktonic taxa. The microfossil
distribution data were augmented by the relative positions in the wells of
physical log markers A through G as defined by A.C.Morton and
R.B. Knox (personal communication, 1984).
A close look at the North Sea analytical data shows that the southern
wells (blocks 21-38) contain more Oligocene-Miocene calcareous taxa,
including several species of planktonics, than the northern wells which
contain a more diversified Paleocene agglutinated record. This pattern of
geographic differentiation was further confirmed using correspondence
analysis (G.F. Bonham-Carter, personal communication, 1985). This
method clarifies the spatial distribution of co-occurring taxa. There may
be several reasons for this biogeographic trend, one of them being the fact
t h a t the principal deep water connection was t o the north in the
Norwegian Sea. The latter region does not have much of an indigenous
planktonic record. Another reason is that the post-Danian, Late
Paleocene-Eocene bathyal mudstone facies did not preserve much of a
carbonate record, owing to diagenetic effects (Gradstein and Berggren,
1981). A third reason is climatic; apparently the transition from carbonate
rich to carbonate poor rocks in Cenozoic time can be traced from south to
north over the central North Sea (Ziegler, 1981). The biogeographic
analysis indicates that for detailed regional studies two zonations are
required, one emphasizing the northern Paleogene record and the other
the southern Oliogocene-Miocene record. In this section, emphasis is on
374
the generalized zonation which combines features from both the Central
Graben and Viking Graben deep water troughs.
The generalized Cenozoic North Sea zonation uses the RASC
thresholds k , = 8 m,l = 1 and m,2 = 5, which means that zonal taxa must
occur in 8 or more out of 29 wells and each pair of taxa in the scaled
optimum sequence in 5 o r more wells. The threshold k , reduces the
original data set of 147 events t o 49 (Fig. 9.27), including 8 planktonic and
25agglutinated taxa and the log markers A-G of Knox and Morton as
found in the majority of the wells studied. The dendrograms that display
the interfossil distances between the ranked taxa (Fig. 9.27), are stable
when RASC is run with k , = 9 and 10 and m,2 = 6 and 7, which
incorporate 45 and 41 taxa, respectively. In each situation the same zones
are recognized.
In order to enhance the zonation with index taxa that are rare or
other taxa that are thought t o be potentially of such use, the RASC method
allows introduction of special or unique events (UE) occurring in one or a
few wells only. Twelve events were selected that occur in less than k , = 8
wells, but are worth noting. These events are the highest occurrences of
(from old to young) Ammodiscus planus, Reticulophragmium garcilassoi,
Bulimina trigonalis, Turrilina robertsi, Haplophragmoides (aff.) jaruisi,
Adercotryma sp. 1 (formally described as Adercotryma agterbergi, nsp. by
Gradstein and Kaminski, 19891, Globigerinatheka index, Turrilina
alsatica, Globigerina ex gr. officinalis, G. angustiurnbilicata and Neogene
radiolarian flood. In the final RASC calculations, stratigraphic neighbors
of these events are identified. A neighbor is a species that occurs in the
scaled optimum sequence and also in the wells with the UE, and
stratigraphically as close as possible to it. Each UE is positioned between
these neighboring events in the scaled optimum sequence (cf. Section 6.8).
Eleven interval zones are recognized (Figs. 9.27 and 9.28), with the
characteristic taxa listed stratigraphically in order of average
Fig. 9.27 Biozonation primarily based on agglutinated benthic foraminifers, Cenozoic, central North
Sea. The scaled optimum sequence is for the average tops of 54 foraminifers and siliceous microfossils
and physical log markers A-G in 29 wells. Dendrogram values a r e distances between events in relative
time. Scaling is stratigraphically downward, in line with the study of the wells. The generalized 10-
fold zonation is representative for the regional Cenozoic stratigraphy (see text). There a r e 11 unique
events ( = r a r e e v e n t s ) shown with * *, A s h a d i n g p a t t e r n h a s been used to e n h a n c e t h e
stratigraphically most useful parts of the dendrograms. The large interfossil distances a t the top of the
Danian, Late Selandian-Early Ypresian, Middle-Late Eocene, Late Oligocene-Early Miocene a n d
Middle Miocene a r e sedimentary cycle boundaries (from Agterberg and Gradstein, 1988).
3 75
376
G crassalormis
G praesc!lvla
1
r
. A gun&! (peak)
G praescilula
AQUITANIAN
G ollrcmals
T alsalica
LOG MARKER F
G mdex T pmeroli
R ampleclens BARTONIAN
R ampleclens
45
LUTETIAN $ G kuqlen
s pafagonrca
LOG MARKER D
LOG MARKER C
55 f 2P@=----'s. 0;
S pafagonca
S spclab~l~s
LCO
S pseudobulloides S tnloculino~des
S pseudobulloides
I I
Fig. 9.28 Relation between global model for (seismic) sequences stratigraphy (Vail, €fardenbol and
coworkers, pers. commun., 1986) and hiatuses based on scaling in time of the RASC zonations for the
central North Sea shown in Fig. 9.27 and the Canadian Atlantic margin shown in Fig. 6.2. Age tiepoints
for scaling are shown on each side of zonation in time. For explanation see text.
377
marker F top Upper Eocene; marker E top Lower Eocene; marker D top
Sele Formation (or equivalent); marker C base Sele Formation (top
Paleocene); marker B top Ekofisk Formation (top Lower Paleocene); and
marker A top Cretaceous. The log picks were expected to vary slightly in
stratigraphic position relative to the foraminifera1 events in the wells and
were treated as “fossil events” in the calculations. Figure 9.27 shows the
calculated average stratigraphic position of these events. There is good
agreement between the ages assigned by Knox and Morton to the log picks
and the ages assigned t o the accompanying zones.
Log marker A is always found a t the level with Globotruncana below
the Danian zone (not shown in Fig. 9.27). Log marker B on average is in
the Danian, rather than at the top as suggested. Log markers C and D are
in the Coscinodiscus zone that delineates the ash-series that straddles the
Paleocene-Eocene boundary.
The top of log marker E is given as top of Lower Eocene in agreement
with its average occurrence slightly above the Subbotina patagonica zone,
Ypresian. The only serious exception to this average position was found to
be in well 23/22-1 where E occurs with Danian planktonics. The latter
may be reworked.
TABLE 9.13
Interpolated ages of the events in the central North Sea zonation of Fig. 9.27, using cubic spline fitting
for the age-RASC distance relationship of a subset of events (shown a s *) for which age estimates (in
parentheses) are available in the literature.
(63 Ma), S.patagonica ( 5 5 Ma), T . robertsi (49 Ma). R . amplecterts (40 Ma),
and T . alsatica (30 Ma) are in common with the central North Sea
zonation. Again, large Eocene, Oligocene and Miocene hiatuses stand out.
Haq et al. (1987) have related a global seismic-sequence stratigraphy
to chronostratigraphy. The sequences are composed of periods of offlap
(basinward movement of the shoreline) and onlap (landwards movement of
the shoreline). These sequences are thought to reflect global changes in
sealevel. If rate of sealevel fall exceeds rate of basin subsidence, such
events can exert considerable influence on shallow deep marine clastic or
carbonate deposition. A relative shift seaward of the shoreline may
disrupt sedimentation in shallow basins, and lead to a hiatus. In deeper
water, more mass-flow sediments may occur causing local deposition or
erosion. The sequences were adjusted to conform to the linear time scale
used for the tiepoints (Berggren et al., 1985), and the North Sea and
Canadian Atlantic margin zones and the seismic sequence stratigraphy
were placed side by side (Fig. 9.28). Not unexpectedly, the more prominent
basinward shifts of the shoreline, for convenience numbered 1 through 7,
approximately coincide in time with breaks in the zonations. As discussed
earlier, large breaks in the scaled optimum sequence of fossil events are
likely to match hiatuses or sudden changes in facies. Major shifts in
position of shorelines influence the sediment supply as well as erosion and
can be expected t o exert control over the sedimentary sequences in the
Canadian offshore and Central and Viking Grabens. The latter, in turn,
influence the zonal boundaries of fossil assemblages.
Shift 1 in the North Sea may have coincided with replacement of the
Danian carbonates (S. pseudobuloides zone) by clastics ( R . paupera - T .
ruthuen murrayi zone). Shift 2 also is seen on the Rockall and Grand
Banks and may tie t o a late Ypresian hiatus. Shifts3 a n d 4 appear
associated with breaks in the uppermost Eocene and Oligocene, which
caused major disruptions in the fossil sequence both in Labrador and
North Sea wells. It is not easy to explain why in the deep central North
Sea a Late Eocene hiatus occurs. The mid-Oligocene shift 4 event appears
t o have affected the deeper North Sea less than the shallower beds offshore
Canada. This is t o be expected. Shift 5 does not match an Early Miocene
hiatus but events 6 and 7 bracket a Late Miocene break. In general, as
expected, the extent of the hiatuses and the presumed influence of sea
level changes increases stratigraphically upward with decreasing rate of
subsidence and sedimentation.
38 1
i
600
ATLANTIC
500
OCEAN
6 450
- km 300
Fig. 9.29 Location map of the Labrador Shelf and Grand Banks wells used by DIorio and Agterberg
(1989).
382
TABLE 9.14
Names and ages of biozones of Fig. 9.30 and list of boundary events used to trace RASC biozones in CASC
multi-well comparison.
Labrador Shelf
Eleven wells were included in the Labrador Shelf group, the
southernmost one being Freydis. RASC biozones were correlated between
wells by tracing the depths of zone boundary events. These events were
chosen from Figure 9.30 and are listed in Table 9.14. When an event is not
found in a well, its expected depth was estimated from its RASC position.
The depths of the zone boundaries are listed in Table 9.15 and plotted in
Figure 9.31 (left side) for the Labrador Shelf wells.
The zone boundaries in the youngest or oldest parts of the wells may
not always be shown because of either the scarcity of data points, or the
specific shapes of the spline curves. The Bjarni, Cartier, Leif M-48 and
Freydis wells show more closely spaced zone boundaries, probably
indicating a lower sediment accumulation rate. This is in contrast with
the northern wells, which appear t o have greater sediment accumulation
rates.
384
Fig. 9.30 Biozonation model of the Cenozoic of the Labrador Shelf and Grand Banks based o n a n
integrated databank of foraminifers, dinoflagellates, and spores and pollen.
385
Fig. 9.30(continued)
TABLE 9.15
CAW depths of biozone boundaries of Table 9.14. Errors are standard deviations.
Well Name -
I I1 11. 111 .
111 IV IV -v v - VI
Rut H-11 '3.16 f 0.63 '3.00 f 0.63
Karlselni H-13 371 ? 0 9 0 '2.94 t 0.20 '2.91 f 0.15 2 7 6 i 0 09 '2.74 f 0.10
Snorrt J-90 '2 9 5 f 0 0 6 '2 53 f 0 51 '2.48 f 0.21 2 15 f 0 05 2.14 t 0.03
Herloll M-92 '1 99 f 0 19 '1.97 i 0.18 176i021 1 72 i 0 22
Bjarni H-81 '1 99 f 0 12 '1 83 t 0 13 '1 82 f 0.12 1 6 7 1 0 08 '1 6 5 i 0 0 8
Gudrid H-55 '2 371.039 2 l o t 0 16 2.08t 0 16 19OfOll '1 88 i 0 09
Cartier D-70 '1.77 f 0.08 '1.76 f 0.08 1 55 i 0 26 1501017
Indian Harbour M-52 2 99 i 0 0 7 2.53 i 0.27 *
'2.47 0.35 2.28 f 0.32 216i047
Lei1 M-48 '1.69 f 0.07 '1.67 i 0.09 1 59 f 0.03 1.58 i 0.04
Lei1 E-38
Freydis 8-87 '1.54 i 0 10 'I 3 9 f 0 1 2 '1 38 f 0.11 1.28 i 0.05 '1 2 7 f 0.07
Hare Bay E-21 '3.06 i 0.12 '2.88 f 0.11 '2.86f0.10 2.27 f 0.67 2.18*0.26
Blue H-28 4.74 f 0.05 4.73 i 0.05 4.43 i 0.64 '4.29 ? 0.55
Bonavista C-99 '3.46 i 0.13 3.44 f 0.13 3.21 i 0.37 3.12f0.25
Cumberland B-55 3.58 f 0.18 '331 i 0 1 0 3291012 2.89i a.28 '2.81 t 0.28
Bonanza M-71 331i018 328i-018 2.90 i 0.40 2 78 i 0 63
Dominion 0-23 250r044 2 4 4 t 0 29 1.94 f 0.21 189*022
South Tempest G-88 230f-011 227f013 1.71 t o 2 5 1 59 f 0 44
Flying Foam 1-13 '1 9 6 i 0 1 5 *i.94ia.i8 1.67 f 0.16 1 62 f 0 23
Adolphus D-50 '2.63 f 0 10 2.28i 0.45 2.16 f 0.46 1.78 f 0.13 1 69 i 0 34
Hibernia P-15 1.25?0.19 120 i 0.10
Egret K-36
Osorev H-84 '0.76 i 0.06 '0.75 i 0 06
Grand Banks
The Grand Banks group consists of twelve wells, the northernmost
one being Hare Bay. The Egret K-36 well is not included in the correlation
chart because it is shallow and has a very condensed section.
The zone boundary events listed in Table 9.14 also were traced in the
Grand Banks wells and plotted in Figure 9.31 (right side). The depth of
the boundaries and their respective local error estimates are presented in
Table 9.15.
I I I I I I1 "In.,",. P-16 I
, : : : : : : : : : : : : : : : : : I - + : : . : : : : : + - ...........................
0.0 1.0 2.0 3.0 4.0 5.0 0.0 1.0 2.0 3.0 4.0 5.0
Fig. 9.31 Biozone correlation chart of the Labrador Shelf wells (left side) and the Grand Banks wells
(right side). The zone boundaries are given in Table 9.14.
rate. The Osprey well exhibits relatively more closely spaced zone
boundaries than other wells; this is presumably due to its more distant
position from the terrigenous sediment supply (see Fig. 9.29). The Blue
well shows all zones at greater depths to the sea floor.
CHAPTER 10
COMPUTER PROGRAMS FOR RANKING, SCALING
AND REGIONAL CORRELATION OF STRATIGRAPHIC EVENTS
10.1 Introduction
The RASC computer program for r a n k i n g a n d s c a l i n g of
biostratigraphic events was originally written between 1978 and 1981 for
mainframe computers. It was followed by the CASC program for
correlation and scaling in time. In 1985, it became possible, after
relatively minor modification, to compile the FORTRAN code of the RASC
and CASC computer programs on IBM compatible microcomputers.
At present, several versions of these programs are in existence in
different languages (primarily FORTRAN, C and BASIC). A brief history
of the development of RASC and CASC with references is given a t the end
of this chapter. The existing programs are only slightly different from one
another. As a rule, later versions are more user-friendly than earlier ones.
The reader wishing to use RASC on a microcomputer (or mainframe) may
obtain a copy of Program RASC (Ranking and Scaling), version 12, which
at the time of writing (1990) is distributed free of charge by the Committee
on Quantitative Stratigraphy (CQS). (Please send 360 KB floppy diskette
to F.M. Gradstein, Chairman, CQS, Atlantic Geoscience Centre, Bedford
Institute of Oceanography, Dartmouth, N.S., Canada, B2Y 4A2). This
enhanced batch version of RASC in FORTRAN 77 by Agterberg et al.
(1989) contains source code, executable (EXE) files and test data files. It
can be executed on a PC with math co-processor. CASC is available as a
mainframe program (Agterberg et al., 1985). Agterberg and Byron (1990)
are preparing micro-RASC for release as a Geological Survey of Canada
Open File.
The micro-RASC system consists of 12 separate program modules. It
makes use of the characteristic features of microcomputers. Except for
Module 1 which can be used to create new input files, each module reads
one or more input files and creates one or more output files. This allows
flexibility for program development because separate modules can be
revised and replaced without changing the remainder of the system.
390
The RASC method requires as input a sequence (SEQ) file with coded
sequences of stratigraphic events for individual sections, a dictionary with
event names (DIC file), and a parameter (PAR) file with settings of
switches and values of parameters. The CASC method requires depth
(DEP) files for individual sections. Module 1 allows preparation of d a t a
(DAT) files from which SEQ files and preliminary DEP files are generated
automatically. Examples of DAT file formats are:
in the step model. Penalty points are assigned for each position that an
event is out of place in a section. Kendall’s rank correlation coefficient
can be computed from the total number of penalty points per section. The
relative order of events in each section is compared to that in the optimum
sequence in the scattergrams.
distribution in standard form. I t is assumed that all events have the same
variance for deviations between their regional mean positions a n d
observed positions within individual sections. In modified RASC, t h e
variances of the events can be different. They are estimated by means of
a n iterative procedure. Firstly, spline-curves are fitted to the events i n
common between the scaled optimum sequence and individual sections in
order t o project the regional mean positions onto the sections, and to collect
all deviations for each event. Secondly, the variance of the deviations for
each event is used for scaling which yields a new set of cumulative RASC
distances. These two steps are repeated until approximate convergence is
reached. Modified RASC allows identification of low-variance events
which can be used a s marker horizons. In addition t o different event
variances, this procedure provides frequency distributions of individual
events which may be positively o r negatively skewed. Maximum
deviations can be used for constructing a conservative range chart i n
which the ranges are based on regional highest and lowest observed
occurrences of fossils (cf. Sections 8.7 to 8.9).
obtained. These include the local and modified local error bars for
deviations between observed depth of events and the probable depths used
for correlation. Local and modified local error bars basically are error bars
along the time axis which have been projected along the depth axis by
assuming locally constant and variable rates of sediment accumulation,
respectively.
1.6 Do you wish to enter rotary table height and water depth?
Condition: Switch 1.5 ison.
1.12Do you wish to subtract a constant from all dictionary numbers that
are read in?
Parameter name: NSTART (Default value NSTART = 0).
Default: As usual, no changes are made in the dictionary numbers.
Module 2: PREPROCESSING
2.1 Do you wish to set the threshold parameter for minimum number of
sections in which a n event should occur?
Parameter name: IOCR (Default value: IOCR = 3)
Default: The minimum number of sections in which a n event should occur is equal to 3.
398
2.2 Are you dealing with two separate groups of fossils which should have
different threshold parameters?
Condition: Switch 1.12 ison.
Parameter name: IOCR2 (Default value: IOCR2 = 0)
Default: As usual, you wish to use a single threshold parameter for minimum number of sections.
2.3 Do you wish to define unique events? (i.e. special rare events that
occur fewer than IOCR times)
2.4 Do you wish to define marker horizons?
2.5 Do you wish to see intermediate tabulations?
Default Intermediate tabulations (e.g. recoded sequence data) will not be shown in the output.
Module 3: RANKING
3.1 Do you wish t o perform presorting?
3.2 Do you wish to apply the modified Hay method?
3.3 Do you wish t o set the threshold parameter for minimum number of
sections in which a pair of events should occur?
Parameter name: CRITl (Default value: CRITl = 1.0)
Default: All frequencies will be used for the modified Hay method.
3.10Do you wish to add ranking results to depth files for use in CASC?
Condition: Switch 1.9 is on.
Default: As usual, CASC will not be applied to ranking results.
3.11 Do you wish to re-insert unique events into the optimum sequence?
Default: Unique events will not be re-inserted into the optimum sequence.
Module 4: SCALING
4.1 Do you wish to set the threshold parameter for minimum number of
sections in which a pair of events should occur?
Parameter name: CRITP (Default value: CRIT2 = 2.0)
Default: All frequencies for pairs occurring in two or more sections will be used for scaling.
4.8 Do you wish to suppress re-insertion of unique events into the scaled
optimum sequence?
Default: As usual, unique events will be re-inserted into the scaled optimum sequence.
4.13 Are you planning to construct a regional time scale using ages (in Ma)
of selected events?
Default Regional time scale (Module 9) will not be constructed.
4.14Do you wish to add scaling results to depth files for use in CASC?
Condition: Switch 1.9 is on; Switch 3.10 is off.
Default: No use will be made of observed superposional relations between events that are above
one another in the original scaled optimum sequence with a probability of approximately 95
percent.
7.2 Do you wish to use the jackknife standard deviations for construction
of a regional time scale?
Condition: Switch 4.13 is on.
9.3 Do you want to substitute ages for RASC distances in depth files?
Condition: Switch 4.14 is on.
Default: CASC will be based on the RASC distances.
402
Module 10: CASC 1: EVENT-DEPTH CURVES
10.1 Are you using an optimum sequence with ranks only?
Condition: Switch 3.10 or Switch 4.14 is on.
Default: You are using the scaled optimum sequence supplemented by RASC distances or ages (in
Ma).
10.2If some events are observed to be coeval, do you wish to work with
separate events at approximately the same event levels?
Default: Events observed to be coeval a t a given level will be averaged.
10.3 Should each average for a n event level be weighted according to the
numbers of coeval events on which it is based?
Condition: Switch 10.1is off.
10.4Do your depth files contain standard deviations for separate events
which are not equal t o one another?
Condition: Switch 10.1is off.
Default: All events will be weighted equally.
10.7 Will you use the indirect method for estimating event-depth
relations?
Default: The direct method will be used for estimation.
10.8Do you want to study the first derivatives and sediment accumulation
curves?
10.9 Do you wish to use defaults except for the age-level relation?
Condition: Switch 10.7 is on.
Default: You will have to select smoothing factors for the event-depth and age-interpolated depth
relations in each section.
403
10.10Do you wish to use the minimum smoothing factor and other defaults
in all sections?
Condition: Switch 10.6 is on; Switch 10.7 is off.
Default: Sections will be analyzed separately one after another.
10.11Do you wish to use plot axes defined during analysis of the first depth
file later, for the other depth files?
Default: You can let the program define default plot axes or define new plot axes for any section.
11.5 D o you wish to use the beam deformation analogue method for cubic
spline smoothing?
Condition: Switch 10.2 is on;Switch 10.4 is off.
Default: As usual, a modification of De Boor’s program for cubic spline smoothing will be used.
R e m a r k The next prompt asks for the name of the first depth file to be analyzed by means of
Module 11.
404
Module 12: CASC 3: MULTI-WELL COMPARISON
12.1 Do you wish to specify the sections to be used for correlation?
Default: All sections analyzed by means of Module 10 or Module 11 will be used for correlation.
12.6 Do you wish to define a new t-value for the error bars?
Condition: Switch 12.5 is on.
P a r a m e t e r name: TVALUE (Default value: TVALUE = 2.0)
Default: As usual, the approximation t = 2.0 for 95 per cent confidence intervals will be used
12.7 Do you want statistical analysis results for spline-curve values and
studentized residuals as well?
Default: As usual, statistical analysis will be restricted to deviations between observed and
calculated values.
Cross, T.A. (Editor), 1990. Quantitative Dynamic Stratigraphy. Prentice Hall, Englewood Cliffs, New
Jersey, 625 pp.
Cubitt, J.C. (Editor), 1978. Quantitative Stratigraphic Correlation. Comput. Geosci., 4 (3): 215-318.
Cubitt, J.C. and Reyment, R.A. (Editors), 1982. Quantitative Stratigraphic Correlation. Wiley,
Chichester, U.K.,320pp.
Davaud, E., 1982. The automation of biochronological correlation. In: J.M. Cubitt and R.A. Reyment
(Editors), Quantitative Stratigraphic Correlation, Wiley, Chichester, pp. 85-99.
Davaud, E. and Guex, J., 1978. Traitement analytique ‘manuel’ et algorithmique de problhmes
complexes de correlations biochronologiques. Eclogae Geol. Helv., 71: 581-610.
David, H.A., 1988. The Method of Paired Comparisons (Second Edition). Oxford Univ. Press, New York,
N.Y., 200 pp.
David, M., 1977. Geostatistical Ore Reserve Estimation. Elsevier, Amsterdam, 364 pp.
Davidson, R.R., 1970. On extending the Bradley-Terry model to accommodate ties in paired comparison
experiments. J . Amer. Statist. Assoc., 65: 317-328.
Davis, J.C., 1986. Statistics and Data Analysis in Geology, 2nd Edition. Wiley, New York, N.Y., 646 pp.
De Boor, C., 1978. A Practical Guide to Splines. Springer Verlag, New York, 392 pp.
Dienes, I., 1974. General formulation of the correlation problem and its solution in two special
situations. Math. Geol, 6: 73-81.
Dienes, I., 1982. Formalized Eocene stratigraphy of Dorog Basin, Transdanubia, Hungary, and related
areas. In: J.M. Cubitt and R.A. Reyment (Editors), Quantitative Stratigraphic Correlation,
Wiley, Chichester, pp. 19-42.
Dienes, I. and Mann, C.J., 1977. Mathematical formalization of stratigraphic terminology. Math. Geol.,
9: 587-603.
D’Iorio, M.A., 1986. Integration of foraminifera1 and dinoflagellate d a t a sets in quantitative
stratigraphy of the Grand Banks and Labrador Shelf. Bull. Canadian Petroleum Geology, 34:
277-283.
D’Iorio, M.A., 1987. Quantitative biostratigraphic analysis of the Cenozoic of 23 Canadian Atlantic
offshore wells. The Compass, 64: 264-277.
DIorio, M.A., 1988. Quantitative biostratigraphic analysis of the Cenozoic of the Labrador Shelf and
Grand Banks: Unpublished Ph.D. thesis, Univ. of Ottawa, 404 p.
DIorio, MA., 1990. Sensitivity of the RASC model to its critical probit value. In: F.P. Agterberg and
G.F. Bonham-Carter (Editors), Statistical Applications in the Earth Sciences, Geol. Surv. Can.
Paper 89-9.
D’Iorio, M.A. and Agterberg, F.P., 1989. Marker event identification technique and correlation of
Cenozoic biozones on the Labrador Shelf and Grand Banks. Bull. Canadian Petroleum Geol., 37:
346-357.
Dixon, W.J. and Massey, F.J.,1957. Introduction to Statistical Analysis. McGraw-Hill, New York, N.Y.,
488 pp.
Doeven, P.H., 1983. Cretaceous nannofossil stratigraphy and paleoecology of the Canadian Atlantic
Margin. Bull. Geol. Surv. Can. no. 356,70 pp.
Doeven, P.H., Gradstein, F.M., Jackson, A,, Agterberg, F.P. and Nel, L.D., 1982. A quantitative
nannofossil range chart. Micropal., 28: 85-92.
Doveton, J.H., 1986. Log analysis of Subsurface Geology Concepts and Computer Methods. Wiley, New
York, N.Y., 273 p.
Drobne, K., 1977. Alveolines Pal6oghnes de la Slovhie et de 1’Istrie. MBm. Suisses Paleont., 99,175 pp.
Drooger, C.W., 1974. The boundaries and limits of stratigraphy. Proc. Kon. Ned. Akad. Wet. Ser. l l B ,
17: 159-176.
Duris, C.S., 1980. Algorithm 547, FORTRAN routines for discrete cubic spline interpolation and
smoothing. ACM Transact. Math. Softw., 6: 92-103.
Edwards, L.E., 1978. Range charts and no-space graphs. Computers and Geosc., 4: 247-258.
Edwards, L.E., 1982. Numerical and semi-objective biostratigraphy: Review and predictions. Proc. 3rd
North Am. Pal. Conv., Montreal, August 1982,l: 147-152.
Edwards, L.E., 1984. Insights on why graphic correlation (Shaw’s method) works. J . Geology, 92:
583-597.
Edwards, L.E., 1989. Supplemented graphic correlation: A powerful tool for paleontologists and
nonpaleontologists. Soc. Econ. Paleontologists and Mineralogists, Research Reports, pp. 127.143.
Edwards, L.E. and Beaver, R.J., 1978. The use of paired comparison models in orders stratigraphic
events. J. Math. Geol. 10: 261-272.
Efron, B., 1982. The Jackknife, the Bootstrap and Other Resampling Plans. SOC.for Industrial and
Applied Mathematics, Philadelphia, Pennsylvania, 92 pp.
412
Eubank, R.L., 1984. The hat matrix for smoothingsplines. Statist. and Prob. Letters, 2: 9-14.
Eubank, R.L., 1988. Spline Smoothing and Nonparametric Regression. Dekker, New York, N.Y.,
438 pp.
Finney, D.J., 1971. Probit Analysis (3rd Edition). Cambridge Univ. Press, 333 pp.
Fisher, R.A. and Yates, F., 1964. Statistical Tables for Biological, Agricultural and Medical Research
(6th Edition). Oliver and Boyd, Edinburgh, 146 pp.
Foster, N.H., 1966. Stratigraphic leak. Am. Assoc. Pet. Geol. Bull., 50: 2604-2606.
Fulkerson, D.R. and Gross, O.A., 1965. Incidence matrices and interval graphs. Pacific J. Math. 15:
835-855.
Gale, N.H., Beckinsale, R.D. and Wadge, A.J., 1980. Discussion of a paper by McKerrow, Lambert and
Chamberlainon the Ordovician, Silurian and Devonian time scales. Earth Plan. Sc. L., 51: 9-17.
Gill, D. and Merriam, D.F. (Editors), 1979. Geomathematical and Petrophysical Studies in
Sedimentology. Pergamon, Oxford, 266 pp.
Gilmore, P.C. and Hoffman, A.J., 1964. A characterization of comparability graphs and interval graphs.
Can. J. Math. 6: 539-548.
Glenn, W.A. and David, H.A., 1960. Ties in paired-comparison experiments using a modified
Thurtone-Mosteller model. Biometrics, 16: 86-109.
Golub, G.H., Heath, M. and Wahba, G., 1979. Generalized cross-validation as a method for choosing a
good ridge parameter. Technometrics, 21: 215-223.
Gordon, A.D., 1982. An investigation of two sequence-comparison statistics. Austral. J . Statistics, 24:
332-342.
Gordon, A.D., Clark, A.M. and Thomson, R., 1988. The use of constraints in sequence slotting. In:
E. Diday (Editor), Data Analysis and Informatics V, North Holland Publishing Co., Amsterdam,
pp. 353-364.
Gradstein, F.M., 1984. On stratigraphic normality. Computers and Geosciences, 10: 43-57.
Gradstein, F.M., 1985. Ranking and scaling in exploration micropaleontology. In: F.M. Gradstein et al.,
Quantitative Stratigraphy, Unesco, Paris and Reidel, Dordrecht, pp. 109-160.
Gradstein, F.M. and Agterberg, F.P., 1982. Models of Cenozoic foraminiferal stratigraphy -
Northwestern Atlantic Margin. In: J.M. Cubitt, and R.A. Reyment (Editors), Quantitative
Stratigraphic Correlation, Wiley, Chichester, pp. 119-173.
Gradstein, F.M., and Agterberg, F.P., 1985. Quantitative correlation in exploration micropaleontology.
In: F.M. Gradstein et al., Quantitative Stratigraphy, UNESCO, Paris and Reidel, Dordrecht,
pp. 309-360.
Gradstein, F.M. and Berggren, W.A., 1981. Flysch-type agglutinated foraminifera and t h e
Maestrichtian to Paleogene history of the Labrador and North Seas. Marine Micropal., 6: 211-268.
Gradstein, F.M. and Fearon, M., 1990. STRATCOR, a new method for biozonation and correlation with
applications to exploration micropaleontology (Summary). In: F.P. Agterberg and G.F. Bonham-
Carter (Editors), Statistical Applications in the Earth Sciences, Geol. Surv. Pap. 89-9.
Gradstein, F.M. and Kaminski, M.A., 1989. Taxonomy and biostratigraphy of new and emended species
of Cenozoic deep-water agglutinated Foraminifera from the Labrador and North Seas. Micropal.
35: 72-92.
Gradstein, F.M. and Srivastava, S.P., 1980. Aspects of Cenozoic stratigraphy and paleogeography of the
Labrador Sea and Baffin Bay. Palaeogeogr., Palaeoclimatol., Palaeoecol., 30: 261-295.
Gradstein, F.M. and Williams, G.L., 1976. Biostratigraphy of the Labrador Shelf, I. Geol. Surv. Canada
Rept. 349.40 pp.
Gradstein, F.M., Agterberg, F.P., Aubry, M.-P., Berggren, W.A., Flynn, J.J., Hewitt, R., Kent, D.V.,
Klitgord, K.D., Miller, K.G., Obradovitch, J . , Ogg, J.G., Prothero, D.R. and Westerman, G.E.G.,
1988. Sea level history. Science 241: 599-605.
Gradstein, F.M., Agterberg, F.P., Brower, J . C . and Schwarzacher, W.S., 1985. Quantitative
Stratigraphy. Unesco, Paris, and Reidel, Dordrecht, 598 p.
Gradstein, F.M., Agterberg, F.P. and D'Iorio, M.A., 1990. Time in quantitative stratigraphy: In:
T.A. Cross (Editor),Quantitative Dynamic Stratigraphy. Prentice-Hall, Englewood Cliffs, N.J.,
pp. 519-542.
Gradstein, F.M., Fearon, J . M . and Huang, Z.,1989. BURSUB and DEPOR version 3.50 -Two FORTRAN
77 programs for porosity and subsidence analysis. Geol. Surv. Can. Open File 1283.
Gradstein, F.M., Kaminsky, M . and Berggren, W.A., 1988. Cenozoic foraminiferal stratigraphy of the
Central North Sea. In: F. Rogl and F.M. Gradstein (Editors), Proc. 2nd Agglutinated Foraminifera
Workshop, Vienna, 1986, Abhandlungen der Geologischen Bundesanstalt, 41: 97-108.
413
Gradstein, F.M., Williams, G.L., Jenkins, W.A.M. and Ascoli, P., 1975. Mesozoic and Cenozoic
stratigraphy of the Altantic continental margin, eastern Canada. In: G.T. Yorath et al. (Editors),
Canada's Continents Margin and Offshore Petroleum Exploration, Can. Soc. Petroleum Geol.
Mem. 4, pp. 103-121.
Grimm, E.C., 1987. CONISS: A FORTRAN 77 program for stratigraphically constrained cluster
analysis by the method of incremental sum of squares. Computers and Geosciences, 13: 13-35.
Guex, J., 1977. Une nouvelle mbthode d'analyse biochronologique, note prbliminaire. Bull., Soc. Vaud.
Sci. Nat., 73: 309-321.
Guex, J., 1980. Calcul, caractbrisation et identification des associations unitaires en biochronologie.
Bull. SOC.Vaud. Sci. Nat., 75: 111-126.
h e x , J., 1981. Associations virtuelles et discontinuit& dans la distribution des esp4ces fossiles: un
exemple inthressant. Bull., Soe. Vaud. Sci. Nat., 75: 179-197.
Guex, J., 1987. Corrdations biochronologiques et Associations unitaires. Presses Polytechniques
Romandes, Lausanne, Switzerland, 264 pp.
Guex, J., 1988. Utilisation des horizons maximaux rbsiduels en biochronologie. Bull., Soc. Vaud. Sci.
Nat., 79.2: 135-142.
Guex, J. and Davaud, E., 1984. Unitary associations method: Use of graph theory and computer
algorithm. Computers and Geosciences, 10: 69-96.
Gyji, R.A. and McDowell, F.W., 1970. Potassium argon ages of glauconites from a biochronologically
dated Upper Jurassic sequence of northern Switzerland. Eclogae Geol. Helvetiae, 63: 11-118.
Hald, A., 1957. Statistical Theory with Engineering Application. Wiley, New York, N.Y., 783 pp.
Hald, A, 1960. Statistical Tables and Formulas. Wiley, New York, 97 pp.
Hallam, A., 1975. Jurassic Environments. Cambridge Univ. Press, Cambridge, 269 pp.
Haq, D.U., Hardenbol, J . and Vail, T.R., 1987. Chronology of fluctuating sea levels since the Triassic.
Science, 235: 1156-1166.
Hardenbol, J.,Vail, P.R. and Ferrer, J., 1981. Interpreting paleoenvironments; subsidence history and
sea-level changes of passive margins from seismics and biostratigraphy. Oceanologica Acta 1981,
sp., pp. 33-44.
Harland, W.B., Cox, A.V., Llewellyn, Pickton, C.A.G., Smith, A.G. and Walters, R., 1982. A Geologic
Time Scale. Cambridge Univ. Press, 131 pp.
Harper, C.W., Jr., 1981. Inferring succession of fossils in time: The need for a quantitative and
statistical approach. J. Paleont., 55: 442-452.
Harper,C.W., Jr., 1984. A Fortran IV program for comparing ranking algorithms in quantitative
biostratigraphy. Computers and Geosciences, 10: 3-29.
Hay, W.W., 1972. Probabilistic stratigraphy. Eclogae Geol. Helv., 65: 255-266.
Hay, W.W. and Southam, J.R., 1978. Quantifying biostratigraphic correlation. Annual Review of Earth
and Planet Sc., 6: 353-375.
Hazel, J.E., 1977. Use of certain multivariate and other techniques in assemblage zonal biostratigraphy,
examples utilizing Cambrian, Cretaceous, and Tertiary benthic invertebrates. In: E.G. Kauffman
and J.E. Hazel (Editors), Concepts and Methods and Biostratigraphy, Dowden, Hutchinson and
Ross, Stroudsburg, Pennsylvania, pp. 187-212.
Hedberg, H.D. (Editor), 1976. International Stratigraphic Guide. Wiley, New York, N.Y., 200 pp.
Heller, M., Gradstein, W.S., Gradstein, F.M. and Agterberg, F.P., 1983. RASC FORTRAN IV computer
program for ranking and scaling of biostratigraphic events. Geological Survey of Canada Open
File 922.
Heller, M., Gradstein, W.S., Gradstein, F.M., Agterberg, F.P. and Lew, S.N., 1985. RASC Fortran 77
computer program for ranking and scaling of biostratigraphic events. Geological Survey of
Canada Open File 1203.
Hemelrijk, J., 1952. A theorem on the sign test when ties are present. Kon. Nederl. Akad. Wetensch.,
Proc., 55. 322.
Hibbert, P., 1990. Spline smoothing by means of an analogy to structural beams. In: F.P. Agterberg and
G.F.Bonham-Carter (Editors), Statistical Applications in the Earth Sciences, Geol. Surv. Can.
Paper 89-9.
Hill, M.O., 1979. DECORANA - a FORTRAN program for detrended correspondence analysis and
reciprocal averaging: Ecology and Systematics. Cornell Univ. Ithaca, New York, 52 pp.
Hohn, M.E., 1978. Stratigraphic correlation by principal components: effects of missing data. J. Geol.,
86: 524-532.
Hohn, M.E., 1985. SAS program for quantitative stratigraphic correlation by principal components.
Computers and Geosciences, 11: 471-477.
414
Howell, J.A., 1983. A FORTRAN 77 Program for automatic stratigraphic correlation: Computers and
Geosciences, 9: 311-327.
Hudson, C.B. and Agterberg, F.P., 1982. Paired comparison models in biostratigraphy. J. Math. Geol.
14: 141-159.
Jackson, A., Lew, S.N. and Agterberg, F.P., 1984. DISSPLA program for display of dendrograms from
RASC output. Computers and Geosciences 1 0 59-165.
Jasko, T., 1984. The first find; estimation of the precision of range zone boundaries. Computers and
G~osc.,10: 133-136.
Jeletzky, J.A., 1965. Is it possible to quantify biochronological correlation? J . Paleont., 39: 135-140.
Jenkins, G.M. and Watts, D.G., 1968. Spectral Analysis and its Application. Holden-Day, San
Francisco, 525 pp.
Johnson, N.I. and Kotz, S., 1969. Discrete Distributions. Houghton Mifflin Company, Boston,
Massachusetts, 328 pp.
Jones, D.J., 1958. Displacement of microfossils. J . Sediment. Petrol., 28: 453-467.
Kemp, F., 1982. An algorithm for the stratigkraphic correlation of well logs. J. Math. Geol., 14: 271-285.
Kemple, W.G., Sadler, P.M. and Straws, 1990. A prototype constrained optimization solution to the time
correlation problem. In: F.P. Agterberg and G.F. Bonham-Carter (Editors), Statistical
Applications in the Earth Sciences, Geol. Surv. Can. Paper 89-9.
Kendall, M.G., 1975a. Rank Correlation Methods. Griffin, London, 202 pp.
Kendall, M.G., 197513. Multivariate Analysis. Hafner, New York, N.Y., 210 pp.
Kendall, M.G. and Stuart, A,, 1961. The Advanced Theory of Statistics, Volume 2. Hafner, New York,
676 pp.
Kent, D.V. and Gradstein, F.M., 1985. A Jurassic and Cretaceous geochronology. Geol. SOC.America
Bull., 96: 1419-1427.
Kent, D.V., and Gradstein, F.M., 1986. A Jurassic to Recent Chronology. In: P.R. Vogt and
B.E. Tucholke (Editors), The western Atlantic region, Vol. M, The Geology of North America,
Geol. Soc. Am., pp. 45-50.
King, C., 1983. Cainozoic micropaleontological biostratigraphy of the North Sea. Rept. Inst. Geol.
Sciences No. 82/7,40 pp.
Kwon, B.D. and Rudman, A.J., 1979. Correlation of geologic logs with spectral methods. Math. Geology,
11: 373-390.
Lapin, L.L., 1982. Statistics for Modern Business Decisions, 3rd Edition. Harcourt, Brace, and
Jovanovich, Inc., New York, N.Y., 887 pp.
Lerche, I., 1990. Philosophies and strategies of model building. In: T.A. Cross (Editor), Quantitative
Dynamic Stratigraphy, Prentice Hall, Englewood Cliffs, New Jersey, pp. 21-44.
McKenzie, R.M., 1981. The Hibernia- a classic structure. Oil and Gas J., September, 1981, pp. 243-247.
McKerrow, W.S., Lambert, R.St.J. and Chamberlain, V.E., 1980. The Ordovician, Silurian and
Devonian time scales. Earth Plan. Sc. L., 51: 1-8.
McLaren, D.J., 1978. Dating and correlation, a review. In: G.V. Cohee, and others (Editors),
Contributions to the geologic time scale. American Ass. Petroleum Geologists, Studies in
Geology 6, pp. 1-7.
Macellari, C.E., 1986. Late Campanian-Maastrichtian ammonite fauna from Seymour Island (Antarctic
Peninsula). J . Paleont., 60,": 1-55.
Magara, K., 1976. Thickness of removed sedimentary rocks, paleopressure, and paleotemperature,
southwestern part of western Canada Basin. Am. Assoc. Petroleum Geologists Bull., 60: 554-565.
Maher, L.J.. 1972. Nomograms for computing 0.95 confidence limits of pollen data. Rev. Palaeobotany
Palynology, 23: 85-93.
Maher, L.J., 1981. Statistics for microfossil concentration measurements employing samples spiked with
marker grains. Rev. Palaeobotany Palynology, 32: 153-191.
Mann, C.J. and Dowell, T.P.L., Jr., 1979. Quantitative lithostratigraphic correlation of subsurface
sequences. Computers and Geosciences, 4 295-306.
Menning, M., 1989. A synopsis of numerical time scales, 1917-1986. Episodes, 12(1): 3-5.
Millendorf, S.A., Brower, J.C. and Dyman, T.S., 1978. A comparison of methods for the quantification of
assemblage zones. Computers and Geosciences, 4 229-242.
Miller, F.X., 1977. The graphic correlation method in biostratigraphy. In: E.G. Kauffman and
J.E. Hazel (Editors), Concepts and methods of biostratigraphy, Dowden, Hutchison and Ross, Inc.,
Stroundsburg, USA, pp. 165-186.
Miller, K.G. and Fairbanks, R.G., 1985. Cainozoic 6 1 8 0 record of climate and sealevel. S. Afr. J. Sci., 81:
248-249.
Miller, R.G., 1974. The Jackknife - a review. Biometrika, 61: 1-17.
415
Mohan, M., 1985. Geohistory analysis of Bombay High region. Marine and Petroleum Geology, 2:
350-360.
Mosteller, F., 1951. Remarks on the method of paired comparisons, I, The least squares solution
assuming equal standard deviations and equal correlations. Psychometrika, 16: 3-9.
Mouterde, R., Ruget, C. and Tintant, H., 1973. Le passage Oxfordien - Kimmeridgien au Portugal
(regions de Torres-Vedras et du Montejunto). Com Ren. Acad. Sc. Paris, 277 (SBr. D): 2645-2648.
Muller, C. and Willems, W., 1981. Nannoplankton en planktonische foraminiferen uit de Ieper-Formatie
(Onder-Eoceen)in Vlaanderen (Belgie). Natuurw. Tijdschr., 62: 64-71.
Nazli, K., 1988. Geostatistical modelling of microfossil abundance data in upper Jurassic shale, Tojeira
sections, central Portugal. Unpublished M.Sc. thesis, Univ. Ottawa, 369 pp.
Nowlan, G.S., 1986. Paleontology: ancient and modern. Geoscience Canada, 13 (2): 67-72.
Odin, G.S. (Editor), 1982. Numerical Dating in Stratigraphy, Parts I and 11. Wiley- Interscience,
Chichester, 1040 pp.
Olea, R.A., 1988. Correlator - an interactive computer system for lithostratigraphic correlation of
wireline logs. Kansas Geol. Survey, Lawrence, Kansas, Petrophysical Ser. 4,85 pp.
Oleynikov, N.A. and Rubel, M. (Editors), 1988. Quantitative Stratigraphy - Retrospective Evaluation
and Future Development. Institute ofGeology, Acad. Sc. Estonian SSR,Tallinn, U.S.S.R., 167 pp.
Palmer, A.R., 1954. The faunas of the Riley formation in central Texas. J . Paleont., 28: 709-786.
Postuma, J.A., 1971. Manual of Planktonic Foraminifera. Elsevier, Amsterdam, 420 pp.
Quenouille, M., 1949. Approximate tests of correlation in time series. J . Royal Statist. Soc. Ser. B., 11:
18-84.
Rao, C.R., 1973. Linear Statistical Inference and its Applications. Wiley, New York, N.Y., 625 p.
Reinsch, C.H., 1967. Smoothing by spline functions. Numerische Mathematik, 10: 177-183.
Reinsch, C.H., 1971. Smoothing by spline functions. 11. Numerische Mathematik, 16: 451-454.
Reyment, R.A., 1980. Morphometrical Methods in Biostratigraphy Academic Press, London, 175 pp.
Reyment, R.and Sturesson, U.,1987. Correlation of chemical and physical environmental fluctuations
in a late Cretaceous borehole sequence - A multivariate study. Sed. Geol. 53: 311-325.
Riedel, W.R., 1979. Recent and potential advances in DSDP biostratigraphy. Am. Ass. Petr. Geol. Bull.,
63: 516.
Roberts, F., 1976. Discrete Mathematical Models. Prentice-Hall, Englewood Cliffs, N.J., 559 p.
Roberts, F., 1978. Graph Theory and its Applications to Problems of Society. Regional Conference Series
in Applied Mathematics 29, SIAM, Philadelphia, Penn., 122 pp.
Royden, L., Sclater, J.G. and Von Herzen, R.P., 1980. Continental margin subsidence and heat flow:
Important parameters in formation of petroleum hydrocarbons. Bull. Am. Assoc. Petr. Geol., 64:
173-187.
Russell, D.A., 1975. Reptilian diversity and the Cretaceous-Tertiary transition in North America. Geol.
Ass. Can. Spec. Paper 13: 119-136.
Russell, D.A., 1977. The biotic crisis a t the end of the Cretaceous period. National Museums of Canada,
Syllogeus, no. 12, pp. 11-23.
Rubel, M., 1978. Principles of construction and use of biostratigraphical scales for correlation.
Computers and Geosciences, 4 243-246.
Rubel, M. and Pak, D.N., 1984. Theory of stratigraphic correlation by means of ordinal scales.
Computers and Geosciences, 10: 97-105.
Salin, Yu. S.,1989. Computerized stratigraphic correlation by means of a geochronological scale. In: A.
Oleynikov and M. Rubel (Editors), Quantitative Stratigraphy-Retrospective Evaluation and
Future Development, Acad. Sciences Estonian S.S.R., Institute of Geology, Tallinn, pp. 73-80.
Sankoff, D. and Kruskal, J.B. (Editors), 1983. Time Warps, String Edits, and Macromolecules: The
Theory and Practice of Sequence Comparison. Addison Wesley, London, 382 p.
Schindewolf, O.H., 1950. Grundlagen und Methoden der palaontologischen Chronologie, 3rd Ed.
Borntraeger, Berlin, 152 pp.
Schlumberger, 1979. Log Interpretation Charts. Schlumberger Ltd., New York, 92 p.
Schoenberg, I.J., 1964. Spline functions and the problem of graduation. Proc. National Academy of
Sciences of the U S A . , 52: 947-950.
Schwarzacher, W., 1985a. Principles of quantitative lithostratigraphy - the treatment of single sections.
In: Quantitative Stratigraphy, UNESCO, Paris and Reidel, Dordrecht, pp. 361-386.
Schwarzacher, W., 1985b. Lithostratigraphic correlation a n d sedimentation models. In:
F.M. Gradstein et al., Quantitative Stratigraphy, UNESCO, Paris, and Reidel, Dordrecht,
pp. 387-418.
Sclater, J.C., and Christie, P.A.F., 1980. Continental stretching: a n explanation of t h e post
mid-Cretaceous subsidence of the central North Sea basin. J. Geophys. Res., 85: 371-379.
416
Shaw, A.B., 1964.Time in Stratigraphy. McGraw-Hill, New York, 365 pp.
Shaw, B.R., 1978. Parametric interpolation of digitized log segments. Computers and Geosciences, 4:
277-283.
Signor, P.W. and Lipps, J.H., 1982. Sampling bias, gradual extinction patterns and catastrophes in the
fossil record. In: L.T. Silver and P.H. Schulz (Editors), Geological Implications of Impacts of Large
Asteroids and Comets on the Earth. Geol. SOC. Am., Special Pap. 190,pp. 291-296.
Silverman, B.W., 1984. A fast and efficient cross-validation method for smoothing parameter choice in
spline regression. J. American Statistical Ass., 79:584-589.
Smith, D.G. and Fewtrell, M.D., 1979. A use of network diagrams in depicting stratigraphic time
correlation. Geol. Soc. London J., 136: 21-28.
Smith, T.F. and Waterman, M.S., 1980 New stratigraphic correlation techniques. J . Geol. 88: 451-457.
Southam, J.R., Hay, W.W. and Worsley, T.R., 1975. Quantitative formulation of reliability in
stratigraphic correlation. Science, 188: 357-359.
Springer, M. and Lilje, A,, 1988. Biostratigraphy and gap analysis: the expected sequence of
biostratigraphic events. J . Geol., 96: 228-236.
Srivastava, S.P. (Editor), 1986. Geophysical maps and geological sections of the Labrador Sea. Geol.
Survey Canada, Paper 85-16,llpp.
Stainforth, R.M., Lamb, J.L., Luterbacher, H., Beard, J.H. and Jeffords, R.M., 1975. Cenozoic planktonic
foraminifera zonation and characteristics of index forms. Univ. Kansas Paleont. Contr., no. 62,
pp. 1-162.
Stam, B., Gradstein, F.M., Lloyd, P. and Gillis, D., 1987. Algorithms for porosity and subsidence history.
Computers and Geosciences, 13 (2).
Stam, B., 1987. Quantitative Analysis of Middle and Late Jurassic Foraminifera from Portugal and its
Implications for the Grand Banks of Newfoundland. Utrecht Micropaleontological Bull. 34,
167 pp.
Strauss, D. and Sadler, P.M., 1989. Classical confidence intervals and Bayesian probability estimation
for ends and local taxon ranges. Math. Geol., 21: 411-427.
Sullivan, F.R., 1965. Lower Tertiary nannoplankton from the California Coast Ranges; 11. Eocene.
Univ. Calif. Publ. Geol. Sc.,53: 1-52.
Thomas, F.C., Gradstein, F.M. and Griffths, C.M., 1988. Bibliography and Index of Quantitative
Biostratigraphy. Special Publ. No. 1, Comm. Quantitative Stratigraphy, Bedford Inst. Oceanogr.,
Dartmouth, N.S., Canada, 58 pp.
Tipper, J.C., 1988. Techniques for quantitative stratigraphic correlation: a review and annotated
bibliography. Geol. Mag., 125 (5):475-494.
Tjalsma, R.C. and Lohmann, G.P., 1983. Paleocene - Eocene bathyal and abyssal benthic Foraminifera
from the Atlantic Ocean. Micropal., Spec. Publ., no. 4,76pp.
Tocher, K.D., 1950. Extension of the Newman-Pearson theory of tests to discontinuous variates.
Biometrika, 37: 130.
Tukey, J., 1958. Bias andconfidence in not quite large samples. Annals Math. Statist., 29: 614.
Tukey, J.W., 1977. Exploratory Data Analysis. Addison-Wesley, Reading, Massachusetts, 688 pp.
Utreras, F., 1981. Optimal smoothing of noisy data using spline functions. SIAM J. Stat. Comput., 2:
349-362.
Vail, P.R. and Mitchum, R.M., Jr., 1979. Global cycles of relative changes of sea-level from seismic
stratigraphy. Am. Ass. Petr. Geol. Mem. 29: 469-472.
Vail, P.R., Mitchum, R.M., Jr. and Thompson, S.,111, 1977. Seismic stratigraphy and global changes of
sealevel. Part 4. Mem. Am. Assoc. Pet. Geol. 26: 83-97.
Van Hinte, J.E., 1978. Geohistory analysis, application of micropaleontology in exploration geology.
Am. Assoc. Petrol. Geol. Bull., 62: 201-227.
Van Hinte, J.E., 1984. Synthetic seismic sections from biostratigraphy. Am. Ass. Petr. Geol. Mem. 34:
674-685.
Van Valen, L. and Sloan, R.E., 1977. Ecology and the extinction of the dinosaurs. Evolutionary Theory,
2: 37-64.
Vrbik, J., 1985. Statistical properties of the number of runs of matches between two random
stratigraphic sections: Mathematical Geology, 17: 29-40.
Watts, A.B., and Steckler, M.S., 1981. Subsidence and tectonics of Atlantic-type continental margins.
Oceanologica Acta, vol. 4,suppl. 1981,no. SP, pp. 143-153.
Wahba, G., 1975. Smoothing noisy data with spline functions. Numerische Mathematik, 2 4 383-393.
Waterman, M.S. and Raymond, R., Jr., 1987. The match game: new stratigraphic correlation
algorithms. Math. Geol. 19: 109-127.
417
Waterman, M.S., Smith, T.F. and Beyer, W.A., 1976. Some biological sequence metrics. Adv. Math., 2 0
367-387.
Wegman, E.J. and Wright, I.W., 1983. Splines in statistics. J. American Statistical Ass., 78: 351-365.
White, J.M., 1990. Exploration of a practical technique to estimate the relative abundance of rare
palynomorphs using an exotic spike. In: F.P. Agterberg and G.F. Bonham-Carter (Editors),
Statistical Applications in the Earth Sciences, Geol. Surv. Can. Paper 89-9.
Whittaker, E.T., 1923. On a new method of graduation. Proc. Edinburg Math. SOC.,41: 63-75.
Wilkinson, E.M., 1974.Techniques of data analysis - seriation theory: Archaeo- Physika, 5: 1-142.
Williams, D.F., 1990. Selected approaches of chemical stratigraphy to time-scale resolution and
quantitative dynamic stratigraphy. In: T. A. Cross (Editor), Quantitative Dynamic Stratigraphy,
Prentice Hall, Englewood Cliffs, New Jersey, pp. 543-565.
Williams, D.F., Lerche, I . and Full, W.E., 1988. Isotope chronostratigraphy: Theory and Methods.
Academic Press, San Diego, 352 pp.
Williamson, M.J., 1987. Quantitative biozonation of the Late Jurassic and Early Cretaceous of the East
Newfoundland Basin. Micropaleontology, 33: 37-65.
Williamson, M.A. and Agterberg, F.P., 1990. A quantitative foraminifera1 correlation of the late
Jurassic and early Cretaceous offshore Newfoundland. In: F.P. Agterberg and C.F. Bonham-
Carter (Editors), Statistical Applications in the Earth Sciences, Geol. Surv. Can. Paper 89-9.
Wilson, L.R., 1964. Recycling, stratigraphic leakage and faulty techniques in palynology. Crana
Palynologica, 5: 427-436.
Wold, S.,1974. Spline functions indata analysis. Technometrics 16 (1):1-11.
Wood, R.I., 1981. The subsidence history of the Conoco well 15/30-1,Central North Sea, Earth and
Planetary Sci. Lett., 54: 306-312.
Worsley, T.R. and Jorgens, M.L., 1977. Automated biostratigraphy. In: A.T.S. Ramsay (Editor),
Oceanic Micropaleontology, Academic Press, London, 2:1201-1229.
Ziegler, P.A., 1981. Evolution of Sedimentary basins in Northwest Europe. In: L.V. Illing and
G.D. Hobson (Editors), Petroleum Geology of the Continental Shelf of Northwest Europe. Inst. of
Petroleum, London, pp. 3-39.
This Page Intentionally Left Blank
419