The Toumba Building at Lefkandi A Statistical Method For Detecting A Design-Unit

The Toumba Building at Lefkandi: A Statistical Method for Detecting a Design-Unit Author(s): Jari Pakkanen Source: The Annual
of the British School at Athens, Vol. 99 (2004), pp. 257-271 Published by: British School at Athens
THE TOUMBA BUILDING AT LEFKANDI: A STATISTICAL METHOD FOR DETECTINGA DESIGN-UNIT'

INTRODUCTION
BESIDES being an archaeological and architectural problem, deriving the length of footstandards or of any other design-units from building dimensions is a statistical one. It is very difficult-though hopefully not entirely impossible-to address issues related to the original design of an ancient building based on a set of modern measurements: the execution of the building might already have been quite different from the design,' the buildings are often poorly preserved, and modern measurement errors are common. However, using a statistical method might help us overcome some of these problems. For example, if the building was laid out using multiples of a certain design-unit, can we detect that? The answer depends mainly on the accuracy of both the execution and our measurements: ifthe errors are significantly smaller than the unit and we have a sufficient
number of relatively reliable measurements, then we do have a chance of solving the problem. In statistical analyses the error factor can be taken into account and we may also tackle questions such as how significant our results are and how accurate conclusions can be derived from the data. It is, therefore, quite inexplicable that statistical methods are not regularly employed in metrological analyses of ancient buildings.3 It has recently been suggested that a foot-unit of c. 0.30 m was used in the design of the
monumental Early Iron Age building at Lefkandi.4 The aim of this paper is to study,
based on a set of measurements published by the excavators,5 whether either one of the following alternatives can be regarded as more reasonable: with no a priori information on Protogeometric metrology (if such a phenomenon indeed existed), could it be that the recorded building dimensions arise from a smooth measurement distribution, or is there, more likely, a single underlying design-unit or, in statistical terms, a quantum or a basic
1 I wish to dedicate this article to the memory of my father Ahti Pakkanen for all his support over the years; especially, I should have liked to have had the opportunity to thank him also in print for locating and sending a copy of von Mises's article published in 1918, the first study employing the sum of cosine values for determining the size of an unknown unit in a data set (for the full reference, see n. lo). I am grateful to Michael Baxter for pointing me towards D. G. Kendall's work on unit detection, and I also wish to thank the anonymous referee for the comments I received on the manuscript of the paper. The following short titles are used in this article: 'Archaeostatistics' = N. R.J. Fieller, 'Archaeostatistics: old statistics in ancient contexts', Statistician, 42 (1993), 279-95. 'Hunting quanta' = D. G. Kendall, 'Hunting quanta', Philosophical Transactionsof the Royal Society of London. Mathematicaland Physical Sciences,A 276 (1974), 231-66. 'Layout' = J. A. K. E. de Waele, 'The layout of the Lefkandi "Heroon"', BSA 93 (1998), 379-84 'Methodological reflections' =J. and P. Pakkanen, 'The Toumba building at Lefkandi: some methodological reflections on its plan and function', BSA 95 (2000), 23952.
Statistics in Archaeology = M. Baxter, Statistics in

Archaeology (London,
2 J. J. Coulton, 'Towards understanding Greek temple design: general considerations', BSA 70 (1975), 93-8. 3 The work of R. C. A. Rottlinder is an exception; see e.g. 'Studien zur Verwendung des Rasters in der Antike II', OJh65 (1996), 1-86. 4 See 'Layout'; the analysis presented in this paper the brief in supersedes metrological study 'Methodological reflections', 242-5, where de Waele's view is shown to be problematic. 5 For the measurements, see M. R. Popham, P. G. Calligas, and L. H. Sackett (eds) withJ. Coulton and H. W. Catling, The Protogeometric Part 2: Building at Toumba, The Excavation, Architecture and Finds (BSA Supp. 23; London, 1992), 33-49; 'Layout', 380-3.
2003).
258
JARI
PAKKANEN
quantity hidden behind the data?6 The Toumba building does not provide an ideal data set for metrological analysis because of the limited number of accurate measurements, but since the issue has been raised, it obviously deserves a thorough study. The emphasis of this paper is on demonstratinghow metrological issues can be approached in a statistically valid way and for these purposes this case study is as illuminating as any other within the field of Greek architecture. I shall try to keep the technical descriptions mainly in the footnotes, but the reader will notice some mathematical formulae in the middle of the text as well: I think a metrological study can only gain from not over-simplifying the fairly complex issue of detecting unit lengths in architectural data.
METHOD
In architectural studies the standard metrological method of deriving the length of the
possibly used foot-standard is roughly the following: measurements of some arbitrarily chosen main elements, such as the width and length of the building or column dimensions, are expressed in terms of the ancient foot-unit that seems to give the best fit. Then the rest of the dimensions are expressed in terms of this unit, and it is often necessary to introduce small fractions of the unit to obtain a successful outcome. The overall result of the method is the confusion observable, for example, in the case of the Parthenon. Different scholars
have suggested a whole array of foot-standards, modules or cubits: 29.366-29.436

30.5-30.7 cm, 32.6-32.8 cm, 49.02857
cm,
cm, and 61.2857 cm.7 If the standard metrological
method were not used as widely as it is in studies of Greek architecture, it would be needless to point out that it is not a valid method for conducting research. The methodological approach adapted in this article is quite different. It was originally published as early as 1974 by D. G. Kendall in his paper 'Hunting quanta'.8 Basically, it is a statistical study of whether the suggested measurement unit of 1.66 m could have
been in general use in the diameter design of British megalithic circles;9 I should emphasize that Kendall's contribution is especially a statistical one and that the considerations on the archaeological significance of the original question are quite separate from the proposed method. He draws on earlier studies by R. von Mises and S. R. Broadbent on detecting a quantum of an unknown size in a set of measurements, but he suggests a modified approach in which the cosine quantogram method is used to analyse the data and the validity of the results is assessed by Monte Carlo computer simulations.'o What this means in the context of Lefkandi is discussed in the following.
6 The Bayesian approach assumes that a quantum exists, an assumption I do not think we can make in this case; cf. P. R. Freeman, 'A Bayesian analysis of the Megalithic yard', Journal of the Royal Statistical SocietyA 139 (1976), 20o-35; Freeman's posterior distributions are actually very closely connected to Kendall's method used in this paper and presented in his article 'Hunting quanta', as has been demonstrated by B. W. Silverman in 'Discussion of Dr Freeman's paper', Journal of the Royal Statistical Society,A 233-4. 139 (1976), 44-5; see also Statisticsin Archaeology, For an overview of, and further references on, posterior distributions, see ibid., 176-8. 7 W Sonntagbauer, 'Zum GrundriB des Parthenon', OJh 67 (1998), 136. 8 Unfortunately, I was unaware of its existence at the
time of writing my contribution to 'Methodological reflections'. 9 Kendall did detect a 'real quantum' in the data, but he demonstrated that it could equally well be a result of laying out the relevant dimensions by pacing ('Hunting quanta', 258); a synopsis of the discussion is presented in C. Renfrew and P. Bahn, Archaeology: Theories, Methods,and
Practice (3rd edn., London, 200ooo),401, and Baxter sums
235: 'Although up the argument in Statisticsin Archaeology, the megalithic yard may be dead, the methodology that some regard as having buried it lives on.' Several more relevant examples of quantal problems in archaeology are discussed in 'Archaeostatistics', 282-6. 10 R. von Mises, 'Uber die "Ganzzahligkeit" der Atomgewichte und verwandte Fragen', Physikalische
THE
TOUMBA
BUILDING
AT LEFKANDI
259
The 'quantum hypothesis' in the case of the Toumba building is that a building dimension Xcan be expressed in terms of an integral multiple Mtimes an underlying fixed quantity q plus a small error component e. Mathematically this can be formulated as
X= Mq + E. (1)
The error may be due to original laying out of the dimension or modern uncertainty of the size of the building element, and it should be significantly smaller than any suggested quantum." To test whether a certain dimension X can usefully be given in terms of quantum q we simply divide Xby q and analyse the remainder (or error) E which will be between o and q: the closer to o or q it is, the better candidate q is for the quantum; in the worst case E is halfway between o and q. Kendall suggests that taking the cosine of e divided by q is the most useful function for assessing how well Xcan be expressed in terms of q: for the wellevaluate how well q qualifies as the overall candidate we have to calculate the cosine for the complete set of measurements and the full range of possible quanta. The performance of each q can be assessed from a plot of the sum of cosine values (Kendall named the plot appropriately as cosine quantogram): the higher the peak, the higher the plausibility of q
being the right candidate for the quantum (see FIGS. 1-3, which will be discussed in detail fitting dimensions the cosine is i (or close to it) and for the worst half-way cases -1. To
in the next sections). To sum up, the mathematical formula for calculating the amount of clustering around q is
n
#(q) =
/N X cos(2re/q),
i=1
(2)
where N is the number of observations: the first term /N is needed to scale the sum to the number of measurements-without the scaling factor presenting more according measurements would always result in getting a higher value for the 'quantum score' o(q)." How is it possible to evaluate whether the function peaks enough to suggest a quantum which is not only due to a coincidence? Here the second part of Kendall's method steps in, namely Monte Carlo simulations. It consists of creating random replica data sets, with the same overall statistical properties as the original data, but from a non-quantal distribution. The new data sets are then subjected to the same test as the original data, and if peaks as high or higher regularly occur in the analysis of random data, we should reject the quantum hypothesis: not even the most prominent peak of the original data is likely to indicate a 'real quantum'.'3
Zeitschrift,19 (1918), 490-500; S. R. Broadbent, 'Quantum hypotheses', Biometrika,42 (1955), 45-57; id., 'Examination of quantum hypothesis based on a single set of data', Biometrika, (1956), 32-44; 'Hunting quanta', 234. 43 A review of Kendall's method with a larger number of executed simulations is presented in Statistics in Archaeology, 228-35. On the use of Monte Carlo methods in archaeolA ogy, see J. Pakkanen, The Temple AthenaAlea at Tegea: of Reconstruction the Peristyle Column (Publications of the of Department of Art History at the University of Helsinki,
18; Helsinki, 1998), 54-5, esp. n. 18; Statisticsin Archaeology, 147-58. " 'Hunting quanta', 233-4. " Ibid., 235-9; 'Archaeostatistics', 282; Statistics in Archaeology, 231. '3 'Hunting quanta', 241-9; 'Archaeostatistics', 2823; Statistics in Archaeology, 232-3. I have implemented the computer programs used in the cosine quantogram analyses, Monte Carlo simulations, and kernel density estimations on the basis of statistical program Survo MM.
260
JARI
TABLE 1. Toumba
PAKKANEN
building dimensions (m).
wall thickness
doorway: central room, south side
0.6
1.47 1.5
0.6
1.47
doorway: central room to west corridor doorway: from east to central room porch, blocking wall thickness
porch, interior length
2.3 0.9
1.5
1.5 2.3 0.9

1.5
east room, transv. wall thickness east room, interior length central room, transv. wall thickness
central room, interior length
0.6 8.3 0.6

22.0
0.6 8.3
22.0
west room, transv. wall thickness west room, interior length west room, transv. wall thickness overall length without apse veranda east room, width (interior)
central room, width (east side)
0.6 3-3 0.6 48.o 8.8

9.02
3.3 0.6 48.0 8.8

9.02
central room, width (west side) west rooms, north room interior N-S length west rooms, west corridor width with walls west rooms, south room interior N-S length interaxial, porch
interaxial interaxial 1 2
8.77 3.0 3.0 2.6

1.5 3.0 3.0
8.77 3.0 3.0 2.6

1.5 2.98
interaxial 3 interaxial 4 interaxial 5

apse room, east interaxial
2-9 3.0 3.0

3.0 2.4 3.0 2.4
apse room, middle interaxial apse room, west bay

N
3-3
27
3.3
21
COSINE
QUANTOGRAM
OF LEFKANDI
DATA
The set of measurements from Lefkandi is presented in col. 1 of TABLE I have intentionally 1. retained the same data as in the previous studies on the building,'4 even though it does create some problems: the question of multiple occurrences of the same dimension is addressed later in this section. The first comment a statistician might make is that the number of measurements is very small, consisting of only 27 observations. However, no more can be obtained from the Toumba building, as is often the case in archaeological contexts.'5 Kendall demonstrated that the smaller the sample, the less likely it is that we
14'Layout', 380-3; 'Methodological reflections', 244The question of measurement accuracy and significant 15. digits is discussed in 'Methodological reflections', 243. '5 The issue of the small number of observations is also addressed in J. Pakkanen, 'Deriving ancient foot-units
from building dimensions: a statistical approach employing cosine quantogram analysis', in G. Burenhult and J. Arvidsson (eds), Archaeological Informatics:Pushing
the Envelope.
501-6.
CAA 2oo0
(BAR Slo16;
Oxford,
200oo2),
THE TOUMBA
BUILDING
AT LEFKANDI
261
shall be able to detect a 'true quantum' from the background noise.16 Even though we are in a situation where it is possible to determine by means of computer simulations whether the peaks are significantly higher than the noise level, the size of the sample naturally has an effect on the reliability of the presented conclusions. The starting point of the quantum analysis is to decide what is the appropriate unit range to be used in the hunt for q. The best candidate for the upper limit of the range is the smallest measured dimension, in this case the wall thickness (o.6 m). The lower limit is slightly more problematic. Most of the dimensions in TABLE can be measured only to 1 the nearest 0oomm, so we have a minimum estimate of +50 mm for the error component in formula (1); as we noted in connection with this formula, E should be significantly less than the smallest considered q. Therefore, the beginning of the analysed range could be reasonably set at 200 mm, but since quite interesting phenomena can be observed at the lower end of the first cosine quantogram plotted on the basis of raw data (FIG.1), I shall start the first two analyses from 60 mm. The quantum score $(q) for the Toumba building is calculated from the measurements presented in col. 1 of TABLE1, and the results of the initial analysis can be drawn as a curve where the score #(q) is plotted against q (FIG.i). There are several peaks which have a score of nearly 4 or even higher: from the left, these are at a = 60.1, b = 74.9, c= 100.0,
d = 150.5, e = 301.7, and f= 489.7 mm. It is quite significant that the fifth peak e can
approximately be expressed in terms of all four peaks to the left of it: e= 5a = 4b = 3c = 2d. The overall effect is that the four peaks reinforce the height of peak e.17This is especially problematic in the case of peak c, which obviously arises from rounding most of the measurements to the nearest too mm. Interestingly, none of the above considerations apply to peak fat 489.7 mm. Experts in Greek metrology will no doubt readily recognize that the last peak curiously corresponds to a cubit of the so-called 'Doric-Pheidonian'
foot of c. 326-7 mm.18
The factitious peaks a, b, and c and their effect on other peaks to their right can be the easily eliminated by unrounding data:19a random fraction of o50 mm is added to all measurements quoted to the nearest i 00oo mm and a fraction of +5 mm to the few 'precise' is clear from FIG.2. It is quite disastrous for peaks a to c: their height collapses and they cannot be separated from the background noise. Peak d fares only slightly better, and e is significantly lower suggesting that, as expected, rounding the dimensions in the raw data did make a contribution to its height; its summit is also shifted to the left and is now at 297.1 mm, a displacement of almost 5 mm. Quite interestingly, this new value is fairly close to the so-called 'Ionic-Attic' foot of c. 294 mm.:' Peakfremains almost as prominent as in the first analysis.
6 'Hunting quanta', 249, 254-7; the altitude of the peak varies as the square-root of N. 18 One foot is two thirds of the length of a cubit, so 489.71 mm x 2.3- 326.5 mm. The classical general account on the length of the 'Doric' foot, though without any detailed analyses, is W. B. Dinsmoor, 'The basis of Greek temple design: Asia Minor, Greece, Italy', Atti del settimocongresso classica(Rome, internationaledi archeologia 1961), i. 358-60. The unit is also called 'Pheidonian' after the legendary king of Argos, who, according to Hdt. vi.
17 Ibid., 252-3.
dimensions measured to the nearest lo mm in col. 1 of TABLE The effect of unrounding 1.20o
127, introduced a new system of weights and measures in his kingdom; see W. Dirpfeld, 'Metrologische Beitritge', AM 15 (1890), 177 for connecting the name of Pheidon with the foot-unit. 19 'Hunting quanta', 241. 'o Measurement X,is replaced by X, + mU, where m is 5 for accurate measurements and 50 for all the others, U a random number between -1 and 1 selected from a uniform distribution, and i= 1, 2, 3,..., N. " For this length of the unit, see e.g. Dinsmoor (n. 18), 357-8.
262
JARI
PAKKANEN
7 6 5 4 3 2 1 0 -1 -2 -3 -4 -5
P(q)
-c
a b
-d
-e
60
100
200
300
400
500
600 q (mm)
FIG. 1. Cosine quantogram for Lefkandi raw data (N= =27).
However, peak e still towers above the others: should we therefore accept it as the quantum? That would be far too hasty. The significant shift in its position produced by the unrounding suggests that the inaccuracy of the original data affects peak e but not peak Also, if we take a new look at col. 1 of TABLE it is evident that there are multiple 1, occurrences of the same dimension: wall thickness of 0.6 m appears three times and the interaxial of c. 3.0 m five times. In col. 2 of TABLE 1 these are cleared and replaced by a single value (the number of observations Nis now 21),22 and the cosine quantogram of the new data, naturally first unrounded, is plotted in FIG. 3: peak e sinks clearly below f which remains as conspicuous as ever with a quantum score of 3.65 and almost exactly at
" On measurement selection and metrology, see

'Archaeostatistics', 285-6.
THE
TOUMBA
BUILDING
AT LEFKANDI
263
7 6 5
g(q)
-e
4
-f
3 2 1 0 -1 -2 -3 -4 -5 60 100
-d
200
300
400
500
600
9 (mm)
FIG.2. Cosine quantogram for unrounded Lefkandi raw data (N= 27).
the same place, 489.5 mm, as in the beginning. The summit of e is again shifted to the left to 296.3 mm and it has a score of only 3.28. In the next two sections I shall try to find an answer to the following questions: are the heights of peaks e and f sufficient to be considered real quanta and if they are, how better estimate? accurately is it possible to define the length of ithe
MONTE CARLO SIMULATIONS
For the computer simulations we need random non-quantal data sets 'similar in all statistical respects to the actual data';23 the prerequisite for the production of such data sets is a
23 Hunting quanta', 245.
264
JARI
PAKKANEN
mathematical model of the data. Kendall used the right half of a normal distribution as his data model.04 Since the effect of simulation distributions on Monte Carlo analysis of cosine quantograms has been questioned by P. R. Freeman,25I shall test several different models, some intentionally very different from the original distribution and some with varying parameters, to find out to what degree the results are dependent on data modelling. In the analyses I shall use runs of at least 1000 simulations for each model, which is usually level of significance.'6 enough for a test at the 5%/0 The first Monte Carlo simulation is not actually based on creating replica data model but on a statistical method which has gained increasing popularity in the past twenty
years: Efron published his first paper on bootstrap methods in 1979, and it is now among the most important computer-intensive methods.27 The rational behind the bootstrap is that the existing sample, such as the building measurements in this study, is the best guide to a larger distribution. Technically, this means taking several random resamples of the sample with replacement: once an observation is selected, it is placed back in the pool and can be selected again. The name bootstrap was given because it 'is supposed to be analogous to someone pulling themselves out of mud with their bootstraps'.28 The resampling of the over the full range 200 _ q < 600. The results are summarized on line 1 of TABLE 2: cols.
2. for TABLE Results of the Monte Carlo simulations (n = 1000looo each run).
1: S 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
2o
data was repeated 1000 times and a cosine quantogram was calculated for each simulation
3.65
2: S
3.28
3:S,
t=
5.06 3.73 3.36 3.46 3.45 3.42 3.40 3.41 3.45 3.43 3.45 3.49 3.46 3.47 3.52 3.45 3.53 3.54 3.53 3-57 3.57
5%
21
Bootstrap Uniform distribution I Uniform distribution II 'Histogram' distribution, Ist run 'Histogram' distribution, 2nd run Normal distribution, Ist run Normal distribution, 2nd run Normal distribution, 3rd run Xo distribution, a= 2600oo X distribution, a= 28oo00 distribution, a = 3000ooo Fdistribution, Ist run Fdistribution, 2nd run KDE, h= 500, ost run KDE, h= 500, 2nd run KDE, h= looo, 1st run KDE, h= looo, 2nd run KDE, h= 1500, 1st run KDE, h= 1500, 2nd run KDE, h= 18oo, 1st run KDE, h= 18oo, 2nd run
81.40/ 6.5% 1.4% 2.1%/ 2.7%o 1.9% 2.7%/ 2.1% 2.3%/ 1.8% 2.8% 2.7% 2.9% 2.3% 2.8% 2.o%
2.4%
2.9% 3.1% 3.9% 3.2%
94.4% 26.8% 7.1% 9.8% 9.6% 8.7% 7.6% 7.9% 8.5% 9.4% 9.0% 8.3% 8.5% 1o.05% 10.8% 10.7% 11.3% 13.8%
12.8%
12.8% 13.6%
24 Ibid., 245-6.
25
Freeman (n. 6), 23.
,6
On the number of Monte Carlo simulations, see B. F.
J. Manly, Randomization,Bootstrapand Monte CarloMethods

in Biology(2nd edn., London and Weinheim, 1997), 80-4. 27 B. Efron, 'Bootstrap methods: another look at the jackknife', Annals ofStatistics, 7 (1979), 1-26; on bootstrap
methods in general, see e.g. B. Efron and R.J. Tibshirani, An Introductionto the Bootstrap(Boca Raton and London, 1993) and A. C. Davison and D. V. Hinkley, Bootstrap Methods and Their Application (Cambridge, 1997); a summary of bootstrap in archaeological context is presented in Statistics in Archaeology, 148-53. 28 Manly (n. 26), 34.
THE
TOUMBA
BUILDING
AT LEFKANDI
265
1-2 give as a percentage the number of times peaks fand e were exceeded by the maximum peak value S in the looo simulations,29 and col. 3 the simulated maximum value Sfor the commonly used statistical significance level of c = 50%.In other words, the last column indicates the value which exceeds 95% of the simulated maxima; if a peak is higher than this score S, we can reasonably claim that the quantum hypothesis is supported by the data. The results are quite disastrous: 81%/0 the simulations produced a peak higher than of f and 940/0 higher than e, the 5%/0 significance level score is 5.1, well beyond the reach of e and f(cf. FIG. 3). The reason why bootstrap does not work in this case is quite peaks
g(q)
6
5 4
-f
-e
a =5%
3 2 1 0 -1 -2 -3 -4 -5 200 300 400 500 600 q (mm)

FIG.3. Cosine quantogram for unrounded Lefkandi data without multiple observations (N= 21).
29 S= sup{#(q): 200 < q
6oo00).
266
JARI
PAKKANEN
self-evident: the possibility of an observation being replicated in the resampled data set emphasizes the quantum peak, quite the opposite of what is needed in the Monte Carlo simulations. The second set of simulations and the first data models are based on uniform distribution: all the simulated observations Xare dispersed over the full range of observed dimensions so that they have an equal probability of getting any value between 0.6 and 48.0 m (uniform distribution I in FIG. 4). The uniform distribution can be considered as the minimum data model: the only respect in which it is similar to the original data is the range of created values. If the results obtained from the other models are not significantly different from the uniform model, they are completely independent of the selection of distribution. Initially, the danger of using solely the uniform model would seem to be that it may give too low values for the peaks and, therefore, the results of the quantogram analysis could be accepted as significant even if they were not. The results of the first looo simulations are given on line 2 of TABLE 2: neither peak e nor f is statistically significant. However, too high a proportion of long dimensions in the uniform distribution causes the same effect as using too small a quantum in data analysis: for example, testing q = 1 cm would necessarily result in discovering a quantum with the maximum theoretical heightso because that is the maximum measurement accuracy of the data. The uniform distribution can be modified by simply excluding the long dimensions. Reducing the range to 0.60-9.02 m (uniform distribution II in FIG. 4) results in notably lower peaks of (line 3 of TABLE 2): peak fis topped in only less than 2%/0 the simulation runs, but peak e is still clearly below the 5%/0 significance level score of 3.36. The right cut-off point for the 95%/0 reliability is most probably somewhere between the two extreme scores of 3.36 and 3.73 discovered in the uniform distribution simulations, and in order to discover the location more exactly, more simulations with different distributions are required.
Relativefrequency
0.00012 0.00010 0.00008 0.00006 0.00004
II Uniform
UniformI 0.00002 0 0 10000 20000 30000 40000 50000 Length (mm) FIG. 4. Two uniform distributions used as simulation data models.
30 In formula (2) the cosine value would be 1 for all measurements, and the maximum peak height could be
calculated
as 2NxN=2/21x21=6.48
THE Relativefrequency
TOUMBA
BUILDING
AT LEFKANDI
267
0.05
0.04
0.03
0.02
0.01
0
0 10000 20000 30000 40000 50000 Length (mm)
FIG. 5. Histogram distribution used as a simulation data model; based on Lefkandi data without multiple observations (N= 21).
The second data model is based on the histogram of the original data (FIG. 5). The simplest way of turning a histogram into a model for the replica data sets is to use its class widths and numbers of observations as the basis of a probability distribution; it is not a very advanced model and its stepped form may introduce a slight unintended quantal effect to the data. However, when the number of simulated observations Nis small (here N= 21), the results should not be too different from the other data models presented in the next paragraphs, and with a good choice of class-width the quantal effect can be reduced. Here the width is selected as 1.2 m, twice the largest tested q. The probability of the simulated observation falling into the first class [500, 1700) is 6/21, the second class [1700, 2900) 3/21, etc., as in FIG. 5. The histogram distribution was used to produce two data sets of 21 observations and the cosine quantogram separate simulation runs of 00looo computed for each of them. The second run is used to check the degree of result fluctuation: if the runs give approximately the same results, it is possible to be quite confident that runs are enough to reach reliable conclusions. A summary of the results of the two 1000looo runs are on lines 4-5 of TABLE 2: as we see, peak fwas submerged in less than 3% of the and the 5%/0 simulations, peak e in nearly 1o%/0, significance level is 3.45 in the first simulation run and virtually the same, 3.46, in the second: therefore, based on the histogram model we can accept peak fat the 5%/0 significance level, but reject peak e at the same level. The next tested distributions are three continuous statistical distributions:3' normal,32 chi-squared,33 and F distributions (FIG. 6).34 Comparison with the histogram distribution
3' All the used continuous probability distributions satisfy the condition 1 ff(x)dx= 1, a where [a, b] = [6oo00, 52oooo]; cf. FIGS. 6-7. 32 The simulated measurements were created using the
33 The formula used was X= ax2 + 600, where the chisquared distribution had 2 degrees of freedom and the
multiplier a three different values of 2600, 2800 and 3000
in the simulation runs (see lines 9-11 of TABLE The 1).

distribution in FIG. 6 was produced with a = 2800. 2600F + 580, where the degrees of freedom and v, = 4.
x2
34 The measurements were created from formula X=

were v, = 4
formula X= 65831|Z + 6oo00; 6583 mm is the mean of the 21 measurements in col. 2 Of TABLE and the distribution is 1, moved to the right by 6oo00 mm; cf. 'Hunting quanta', 245-6.
268
density
JARI
PAKKANEN
0.0004
F distribution distribution X2 Normaldistribution
0.0003
0.0002
0.0001
0 0 10000 20000 30000 40000 50000 Length (mm) FIG. 6. Continuous statistical distributions used as simulation data models.
shows that none of them fits precisely with the measurements simply because the original data is not continuous: there are clear gaps between different clusters of dimensions (cf. FIGS. 5 and 6). However, the simulation results presented on lines 6-13 of TABLE 2 are quite well in line with the previous analyses: there is minor fluctuation in the values in cols. i and 2, but the 5%/0 significance level of 3.4-5 (col. 3) is the same as the one determined by the histogram model, so peak fcan be accepted as statistically significant while peak e cannot. The final distributions are based on a more advanced method of density estimation than the histogram: in kernel density estimation (KDE) a 'bump' is placed at the observation and these are summed to form a smooth curve instead of the rectangular shapes of the histogram (FIG. 7).35 The shape of an individual bump is clearly visible at the centre and the right of the figure. Since the choice of the origin and class-width of the histogram may have an effect on data interpretation, the use of kernel density estimation has recently been advocated as preferable also in archaeological contexts.36 The concept of using KDE as a distribution model emphasizes the fact that the observations we have are the best guide to the 'underlying' distribution.? However, if the model produces too exact replicas of the original data set it will also produce similar quantal effects. The key is to smooth the KDE distribution appropriately, which is done by controlling the band- or windowwidth h corresponding to histogram class-width. There are several different methods for choosing the 'optimal' width,38 and different hivalues are tested in this study in order to analyse the effects of band-widths on the results: with small band-widths the distribution
35 On univariate kernel density estimation in general, see B. W. Silverman, Density Estimation for Statistics and Data Analysis (London and New York, 1986), 7-74. 36 M. J. Baxter and C. C. Beardah, 'Beyond the histogram: improved approaches to simple data display in archaeology using kernel density estimates', Archeologia e calcolatori,7 (1996), 397-408; for a recent bibliography on the use of kernel density estimates in archaeology, see C. C. Beardah and M.J. Baxter, 'Three-dimensional data display using kernel density estimates', in J. A. Barcel6 et al. (eds), New Techniquesfor Old Times: Computer
Applications and Quantitative Methods in Archaeology. Barcelona, March 1998 Proceedingsof the 26th Conference, (BAR S757; Oxford, 1999), 163-9; see also Statistics in Archaeology, 29-37. 37 The parallel with bootstrapping techniques is evident, but the problems with bootstrap demonstrated in the beginning of this section can be avoided; cf. Manly (n. 26), 34. 38 For evaluating the band-widths I have employed the MATLAB routines programmed by C. C. Beardah; see Baxter and Beardah (n. 36), 405-8.
THE TOUMBA
density
h = 500 0.00025 0.00020 0.00015 0.00010 0.00005
BUILDING
AT LEFKANDI
269
0.00030-
h = 1000
h = 1500
00 10000 20000 30000 40000 50000 Length (mm)
FIG.7. Three kernel density estimation distributions used as simulation data models.
displays more details of the underlying data structure and with large values the curve becomes very smooth (FIG.7). The band-widths h of the simulations vary from 500 to
18oo (lines 14-21 in TABLE 2)39 while the 'objective' values for h have a range of 9452049.40 The first KDE distribution is based on a quite small window-width; it is introduced to test if too small h values have an effect on quantogram analysis. The results of the different h values are consistent with previous analyses with peak fstatistically significant
quantum scores topping peak fis slightly rising from less than 3% (h = 500 and looo) to more than 30/0(h= 1800oo), the same can be observed for peak e. the smallest percentages and
are less than 11% and the largest over 13% (cols. 1-2 in TABLE The smoother the data 2).
at 5%/0 level and peak e not (lines 4-21
in TABLE2). The trend for the percentage
of
model is, the less likely it is that the highest observed peaks would be considered significant, but the 5%/0 significance level remains constant at 3.5-3.6. The most probable explanation for the trend observable in cols. 1 and 2 is that in the smoother models the number of large measurements is proportionally slightly higher than in the more 'rugged' models (FIG.7): this is mostly due to the clear cluster of dimensions below 5 m slowly eroding to the right. The phenomenon is unavoidable since it would hardly be sensible if the data models produced dimensions with negative simulated values, but at least in this case the lower end of the window width h (500 and i000) is preferable to the larger values. As we have seen, the results of the Monte Carlo simulations are not critically dependent level values of 3.40-3.42 produced by the normal distribution are very probably slightly too low and the top end of the KDE simulations (h = 180oo)too high at 3.57. The solid
line at the height of 3.48 in FIG. 3 indicates the level a = 5% calculated from the 13000 on the data model used in the case of the Lefkandi building (TABLE The 5%/0 2). significance
KDE simulations based on the histogram, chi-squared, F and the first three KDE distributions. The determined significance level is unique for this data set, and a new
39 In order to keep FIG. 7 intelligible, I have not illustrated KDE with h= 1800 in the plot: it continues the trend and is still smoother than the KDE with h = 1500. 40 The band-width h calculated using Solve-TheEquation method (STE) is 945.3, one-, two- and threestage Direct-Plug-In (DPI) methods 2o49.1, 1488.2 and 1132.5 respectively, and Smooth-Cross-Validation (SCV) 1383.1. For the methods, see Baxter and Beardah (n. 36), 397-408.
270
JARI
PAKKANEN
value has to be determined by simulation for each set of data one wishes to analyse. The conclusion of the different simulations of the Toumba building data is that a possible design-unit of 489.5 mm (peak f) can be supported by statistical analysis while the shorter unit of 296.3 mm (peak e) cannot. In general, the results of this study endorse the view that as long as the simulated nonquantal measurements produced from the data model even coarsely imitate the original measurement distribution, the role of the data model is not a critical factor in the simulations. However, it is advisable to continue employing several different data models in metrological analyses so that the general question of dependence of the simulation results on the models can be given a more thorough answer. This is especially important in cases where the maximum peak height is close to the determined statistical significance level, otherwise there remains the danger of accepting a false conclusion or rejecting a probable one.
ACCURACY
The next question is how accurately the quantum 489.5 mm can be determined. Kendall suggested the following procedure for calculating 'the best available estimate of the uncertainty in our knowledge of the value of quantum':41 a random sample of measurements Xis first created from a data model and for each Xis calculated the nearest integer L to X/q and Xis replaced by a new value
X'= Lq + ae,
(3)
where a is standard deviation and e a standardizedGaussian random variable: Lq is thereforethe 'targetlength' which is disturbedby errorae. Kendalldemonstrated that an expectation value for the standarddeviation when the number of measurementsN is large (N2/N>S) can be calculatedas
a = [q x (4) In(S/412N)/-2~']1'2, where S is the maximum peak height;42 with q = 489.5 mm and S= 3.65, a is 83.4 mm.
Since Nis small in the case of Toumbabuilding,sample standarddeviations for errore of
formula (1) should also be calculated: here s is 55.2 mm, which is appreciably smaller
than o. However, I would ratheropt to the side of caution and use the largervalue o in the simulations. The model used in the second stage of the analysiswas the KDE distribution with h = looo. The cosine quantogramscalculatedfor the too new sets of simulatedX'-values have the maximumpeak heights lying in the range 483.4-495.8 mm with a mean value of 489.3 mm. As we see, the mean is not exactly the same as q due to the perturbing factor oe in formula (3). The standarddeviation is 1.98 mm, and the 95%/0 confidence interval for the mean 485.4-93.1 mm. Therefore,we may conclude that based on the preserved building measurementsthe quantumlength cannot be determinedany more accuratelythan as c. 485-93 mm.
41 'Hunting quanta', 260.
42 Ibid., 253.
THE TOUMBA
BUILDING
AT LEFKANDI
271
CONCLUSIONS
Cosine quantogram analyses and computer simulation based on the reported measurements of the Toumba building at Lefkandi would seem to indicate that a unit of c. 49 cm could have been employed in its design. A possible 'physical' explanation for this unit could be that it is the cubit, the distance between the elbow and the tip of the fingers, of a fairly tall Early Iron Age master builder, but that remains pure speculation. Moreover, it should at this point be stressed again that the metrological analysis is based on a very limited number of precise dimensions and, even though the results are statistically significant, they should only be accepted with caution. More dimensions would be needed to settle the question with a higher degree of certainty whether a unit of c. 0.49 m could have indeed been used by Early Iron Age builders. Since no more observations can be obtained from the Toumba building, it would be necessary to discover other accurately laid-out Protogeometric buildings which had been designed using multiples of the sameunit as at Lefkandi. Taking into consideration how unique the building is in almost every respect, it is very unlikely that such buildings will be identified. Even if no certain conclusion regarding the Lefkandi design-unit is possible to reach, a more significant contribution of this paper is more probably in the field of testing and suggesting refinements in the cosine quantogram method for detecting possible designunits in Greek building measurements. It is not possible to overemphasize that statistical methods can and shouldbe used in metrological studies of Greek architecture.If appropriate methods were used more widely in architectural studies, one possible result could be a serious drop in the number of hypothetical ancient feet, cubits, design-units, and modules being put forward. Royal Holloway, University ofLondon
JARI PAKKANEN

The Toumba Building at Lefkandi A Statistical Method For Detecting A Design-Unit

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

The Toumba Building at Lefkandi A Statistical Method For Detecting A Design-Unit

Uploaded by

Copyright:

Available Formats

The Toumba Building at Lefkandi: A Statistical Method for Detecting a Design-Unit Author(s): Jari Pakkanen Source: The Annual

THE TOUMBA BUILDING AT LEFKANDI: A STATISTICAL METHOD FOR DETECTINGA DESIGN-UNIT'

Statistics in Archaeology = M. Baxter, Statistics in

have suggested a whole array of foot-standards, modules or cubits: 29.366-29.436

cm, and 61.2857 cm.7 If the standard metrological

1.5 2.3 0.9

0.6 8.3 0.6

0.6 3-3 0.6 48.o 8.8

3.3 0.6 48.0 8.8

8.77 3.0 3.0 2.6

8.77 3.0 3.0 2.6

interaxial 3 interaxial 4 interaxial 5

2-9 3.0 3.0

apse room, middle interaxial apse room, west bay

FIG. 1. Cosine quantogram for Lefkandi raw data (N= =27).

" On measurement selection and metrology, see

2.9% 3.1% 3.9% 3.2%

Freeman (n. 6), 23.

On the number of Monte Carlo simulations, see B. F.

J. Manly, Randomization,Bootstrapand Monte CarloMethods

3 2 1 0 -1 -2 -3 -4 -5 200 300 400 500 600 q (mm)

29 S= sup{#(q): 200 < q

0.00012 0.00010 0.00008 0.00006 0.00004

in the simulation runs (see lines 9-11 of TABLE The 1).

34 The measurements were created from formula X=

F distribution distribution X2 Normaldistribution

00 10000 20000 30000 40000 50000 Length (mm)

at 5%/0 level and peak e not (lines 4-21

in TABLE2). The trend for the percentage

Since Nis small in the case of Toumbabuilding,sample standarddeviations for errore of

41 'Hunting quanta', 260.

You might also like