Professional Documents
Culture Documents
Chapter 6: Order Statistics
Chapter 6: Order Statistics
1. X1 , X2 , . . . , X n are independent,
X(1) min{X1 , X2 , . . . , X n },
1
X(n−1) min {X1 , X2 , . . . , X n } − {X(1) , X(2) , . . . , X(n−2) } ,
max{X1 , X2 , . . . , X n }.
Then
X(1) < X(2) < · · · X(n)
Statistical Inference, August 21, 2020
denotes the original random sample after arrangement in increasing order of magnitude,
and these are collectively termed the order statistics of the random sample X1 , X2 , . . . , X n .
The rth smallest, 1 ≤ r ≤ n, of the ordered X ’s, X(r) , is called the rth-order statistic. Some
authors use alternative notation
to denote the order statistics. Further, by taking Yi X(i) , i 1, . . . , n one can also define
order statistics as
Y1 < Y2 < · · · < Yn .
Some familiar applications of order statistics, which are obvious on reflection, are as
follows:
1. X(n) , the maximum (largest) value in the sample, is of interest in the study of foods
and other extreme meteorological phenomena.
2. X(1) , the minimum (smallest) value, is useful for phenomena where, for example,
the strength of a chain depends on the weakest link.
3. The sample median, defined as X([n+1]/2) for n odd and any number between X(n/2)
and X(n/2+1) , for n even, is a measure of location and an estimate of the population
central tendency.
4. The sample midrange, defined as (X(1) +X(n) )/2, is also a measure of central tendency.
6. In some experiments, the sampling process ceases after collecting r of the observa-
tions. For example, in life-testing electric light bulbs, one may start with a group of
n bulbs but stop taking observations after the rth bulb burns out. Then information
is available only on the first r ordered ”lifetimes” X(1) < X(2) < · · · X(r) , where r ≤ n.
This type of data is often referred to as censored data.
2
7. Order statistics are used to study outliers or extreme observations, e.g., when so-
called dirty data are suspected.
For example, if a random sample of five light bulbs is tested, the observed failure times
might be (in months) (x1 , . . . , x5 ) (5, 11, 4, 100, 17). Now, the actual observations would
Statistical Inference, August 21, 2020
have taken place in the order x 3 4, x1 5, x2 11, x5 17, and x4 100. But the
"ordered" random sample in this case is (y1 , . . . , y5 ) (4, 5, 11, 17, 100).
The joint distribution of the ordered variables (that is Y1 , . . . , Yn ) is not the same as the
joint distribution of the unordered variables (that is X1 , X2 , . . . , X n ). Note that because of
ordering
x 1 y1 , x 2 y2 , x 3 y3 ,
x 1 y2 , x 2 y1 , x 3 y3 ,
x 1 y1 , x 2 y3 , x 3 y2 ,
x 1 y3 , x 2 y2 , x 3 y1 ,
x 1 y2 , x 2 y3 , x 3 y1 ,
3
x 1 y3 , x 2 y1 , x 3 y2 ,
One can check that for the transformation of n variables there are n! inverse transforma-
tions. Further, one can check that
J(x1 , x2 , x3 → y1 , y2 , y3 ) 1,
J(x 1 , x2 , x3 → y2 , y1 , y3 ) 1,
J(x 1 , x2 , x3 → y1 , y3 , y2 ) 1,
Statistical Inference, August 21, 2020
J(x 1 , x2 , x3 → y3 , y2 , y1 ) 1,
J(x 1 , x2 , x3 → y2 , y3 , y1 ) 1,
J(x1 , x2 , x3 → y3 , y1 , y2 ) 1.
Now, the joint density of Y1 , Y2 , Y3 is derived as
fY1 ,Y2 ,Y3 (y1 , y2 , y3 ) f X (y1 ) f X (y2 ) f X (y3 )(1) + f X (y2 ) f X (y1 ) f X (y3 )(1)
+ f X (y1 ) f X (y3 ) f X (y2 )(1) + f X (y3 ) f X (y2 ) f X (y1 )(1)
+ f X (y2 ) f X (y3 ) f X (y1 )(1) + f X (y3 ) f X (y1 ) f X (y2 )(1)
6 f X (y1 ) f X (y2 ) f X (y3 )
3! f X (y1 ) f X (y2 ) f X (y3 ), a < y1 < y2 < y3 < b
where the last line has been obtained because the product is commutative.
Note that we can write the joint density of Y1 , Y2 , Y3 as
But only the factorization of the joint density is not sufficient for independence. You have
to look at the support set of fY1 ,Y2 ,Y3 (y1 , y2 , y3 ). If the support set of the joint density is not
a cartesian product, the random variables are not independent. In this case the support
set of fY1 ,Y2 ,Y3 (y1 , y2 , y3 ) can not be written as cartesian product and therefore Y1 , Y2 , Y3
are not independent.
Now you can easily state the joint density of Y1 , . . . , Yn . This is Theorem 6.5.1 of the
book.
If X1 , X2 , . . . , X n is a random sample from a population with continuous pdf f X (x),
then the joint pdf of the order statistics Y1 , . . . , Yn is