Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

Review of Basic Concepts

Some abbreviations widely used in statistics.^


Sample
Statistic

Preferred
symbol

Population

Acceptable
symbol

Arithmetic mean

X
Chi-square
2
Correlation coefficient
r
Coefficient of multiple determination
R2
Coefficient of simple determination
r2
Coefficient of variation
CV
Degrees of freedom
df
DF
Least significant difference
LSD
Multiple correlation coefficient
R
Not significant
NS
Probability of type I error

Probability of type II error

Regression coefficient
b

Sample size
n
N
x
sx
Standard error of mean
SE
Standard deviation of sample
SD
s

Students t
t
Variance
s2
2
Variance ratio
F
^The symbols *, **, and *** are used to show significance at the P = 0.05, 0.01, and 0.001 levels,
respectively. Significance at other levels should be designated by a supplemental note.
From: Publications Handbook and Style Manual. 1988. Amer. Soc. Agron. Inc., Crop Sci. Soc.
Amer. Inc., Soil Sci. Soc. Amer., Madison, WI.

2.1

Review of Basic Concepts - Example Problem


Comparison of the speed (-2 minutes) in seconds of two calculating machines for computing sums
of squares (modified from Cochran and Cox)
Machine A
Replication

Time
(sec)

Deviation
from
Mean

Machine B
Dev.2

Time
(sec)

Deviation
from
Mean

Dev.2

A-B

Rep. or
Pair
Totals

30*

64

14

16

44

21

-1

21*

49

42

22*

-9

81

17

27

22

13*

-1

35

19

-3

14*

33

29*

49

17

12

46

17

-5

25

-6

36

25

14*

-8

64

16

-2

30

23*

-6

36

15

31

10

23

24*

10

100

-1

47

Total

x2=214

X=220

8*

X=140

where x = X - X
Sums of Squares = SS = x 2 = (X -X)2
Mean = X =

X
n

Standard Deviation = s = s 2
Variance = s 2 =

(X - X)2
n-1

Standard Error = sx =

s2
n

2.2

x2=316

Coefficient of Variation = CV =

s
X

x 100%

Confidence Limits = CL = X + tsx


t Distribution
The t distribution was first presented by William S. Gosset who published under the
psuedonym Student in 1908. Thus the term Students t test. The t test compares the deviation
of the sample mean form the population mean measured against the standard error of the mean. It
is also used to compare the difference between two means measured against the standard error of
the difference. The t distribution follows the normal distribution and varies for different df. The
standard t tables are two-tailed tables in which the probability, i.e. 5%, is distributed on both
ends of the distribution. The differences between the results of a standard treatment and a new
treatment may be either positive or negative, i.e. the result of the new treatment may be either
larger of smaller than of the standard treatment. There are also one-tailed t tables in which the
difference between the result of a standard treatment and the new treatment can be only positive
or only negative.
The t test for the difference between the sample mean and the population mean is
calculated in the following manner.
t=

X-
where sx =
sx

s2
n

The t test for the difference between means from two different treatments is calculated in
the following manner.
X -Xb
t= a
where sd =
sd

sa
na

sb
nb

Degrees of freedom (df) for looking up t value in t table:


1. If samples are from two populations, the df is the sum of the df for the two
populations.
2. If pairs of values or replicated comparisons are being compared, the df is the number
of pairs - 1.
3. In an ANOVA, the df is the df for mean square for error.
2.3

Confidence Limits
The confidence limits of a mean may be calculated by the formula
CL = X + t sx
as shown in the example.
When a t value for the 0.05 probability is used in the CL, the true mean is expected to lie
within the confidence limits indicated with a probability of 95% unless a 1-in-20 chance has
occurred.
The 95% confidence limits may be calculated for the means of each of two treatments. If
the confidence limits of the two means do not overlap, it may be concluded that the two means
are significantly different at the 95% probability level.
Confidence Limits - Example Problem
Determine the confidence limits for for calculating machine A, given:
X = 220

n = 10

x 2 = 214

s2 =

x 2
214
= 23.77
=
9
n - 1

X
16

17

18

19

20

21

22

23

24

25

26

27

P = 0.80
19.9

24.1
P = 0.95

18.5

25.5
P = 0.99

17.0

X =

27.0

X
220
=
= 22.0
n
10

2.4

28 sec

CL = 22 + tsx

sX =

s2
=
n

23.77
= 1.54
10

For P = 0.80
t(.20,9) = 1.383
CL = 22 (1.383) (1.54)
= 22 2.1
= 19.9, 24.1

For P = 0.95
t(.05,9) = 2.262
CL = 22 (2.262) (1.54)
= 22 3.5
= 18.5, 25.5

For P = 0.99
t(.01,9) =
CL =

Thus, with a confidence of 95%, we can say that the true mean, , is included in the
range 18.5 to 25.5. Or, to state this another way, there is 1 chance in 20 or 5 chances in 100 that
the true mean for machine A lies outside this range.

2.5

You might also like