Professional Documents
Culture Documents
Stats 2
Stats 2
open end casclass interval: it is a series in which the lower limit of the first clss interval and the upper limit of the last class
ex:
less than 10
10-20
20-30
more than 30
close end series: it is a series in which all the class intervals are clearly specified
ex:
10-20
20-30
30-40
TABULATION OF DATA:
A statistical table is the logical listing of related quantitative data in vertical columns and horizontal rows of numbers with
Table number: the table number is given in a logical sequence for proper identification and easy and ready reference for fu
TITLE: every table must be given a sutaible title, ehich appears at the top of the table(below the table no or next to the tab
captions and stubs : captions are the headings for the vertical columns and stubs are the headings for the horizontal rows
Body of the table: the arrangement of the data according to the descriptions gieven in the captions and stubs forms of the
footnote: when some characteristic or feature or item of the table has not been adequately explained and needs further e
footnote is placed at the bottom of the yable direct;y below the body of the table. Footnotes are identified by symbols.
source note: the source note is required if the secondary data is used. It must be given at the bottom of the table, below t
footnote:
sourcenote:
in a sample study about coffee habits in two towns, the following info was received
town a
females were 40%, total coffee drinkers were 45%, and male non coffee drinkers were 20%
town b
males were 55%, male non coffee drinkers were 30% and female coffee drinkers were 15%
present the data in a tabular form.
Table no.1
title: coffee drinking habits in two towns
Town A %' Town B
male feamle total male female
coffee 40 5 45 25 15
non coffee 20 35 55 30 30
total 60 40 100 55 45
footnote
sourcenote:
a no of workers n a large factory in 2006 was 540 of whivh 30% were females and rest males
in 2008 the strength of the workers incr by 100 females and 200 males
in 2010 the total no of workers inc by 25% on its value in 2008
the female workers were 340. tabulate the info
table no 2
no of workers in a large factory in diff years
footnote:
source note: problem discussed in the class
present the following info in a suitable tabular form supplying the figures not directly given:
in 1995 out of total 2000 workers in a factory, 1550 were members of a trade union. The number of women emploted was
table showitable no: 3
1995 2000
union non union total union non union
men 1500 250 1750 1600 225
women 50 200 250 125 155
total 1550 450 2000 1725 380
foot note:
source note: problem discuseed in class
in 1990 out of total of 2000 students in a college 1400 were for graduation and the rest for thr post graduation.
out of 1400 graduate stds 100 were girls. However in all there were 600 girls in the clg
in 1995 no of gra stds increased to 1700 out of which 250 were girls but the no of pg students fell to 500 of which only 50
in 2000, out of 800 girls, 650 were for graduation, whereas the total no of graduates was 2200. the no of boys and girls in p
also calcu the % increase in the no of graduate stds in 200 as compared to 1990.
table 4
no of students in the clg
1990
graduate p.g total
% increase=(2200-1400)/1400*100=57.1428
out of total no of 2807 women who were interviewed for employment in a textile factory,
912 were from textile areas and the rest from non textile areas. Amongst the married women
who bwlong to textile areas, 347 were having some work experience and 173 did not work have work experience.
while for non textile areas the corresponding fig were 199 and 670 respectively.
the total no of women having no experience was 1841 of whom 311 resided in textile areas.
of the total no of women 1418 were unmarried and of these the no of women having experience in the textile and non tex
tabualte the info.
table no 5
table showing the no of workers in a textile f
footnote:
* exp- experienced
sourcenote: copied from vidhur
DIAGRAM:
MEANING: a diagram is a visual form for presentation of a statistical data, highlighting their basic facts and relationship.
if we draw diagrams on the basis of data collected they will easily be understood and appreciated by all. It is readily intelle
SIGNIFICANCE OF DIAGRAMS:
types of diagrams:
* one dimensional diagrams: simple bar, multiple bar, sub divided, sub divided percentage bar.
* two dimensional: rectangles, squares, circles.
* three dimensional: cubes, cylinders, blocks
* others: pictograms and cartograms
700
2000
700
600
1500
600
500 500
500 1000
no of students.
400
400
300 300 280 500 300
300
0
200 year 1class
100 120 100
100 Row 201
0
2006 2007 2008
years
108
120 90 95
100
80
60
40
20
0
1994-95 1995-96 1996-97 1997-98 1998-99
years
humanties
50%
science
40%
30% 78.43 76.07 71.09
20%
10% 26.56
23.53 23.93
0%
2003-4 2004-5 2005-6
years
al rows of numbers with sufficient explanatory and qualifying words, phrases and statements in the form of titles , heading, and notes
nd ready reference for future. The table number should be place at the top of the table either in the centre or above the title or in the
ble no or next to the table no). Thr title should be self explanatory. It should precisely describe the nature of the data.
Headnote: Headnote:
it is given just
Headnote:
it isbelow
given just
the
Headnote:
it istitle
below
givenin just
the
aHeadnote:
itprominent
istitle
below
givenin just
the
aHeadnote:
itprominent
type
istitle
below
given
ususally
in just
the
aHeadnote:
itprominent
type
istitle
below
given
centered
ususally
in just
the
aitprominent
type
isand
title
below
given
centered
ususally
encloed
in just
the
a prominent
type
and
title
below
centered
inususally
encloed
brackets
in the
a prominent
type
and
title
centered
infor
ususally
encloed
brackets
inurther
a prominent
type
and
centered
in
descripti
for
ususally
encloe
bracke
urthty
for the horizontal rows. They should be brief, concise anfd self explanatory.
s and stubs forms of the body of the table. It contains the numerical info. Totals must be given for each separate class/ category below
ned and needs further elaboration or when some additional or extra info is required for its complete description, footnotes are used f
dentified by symbols.
H45A79H50:I61H41:I61A79H50:IF53:I61
total
40
60
100
of women emploted was 250 out of which 200 did not belong to any trade union. In 2000, the number of union workers was 1725 of w
table no: 3
total
1825
280
2105
t graduation.
1995 2000
g pg total g pg total
ork experience.
n the textile and non textile areas was 254 and 166 respectively.
o of workers in a textile factory.
non ttextile
non-exp total exp non exp total
670 869 546 843 1389
860 1026 450 998 1418
1class
2 class
300
3 class
fail
1class 2 class 3 class fail
175
130
1998-99
1998-99
:X16A1:X17A1:X16A1:W16A1:X16A1:X17A1:X16
titles , heading, and notes to make clear the full meaning if data and their origin.
the title in a prominent type ususally centered and encloed in brackets for urther description of the contents of the tabl.
arate class/ category below the columns or against the rows/ columns shoukd also be given.
on workers was 1725 of which 1600 were men. The no of non union workers was 380, among which 155 were women.
ts of the tabl.
total
mean= 420/6= 70
x
30
20
10
total 60
mean= 60/3=
weighted mean=
month cost
1 10
2 12
3 15
4 20
5 25
boys:
75= sumx/70
sumx= 75*70
sumx= 5250
girls:
mean= sumx/30
sumx= 30x
boys+girls
7200= 5250+30x
7200-5250=
1950/30= x
x(girls)= 65
properties of ar mean:
x (x-x.)
10 -20
30 0
20 -10
50 20
40 10
total 150 0
xbar= 150/5
xbar= 30
merits of ar
*it is simple to understand
*it is easy to cal
*it is based on all obs
*it is suitable for further mathemetical treatment
* it is rigidly defined.
demerits:
* extreme itmes affect the avrege fig dispropotionately
*it may not be an item among the given values of the variable. It may represent fractional fig
* it may give wrong impression, if proper weights are not given
* in the case of open end class intervals, midpoits are arrived by making assumptions, which may be wrong
*it cannot be determined y inspection, nor can it be loacated graphically.
MEDIAN:
MEDIAN IS A POSITIONA;L AVERAGE
*median is the value of that item in a series which divides the array into 2 equal parts, one consisting of all the v
* it is denoted by the symbol Md
in order: 7 8
md= size(7+1)/2=4th item= 14
20 45 32
1 2 3
16 18
IN A CLASS OF 15 STUDENTS 5 STUDENTS FAILED IN A TEST. THE MARKS OF 10 STU WHO HAVE PASSED WERE
9 6 7
WHAT IS THE MEDIAN MARK OF ALL STUDENTS.
IN ORDER: 4 5 6
OBS-6 7 8
MD= SIZE(15+1)/2= 8
MD= 6
10 10 10
20 16 26
30 18 44
40 13 57
50 6 63
60 3 66
70 8 74
80 4 78
90 6 84
100 6 90
TOTAL 90
MD= SIZE(90+1)/2TH=45.5
MD= 40
MD= SIZE(30+1)/2=15.5
MD=25
MD= l+ (n/2-m)
_________ *c
f
MD= L+(N/2)-M
_________ *C
F
40+(215-195)/22*10= 44.5
age no of person
below 10 2
below 20 5
below30 9
below 40 12
age cf f
0-10 2 2
10_20 5 3
20-30 9 3
30-40 12 3
40-50 14 4
50-60 15 1
60-70 15.5 0.5
70 & over 15.6 0.1
MEDIAN IS A SUITABLE MEASURE WHEN THE CLASS INTERVALS ARE OPEN END.
IN A GROUP OF THOUSAND WAGE EARNERS THE MONTHLY WAGES OF 4% ARE BELOW RS 60 AND THOSE OF 15% ARE UN
GOT RS 100 AND OVER. FIND THE MEDIAN WAGES.
NO OF WORKERS
BELOW 60 4%
UNDER 62.5 15%
MEDIAN= 78.75
PROPERTIES OF MEDIAN:
THE SUM OF DEVIATIONS OF THE ITEMS FROM THE MEDIAN ( IGNORING THE -VE SIGNS) WILL BW THE LEAST O
IT IS ONLY A POSITIONAL AVERAGE CALCULATED WITH THE HELP 0F MATHEMATICAL FORMULA BASED ON INT
MERITS OF MEDIAN:
IT IS SIMPLE TO UNDERSTAND AND EASY TO CALCULATE
IT ELIMINIATES THE EFFECTS OF EXTREME VALUES TO VARIABLE BY WHICH IT IS NOT AFFECTED
IT IS CAPABLE OF FURTHER ALGEBRAIC TREATMENT FOR ANALYZING ORDER MEASURES
IT CAN BE DETERMINED JUST BY INSPECTION OF THE ARRAYED DATA.
IT CAN BE DETERMINED GRAPHICALLY WITH THE HELP OF OGIVES
LIMITATIOMNS:
IT MAY NOT ALWAYS REPRESENTATIVE OF THE ITMES AS IT IGNORES THE EXTREME VALUES
IT CANNOT BE DETERMINED PRECISELY WHEN ITS SIZE FALLS B/W THE 2 VALUES
IT REQUIRES THE DATA TO BE ARRAYED, BEFORE IT CAN BE DTERMINED, WHICH INVOLVES A CONSIDERABLE AM
MODE:
EX :10 20 30
THERE IS NO MODE
*Mode is said to be ill- defined when the data is bi modal or multi modal.
*the modal value for discrete freq dist can be obtained just by inspection.
marks no of students
10 6
20 5
30 15
40 11
50 8
weight no of stu
60 5
62 7
64 15
66 21
68 36
70 15
∆1= f1-f0
87-52=35
calculate the mode for the following data:
income freq
less than 50 97
less than 45 95
less than 40 90
less than 35 80
less than 30 60
less than 25 30
less than 20 12
less than 15 4
∆1= 30-18=12
∆2= 30-20= 10
ASSYMETRICAL DATA
MODE= 3 MEDIAN- 2 MEAN
YOU NEED TO FIND OUT THE MODE VALUE WHEN MEAN AND MEDIAN VALUES ARE GIVEN:
IF THE MEAN AND MEDIAN IS 26.8 AND 27.9
MODE= 3 MEDIAN-2MEAN
32.1=3 MEDIAN- 2MEDIAN
32.1= 3 MEDIAN-2(35.4)
32.1= 3 MEDIAN-70.8
32.1+70.8=N 3 MEDIAN
102.9= 3 MEDIAN
MEDIAN = 34.3
AN AVERAGE MEANS SINGLE VALUE WHERE ALL THE VALUES LIE NEAR BY THAT VALUE.
the term dispersion or variation is studied with referrence to two meausres: absolute or relative
* absolute measure are expressed in terms of original units and they are not suitable for comparative studies
* relative measures are expressed in ratios or % and they are suitable for comparative studies.
RANGE:
range for raw data
range is denoted by R
R= largest value- shortest value
R= L-S
R=11-4= 7
COEFF OF R= L-S/L+S'
R= 7/15= 0.47
MARKS STU
10 15
20 18
30 25
40 30
50 16
60 10
70 9
MARKS STU
10_20 4
20-30 10
30-40 16
40-50 22
50-60 20
60-70 28
70-80 8
80-90 2
90-100 5
R= L-S
R= 100-10 = 90
CO-EFF OF R= 100-10-/100+10
90/110= 0.818
MERITS OF RANGE:
* IT IS SIMPLEST MEASURE OF DISPERSION
* IT IS RIGIDLY DEFINED
*IT IS USEFUL IN STATISTICAL METHODS OF QUALITY CONTROL TECHNIQUES
* IT IS USEFUL IN STUDYING THE VARIATIONS IN THE PRICES OF SHARES AND STOCKS
*IT IS USEFUL IN STUDYING WEATHER CONDTIONS WHERE MINIMUM AND MAX TEMP IS IDENTIFIED
DEMERITS OF RANGE;
* IT IS NOT STABLE MEASURE OF DISPERSON BECAUSE IT IS AFFECTED BY EXTREME VALUES ONLY
* IT IS NOT A SUITABLE MEASURE OF DISP WHEN CI ARE OPEN ENDED
* IN FINDING RANGE, FREQ ARE NEVER TAKEN INTO AN A/C
QUARTILE DEVIATION:
* IT IS ALSO KNOWN AS SEMI INTER QUARTILE RANGE.
IT IS DENOTED BY QD
QD= Q3-Q1
______ ( ABSOLUTE MEASURE)
2
30+ 37.5-34
______ *15
30
Q1= 31.75
3N/4
3*37.5= 112.5
Q3= 62.625
q1= 162.5
Q3= 179.722222222222
QD= 8.61
MERITS OF QD
* IT IS EASY TO Calculate
* it is not affected by extreme values of the variables as it is concerned with the central half portion of the dist.
* it is not at all affected by open end intervals
demerits of QD:
*IT IS THE ONLY POSITIONAL AVERAGE, BUT NOT A MATHEMATICAL AVERAGE
* IT IGNORES COMPLETELY THE PORTIONS BELOW THE LOWER QUARTILE AND ABOVE THE UPPER QUARTILE
* IT IS NOT CAPABLE OF FURTHER MATHEMATICAL TREATMENT.
MEAN DEVIATION:
* mean deviation is the average diff among the itmes in a series from mean or mode or median
* mean is supposed to be the suitable central tendency for calculating deviations because the sum of deviations
𝑀𝐷=(∑128▒|𝑑| )/𝑛
( absolute average)
calculate mean deviation and its coeff for the following data from mean:
x d
100 269
150 219
200 169
250 119
360 9
490 121
500 131
600 231
671 302
total: 3321 1570
mean 369
MD= 174.444444444444
59 32 67
|x-47| 12 15 20
Md= 16.9090909090909
coeff of MD= MD
_________________
MEAN/MODE/MEDIAN
X F |X-30| F|X-30|
10 4 20 80
20 7 10 70
30 15 0 0
40 8 10 80
50 7 20 140
60 2 30 60
total 43 430
MODE = 30
MD= b
AGE NO OF PERSONS CF
18 12 12
20 15 27
22 20 47
24 25 72
26 18 90
28 10 100
30 8 108
TOTAL 108
SINCE NOTHING IS SPECIFIED IN THE QUESTION ABOUT CENTRAL TENDENCIES, CALCULATE THE MD FROM MEDIAN
MEDIAN= 24
MD 2.74074074074
COEFF OF MD 0.114166666666667
𝑀𝐷=(∑128▒𝑓|ⅆ| )/𝑁
COEFF OF MD= MD
_____________
Mean/mode/median
marks frequency x
10_20 5 15
20-30 10 25
30-40 16 35
40-50 20 45
50-60 25 55
60-70 20 65
70-80 18 75
80-90 12 85
80-100 6 95
total 132
mean= 56.0606060606
md 16.6758484848
coeff 0.29746108108
ci f cf
140-150 4 4
150-160 6 10
160-170 10 20
170-180 18 38
180-190 9 47
190-200 3 50
50
SINCE NOTHING IS SPECIFIED IN THE QUESTION ABOUT CENTRAL TENDENCIES, CALCULATE THE MD FROM MEDIAN
√((𝛴𝑓(𝑥−𝑥)^2)/(∑128▒𝑓))
MEAN 12.0806451613
SD 1.52715632848
CV= SD/MEAN*100
CV 12.6413474454
SD ( CONTINOUS FREQ)
SD=
TO CHECK THE QUALITY OF 2 BRANDS OF LIGHT BULBS, THEIR LIFE IN BURNING HOURS WAS ESTIMATED AS UN
mean= 134.5
SD= 68.8095
CV= 51.1595
an analysis of the month;y wages paid to workers in two firms A and B. belonging to the same factory. Give the
firm a firm b
no of wages eaeners 550 650
av monthly wages in 00 rs 50 45
sd of dist of wages in 00 rs.
SKEWNESS
* skewness means lack of symmetry
*it is denoted by Sk
* a distribution is said to be skewed if:
* mean mode median fall at diff points, mean is not eq to median is not eq to mode
* quartiles are not equidistant from median
* the curve drawn with the help of the given data is not symmetrical but stretched more to one side than to oth
skewness= mean-mode
__________
sigma
skewness:
* but quite often, mode is ill-defined and quite difficult to locate
* in such a situation, the emphirical relnship b/w the mean, mode, median, for a moderately assymetrical dist
mean= 19
mode= 12
sd= 7.2938
sk= 0.9597
it is positively skewed data.
mean= 12.4333
mode= 12
sd= 1.2023
sk= 0.3604
mean= 24.5
mode= l+ ∆1
_____ *c
∆1+ ∆2
delta1 = f1-f0 10
delta2- f1-f2 5
26.6667
sd= 12.0312094155
10,12,12,12,15,20,25,26,28,30
SK= 0.2414
wages no of workers cf
0-10 15 15 M1
10_20 20 35 Q1
20-30 30 65 Q2
30-40 25 90
40-50 10 100
MARKS NO OF STUDENTS CF
0-10 10 10
10_20 15 25
20-30 24 49
30-40 25 74
40-50 10 84
50-60 10 94
60-70 6 100
100
WAGES NO OF STUDENTS fx
50 1 50
60 3 180
70 ?(A) 70A
80 ?(B) 80B
90 6 540
100 2 200
110 1 110
25 1080+70A+80B
INCOMPLETE
∑128▒𝑓𝑥/𝑛
x=
size freq x fx
0-25 10 12.5 125
25-50 15 37.5 562.5
50-75 14 62.5 875
75-100 11 87.5 962.5
50 2525
mean: 50.5
CI FREQ CI F X FX
50-59 1 _0.5-9.5 3 4.5 13.5
40-49 3 9.5-19.5 15 14.5 217.5
30-39 8 19.5-29.5 10 24.5 245
20-29 10 29.5-39.5 8 34.5 276
10_19 15 39.5-49.5 3 44.5 133.5
0-9 3 49.5-59.5 1 54.5 54.5
TOTAL 40 940
CONVERT THE INCLUSIVE CLASS INTERVALS TO EXCLUSIVE FORM AND WRITE IT IN ASCENDING ORDER
MEAN: 23.5
MEAN= 29
CI FREQ X FX
0-10 5 5 25
10_20 ?(A) 15 15A
20-30 15 25 375
30-40 ?(B) 35 35B
40-50 5 45 225
TOTAL 625+15A+35B
25+a+b=25
A+B=20 ( A= 20-B)
1125-625= 15A+35B
500= 15A+ 35B
100= 3A+ 7B( DIVIDE BY 5)
100= 3( 20-B)+ 7B
100= 60-3B+7B
100-60= 4B
40=4B
B=10
A= 20-B
20_10
A= 10
MARKS NO OF STU X XF
5_10 2 7.5 15
10_15 2 12.5 25
15_20 A 17.5 17.5A
20_25 4 22.5 90
25_30 4 27.5 110
12+ A 240+ 17.5A
19= 240+17.5A/12+A
19(12+A)= 240+ 17.5A
228+ 19A= 240+17.5A
228-240= 17.5A- 19A
_12=1.5A
A= 12/1.5
A= 12/1.5
A=8
WEIGHTED AR MEAN
* WEIGHTS ARE NO OR % WHICH STAND FOR RELATIVE IMP OF ITMES
* MEAN( RAW DATA)
(∑128▒𝑤𝑥)/(∑128▒𝑤)
A CANDIDATE OBTAINS FOLLOWING MARKS. ENGLISH-60, KANNADA-50, PHYSICS-80, CHEM-70, MATHS 90, BIO-70. ADMIS
RE GIVEN AS 1,2,3,3,3 AND 2 RESPECTIVELY. FIND OUT THE WEIGHTED MEAN MARKS and also the mean marks.
x W wx
60 1 60
50 2 100
80 3 240
70 3 210
90 3 270
70 2 140
420 14 1020
ean= 420/6= 70
contractor employs 3 types of workers. Male, female and children. To a male worker he pays rupees 30 per day. To a female worker ru
w wx
20 600
15 300
5 50
40 950
20
950/40
23.75
no of shre total
120 1200
100 1200 Mean= 14.63
80 1200
60 1200
50 1200
410 6000
30x
x-x.2
400
0
100
400
100
1000
median
7 14 19 45 14
4 5 6 7
12 14 19 36
median
14 16 18 26 58 23
4 5 6 7 8
20 26 32 45 58
6 7 7 8 8 9 9
9 10 11 12 13 14 15
ESS THAN CF
SS THAN CF
ESS THAN CF
age no of person
below 50 14
below 60 15
below 70 15.5
70 and over 15.6
OW RS 60 AND THOSE OF 15% ARE UNDER RS 62.50. 15% EARNED RS 95 AND OVER AND 5%
WAGES NO OF WORKERS CF
0-60 40 40
60-62.5 110 150
62.5-95 700 850
95-100 100 950
100 QND OVER 50 1000
1000
THE -VE SIGNS) WILL BW THE LEAST OR SMALLER THEN THE ONE OBTAINED FROM ANY OTHER VALUE.
THEMATICAL FORMULA BASED ON INTERPOLATION.
E EXTREME VALUES
40 90 5
25 20 30 20 60 20
20 10 40 10 20
20 15 10 5 40 5 15 10
mode: 30
mode: 68
income freq
0-15 4
15-20 8
20-25 18 f0
25-30 30 f1
30-35 20 f2
35-40 10
40-45 5
45-50 2
97
8 11 10 4
AND STOCKS
ND MAX TEMP IS IDENTIFIED
M1
Q1 CLASS
M3
Q3 CLASS
M1
Q1 M3
Q3
an or mode or median
eviations because the sum of deviations from median is less than the sum of dev from the mean.
absolute average)
_______________
ean/mode/ median
43 22 17 64 55 47 80 25
4 25 30 17 8 0 33 22
|X-24| f|d|
6 72
4 60
2 40
0 0
2 36
4 40
6 48
296
mean= 11.77192982456
SD 1.451164612581
CV= 12.32732979391
RNING HOURS WAS ESTIMATED AS UNDER FOR 100 BULBS OF EACH BRAND
mean= 136.5
sd= 37.3196
cv= 27.3403
eq to mode
f(x-x)2
11.8419
8.2174
1.8775
2.5692
12.2727
6.5879
43.3667
x-x2 f(x-x)2
380.25 5703.75
90.25 1805
0.25 7.5
110.25 2756.25
420.25 4202.5
1001.25 14475
E CENTRAL 50% OF THE DATA AND IGNORES THE REMining 50% TOWARDS THE EXTREME.
N/4= 100/4= 25
Q1= 15
2*25= 50
Q2= 25
Q3= 3*25=75
N/4 25
Q1= 20
2*N/4 50
Q2 30.4
3*N/4 75
Q3= 41
SK= 0.009523809524
day. To a female worker rupees 20 per day and to a child worker rupees 10 per day. Calculate the average wage as well as the weighte
median=47
186
ED, WHERE WEIGHTS FOR DIFF SUBJECTS
wage as well as the weighted average wage assuming the no of workers employed are male=20, female=15, and children=5