Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 9

Statistics Assignment 4

BURT IQ TEST

Group 8
Pia Bakshi
Shruti Shukla
Srilakshmi Anumolu

Vikas Vimal

Introduction
The question brings to the fore two distinct data sets on Intelligent QuotientOne of parents determining their professional competence and one on their
children determining their professional competence.
We have to ascertain if the given sets of information are genuine or not; and if
not, which one of the two contains fraudulent data.
As stated by Burt himself, he standardized the sample to a normal distribution
containing 1000 data points with mean of 100 and standard deviation, 15 for
both the sets.

Executive Summary
To ascertain the limits of fraudulent activity, we need to perform a chi-squared
test on the data. The Chi squared test is a method for assessing the goodness
of fit between a set of observed values and those expected theoretically from a
distribution.
The set of experimental values are named as Sn and the expected values,
named np are derived from the probability distribution of a bell curve.
The weighted difference between the given values and expected values is
termed as Chi-Squared, denoted by X2.
If we aim at a cut-off confidence of 95%, we need to find the area under the
tails of the bell curve; specifically the first 2.5% and the last 2.5%. This would
tell us if the data fits the expected values TOO CLOSELY or TOO POORLY.
We fixed our confidence level at 95% with 9 degrees of Freedom. The deduced
Chi square range is between 2.7 and 19.023.
If the result of the determined Chi Square for the two data sets falls in the
deduced range, then the data stands valid and genuine, otherwise we can
deduce that it has probably been tampered with.

Procedure
We use the range of IQ as mentioned in the leftmost column of the provided
data to form boxes of fixed width. Each box would contain data between, say 50
and 60 IQ points, 60 and 70 IQ points and so on.
By standardizing the limits of each box, we can proceed to derive the
probability of a number falling within that limit. For example, the expected
probability that a number will fall between the limits of 50 -60 is 0.0034. Upon
multiplying it with the total number of data points, 1000, we can arrive at a
value of 3.4. This is the expected value of number of data points in the range of
50-60 for a normal standard distribution of mean 100 and std dev of 15.
We thus find the expected values for each box.
We then calculate the weighted sum of differences between the given and
expected values using the formula:
X 2=

( Sn1np 1 )2 ( Sn 2np 2 )2 ( Sn 3np 3 )2


( Snjnpj )2
+
+
++
np 1
np 2
np 3
npj

300
250
200
PARENTS

150

np

100
50
0
1

10

Using
the formula, we arrived at the value for Chi Square as 3.43 in the first case and
3.109 in the second case:

300
250
200
CHILDREN

150

np

100
50
0
1

10

As previously determined, our Chi Square range falls within 2.7 and 19.023
and as such, the cases are favorable and by corollary, genuine.

We generated a set of Random data using the RAND function that resembles a
distribution of mean 100 and stddev of 15. This data shows that the given
values are well within the deviation expected of a random data set.

Range

RANDOM
DATA

50-60

60-70

14

70-80

71

80-90

122

90-100
100110
110120
120130
130140

264

140+
Total
Mean IQ
std dev

X^2
X^2

247
175
76
20
6
1000
100
15
DOF
2.5% value
97.5 value

Np
3.401320
2
18.91975
1
68.46108
8
161.2813
2
247.5074
6
247.5074
6
161.2813
2
68.46108
8
18.91975
1
3.401320
2

9
19.023
2.700

Conclusion
Therefore, it is possible for us to deduce that the data originally presented was
authentic.
The graph indicates that the expected value arrived at, through simulated data,
and corresponds to the given data, thus proving the information genuine and
authentic.

You might also like