Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 20

ANALISIS DATA EXPLORASI

MINGGU KE 3
UJI GOODNESS
UNTUK DISTRIBUSI NORMAL

PENILAIAN ATAS DISTRIBUSI


NORMAL DAN TRANFRMASI DATA
Banyak metode/analisis statistika dalam
penggunaannya mengharuskan persyaratan
bahwa variabel/ data yang dipakai memiliki
distribusi (normal, poisson, dll).
Misalnya, penggunaan t-tes, F-tes, dan
analisis regresi.
Standardized
normal
distribution with
empirical rule
2
percentages.

PENILAIAN DISTRIBUSI NORMAL


Metode eksplorasi data:
Histogram and Boxplot
Normal Quantile Plot
(juga disebut dengan Normal
Probability Plot)
Goodness of Fit Tests, seperti
Anderson-Darling Test (MINITAB)
Kolmogorov-Smirnov Test (SPSS)
Lillefors Test
Shapiro-Wilk Test
Permasalahan : tidak semuanya sama/

MENGUJI DISTRIBUSI
NORMAL
Pengujian dengan statistika deskriptif
secara konfensional :
histogram dengsn kurva normal,
normal scores plot (normal probability
plot).
Jika data "normal, maka sumbu nonlinear vertikal di plot probabilitas harus
menghasilkan pendekatan suatu plot
pencar yang linier mewakili data
mentah.

MENGUJI DISTRIBUSI
NORMAL
Histogram of Chest

Empirical CDF of Chest

Normal

Normal

1200

Mean 39.83
StDev 2.050
N
5738

1000

Mean 39.83
StDev 2.050
N
5738

100

80

Percent

600

60
40

400
20
200
0

0
34

36

38

40
42
Chest

44

46

48

35

40
Chest

Boxplot of Chest

45

50

Probability Plot of Chest


Normal - 95% CI
Mean
StDev
N
AD
P-Value

99.99
99

39.83
2.050
5738
55.693
<0.005

95
Percent

Frequency

800

80
50
20
5
1
0.01

35

40

45
Chest

50

30

35

40
Chest

45

50

MENGUJI DISTRIBUSI
NORMAL
Bila data diplot vs diharapkan z-skor
plot probabilitas normal menunjukkan
skewness kanan oleh kurva lentur ke
bawah.
Bila data diplot vs diharapkan z-skor
plot probabilitas normal menunjukkan
skewness kiri oleh kurva lentur atas.

MENGUJI DISTRIBUSI
NORMAL
Histogram of QRAT

Empirical CDF of QRAT

Normal

Normal
Mean 158.6
StDev 97.75
N
50

25

80

Percent

Frequency

20

15

10

60
40

20

Mean 158.6
StDev 97.75
N
50

100

0
0

120

240
QRAT

360

480

-100

100

Boxplot of QRAT

200
300
QRAT

400

500

600

Probability Plot of QRAT


Normal - 95% CI
99

Mean
StDev
N
AD
P-Value

95
90

158.6
97.75
50
4.552
<0.005

Percent

80
70
60
50
40
30
20
10
5

100

200

300
QRAT

400

500

600

-200

-100

100

200
QRAT

300

400

500

600

Assessing Normality and


Data Transformations
Histogram of Speed

Empirical CDF of Speed

Normal

Normal
Mean 26.21
StDev 10.75
N
66

40

Mean 26.21
StDev 10.75
N
66

100

80

Percent

Frequency

30

20

40

20

10

60

0
-40

-20

0
Speed

20

40

-50

-25

Boxplot of Speed

0
Speed

25

50

Probability Plot of Speed


Normal - 95% CI
99.9

Mean
StDev
N
AD
P-Value

99

Percent

95
90

26.21
10.75
66
5.884
<0.005

80
70
60
50
40
30
20
10
5
1

-50

-40

-30

-20

-10

0
Speed

10

20

30

40

0.1

-50

-25

25
Speed

50

75

Histograms and Boxplots

X 62.25, s 12.84

Kurva merah
merupakan
distribusi yang
sesuai ( fit) data
distribusi normal,
dan kurva biru
estimasi fungsi
densitas, kurva ini

Histograms and Boxplots

Outliers are
not consistent
with normality.

X 79.42, s 39.95

Kurva merah
merupakan
distribusi fit
normal data dan
biru adalah
perkiraan
kepadatan dari
data yang tidak
setuju

Normal Quantile Plot


THE IDEAL PLOT:
Here is an example where
the data is perfectly
normal. The plot on right is
a normal quantile plot with
the data on the vertical axis
and the expected z-scores if
our data was normal on the
horizontal axis.
When our data is
approximately normal the
spacing of the two will
agree resulting in a plot
with observations lying on
the reference line in the
normal quantile plot. The
points should lie within the
dashed lines.

Normal Quantile Plot


THE IDEAL PLOT:
Here is an example where
the data is perfectly
normal. The plot on right is
a normal quantile plot with
the data on the vertical axis
and the expected z-scores if
our data was normal on the
horizontal axis.
When our data is
approximately normal the
spacing of the two will
agree resulting in a plot
with observations lying on
the reference line in the
normal quantile plot. The
points should lie within the
dashed lines.

Normal Quantile Plot


(right skewness)

The systolic volumes of


the male heart patients are
clearly right skewed.

When the data is plotted


vs. the expected z-scores
the normal quantile plot
shows right skewness by
a upward bending curve.

Normal Quantile Plot


(left skewness)
The distribution of
birthweights from this
study of very low
birthweight infants is
skewed left.
When the data is plotted
vs. the expected zscores the normal
quantile plot shows left
skewness by a
downward bending
curve.

Normal Quantile Plot


(leptokurtosis)

Tests of Normality
There are several different tests that
can be used to test the following
hypotheses:
Ho: The distribution is normal
HA: The distribution is NOT normal
Common tests of normality include:
Shapiro-Wilk
KolmogorovSmirnov
Anderson-Darling Lillefors
Problem: THEY DONT ALWAYS

Tests of Normality
Ho: The distribution of systolic volume is
normal
HA: The distribution of systolic volume is NOT
normal
Because p < .0001 we
have strong evidence
against normality for
the systolic volume
population distribution
using the Shapiro-Wilk
test.

Tests of Normality
Ho: The distribution of systolic volume is
normal
HA: The distribution of systolic volume is NOT
normal

We do not have
evidence at the
level against the
normality of the
population systolic
volume distribution
when using the

Tests of Normality
Ho: The distribution of cholesterol level is
normal
HA: The distribution of cholesterol level is NOT
normal
We have no
evidence against
the normality of the
population
distribution of
cholesterol levels
for male heart
patients (p = .
2184).

You might also like