Professional Documents
Culture Documents
Significance of P-Value, Box-Whisker Plots in Statistical Testing 260811
Significance of P-Value, Box-Whisker Plots in Statistical Testing 260811
Significance of P-Value, Box-Whisker Plots in Statistical Testing 260811
Hypothesis Test
Hypothesis: Abstract claim which cannot be proved. For example, claiming that a new algorithm is better than the current algorithm tested on same set of data. the null hypothesis, denoted H0, alternative hypothesis, denoted H1 The experiment has been carried out in an attempt to disprove or reject the null hypothesis, thus we give that one priority so it cannot be rejected unless the evidence against it is sufficiently strong. For example, H0: There is no difference in mean overlapping fraction between algorithm A1 and algorithm A2 against H1: There is a difference.
The outcome of a hypothesis test is "Reject H0in favour of H1" or "Do not reject H0".
P-value
Assume the null hypothesis H0, is true The probability value (p-value) of a statistical hypothesis test is the probability of getting a value of the test statistic as extreme as than that observed by chance alone. It is the probability of wrongly rejecting the null hypothesis if it is in fact true. The p-value is compared with the actual significance level of our test and, if it is smaller, the result is statistically significant. That is, if the null hypothesis were to be rejected at the 5% significance level, this would be reported as "p < 0.05".
Example:Significance level, alpha= 5% If p<0.05 ==> less overlapbetween distributionsofmean overlapping fraction of algorithm A1 and A2 ==> Reject Null Hypothesis, Ho in favor of Alternate Hypothesis, H1 If p>=0.05 ==> high overlap between distributionsofmean overlapping fraction of algorithm A1 and A2 ==> Accept Null Hypothesis,Ho against Alternate Hypothesis, H1 Where, Null Hypothesis, Ho= There is no difference between mean overlapping fraction of algorithm A1 and A2 on average & Alternate Hypothesis ,H1=There is difference between mean overlapping fraction of algorithm A1 and A2 on average
P-Value Approach
Assume that the null hypothesis is true. The P-Value is the probability of observing a sample mean that is as or more extreme than the observed.
z0 =
/ n
Two-tail
Right Tail
Left Tail
If the P-Value is greater than the significance level , do not reject the null hypothesis. If the P-Value is smaller than the significance level , reject the null hypothesis.
If the test statistic is smaller than z , then the area to the left of the test statistic (P-Value) would be smaller than .
If the test statistic is greater than z ,, the area to the right of the test statistic (P-Value) is smaller than .
5. Two-tail test
If the test statistic is smaller than z or larger than z , the area to the left of the test statistic combine with the area to the right of the test statistic (P-Value) is smaller than .
If the test statistic is greater than z , then the area to the left of the test statistic (P-Value) would be greater than .
If the test statistic is less than z ,, the area to the right of the test statistic (P-Value) is greater than .
8. Two-tail test
If the test statistic is greater than z or smaller than z , the area to the left of the test statistic combine with the area to the right of the test statistic (P-Value) is greater than .
z0 =
x 0 / n
z0 =
P(Z>zo)=P(Z>1.11)=0.1335
Do not reject the null hypothesis. There is not sufficient evidence at the = 0.05 level of significance to support the researchers claim that the farm sizes are larger.
x 0 z0 = / n
Step 2: Find the P-Value
z0 =
P(Z<zo)=P(Z<-2.18)=0.0146
Reject the null hypothesis. There is sufficient evidence at the = 0.05 level of significance to support the researchers claim that the oil output per well has declined.
x 0 z0 = / n
Step 2: Find the P-Value
Reject the null hypothesis. There is sufficient evidence at the = 0.05 level of significance to conclude that the volume of the Dell stock was different in 2004.
Example
x=normrnd(.9,.02,1,10); Normrnd=Random arrays from normal distribution Mean=.9, Sigma =.02, size of x is 1x10 [h,p]=ttest(x,.89) h= p= 1 0.0289
As p<.05, we reject null hypothesis. There is sufficient evidence at the = 0.05 level of significance to support the researchers claim that mean value is increased There is statistical significant difference.
Example x=normrnd(.9,.02,1,10); boxplot(x); Normrnd=Random arrays from normal distribution Mean=.9, Sigma =.02, size of x is 1x10 Maximum whisker length w. The default is a w of 1.5. Points are drawn as outliers if they are larger than q3 + w(q3 q1) or smaller than q1 w(q3 q1), where q1 and q3 are the 25th and 75th percentiles, respectively.
Box plot