Download as pdf or txt
Download as pdf or txt
You are on page 1of 18

STATISTICS FOR SCIENCE AND ENGINEERING

STA408

GROUP PROJECT:
INFERENTIAL STATISTICS

PREPARED BY:

1. AISHAH BINTI MOHD HAZRIL MUNIR @ UMAR 2021462312


2. ELIYANA NATASHA BINTI AZMAN 2022842448
3. NOR AIDA EWANI BINTI AMRAN 2021477216
4. WAN NORIZWANI BINTI WAN MOHAMMAD 'AKTHAILLAH 2021461602

PREPARED FOR:
SIR MUHAMMAD HASBULLAH BIN MOHD RAZALI

DATE OF SUBMISSION:
7 JULY 2023
TABLE OF CONTENT

ACKNOWLEDGEMENT 3
1.0 INTRODUCTION 4
1.1 BACKGROUND OF STUDY 4
1.2 OBJECTIVE 5
2.0 METHODOLOGY 6
2.1 DESCRIPTION OF DATA 6
2.2 METHOD OF ANALYSIS 6
3.0 FINDING 8
3.1 DESCRIPTIVE TEST 8
3.2 T -TEST TWO SAMPLE ASSUMING EQUAL VARIANCES 10
3.3 F-TEST TWO-SAMPLE FOR VARIANCES 12
3.4 ANOVA: SINGLE FACTOR 14
4.0 CONCLUSION 16
5.0 REFLECTIVE ESSAY 17
6.0 REFERENCES 18
ACKNOWLEDGEMENT

First and foremost, praise and thanks to Allah, the Almighty, for His showers of blessings
throughout the journey to complete the STA408 project assignment successfully. We as a group
member for this project, Aishah Binti Mohd Hazril Munir @ Umar, Wan Norizwani Binti Wan
Mohammad ‘Akthaillah, Nor Aida Ewani Binti Amran and Eliyana Natasha Binti Azman would
like to appreciate our Statistics 408 lecturer, Sir Muhamad Hasbullah Bin Mohd Razali, for giving
us opportunity to do this project assignment during this semester.

We want to express our deepest gratitude to Sir Muhamad Hasbullah Bin Mohd Razali for
providing the guidelines for us for completing the assignment through numerous consultations.
We are grateful because we managed to complete this assignment within the time given from
Sir Muhamad Hasbullah Bin Mohd Razali by his nice support and guidance.

Most importantly, I want to express my sincere gratitude to each member of the group. Your
dedication, professionalism, and commitment to excellence were exemplary. Despite the
inevitable obstacles and time constraints, we remained united and supportive of one another,
which truly made a difference in the project's outcome. We also sincerely thank all people for
their help directly and indirectly towards our project assignment.
1.0 INTRODUCTION
1.1 BACKGROUND OF STUDY

The main objectives of the statistical inferences are to justify the conclusions of a population
referring to the data obtained from the sample population. In the case of unknown value
population parameters, the values are estimated either by using point estimate or interval
estimate. If a hypothesis of the parameter’s value is made, we have to test whether the
hypothesis made is accepted or not. A hypothesis is a statement about the value of the
population parameter. The method that tells us whether or not to accept the hypothesis is called
the hypothesis testing or also known as the test of significance.

There are five important steps involved in hypothesis testing. The first step is by stating the null
and hypothesis study from the research question. The null hypothesis (Hо) is a statement that
the population parameter has a specific value. The alternative hypothesis (H1) is the opposite
hypothesis to the null hypothesis. It is accepted when the null hypothesis is rejected.

Followed by the test statistic step in hypothesis testing. The test statistic is a single number
calculated from the sample data as a basis in determining whether to reject the null hypothesis
or not.

Beside, a t-test is one of the inferential statistical types used to determine the significant
differences between the mean groups. There are 3 key data needed to calculate the t-test
values which are the differences in mean values from each data set, the standard deviation of
each group and the number of data values of each group.

On the other hand, the F test is the statistical analysis used to determine whether the variances
of the two populations of two samples is equal or not. In this study, the F test is used to
determine whether there is difference in the variance of gross income between Branch A and
Branch B. The two variances of gross income are divided and compared using the f statistic.

In conclusion, ANOVA test is a type of statistical test of differences in mean. According to


Kenton (2023), ANOVA analysis is the analysis of tools used in the statistic to split the
aggregate variations observed in a dataset into two parts which are a random systematic factor
and random factor. In this test, the variability of all observations in an experiment is divided into
two sources. One source is variability between groups and the other is variability within each
group. If the variability between groups is larger within group variability then, the null hypothesis
is rejected. In this study, the ANOVA test is to compare whether there is a difference in the
means of gross income between 3 branches.

1.2 OBJECTIVE

The objectives of the study are:

a. To estimate the mean and standard deviation of rating between the three supermarket
branches.
b. To determine whether there is a difference in the mean of gross income between female
and male buyers.
c. To determine whether there is difference in the variance of gross income between
Branch A and Branch B.
d. To compare whether there is a difference in the means of gross income between three
branches.
2.0 METHODOLOGY
2.1 DESCRIPTION OF DATA

For this study, we used secondary data as our observations. The information was collected
directly from a data set given by Sir Muhamad Hasbullah Bin Mohd Razali. Overall, the collected
data consists of 10 types of variables which influence the supermarket sales in three cities
which are city, customer type, gender, product line, unit price, quantity, tax, payment type and
rating. However, only three variables were used to perform the selected statistical tests in this
study which were rating, gender, and gross income from supermarkets in three cities with a
sample size of 29. The following table is the description of each variable involved.

No. Variables involved Type of variable Scale of


measurement

1. Rating Independent -

2. Gross income Independent Myanmar Kyat

3. Gender Independent -

2.2 METHOD OF ANALYSIS

Objective Statistical Analysis

To estimate the mean and standard deviation Descriptive Statistics


of rating between the three supermarket
branches.

To determine whether there is a difference in Independent sample t-test


the mean of gross income between female
and male buyers.
To determine whether there is difference in F-test
the variance of gross income between Branch
A and Branch B.

To compare whether there is a difference in One-way ANOVA


the means of gross income between three
branches.
3.0 FINDING
3.1 DESCRIPTIVE TEST

Objective: To estimate the mean and standard deviation of rating between the three
supermarket branches.

Brief informative coefficients known as descriptive statistics are used to sum up a particular data
set, which may be a sample of a population or a representation of the complete population.
Measures of central tendency and measures of variability (spread) are the two main categories
of descriptive statistics. The mean, median, and mode are measurements of central tendency,
while the standard deviation, variance, minimum and maximum variables, kurtosis, and
skewness are measures of variability.

The table below shows the output for mean, standard deviation and sample variances for rating
between different supermarket branches by using Excel

Table 1 Descriptive Analysis of Rating Between Three Different Supermarket Branches


Figure 1: Mean, Standard Deviation and Variance of Rating Between Three Different Supermarket
Branches

According to the figure above, the bar graphs show the mean, standard deviation and variance
of rating between different supermarkets. The graph shows three different supermarket
branches namely Branch A, Branch B and Branch C.

Based on the graph, the highest mean value rating between supermarket branches is Branch C
which is 7.310344828. The second highest rating supermarket branch is Branch A which is
6.944827586. Branch B recorded as the lowest rating supermarket branch which is
6.493103448. Mean is the average level of data observed. From the analysis, it showed that
Branch C has the highest rating among other branches.

Standard deviation describes the variance or how dispersed the data observed in that variable
is around its mean. In other words, the lower the value of variance, the more consistent the data
observed will be. Attributed to the graph, Branch C has the lowest value of variance followed by
Branch A and Branch B has the highest value of variance. From the analysis, it showed that the
rating for Branch C is more consistent than other branches.
3.2 T -TEST TWO SAMPLE ASSUMING EQUAL VARIANCES

Objective: To determine whether there is a difference in the mean of gross income between
female and male buyers.

Table 2 t-Test: Two-Sample Assuming Equal Variances

Assumptions: Populations are normal; the small samples are independent of each other

The 5 steps in conducting hypothesis testing for T- test are:

1. Stating hypothesis

Ηօ = μ𝐹= μ𝑀
Η1 = μ ≠ μ (two-tailed test)
𝐹 𝑀

2. Appropriate test statistic T Stat


T stat value obtained from the t-test table
T stat = - 0.82642404
3. Determining critical value
T critical value obtained from the t-test table
α = 0.05,T Critical two-tail = +- 2.003240719

4. Decision rule
Reject Ho if T stat < - T critical or T stat > T critical
Or
Reject Ho if p-value < α

5. Decision and conclusion

Since T stat = - 0.82642404 > T Critical = - 2.003240719, failed to reject Ho.


Or
Since p-value (obtained from the t-test table) = 0.412070932 > α = 0.05, fail to reject Ho.

μ𝐹= μ𝑀

Thus, there is no significant difference in the mean of gross income between female and male
buyers.
3.3 F-TEST TWO-SAMPLE FOR VARIANCES

Objective: To determine whether there is difference in the variance of gross income between
Branch A and Branch B.

To determine whether there is difference in the variance of two population gross income
between Branch A and Branch B, F-Test was chosen as statistical analysis.

The assumption of hypothesis testing for F-Test two-sample for variances or also known as
hypothesis tests for ratio between two population variances is that the population is normally
distributed.

Since the statement made is about population parameters, the F-Test was done by taking 60
random samples from the two populations (29 samples for each population) and sample
statistics were calculated.

To provide the output for F-Test analysis, Excel was used, the summary, and the F-Test table
from the output are shown below:

Table 3: F-Test Two-Sample for Variances.


The 5 steps in conducting hypothesis testing for said F-test are:

1. Stating hypothesis
2 2
Null hypothesis, Ho: σ1 = σ2
2 2
Alternative hypothesis, H1 : σ1 < σ2 (left tailed)

2. Appropriate test statistic – F distribution


Fcal value obtained from the F-test table
Fcal = 0.885766968

3. Determining critical value


Fcrit value obtained from the F-test table
Fcrit = 0.531327202

4. Decision rule
Reject Ho if Fcal < Fcrit

5. Decision and conclusion

Since Fcal = 0.885766968 > Fcrit = 0.531327202, failed to reject Ho.


2 2
σ1 = σ2

Thus, there is no significant difference in the variance of gross income between Branch A and
Branch B.
3.4 ANOVA: SINGLE FACTOR

Objective: To compare whether there is a difference in the means of gross income between
three branches.

ANOVA is the statistical analysis chosen for the specific objective stated above , a short term for
analysis of variance which is a test of differences in means. There is only one factor influencing
the observations for the selected statistical data that has been considered, which are the
branches of supermarkets namely Branch A, Branch B and Branch C. Hence, ANOVA: Single
Factor has been chosen from the data analysis to indicate whether there is a difference in the
means of gross income between three branches. Several assumptions for hypothesis testing
that should be satisfied when applying ANOVA: Single Factor or also known as one-way
ANOVA are listed below:

1. Each of the groups must be random samples from normal populations.


2. Population variances must be equal.
3. Observations are independent.

By using Excel to provide the output for ANOVA analysis, the tabulated summary and
the ANOVA table from the output are shown below:

Table 4 Summary of Three Branches & ANOVA Single Factor


The 5 steps in conducting hypothesis testing for said ANOVA test are:

1. Stating hypothesis
Let subscripts 1, 2 and 3 representing Branch A, Branch B and Branch C of the Supermarket
:
Null hypothesis, Ho = μ1 = μ2 = μ3
Alternative hypothesis, H1 = At least two means are different

2. Appropriate test statistic – F distribution


F statistic is obtained by dividing the mean square of between group (treatment) and mean
square of within group (error).

Fcal = 101.9757229 / 151.3050197


= 0.673974486 (as stated in the above ANOVA table)

3. Determining critical value


Obtaining Fcrit value from the ANOVA table
Fcrit = 3.10515661

4. Decision rule
Reject Ho if Fcal > Fcrit

5. Decision and conclusion


Fail to reject Ho since Fcal = 0.673974486 < Fcrit = 3.10515661.

Thus, there is no significant difference in the means of gross income between three branches of
the supermarkets.
4.0 CONCLUSION

All the four objectives of the studies have been achieved. For descriptive statistics we can see
that, the mean and standard deviation of rating between three supermarket branches which is
A, B and C has been estimated. Next, for the T-test, the gross income of male and females
showed no difference . For F-test, variance of gross income between Branch B and Branch A
reveal the same. Last but not least, for ANOVA test, there is no significant difference between
the mean of gross income of three branches supermarkets because Ho has failed to reject.
5.0 REFLECTIVE ESSAY

The project experience provided us with valuable insights, including some entirely new and
unprecedented ones. To compile the report, our first task was to identify suitable datasets for
extraction, which we obtained from several available databases. Additionally, we gained
hands-on experience in conducting analysis using Excel, specifically utilizing its data analysis
option to generate outputs. Through the process of writing this report, we developed a deeper
understanding of the practical applications of four distinct statistical tests employed in this study:
descriptive statistics, T-Test, F-Test, and ANOVA. It is evident that different statistical tests are
utilized to analyze various functions, involving different variables and parameters, ultimately
allowing us to effectively determine whether our objectives were fully accomplished.
6.0 REFERENCES

Bansal, S. (n.d.). How to Get Descriptive Statistics in Excel. Retrieved from TrumpEXCEL:
https://trumpexcel.com/descriptive-statistics-excel/

Kenton,W. (2023, June 12). What is the definition of ANOVA .Retrieved from
https://www.investopedia.com/terms/a/anova.asp .

Rohana, A., Nor Azriani, M.N., & Balkiah, M.. (2020). Statistics for Science & Engineering (3rd
ed.). Faculty of Computer and Mathematical Sciences Universiti Teknologi MARA
Cawangan Perliss.

Zach. (2021, November 30). How to Interpret ANOVA Results in Excel. Retrieved from Statology:
https://www.statology.org/interpret-anova-results-in-excel/

You might also like