Professional Documents
Culture Documents
Asignment STA
Asignment STA
STA408
GROUP PROJECT:
INFERENTIAL STATISTICS
PREPARED BY:
PREPARED FOR:
SIR MUHAMMAD HASBULLAH BIN MOHD RAZALI
DATE OF SUBMISSION:
7 JULY 2023
TABLE OF CONTENT
ACKNOWLEDGEMENT 3
1.0 INTRODUCTION 4
1.1 BACKGROUND OF STUDY 4
1.2 OBJECTIVE 5
2.0 METHODOLOGY 6
2.1 DESCRIPTION OF DATA 6
2.2 METHOD OF ANALYSIS 6
3.0 FINDING 8
3.1 DESCRIPTIVE TEST 8
3.2 T -TEST TWO SAMPLE ASSUMING EQUAL VARIANCES 10
3.3 F-TEST TWO-SAMPLE FOR VARIANCES 12
3.4 ANOVA: SINGLE FACTOR 14
4.0 CONCLUSION 16
5.0 REFLECTIVE ESSAY 17
6.0 REFERENCES 18
ACKNOWLEDGEMENT
First and foremost, praise and thanks to Allah, the Almighty, for His showers of blessings
throughout the journey to complete the STA408 project assignment successfully. We as a group
member for this project, Aishah Binti Mohd Hazril Munir @ Umar, Wan Norizwani Binti Wan
Mohammad ‘Akthaillah, Nor Aida Ewani Binti Amran and Eliyana Natasha Binti Azman would
like to appreciate our Statistics 408 lecturer, Sir Muhamad Hasbullah Bin Mohd Razali, for giving
us opportunity to do this project assignment during this semester.
We want to express our deepest gratitude to Sir Muhamad Hasbullah Bin Mohd Razali for
providing the guidelines for us for completing the assignment through numerous consultations.
We are grateful because we managed to complete this assignment within the time given from
Sir Muhamad Hasbullah Bin Mohd Razali by his nice support and guidance.
Most importantly, I want to express my sincere gratitude to each member of the group. Your
dedication, professionalism, and commitment to excellence were exemplary. Despite the
inevitable obstacles and time constraints, we remained united and supportive of one another,
which truly made a difference in the project's outcome. We also sincerely thank all people for
their help directly and indirectly towards our project assignment.
1.0 INTRODUCTION
1.1 BACKGROUND OF STUDY
The main objectives of the statistical inferences are to justify the conclusions of a population
referring to the data obtained from the sample population. In the case of unknown value
population parameters, the values are estimated either by using point estimate or interval
estimate. If a hypothesis of the parameter’s value is made, we have to test whether the
hypothesis made is accepted or not. A hypothesis is a statement about the value of the
population parameter. The method that tells us whether or not to accept the hypothesis is called
the hypothesis testing or also known as the test of significance.
There are five important steps involved in hypothesis testing. The first step is by stating the null
and hypothesis study from the research question. The null hypothesis (Hо) is a statement that
the population parameter has a specific value. The alternative hypothesis (H1) is the opposite
hypothesis to the null hypothesis. It is accepted when the null hypothesis is rejected.
Followed by the test statistic step in hypothesis testing. The test statistic is a single number
calculated from the sample data as a basis in determining whether to reject the null hypothesis
or not.
Beside, a t-test is one of the inferential statistical types used to determine the significant
differences between the mean groups. There are 3 key data needed to calculate the t-test
values which are the differences in mean values from each data set, the standard deviation of
each group and the number of data values of each group.
On the other hand, the F test is the statistical analysis used to determine whether the variances
of the two populations of two samples is equal or not. In this study, the F test is used to
determine whether there is difference in the variance of gross income between Branch A and
Branch B. The two variances of gross income are divided and compared using the f statistic.
1.2 OBJECTIVE
a. To estimate the mean and standard deviation of rating between the three supermarket
branches.
b. To determine whether there is a difference in the mean of gross income between female
and male buyers.
c. To determine whether there is difference in the variance of gross income between
Branch A and Branch B.
d. To compare whether there is a difference in the means of gross income between three
branches.
2.0 METHODOLOGY
2.1 DESCRIPTION OF DATA
For this study, we used secondary data as our observations. The information was collected
directly from a data set given by Sir Muhamad Hasbullah Bin Mohd Razali. Overall, the collected
data consists of 10 types of variables which influence the supermarket sales in three cities
which are city, customer type, gender, product line, unit price, quantity, tax, payment type and
rating. However, only three variables were used to perform the selected statistical tests in this
study which were rating, gender, and gross income from supermarkets in three cities with a
sample size of 29. The following table is the description of each variable involved.
1. Rating Independent -
3. Gender Independent -
Objective: To estimate the mean and standard deviation of rating between the three
supermarket branches.
Brief informative coefficients known as descriptive statistics are used to sum up a particular data
set, which may be a sample of a population or a representation of the complete population.
Measures of central tendency and measures of variability (spread) are the two main categories
of descriptive statistics. The mean, median, and mode are measurements of central tendency,
while the standard deviation, variance, minimum and maximum variables, kurtosis, and
skewness are measures of variability.
The table below shows the output for mean, standard deviation and sample variances for rating
between different supermarket branches by using Excel
According to the figure above, the bar graphs show the mean, standard deviation and variance
of rating between different supermarkets. The graph shows three different supermarket
branches namely Branch A, Branch B and Branch C.
Based on the graph, the highest mean value rating between supermarket branches is Branch C
which is 7.310344828. The second highest rating supermarket branch is Branch A which is
6.944827586. Branch B recorded as the lowest rating supermarket branch which is
6.493103448. Mean is the average level of data observed. From the analysis, it showed that
Branch C has the highest rating among other branches.
Standard deviation describes the variance or how dispersed the data observed in that variable
is around its mean. In other words, the lower the value of variance, the more consistent the data
observed will be. Attributed to the graph, Branch C has the lowest value of variance followed by
Branch A and Branch B has the highest value of variance. From the analysis, it showed that the
rating for Branch C is more consistent than other branches.
3.2 T -TEST TWO SAMPLE ASSUMING EQUAL VARIANCES
Objective: To determine whether there is a difference in the mean of gross income between
female and male buyers.
Assumptions: Populations are normal; the small samples are independent of each other
1. Stating hypothesis
Ηօ = μ𝐹= μ𝑀
Η1 = μ ≠ μ (two-tailed test)
𝐹 𝑀
4. Decision rule
Reject Ho if T stat < - T critical or T stat > T critical
Or
Reject Ho if p-value < α
μ𝐹= μ𝑀
Thus, there is no significant difference in the mean of gross income between female and male
buyers.
3.3 F-TEST TWO-SAMPLE FOR VARIANCES
Objective: To determine whether there is difference in the variance of gross income between
Branch A and Branch B.
To determine whether there is difference in the variance of two population gross income
between Branch A and Branch B, F-Test was chosen as statistical analysis.
The assumption of hypothesis testing for F-Test two-sample for variances or also known as
hypothesis tests for ratio between two population variances is that the population is normally
distributed.
Since the statement made is about population parameters, the F-Test was done by taking 60
random samples from the two populations (29 samples for each population) and sample
statistics were calculated.
To provide the output for F-Test analysis, Excel was used, the summary, and the F-Test table
from the output are shown below:
1. Stating hypothesis
2 2
Null hypothesis, Ho: σ1 = σ2
2 2
Alternative hypothesis, H1 : σ1 < σ2 (left tailed)
4. Decision rule
Reject Ho if Fcal < Fcrit
Thus, there is no significant difference in the variance of gross income between Branch A and
Branch B.
3.4 ANOVA: SINGLE FACTOR
Objective: To compare whether there is a difference in the means of gross income between
three branches.
ANOVA is the statistical analysis chosen for the specific objective stated above , a short term for
analysis of variance which is a test of differences in means. There is only one factor influencing
the observations for the selected statistical data that has been considered, which are the
branches of supermarkets namely Branch A, Branch B and Branch C. Hence, ANOVA: Single
Factor has been chosen from the data analysis to indicate whether there is a difference in the
means of gross income between three branches. Several assumptions for hypothesis testing
that should be satisfied when applying ANOVA: Single Factor or also known as one-way
ANOVA are listed below:
By using Excel to provide the output for ANOVA analysis, the tabulated summary and
the ANOVA table from the output are shown below:
1. Stating hypothesis
Let subscripts 1, 2 and 3 representing Branch A, Branch B and Branch C of the Supermarket
:
Null hypothesis, Ho = μ1 = μ2 = μ3
Alternative hypothesis, H1 = At least two means are different
4. Decision rule
Reject Ho if Fcal > Fcrit
Thus, there is no significant difference in the means of gross income between three branches of
the supermarkets.
4.0 CONCLUSION
All the four objectives of the studies have been achieved. For descriptive statistics we can see
that, the mean and standard deviation of rating between three supermarket branches which is
A, B and C has been estimated. Next, for the T-test, the gross income of male and females
showed no difference . For F-test, variance of gross income between Branch B and Branch A
reveal the same. Last but not least, for ANOVA test, there is no significant difference between
the mean of gross income of three branches supermarkets because Ho has failed to reject.
5.0 REFLECTIVE ESSAY
The project experience provided us with valuable insights, including some entirely new and
unprecedented ones. To compile the report, our first task was to identify suitable datasets for
extraction, which we obtained from several available databases. Additionally, we gained
hands-on experience in conducting analysis using Excel, specifically utilizing its data analysis
option to generate outputs. Through the process of writing this report, we developed a deeper
understanding of the practical applications of four distinct statistical tests employed in this study:
descriptive statistics, T-Test, F-Test, and ANOVA. It is evident that different statistical tests are
utilized to analyze various functions, involving different variables and parameters, ultimately
allowing us to effectively determine whether our objectives were fully accomplished.
6.0 REFERENCES
Bansal, S. (n.d.). How to Get Descriptive Statistics in Excel. Retrieved from TrumpEXCEL:
https://trumpexcel.com/descriptive-statistics-excel/
Kenton,W. (2023, June 12). What is the definition of ANOVA .Retrieved from
https://www.investopedia.com/terms/a/anova.asp .
Rohana, A., Nor Azriani, M.N., & Balkiah, M.. (2020). Statistics for Science & Engineering (3rd
ed.). Faculty of Computer and Mathematical Sciences Universiti Teknologi MARA
Cawangan Perliss.
Zach. (2021, November 30). How to Interpret ANOVA Results in Excel. Retrieved from Statology:
https://www.statology.org/interpret-anova-results-in-excel/