Statistics No 1

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 14

Assignment No.

1
STATISTICS

Topic:
Statistics Tests & Statistical Software’s

Submitted by:

M. Saad Saeed (BB-23-20)


Mahad Hassan (BB-23-66)
Rao Ahsan Saleem (BB-22-72)
Daniyal Ahmad (BB -23-21)
Sharjeel Altaf (BB-23-24)

BBA 2nd Semester (Morning)

Date:
09-05-2024

Submitted to:
Mam Zainab Rehman

INSTITUTE OF MANAGEMENT SCIENCES BZU, MULTAN


1. Statistical Tests

Statistical tests are used in hypothesis testing. They can be used to:

▪ Determine whether a predictor variable has a statistically significant relationship with an outcome
variable.
▪ Estimate the difference between two or more groups.
▪ Statistical tests assume a null hypothesis of no relationship or no difference between groups. Then
they determine whether the observed data fall outside of the range of values predicted by the null
hypothesis.

We are going to discuss such 5-test given below:

1. Z- Test:

➢ Formula:

➢ Application:
▪ Z-Test is used in hypothesis testing to evaluate whether a finding or association is
statistically significant or not.
▪ In particular, it tests whether two means are the same (the null hypothesis).
▪ A Z-Test can only be used if the population standard deviation is known, and the sample
size is 30 data points or larger than it.
➢ Table:

➢ Advantages of the Z-Test:


▪ It is suitable for large sample sizes.
▪ It provides a straightforward and easy-to-understand method for hypothesis testing.
▪ The Z-Test is widely recognized and utilized in various fields, including research, quality
control, and business analytics.
➢ Disadvantages of Z-Test:
▪ Limited to large sample sizes: The Z-Test is not appropriate for small sample sizes.
▪ As its assumptions and calculations are based on large sample theory.
▪ The limitation of Z-Tests is that we don’t usually know the population standard
deviation.

2. Welch’s T-Test, Or Unequal Variances T-Test:

➢ Formula:
Formula for Welch t-test
D = (mean1 – mean2)/sqrt ((var1 + var2)/2)
Where:
▪ mean1 and mean2 are the means of each group, respectively.
▪ Var1 and var2 are the variance of the two groups.

➢ Application:
▪ Welch’s t-test also known as unequal variances t-test is used when you want to test
whether the means of two population are equal.
▪ This test is generally applied when the there is a difference between the variations of
two populations and also when their sample sizes are unequal.
➢ Table:

➢ Advantages:
▪ Welch’s t-test can be generalized to more than 2-samples, which is more robust than
one-way analysis of variance (ANOVA).
▪ This leads to more accurate results and prevents us from making false conclusions.
▪ The power of Welch’s t-test comes close to that of Student’s t-test, even when the
population variances are equal and sample sizes are balanced.
➢ Disadvantages:
Welch’s t-test has some limitations that one needs to be aware of.
▪ First, the test assumes that the two populations have similar means.
▪ Secondly, the test may not be suitable for small sample sizes, as it may not have enough
power to detect a significant difference.

3. Median Test:

➢ Formula:
Assume that we have a random sample of size n1 from population 1 and a random sample
of size n2 from population 2. The median test can be summarized as follows.
H 0: m 1 = m 2 versus H a : m 1 > m 2 ,
▪ upper tailed test m 1 < m 2
▪ lower tailed test m 1 ≠ m 2, two-tailed test
➢ Application:
▪ The median test is a non-parametric test that is used to test whether two (or more)
independent groups differ in central tendency – specifically whether the groups have
been drawn from a population with the same median.
▪ The null hypothesis is that the groups are drawn from populations with the same
median.
➢ Table:

➢ Advantages:
▪ One advantage of using median tests is that they are robust to outliers.
▪ Unlike mean-based tests, which can be heavily influenced by extreme values in the data,
median-based tests are less affected by these outliers. This makes them a good choice
for data sets that are skewed or have extreme values.
➢ Disadvantages:
▪ Potential drawback is that they may have lower statistical power than mean-based
tests. This means that they may be less likely to detect a significant difference between
groups in some cases, particularly when the sample size is small.
▪ Another disadvantage of median tests is that they can be more difficult to interpret than
mean-based tests. This is because the median represents the middle value in a data set,
rather than the average value.
4. Fisher’s Exact Test:

➢ Formula:

Where:
▪ a, b, c, d are the values given in contingency table.
▪ N=total frequency.

➢ Application:
▪ Fisher’s Exact Test is generally preferred when dealing with small datasets or when the
expected frequencies in any cell of the 2×2 table are less than 5.
▪ Ensuring the robustness and reliability of the inferential conclusions drawn from the
analysis.
➢ Table:
▪Fisher’s exact test makes use of contingency tables to calculate the probability of
observing the data as it is, considering all other possible arrangements of the observed
data while maintaining the row and column totals fixed.
▪ Fisher’s exact test makes use of contingency tables to calculate the probability of
observing the data as it is, considering all other possible arrangements of the observed
data while maintaining the row and column totals fixed.
➢ Advantages:
▪ Fisher’s exact test is more accurate than the chi-square test or G–test of independence
when the expected numbers are small.
▪ I recommend using Fisher’s exact test when the total sample size is less than 1000 and
use the chi-square or G–test for larger sample sizes.
➢ Disadvantages:
▪ As its name implies, Fisher’s exact test, gives an exactly correct answer no matter what
sample size you use.
▪ But some statisticians conclude that Fisher’s test gives the exact answer to the wrong
question, so its result is also an approximation to the answer you really want.

5. F-Test:

➢ Formula:

➢ Application:
▪ The F-test is used by a researcher in order to carry out the test for the equality of the
two population variances.
▪ If a researcher wants to test whether or not two independent samples have been drawn
from a normal population with the same variability, then he generally employs the F-
test.
➢ Table:

➢ Advantages:
▪ F-tests are surprisingly flexible because you can include different variances in the ratio
to test a wide variety of properties.
▪ F-tests can compare the fits of different models, test the overall significance in
regression models, test specific terms in linear models, and determine whether a set of
means are all equal.
➢ Disadvantages:
▪ While the F-test is a powerful statistical tool, it does have some limitations.
▪ One of the main limitations is that it assumes that the data is normally distributed. If
the data is not normally distributed, the results of the F-test may not be accurate.
▪ Another limitation is that the F-test is sensitive to outliers.

REFERENCE’S
https://en.wikipedia.org/wiki/List_of_statistical_tests
2. Statistical Software’s

1. Minitab:

➢ Application:
▪ Minitab is a data analysis software package that is used for data analysis. It is widely
used in a variety of industries, including healthcare, manufacturing, and education.
▪ Minitab provides users with tools to perform statistical analysis, including hypothesis
testing, regression analysis, and ANOVA.
➢ Pros & Cons
▪ Minitab Statistical Software can look at current and past data to discover trends, find
and predict patterns, uncover hidden relationships between variables, and create
visualizations.
▪ Key statistical tests available in Minitab include t tests, one and two proportions,
normality test, chi-square, and equivalence tests.

2. MATLAB:

➢ Application:
▪ MATLAB is a programming platform designed specifically for engineers and scientists to
analyze and design systems and products that transform our world.
▪ The heart of MATLAB is the MATLAB language, a matrix-based language allowing the
most natural expression of computational mathematics.
➢ Pros & Cons:
▪ Powerful numerical computation.
▪ Extensive functionality.
▪ Interactive development environment
▪ Data visualisation.
▪ Simulink integration.
▪ Algorithm development and prototyping.
▪ Community support and resources.

3. XLStat:

➢ Application:
▪ XLSTAT works as an Excel add-on and allows users to analyze, customize, and share data
in an easily digestible format.
▪ While its counterpart, NVivo, specifically helps researchers organize qualitative data in a
more structured way that allows for deeper insights.
➢ Pros & Cons:
▪ It includes more than 100 key statistical tools to help you gain in-depth insights into
your data. You will benefit from data preparation and visualization tools, parametric and
nonparametric tests, and modelling methods, such as ANOVA, regression, and
generalized linear or nonlinear models.
▪ The only downside to XLSTAT is its lack of flexibility. Not a lot of data manipulation
functions.
▪ The statistical data can lead to misuse.
▪ There are chances of errors becomes easy when the statistical methods are not done by
the experts.

4. SAS Viya:

➢ Application:
▪ SAS Viya offers rapid AI and analytics.
▪ Benefit from faster data integration,
▪ Effective model development and
▪ Reduced cloud costs.
➢ Pros & Cons:
▪ It makes data handling easier.
▪ It is robust and reliable in a variety of sectors.
▪ It has a steep learning curve.
▪ It is not very flexible and provides limited customization.
5. JMP:

➢ Application:
▪ JMP empowers users to explore and analyze data visually, solve critical problems and
then share those insights to make stronger data-driven decisions.
▪ Unlike spreadsheets or other statistical software, JMP is designed for the way you solve
problems across the entire analytic workflow.
➢ Pros & Cons:
▪ JMP brings your data analysis to a whole new level, letting you tackle routine and
difficult statistical problems more easily and communicate your findings more
effectively.
▪ Your data comes in many forms. Fortunately for you, JMP is hungry for data.
▪ Variable value designation is a big problem in JMP, the software fails to recognize the
type of data when it comes to numeric value.”
▪ “Error and bug fixing requires exhaustive customer support and it is very time
consuming.
REFERENCE’S
https://www.google.com/search?q=Statistical+software+for+data+analysis&oq=statistica&gs_lcrp=EgZja
HJvbWUqDggBEEUYJxg7GIAEGIoFMgYIABBFGDkyDggBEEUYJxg7GIAEGIoFMgYIAhAjGCcyEwgDEAAYkQIYi
wMYsQMYgAQYigUyEAgEEAAYkQIYiwMYgAQYigUyBggFEEUYPDIGCAYQRRg8MgYIBxBFGDwyEAgIEAAYkQI
YiwMYgAQYigUyDAgJEAAYQxiABBiKBTIUCAoQLhgKGEMYxwEY0QMYgAQYigUyDAgLEAAYQxiABBiKBTIHCA
wQABiABDIHCA0QABiABDIHCA4QABiABNIBCDY5MzlqMWo3qAIUsAIB&client=ms-
unknown&sourceid=chrome-mobile&ie=UTF-8

You might also like