Download as xlsx, pdf, or txt
Download as xlsx, pdf, or txt
You are on page 1of 21

Faculty Name with Code: MANEESHA.

P A
Subject Code:22BST4AA
Subejct Title: Inferential Statistics

Part – A (5 Mark Questions) Question difficulty


All the questions should be equally divided into all level
cognitive level. (UG 1 to 4 and PG 5&6)
Unit Number Minimum 5 questions from each unit and each Low/Medium/Hig
S. No 1/2/3/4/5 Section Mark cognitive level h
1 1 A 5

Medium

2 1 A 5 Explain power of a test and critical region

Define random sample, parameter and statistic High

3 1 A 5

What is Hypothesis ? what are the types of low


hypothesis
4 1 A 5

Explain null and alternative hypothesis High


5 1 A 5

What is Neyman Pearson Lemma explain with Medium


6 2 A 5
derivation
Explain T test for one sample mean and two low
7 2 A 5 sample mean
Give explanation about confidence interval for High
8 2 A 5
population arithmetic mean

What is parametric testing and its types High

9 2 A 5

Discuss F test for equality of two variances. low


10 2 A 5

Explain confidence interval for population High


variance
11 3 A 5

What is Sign test explain with its types Medium


12 3 A 5
Explain briefly Median test give proper example low
13 3 A 5

Discuss Signed rank test in detail with example High


14 3 A 5

Explain Wilcoxon signed Rank Test High


15 3 A 5

Briefly explain Kruskal Wallis test low


16 4 A 5
What are the properties of method of maximum High
17 4 A 5 likelihood estimator

What are assumptions of ordinary least square Medium


method

18 4 A 5

What are the properties of the OLS method. low

19 4 A 5

What are the properties of good estimator low

20 4 A 5

Discuss briefly what is estimation? High


21 5 A 5

Medium

22 5 A 5 What are the important terms related to Bayesian statis


Give an Introduction to Bayes inference low
23 5 A 5
Discuss Bayesian Interval estimation Medium

24 5 A 5

Explain Prior and posterior distributions low

25 5 A 5

Explain properly Bayesian Procedure High

Part – B (12 Mark Questions)


All the questions should be equally divided into all
cognitive level. (UG 1 to 4 and PG 5&6)
Unit Number Minimum 5 questions from each unit and each Question difficulty level
S. No 1/2/3/4/5 Section Mark cognitive level Low/Medium/High

Explain in detail Level of significance and low


probabilities of Type I and Type II errors

1 1 B 12

Discuss in detail Hypothesis and types of hypothesis High

2 1 B 12

What are the Steps involved in Hypothesis testing Medium

3 1 B 12

Explain unbiased test and unbiased critical region low


4 1 B 12

Explain application of Neyman Pearson Lemma High

5 1 B 12
Explain in detail the T and F testing High

6 2 B 12

Discuss in detail about test for the equality of two or low


more than two normal distributions
7 2 B 12

Explain confidence interval for population arithmetic High


mean, & confidence interval for population variance.
8 2 B 12

Explain single mean, two mean ,single proportion Medium


and two proportions

9 2 B 12

Describe Parametric test and its properties low

10 2 B 12

Explain in detail about Goodness of fit test and High


independence of attributes using test

11 3 B 12

Discuss in detail about testing of equality of more High


than two variances using test
12 3 B 12

Explain non parametric test describe its properties low


and applications
13 3 B 12

Explain a)Wilcoxon signed rank test b) Mann- High


Whiteney U test
14 3 B 12
When to use Fried Man test explain with example High

15 3 B 12

Explain Estimation, Types of estimation, properties low


of good estimator
16 4 B 12

Give Introduction and assumptions of ordinary least High


square method

17 4 B 12

Explain unbiasedness, consistency, efficiency and Medium


sufficiency

18 4 B 12

Discuss estimation of parameters in multiple linear low


regression coefficients

19 4 B 12

List the properties of the OLS method. High

20 4 B 12
Explain Prior and posterior distributions High

21 5 B 12

Discuss point estimation of Bayesian statistic low

22 5 B 12

Explain important terms related to Bayesian


inference. High

23 5 B 12

What are the Bayesian testing procedures High

24 5 B 12

Explain Bayesian Theorem in detail Medium

25 5 B 12

Part – C (15 Mark Questions)


(COMPULSORY QUESTION)
Unit Number Minimum 1 question from each unit and each Question difficulty level
S. No 1/2/3/4/5 Section Mark cognitive level High

Define Neyman Pearson Lemma, application and High

1 1 C 15

Explain small samle test in detail with its types High

2 2 C 15
High
Discuss the steps of the following testing
strategieaa)Wilcoxon signed Rank Test,
b)Mann-Whiteney U test, c)Sign test,
3 3 C 15 d) Signed rank test

High

4 4 C 15 Discuss estimation of mean and variance of norma

High

5 5 C 15 Explain Bayesian Statistical Inference and Estimat


Course
Cognitive Level Outcome

CO1
Understanding

CO1
Understanding

CO1
Remembering

CO1
Understanding

CO1
Understanding

Remembering
CO2

Understanding
CO2

Understanding

CO2
Remembering
CO2

Understanding
CO2
Understanding
CO3
Understanding
CO3

Understanding
CO3

Understanding
CO3

Remembering
CO3

Understanding
CO4

Understanding

CO4

Remembering

CO4

Remembering

CO4

Understanding
CO4

Understanding

CO5
Remembering
CO5
Understanding

CO5

Remembering

CO5
Understanding
CO5

Cognitive LevelCourse Outcome


CO1

Remembering

CO1
Understanding

CO1

Understanding

CO1
Remembering

CO1
Understanding
Understanding

CO2

Remembering
CO2

Understanding
CO2

Understanding

CO2

Understanding

CO2

Understanding

CO3

Understanding

CO3

Remembering

CO3

Understanding
CO3
Understanding

CO3

Remembering
CO4

Understanding

CO4

Understanding

CO4

Understanding

CO4

Understanding

CO4
Understanding

CO5

Remembering

CO5

Understanding

CO5

Understanding

CO5

Understanding

CO5

Cognitive LevelCourse Outcome

Understanding

Understanding
Applying

Remembering

Understanding
Answer Key
(Maximum of 4 to 5 points)

The power of a statistical test is the probability that the test will correctly reject a false null hypothesis. In other
words, it is the ability of a test to detect a true effect or difference when it exists. The critical region (also
known as the rejection region) is the set of all possible sample outcomes that would lead to the rejection of the
null hypothesis
Random Sample:A random sample is a subset of a population chosen in such a way that each member of the
population has an equal chance of being included.
Parameter:A parameter is a numerical characteristic of a population, often denoted by Greek letters, that
summarizes or describes a specific aspect of the entire group.
Statistic:A statistic is a numerical measure or summary calculated from a sample, used to estimate or infer
information about a corresponding parameter of the population.

A hypothesis in inferential statistics is a statement or assumption about a population parameter, often framed
to be tested through statistical analysis to determine its validity. Types Null Hypothesis and Alternative
Hypothesis
Null Hypothesis (H0):The null hypothesis is a statement suggesting that there is no significant difference, effect,
or relationship in the population; any observed difference in the sample is due to random chance.
Alternative Hypothesis (Ha or H1):The alternative hypothesis is a statement proposing that there is a significant
difference, effect, or relationship in the population, contradicting the null hypothesis. It represents what
researchers aim to support with their data.
The Neyman-Pearson Lemma states that among all possible tests for a simple versus simple hypothesis testing
problem (where each hypothesis specifies a unique probability distribution), the likelihood ratio test is the most
powerful test for a given level of significance.
Used to determine if the mean of a single sample is significantly different from a known or hypothesized
population mean.
A confidence interval for the population arithmetic mean is a range of values constructed from sample data that is
likely to contain the true population mean with a certain level of confidence. It provides a way to quantify the
uncertainty associated with estimating a population parameter based on a sample.

Parametric testing refers to a category of statistical tests that make certain assumptions about the distribution of
the underlying population from which the sample is drawn . One-Sample t-Test: Compares the mean of a single
sample to a known or hypothesized population mean.
Two-Sample t-Test: Compares the means of two independent samples to assess if they are significantly different
from each other.
Paired Sample t-Test: Compares the means of two related or paired samples, such as repeated measurements
on the same subjects.
The F-test for equality of two variances is a statistical test used to assess whether the variances of two
independent samples are equal.

A confidence interval for the population variance is a statistical range of values that is constructed from sample
data and is used to estimate the true variance of a population with a certain level of confidence. This interval
provides a measure of the uncertainty associated with estimating the population variance based on a sample.
The sign test is a non-parametric statistical test used to determine whether the median of a sample is equal to a
hypothesized population median.
The Median Test is a non-parametric statistical test used to determine whether there is a significant difference
between the medians of two or more independent groups. It is particularly useful when the assumption of
normality is not met or when dealing with ordinal or skewed interval data.

The Signed Rank Test, also known as the Wilcoxon Signed Rank Test, is a non-parametric statistical test used to
determine whether the median of a single sample is different from a hypothesized median. It is particularly useful
when the assumption of normality is not met or when dealing with ordinal or skewed interval data.
The Wilcoxon Signed Rank Test is a non-parametric statistical test used to determine whether the median of a
paired sample is different from a hypothesized median. It's commonly applied when the data do not meet the
assumptions of normality or when working with ordinal or skewed interval data. This test is particularly useful for
paired samples or repeated measures designs.
The Kruskal-Wallis H test (sometimes also called the "one-way ANOVA on ranks") is a rank-based nonparametric
test that can be used to determine if there are statistically significant differences between two or more groups of
an independent variable on a continuous or ordinal dependent variable.

Consistency,Asymptotic Normality,Efficiency,Invariance,Robustness,Efficiency at Asymptotic Normality

Linearity:

Independence:

Homoscedasticity:

Normality of Residuals

Unbiased Estimation:

Efficiency:

BLUE (Best Linear Unbiased Estimator):

Minimum Variance:

Unbiasedness
Efficiency
Consistency
Sufficiency
Minimum Variance
Estimation, in statistics, refers to the process of making educated guesses or approximations about certain
characteristics or parameters of a population based on information derived from a sample of that population. The
goal of estimation is to infer or predict unknown values, such as population means, proportions, variances, or
other parameters, using observed data from a subset of the population.

Bayesian inference
Prior probability
Likelihood
Posterior probability
Bayes' theorem
Bayes factor
Bayesian inference is a fundamental concept in statistics that provides a framework for updating our beliefs about
uncertain quantities based on evidence or data.
Bayesian interval estimation, also known as Bayesian credible interval estimation, is a method used in Bayesian
statistics to estimate a range of plausible values for an unknown parameter of interest. Unlike classical
frequentist confidence intervals, which are constructed based on the sampling distribution of the estimator,
Bayesian credible intervals directly incorporate prior knowledge and uncertainties about the parameter.

Prior Distribution:
The prior distribution in Bayesian statistics represents our beliefs or uncertainty about a parameter before
observing any data. It encapsulates our subjective knowledge, information from previous studies, expert
opinions, or assumptions about the parameter's possible values
Posterior Distribution:
The posterior distribution in Bayesian statistics represents our updated beliefs or uncertainty about a parameter
after observing data. It combines the prior distribution with the likelihood function, which quantifies the probability
of observing the data given different values of the parameter.
The Bayesian procedure is a statistical framework for making inferences, predictions, and decisions based on
probability theory, specifically Bayesian probability. It involves several steps

Answer Key
(Maximum of 6 to 7 points)
The level of significance, denoted by
�α, represents the threshold for rejecting the null hypothesis in hypothesis testing. A Type I error
occurs when the null hypothesis is incorrectly rejected when it is actually true. In other words, it
represents the situation where we detect an effect or difference when there is none in reality. The
probability of committing a Type I error is precisely the level of significance

α chosen for the test.A Type II error occurs when the null hypothesis is incorrectly not rejected when it
is actually false. In other words, it represents the situation where we fail to detect an effect or
difference when one truly exists

A hypothesis is a statement or assumption made about a population parameter or the relationship


between two or more variables. In statistics, hypotheses are used to make informed decisions based
on available data. There are two main types of hypotheses: null hypothesis and alternative hypothesis.

Formulate the null and alternative hypotheses.


Choose the appropriate statistical test.
Determine the level of significance (

α).
Collect data. test is one where the probability of making a Type I error (rejecting the null hypothesis
An unbiased
when it's true) is no greater than the chosen significance level.

An unbiased critical region is a region of the sample space that ensures the test maintains the correct
level of significance,
Neyman-Pearson avoiding
Lemma undue
provides influence on
a systematic the test's
method for outcome duethe
constructing to factors other than
most powerful the data
hypothesis
itself.
test for a given significance level.
It helps in determining the critical region that maximizes the power of the test while maintaining a
fixed Type I error rate.
It is widely used in various fields, including signal detection, quality control, medical diagnosis, and
communication theory, to design optimal hypothesis tests with specific performance criteria.
T-testing:
T-tests are used to determine whether the means of two groups are significantly different from each other. It is typically
applied when comparing the means of two independent groups or when comparing the mean of a sample to a known
population mean F-tests, on the other hand, are used for comparing variances or testing the equality of means across multiple
groups. One common application of F-tests is in analysis of variance (ANOVA), where it is used to assess whether there are
significant differences in the means of three or more groups

Test for the equality of two or more than two normal distributions typically involves statistical tests such as the two-sample t-
test for comparing means of two groups, ANOVA for comparing means across multiple groups, or Levene's test for
comparing variances between groups, ensuring that assumptions like normality and homogeneity of variances are met.
Confidence Interval for Population Arithmetic Mean:
A confidence interval for the population arithmetic mean provides a range of plausible values for the true mean of a
population, based on sample data. It is constructed using the sample mean and the standard error of the mean, and it
quantifies the uncertainty associated with estimating the population mean

Single Mean:A single mean hypothesis test or confidence interval is used to make inferences about the population mean
based on a single sample.A two means hypothesis test or confidence interval compares the means of two independent
samples to determine if they are significantly different from each other or constructs a range of plausible differences between
them.A single proportion hypothesis test or confidence interval is used to make inferences about the population proportion
based on a single sample. two proportions hypothesis test or confidence interval compares the proportions of two
independent samples to determine if they are significantly different from each other or constructs a range of plausible
differences between them.

Parametric test: A statistical test that assumes specific parameters of the population distribution, such as mean and variance,
and relies on these assumptions for inference.

Goodness of Fit Test:


A goodness of fit test assesses whether observed data fits a particular probability distribution, typically the normal
distribution, using statistics like chi-square, Kolmogorov-Smirnov, or Anderson-Darling tests.

Independence of Attributes Test:


An independence of attributes test, such as the chi-square test for independence, evaluates whether two categorical variables
are independent of each other or if there's a significant association between them.

Testing Equality of More Than Two Variances:


This involves using Bartlett's test or Levene's test to assess whether the variances of multiple groups are equal, a prerequisite
for certain statistical analyses like ANOVA, by comparing the variability within groups across different samples.

Nonparametric test: A statistical test that does not make assumptions about the underlying population
distribution, making it robust to violations of normality and suitable for ordinal or non-normally distributed data,
with applications in comparing medians, testing independence, and analyzing ranked data.

a) Wilcoxon signed-rank test: A nonparametric test used to assess whether the median difference between paired samples is
significantly different from zero.
b) Mann-Whitney U test: A nonparametric test used to determine if there is a significant difference between the medians of
two independent groups.
Friedman test: A non-parametric test used to determine whether there are statistically significant differences among multiple
paired groups, typically when comparing three or more related samples or treatments, as an alternative to repeated measures
ANOVA.
Example: Assessing if there's a difference in the performance of three different teaching methods across multiple classrooms.

Estimation: The process of inferring population parameters from sample data.


Types of Estimation: Point estimation (estimating a single value) and interval estimation (estimating a range of values).
Properties of a Good Estimator: Unbiasedness, efficiency, and consistency.

Introduction to Ordinary Least Squares (OLS) Method: OLS is a statistical technique used to estimate the parameters of a
linear regression model by minimizing the sum of squared differences between observed and predicted values.

Assumptions of Ordinary Least Squares (OLS) Method: The key assumptions include linearity, independence of errors,
homoscedasticity, normality of errors, and absence of perfect multicollinearity.

Unbiasedness: An estimator is unbiased if, on average, it provides estimates that are equal to the true parameter value.
Consistency: An estimator is consistent if it converges to the true parameter value as the sample size increases indefinitely.
Efficiency: An estimator is efficient if it has the smallest possible variance among all unbiased estimators.
Sufficiency: A statistic is sufficient if it contains all the information in the sample needed to estimate the parameter, without
losing any additional information.

Estimation of parameters in multiple linear regression involves using methods like ordinary least squares to find coefficients
that minimize the sum of squared differences between observed and predicted values.

Best linear unbiased estimator (BLUE): OLS estimators have the smallest variance among all linear unbiased estimators.
Efficient: OLS estimators achieve the Cramér-Rao lower bound, making them statistically efficient.
Consistent: OLS estimators converge to the true parameter values as sample size increases.
Unbiased: OLS estimators have zero bias under the assumptions of the linear regression model.
Gauss-Markov theorem: OLS estimators are the best linear unbiased estimators under the classical linear regression model
assumptions.
Prior Distribution: Represents initial beliefs about a parameter before observing data.
Posterior Distribution: Represents updated beliefs about a parameter after incorporating observed data.

Point estimation in Bayesian statistics involves deriving a single value (e.g., posterior mean, median) as the best estimate of a
parameter, considering both prior information and observed data.

Prior probability: Initial belief about the parameter before observing data.
Likelihood: Probability of observing the data given different parameter values.
Posterior probability: Updated belief about the parameter after incorporating observed data.
Bayes' theorem: Formula to update prior beliefs using observed data.
Bayes factor: Ratio of the likelihoods of two competing hypotheses.

Bayesian testing procedures involve comparing hypotheses by assessing the posterior probabilities of competing models or
hypotheses.

Bayesian procedures involve updating beliefs about uncertain quantities using Bayes' theorem, incorporating prior knowledge
with observed data to derive posterior distributions for inference and decision-making.

Answer Key
(Maximum of 8 to 9 points)
Neyman-Pearson Lemma provides a systematic method for constructing the most powerful hypothesis
test for a given significance level.
It helps in determining the critical region that maximizes the power of the test while maintaining a
fixed Type I error rate.
It is widely used in various fields, including signal detection, quality control, medical diagnosis, and
communication theory, to design optimal hypothesis tests with specific performance criteria.

Small sample tests are statistical tests designed for use with small sample sizes, often when population parameters are
unknown or when assumptions of large sample tests are violated.
Types include t-tests for comparing means, chi-square tests for independence, and Fisher's exact test for small contingency
tables.
a) Wilcoxon Signed Rank Test: Non-parametric test for paired data, assessing if median differences between paired
observations differ significantly from zero.
b) Mann-Whitney U Test: Non-parametric test for independent samples, determining if there's a significant difference
between their medians.
c) Sign Test: Non-parametric test assessing whether the median of a paired difference is significantly different from zero,
based on the signs of the differences.
d) Signed Rank Test: Non-parametric test similar to Wilcoxon Signed Rank Test, but uses the magnitudes of the differences.

Estimation of mean and variance of a normal distribution using maximum likelihood estimator involves finding parameter
values that maximize the likelihood function given the observed data.

Bayesian statistical inference involves updating beliefs about uncertain quantities using Bayes' theorem, incorporating prior
knowledge with observed data to derive posterior distributions for inference and decision-making.

You might also like