Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

NAME: SRISHTI SRIVASTAVA

REGISTRATION NUMBER: 18MSI0053


COURSE CODE: MAT1012

STATISTICAL APPLICATION
DIGITAL ASSIGNMENT 2
Question 1: Explain in detail, about the regression with some
suitable application and give valid conclusions.
Answer 1: Regression usually means stepping back towards the average.
Regression analysis is the mathematical measure of average relationship
between two variables in the terms of original correlation of data.
In regression analysis, there are two types of variables: Whose value is
influenced or is to be predicted is called as dependent variable and the
variable which influences the vales or is issued for prediction is called as
independent variable.
Independent variable is also called as Regressor or Predictor or
Explanatory variables. The dependent variables is known as
Regressed or Explained variables.
This technique is used for forecasting, time series modelling and finding
casual effect relationship between the variables. Most importantly it is
accurate tool for modelling and analysing the data. For example,
relationship between rash driving and number of road accidents by a
driver is best studied through regression. Regression analysis also allows
us to compare the effects of variables measured on different scales, such
as the effect of price changes and the number of promotional activities.
These benefits help data scientists to eliminate and evaluate the best set
of variables to be used for building predictive models. There are multiple
benefits of using regression analysis. They are as follows:
1. It indicates the significant relationships between dependent
variable and independent variable.
2. It indicates the strength of impact of multiple independent
variables on a dependent variable.

APPLICATIONS OF REGRESSION
Regression is ubiquitous and machine learning technique that is used
everywhere from scientific research to stock markets. Some applications
are as follows:
1. Studying engine performance from test data in automobiles
NAME: SRISHTI SRIVASTAVA
REGISTRATION NUMBER: 18MSI0053
COURSE CODE: MAT1012

2. Least squares regression is used to model causal relationships


between parameters in biological systems
3. OLS regression can be used in weather data analysis
4. Linear regression can be used in market research studies and
customer survey results analysis
5. Linear regression is used in observational astronomy commonly
enough. A number of statistical tools and methods are used in
astronomical data analysis, and there are entire libraries in
languages like Python meant to do data analysis in astrophysics.

CONCLUSION
Here the conclusion can be drawn by giving example:
Regression is mainly used for measuring the relationship between the
two variables.For example, one would like to know not just whether
patients have high blood pressure, but also whether the likelihood of
having high blood pressure is influenced by factors such as age and
weight. The variable to be explained (blood pressure) is called the
dependent variable, or, alternatively, the response variable; the
variables that explain it (age, weight) are called independent variables
or predictor variables.

QUESTION 2: What is the difference between Control charts for


variables and attributes. Explain through the examples.
ANSWER 2: The control charts are the graphs used to study how the
process changes over a time. Data is plotted in a time order. Control
charts always have a central line for the average, the upper line for the
upper control limit( UCL) and a lower line for the lower control
limit(LCL). Control charts are divided into following ways:

CONTROL CHARTS FOR VARIABLES


A number of samples of component coming out of the process are taken
over a period of time. Each sample must be taken at random and the size
of sample is generally kept as 5. For each sample, the average value XX of
all the measurements and the range R are calculated. The grand average
XX (equal to the average value of all the sample average, XX ) and R (XX is
NAME: SRISHTI SRIVASTAVA
REGISTRATION NUMBER: 18MSI0053
COURSE CODE: MAT1012

equal to the average of all the sample ranges R) are found and from these
we can calculate the control limits for the XX and R charts.
Therefore,

LCLR = D3 RX
Here the factors A2, D4 and D3depend on the number of units per sample.
Larger the number, the close the limits.
Variable control charts are used to measure the quantities such
as length, temperature, weight, volume and time.
Variable control charts must be able to measure the quality
characteristics in numbers. They may be impractical and
uneconomical. For example, manufacture pants responsible for 100000
dimensions.
EXAMPLE: A quality engineer establishes that a process is set up per the
standard operating conditions. She takes five samples consisting 4 of
observations each from the process at random interval. She verifies the
measurement system, measures the thickness of each part and records
the observations.
NAME: SRISHTI SRIVASTAVA
REGISTRATION NUMBER: 18MSI0053
COURSE CODE: MAT1012

So she calculated the range and the mean of the following data.

Average mean= 0.5027

Average range= 0.0021

She also calculated the upper control limit and lower control limit:

UCLR=D4(average Range)=2.282(0.0021)=0.0047922

LCLR=D3(average Range)=0(0.0021)=0
NAME: SRISHTI SRIVASTAVA
REGISTRATION NUMBER: 18MSI0053
COURSE CODE: MAT1012

CONTROL CHARTS FOR ATTRIBUTES


There are instances in industrial practice where direct measurements are
not required or possible. Under such circumstances, the inspection results
are based on the classification of products as being defective or not
defective, acceptable as good or bad accordingly as that product confirms
or fails to confirm the specified specification.

In manufacturing, sometime it is required to control burns, cracks, voids,


dents, scratches, missing and wrong components, rust etc. Here, we
inspect products only as good or bad but not how much good or how
much bad.

1. Attribute Charts for Defective Items: (P-Chart):

This is the control chart for percent defectives or for fraction


defectives. This is used whenever the quality characteristics are
expressed as the number of units confirming or not confirming to the
specified specifications either by visual inspection.

The fraction defective value is represented in a decimal as proportion of


defectives out of one product, while percent defective is the fraction
defective value expressed as percentage. As in the above example,
fraction defective of 15/200 = 0.075, and percent defective will be 0.075
x 100 = 7.5%.

EXAMPLE:
Data Sample Defective Percent
11 120 3 2.5
12 100 1 1
13 45 2 4.44
14 60 2 3.33
15 130 2 1.24
16 90 1 1.1
17 105 4 3.8
18 80 3 3.75
19 75 1 1.33
20 105 2 1.9
Total 900 21
Here the average sample size will be = 900/10 = 90

PX the fraction defective = 21/900 = 0.023


NAME: SRISHTI SRIVASTAVA
REGISTRATION NUMBER: 18MSI0053
COURSE CODE: MAT1012

2. Attribute Charts for Number of Defects per Unit: (C-Chart):

This is a method of plotting attribute characteristics. In this case, the


sample taken is a single unit, such as length, breadth and area or a fixed
time etc. In some cases it is required to find the number of defects per
unit rather than the percent defective.

For example take a case in which a large number of small components


form a large unit, say transistor. The transistor set may have defect at
various points. In this case, it seems natural to count the number of
defects per set, rather than to determine all points at which the unit is
defective.

QUESTION 4: What is a Chi square test and explain with a social


problem.

ANSWER 3: The Chi Square statistic is commonly used for testing


relationships between categorical variables. The null hypothesis of
the Chi-Square test is that no relationship exists on the categorical
variables in the population; they are independent. The calculation of the
Chi-Square statistic is quite straightforward, it is represented by formula:

where fo = the observed frequency (the observed counts in the cells) and
fe = the expected frequency if NO relationship existed between the
variables.

EXAMPLE OF CHI SQUARE METHOD ON BASES OF SPSS


The Chi-Square statistics appears as an option when requesting a
tabulation in SPSS. The output is labelled Chi-Square Tests; the Chi-Square
statistic used in the Test of Independence is labelled Pearson Chi-
Square. This statistic can be evaluated by comparing the actual value
against a critical value found in a ChiSquare distribution, but it is easier to
simply examine the p-value provided by SPSS. To make a conclusion
about the hypothesis with 95% confidence, the value labelled Asymp. Sig.
(which is the p-value of the Chi-Square statistic) should be less than .05
(which is the alpha level associated with a 95% confidence level).
NAME: SRISHTI SRIVASTAVA
REGISTRATION NUMBER: 18MSI0053
COURSE CODE: MAT1012

If so, we can conclude that the variables are not independent of each
other and that there is a statistical relationship between the categorical
variables.

In this example, there is an association between fundamentalism and


views on teaching sex education in public schools. While 17.2% of
fundamentalists oppose teaching sex education, only 6.5% of liberals are
opposed. The p-value indicates that these variables are not independent
of each other and that there is a statistically significant relationship
between the categorical variables.

You might also like