Professional Documents
Culture Documents
Statistical Application Da 2 Srishti
Statistical Application Da 2 Srishti
STATISTICAL APPLICATION
DIGITAL ASSIGNMENT 2
Question 1: Explain in detail, about the regression with some
suitable application and give valid conclusions.
Answer 1: Regression usually means stepping back towards the average.
Regression analysis is the mathematical measure of average relationship
between two variables in the terms of original correlation of data.
In regression analysis, there are two types of variables: Whose value is
influenced or is to be predicted is called as dependent variable and the
variable which influences the vales or is issued for prediction is called as
independent variable.
Independent variable is also called as Regressor or Predictor or
Explanatory variables. The dependent variables is known as
Regressed or Explained variables.
This technique is used for forecasting, time series modelling and finding
casual effect relationship between the variables. Most importantly it is
accurate tool for modelling and analysing the data. For example,
relationship between rash driving and number of road accidents by a
driver is best studied through regression. Regression analysis also allows
us to compare the effects of variables measured on different scales, such
as the effect of price changes and the number of promotional activities.
These benefits help data scientists to eliminate and evaluate the best set
of variables to be used for building predictive models. There are multiple
benefits of using regression analysis. They are as follows:
1. It indicates the significant relationships between dependent
variable and independent variable.
2. It indicates the strength of impact of multiple independent
variables on a dependent variable.
APPLICATIONS OF REGRESSION
Regression is ubiquitous and machine learning technique that is used
everywhere from scientific research to stock markets. Some applications
are as follows:
1. Studying engine performance from test data in automobiles
NAME: SRISHTI SRIVASTAVA
REGISTRATION NUMBER: 18MSI0053
COURSE CODE: MAT1012
CONCLUSION
Here the conclusion can be drawn by giving example:
Regression is mainly used for measuring the relationship between the
two variables.For example, one would like to know not just whether
patients have high blood pressure, but also whether the likelihood of
having high blood pressure is influenced by factors such as age and
weight. The variable to be explained (blood pressure) is called the
dependent variable, or, alternatively, the response variable; the
variables that explain it (age, weight) are called independent variables
or predictor variables.
equal to the average of all the sample ranges R) are found and from these
we can calculate the control limits for the XX and R charts.
Therefore,
LCLR = D3 RX
Here the factors A2, D4 and D3depend on the number of units per sample.
Larger the number, the close the limits.
Variable control charts are used to measure the quantities such
as length, temperature, weight, volume and time.
Variable control charts must be able to measure the quality
characteristics in numbers. They may be impractical and
uneconomical. For example, manufacture pants responsible for 100000
dimensions.
EXAMPLE: A quality engineer establishes that a process is set up per the
standard operating conditions. She takes five samples consisting 4 of
observations each from the process at random interval. She verifies the
measurement system, measures the thickness of each part and records
the observations.
NAME: SRISHTI SRIVASTAVA
REGISTRATION NUMBER: 18MSI0053
COURSE CODE: MAT1012
So she calculated the range and the mean of the following data.
She also calculated the upper control limit and lower control limit:
UCLR=D4(average Range)=2.282(0.0021)=0.0047922
LCLR=D3(average Range)=0(0.0021)=0
NAME: SRISHTI SRIVASTAVA
REGISTRATION NUMBER: 18MSI0053
COURSE CODE: MAT1012
EXAMPLE:
Data Sample Defective Percent
11 120 3 2.5
12 100 1 1
13 45 2 4.44
14 60 2 3.33
15 130 2 1.24
16 90 1 1.1
17 105 4 3.8
18 80 3 3.75
19 75 1 1.33
20 105 2 1.9
Total 900 21
Here the average sample size will be = 900/10 = 90
where fo = the observed frequency (the observed counts in the cells) and
fe = the expected frequency if NO relationship existed between the
variables.
If so, we can conclude that the variables are not independent of each
other and that there is a statistical relationship between the categorical
variables.