Professional Documents
Culture Documents
Stochastic Research Paper
Stochastic Research Paper
T
HIS Paper represent the Data Collection of
Students Survey in which 25 Students and their
weights are given. The data set gives description
about the students gender, vegetarian or not, their
heights, the food they like and their weights. First,
we draw Histogram of 25 students, weights and
graph of both random variables. Second, we draw
Normal Distributions of random variables of
weights. Third, we draw Linear Regression. Fourth,
we apply Hypothesis test on the data set. Then, fifth
we apply least means square on the data. In the end
we plot ROC graph and calculate the precision of
the data calculated.
The normal distribution is the most
important probability distribution in statistics Fig. 1. Histogram of data X
because it fits many natural phenomena. For
example, heights, blood pressure, measurement
error, and IQ scores follow the normal distribution.
It is also known as the Gaussian distribution and the
bell curve.
II. PROCEDURE
A. Histogram
A histogram is a graphical display
of data using bars of different heights. In
a histogram, each bar groups numbers into ranges.
Taller bars show that more data falls in that range.
A histogram displays the shape and spread of
continuous sample data. The major difference is that
a histogram is only used to plot the frequency of
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 2
Fig. 2. Histogram of data Y Fig. 4 Show the MATLAB code for Histogram of
X and Y Parameters.
B. Normal Distribution
Normal distribution, is a
probability distribution that is symmetric about the
mean, showing that data near the mean are more
frequent in occurrence than data far from the mean.
In graph form, normal distribution will appear as a
bell curve.
Fig. 3. Graph Between data X and data Y Variance is the expectation of the squared
deviation of a random variable from its mean.
probability distribution that is symmetric about the person or the price of a share. Such variables may be
mean, showing that data near the mean are more better described by other distributions, such as the
frequent in occurrence than data far from the mean. log-normal distribution or the Pareto distribution.
In graph form, normal distribution will appear as a The value of the normal
bell curve. In probability theory, a normal distribution is practically zero when the value lies
distribution is a type of continuous probability more than a few standard deviations away from the
distribution for a real-valued random variable. The mean. Therefore, it may not be an appropriate model
general form of its probability density function is the when one expects a significant fraction of outliers’
parameter is the mean or expectation of the values that lie many standard deviations away from
distribution; and is its standard deviation. the mean and least squares and other statistical
A random variable with a inference methods that are optimal for normally
Gaussian distribution is said to be normally distributed variables often become highly unreliable
distributed and is called a normal deviate. Normal when applied to such data. In those cases, a more
distributions are important in statistics and are often heavy-tailed distribution should be assumed and the
used in the natural and social sciences to represent appropriate robust statistical inference methods
real-valued random variables whose distributions are applied. The Gaussian distribution belongs to the
not known. Their importance is partly due to the family of stable distributions which are the attractors
central limit theorem. It states that, under some of sums of independent, identically distributed
conditions, the average of many samples distributions whether or not the mean or variance is
(observations) of a random variable with finite mean finite. Except for the Gaussian which is a limiting
and variance is itself a random variable whose case, all stable distributions have heavy tails and
distribution converges to a normal distribution as the infinite variance.
number of samples increases. In our dataset we measure mean
Therefore, physical quantities and standard deviation of weights of students which
that are expected to be the sum of many independent are 149.5kg and 11.5kg.Minimum Weight is 130kg
processes (such as measurement errors) often have and maximum weight is 170kg.
distributions that are nearly normal. Moreover, Fig. 5 give the MATLAB result of
Gaussian distributions have some unique properties normal Distribution. Fig. 6 Show the MATLAB
that are valuable in analytic studies. For instance, Code of normal Distribution. Fig. 7 give the
any linear combination of a fixed collection of MATLAB result of output (mean, Variance).
normal deviates is a normal deviate. Many results
and methods (such as propagation of uncertainty and
least squares parameter fitting) can be derived
analytically in explicit form when the relevant
variables are normally distributed.
The normal distribution is the only
distribution whose cumulates beyond the first two
(i.e., other than the mean and variance) are zero. It
is also the continuous distribution with the maximum
entropy for a specified mean and variance.
Assuming that the mean and variance are finite, that
the normal distribution is the only distribution where
the mean and variance calculated from a set of
independent draws are independent of each other.
Fig. 7 give the MATLAB result of output (mean, Below, Fig. 8 Show the Linear Regression
Variance). graph on MATLAB. Fig. 9 Show the Linear
Regression code on MATLAB.
C. Linear Regression
In statistics, linear
regression is a linear approach to modeling the
relationship between a scalar response and one or
more explanatory variables. The case of one
explanatory variable is called simple linear
regression. For more than one explanatory variable,
the process is called multiple linear regression.
To find the linear relation that
matches best a set of data pairs, in the sense that it
minimizes the sum of the squares of the
discrepancies between the model and the data.
The linear regression
methodology for building a model of the relation
between two or more variables of interest on the
basis of available data. An interesting feature of this
methodology is that it may be explained and
developed simply as a least squares approximation
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 5