Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

1

INTRODUCTION

Statistics
Statistics is a mathematical body of science that pertains to the collection, analysis, interpretation or
explanation, and presentation of data, or as a branch of mathematics. Some consider statistics to be a
distinct mathematical science rather than a branch of mathematics. While many scientific
investigations make use of data, statistics is concerned with the use of data in the context of
uncertainty and decision making in the face of uncertainty.
In applying statistics to a problem, it is common practice to start with a population or process to be
studied. Populations can be diverse topics such as "all people living in a country" or "every atom
composing a crystal". Ideally, statisticians compile data about the entire population (an operation
called census). This may be organized by governmental statistical institutes. Descriptive statistics can
be used to summarize the population data. Numerical descriptors include mean and standard
deviation for continuous data (like income), while frequency and percentage are more useful in terms
of describing categorical data (like education).
When a census is not feasible, a chosen subset of the population called a sample is studied. Once a
sample that is representative of the populations is determined, data is collected for the sample
members in an observational or experimental setting. Again, descriptive statistics can be used to
summarize the sample data. However, drawing the sample contains an element of randomness; hence,
the numerical descriptors from the sample are also prone to uncertainty. To draw meaningful
conclusions about the entire population, inferential statistics is needed. It uses patterns in the sample
data to draw inferences about the population represented while accounting for randomness. These
inferences may take the form of answering yes/no questions about the data (hypothesis testing),
estimating numerical characteristics of the data (estimation), describing associations within the data
(correlation), and model relationships within the data (for example, using regression analysis).
Inference can extend to forecasting, prediction, and estimation of unobserved values either in or
associated with the population being studied. It can include extrapolation and interpolation of time
series or spatial data, and data mining.

The Statistics Deviation


The concept of Standard Deviation was introduced by Karl Pearson in 1893. It is by far the most
important and widely used measure of dispersion. Its significance lies in the fact that it is free from
those defects which afflicted earlier methods and satisfies most of the properties of a good measure of
dispersion. Standard Deviation is also known as root-mean square deviation as it is the square root of
means of the squared deviations from the arithmetic mean.
In financial terms, standard deviation is used -to measure risks involved in an investment instrument.
Standard deviation provides investors a mathematical basis for decisions to be made regarding their
investment in financial market. Standard Deviation is a common term used in deals involving stocks,
mutual funds, ETFs and others. Standard Deviation is also known as volatility.

Variance
In statistics, variance refers to the spread of a data set. It’s a measurement used to identify how far
each number in the data set is from the mean.
2

While performing market research, variance is particularly useful when calculating probabilities of
future events. Variance is a great way to find all of the possible values and likelihoods that a random
variable can take within a given range.
A variance value of zero represents that all of the values within a data set are identical, while all
variances that are not equal to zero will come in the form of positive numbers.
The larger the variance, the more spread in the data set.
A large variance means that the numbers in a set are far from the mean and each other. A small
variance means that the numbers are closer together in value.
3

HISTORY
The earliest writings on probability and statistics date back to Arab
mathematicians and cryptographers, during the Islamic Golden Age between the 8th and 13th
centuries. Al-Khali (717–786) wrote the Book of Cryptographic Messages, which contains the first
use of permutations and combinations, to list all possible Arabic words with and without vowels. The
earliest book on statistics is the 9th-century treatise Manuscript on Deciphering Cryptographic
Messages, written by Arab scholar Al-Kind (801–873). In his book, Al-Kind gave a detailed
description of how to use statistics and frequency analysis to decipher encrypted messages. This text
laid the foundations for statistics and cryptanalysis. Al-Kind also made the earliest known use
of statistical inference, while he and later Arab cryptographers developed the early statistical methods
for decoding encrypted messages. In (1187–1268) later made an important contribution, on the use
of sample size in frequency analysis.
The earliest European writing on statistics dates back to 1663, with the publication of Natural and
Political Observations upon the Bills of Mortality by John Grunt. Early applications of statistical
thinking revolved around the needs of states to base policy on demographic and economic data, hence
its stat- etymology. The scope of the discipline of statistics broadened in the early 19th century to
include the collection and analysis of data in general. Today, statistics is widely employed in
government, business, and natural and social sciences.
Statistics, in the modern sense of the word, began evolving in the 18th century in response to the
novel needs of industrializing sovereign states. The evolution of statistics was, in particular,
intimately connected with the development of European states following the peace of Westphalia
(1648), and with the development of probability theory, which put statistics on a firm theoretical
basis.
In early times, the meaning was restricted to information about states, particularly demographics such
as population. This was later extended to include all collections of information of all types, and later
still it was extended to include the analysis and interpretation of such data. In modern terms,
"statistics" means both sets of collected information, as in national accounts and temperature record,
and analytical work which require statistical inference. Statistical activities are often associated with
models expressed using [probability], hence the connection with probability theory. The large
requirements of data processing have made statistics a key application of computing. A number of
statistical concepts have an important impact on a wide range of sciences. These include the design of
experiments and approaches to statistical inference such as Bayesian inference, each of which can be
considered to have their own sequence in the development of the ideas underlying modern statistics.
4

PURPOSE

Standard Deviation
In statistics, the standard deviation is a measure of the amount of variation or dispersion of a set of
values. A low standard deviation indicates that the values tend to be close to the mean (also called
the expected value) of the set, while a high standard deviation indicates that the values are spread out
over a wider range.
Standard deviation may be abbreviated SD, and is most commonly represented in mathematical texts
and equations by the lower case Greek letter sigma σ, for the population standard deviation, or
the Latin letter, for the sample standard deviation.
(For other uses of the symbol σ in science and mathematics, see Sigma § Science and mathematics.)
The standard deviation of a random variable, sample, statistical population, data set, or probability
distribution is the square root of its variance. It is algebraically simpler, though in practice less robust,
than the average absolute deviation. A useful property of the standard deviation is that unlike the
variance, it is expressed in the same unit as the data.

Variance
Variance is the expectation of the squared deviation of a random variable from its mean. Informally,
it measures how far a set of numbers is spread out from their average value. Variance has a central
role in statistics, where some ideas that use it include descriptive statistics, statistical
inference, hypothesis testing, goodness of fitness, and Monte Carlo sampling. Variance is an
important tool in the sciences, where statistical analysis of data is common. The variance is the square
of the standard deviation, the second central moment of a distribution, and the covariance of the
random variable with itself, and it is often represented by variance is the expectation of the
squared deviation of a random variable from its mean. Informally, it measures how far a set of
numbers is spread out from their average value. Variance has a central role in statistics, where some
ideas that use it include descriptive statistics, statistical inference, hypothesis testing, goodness of fit,
and Monte Carlo sampling. Variance is an important tool in the sciences, where statistical analysis of
data is common. The variance is the square of the standard deviation, the second central moment of
a distribution, and the covariance of the random variable with itself, and it is often represented.
5

METHODS
Standard Deviation
 Step 1 - Find the mean.
 Step 2 - For each data point, find the square of its distance to the mean.
 Step 3 - Sum the values from step 2.
 Step 4 - Divide by the number of data points.
 Step 5 - Take the square root.

Variance
 Step 1 – Find the mean.
To find the mean, add up all the scores, then divide them by the number of scores.
 Step 2 – Find each score`s deviation from the mean.
Subtract the mean from each score to get the deviations from the mean.
 Step 3 - Square each deviation from the mean.
Multiply each deviation from the mean by itself. This will result in positive numbers.
 Step 4 - Find the sum of squares.
Add up all of the squared deviations. This is called the sum of squares.
 Step 5 - Find the variance.
Divide the sum of the squares by n – 1 (for a sample) or N (for a population) – this is the
variance.
 Step 6 - Find the square root of the variance.
To find the standard deviation, we take the square root of the variance.
6

USES

Standard Deviation
Standard deviation is a number used to tell how measurements for a group are spread out from the
average. A low standard deviation means that most of the numbers are close to the average, while a
high standard deviation means that the numbers are more spread out.

Variance
In statistics, variance measures variability from the average or mean. It is calculated by taking the
differences between each number in the data set and the mean, then squaring the differences to make
them positive, and finally dividing the sum of the squares by the number of values in the data set.
A large variance indicates that numbers in the set are far from the mean and far from each other. A
small variance, on the other hand, indicates the opposite. A variance value of zero, though, indicates
that all values within a set of numbers are identical. Every variance that isn't zero is a positive
number. A variance cannot be negative. That's because it's mathematically impossible since you can't
have a negative value resulting from a square.
7

DIFFERENCE BETWEEN VARIANCE AND STANDARD


DEVIATION

Here, the list of comparative differences between the variance and the standard deviation is given
below in detail:

Difference between Variance and Standard Deviation

Variance Standard Deviation

It can simply be defined as the It can simply be defined as the


numerical value, which describes observations that get measured are
how variable the observations are. measured through dispersion within a
data set.

Variance is nothing but the average Standard Deviation is defined as the


taken out of the squared deviations. root of the mean square deviation

Variance is expressed in Squared Standard deviation is expressed in the


units. same units of the data available.

It is mathematically denoted as (σ2) It is mathematically denoted as (σ)

Variance is a perfect indicator of Standard deviation is the perfect


the individuals spread out in a indicator of the observations in a data
group. set.
8

BENEFITS AND DRAWBACKS

Advantages And Disadvantages Of Standard Deviation


Main advantages and disadvantages of standard deviation can be expressed as follows:
Advantages/Merits Of Standard Deviation
1. Rigidly Define
Standard deviation is rigidly defined measure and its value is always fixed.
2. Best Measure
Standard deviation is based on all the items in the series. So, it is the best measure of dispersion.
3. Less Affected
Standard deviation is least affected by the sampling fluctuations than other measures (mean deviation
and quartile deviation).
4. Suitable For Algebraic Operation
Standard deviation can be used for mathematical operations and algebraic treatments. It is also
applicable in statistical analysis.

Disadvantages/Demerits Of Standard Deviation


1. Complex Method
Standard deviation is complex to compute and difficult to understand as compared to other measures
of dispersion.
2. High Effect
Standard deviation is highly affected by the extreme values in the series.
3. Standard deviation cannot be obtained for open end class frequency distribution.

Advantages and Disadvantages of Variance


Advantages of Variance
Statisticians use variance to see how individual numbers relate to each other within a data set, rather
than using broader mathematical techniques such as arranging numbers into quartiles. The advantage
of variance is that it treats all deviations from the mean the same regardless of their direction. The
squared deviations cannot sum to zero and give the appearance of no variability at all in the data.
Disadvantages of Variance
One drawback to variance, though, is that it gives added weight to outliers. These are the numbers that
are far from the mean. Squaring these numbers can skew the data. Another pitfall of using variance is
that it is not easily interpreted. Users often employ it primarily to take the square root of its value,
which indicates the standard deviation of the data set. As noted above, investors can use standard
deviation to assess how consistent returns are over time.
9

STATISTICAL ANALYSIS
Standard Deviation
Statistical analysis is the collection and interpretation of data in order to uncover patterns and trends.
It is a component of data analysis. Statistical analysis can be used in situations like gathering research
interpretations, statistical model or designing surveys and studies. It can also be useful for business
intelligence organizations that have to work with large data volumes.
In the context of business intelligence (BI), statistical analysis involves collecting and scrutinizing
every data sample in a set of items from which samples can be drawn. A sample, in statistics, is a
representative selection drawn from a total population.
The goal of statistical analysis is to identify trends. A retail business, for example, might use statistical
analysis to find patterns in unstructured and semi-structured customer data that can be used to create a
more positive customer experience and increase sales.

Steps of statistical analysis


Statistical analysis can be broken down into five discrete steps, as follows:
 Describe the nature of the data to be analyzed.
 Explore the relation of the data to the underlying population.
 Create a model to summarize an understanding of how the data relates to the underlying population.
 Prove (or disprove) the validity of the model.
 Employ predictive analytics to run scenarios that will help guide future actions.

Variance
Analysis of variance (ANOVA) is a collection of statistical models and their associated estimation
procedures (such as the "variation" among and between groups) used to analyze the differences
among means. ANOVA was developed by the statistician Ronald Fisher. The ANOVA is based on
the law of total variance, where the observed variance in a particular variable is partitioned into
components attributable to different sources of variation. In its simplest form, ANOVA provides
a statistical test of whether two or more population means are equal, and therefore generalizes the t-
test beyond two means.
10

INVESTING

Standard Deviation and Variance in Investing


For traders and analysts, these two concepts are of paramount importance as they are used to measure
security and market volatility, which in turn plays a large role in creating a profitable trading strategy.
Standard deviation is one of the key methods that analysts, portfolio managers, and advisors use to
determine risk. When the group of numbers is closer to the mean, the investment is less risky; when
the group of numbers is further from the mean, the investment is of greater risk to a potential
purchaser.
Securities that are close to their means are seen as less risky, as they are more likely to continue
behaving as such. Securities with large trading ranges that tend to spike or change direction are
riskier. In investing, risk in itself is not a bad thing, as the riskier the security, the greater potential for
a payout.
Variance is neither good nor bad for investors in and of itself. However, high variance in a stock is
associated with higher risk, along with a higher return. Low variance is associated with lower risk and
a lower return. High-variance stocks tend to be good for aggressive investors who are less risk-averse,
while low-variance stocks tend to be good for conservative investors who have less risk tolerance.
Variance is a measurement of the degree of risk in an investment. Risk reflects the chance that an
investment's actual return, or its gain or loss over a specific period, is higher or lower than expected.
There is a possibility some, or all, of the investment will be lost
11

RESULTS

Standard deviation can be difficult to interpret as a single number on its own. Basically, a small
standard deviation means that the values in a statistical data set are close to the mean of the data set,
on average, and a large standard deviation means that the values in the data set are farther away from
the mean, on average.
A small standard deviation can be a goal in certain situations where the results are restricted, for
example, in product manufacturing and quality control. A particular type of car part that has to be 2
centimeters in diameter to fit properly had better not have a big standard deviation during the
manufacturing process.
But in situations where you just observe and record data, a large standard deviation isn’t necessarily a
bad thing; it just reflects a large amount of variation in the group that is being studied. For example, if
you look at salaries for everyone in a certain company, including everyone from the student intern to
the CEO, the standard deviation may be very large. On the other hand, if you narrow the group down
by looking only at the student interns, the standard deviation is smaller, because the individuals within
this group have salaries that are less variable. The second data set isn’t better, it’s just less variable.
But in situations where you just observe and record data, a large standard deviation isn’t necessarily a
bad thing; it just reflects a large amount of variation in the group that is being studied. For example, if
you look at salaries for everyone in a certain company, including everyone from the student intern to
the CEO, the standard deviation may be very large. On the other hand, if you narrow the group down
by looking only at the student interns, the standard deviation is smaller, because the individuals within
this group have salaries that are less variable. The second data set isn`t better, it`s just less variable.

.
12

CONCLUSIONS
In addition to expressing the variability of a population, the standard deviation is commonly used
to measure confidence in statistical conclusions. Statistical conclusion validity is the degree to
which conclusions about the relationship among variables based on the data are correct or
"reasonable". This began as being solely about whether the statistical conclusion about the
relationship of the variables was correct, but now there is a movement towards moving to
"reasonable" conclusions that use: quantitative, statistical, and qualitative data. Fundamentally, two
types of errors can occur: type I (finding a difference or correlation when none exists) and type
II (finding no difference or correlation when one exists). Statistical conclusion validity concerns the
qualities of the study that make these types of errors more likely. Statistical conclusion validity
involves ensuring the use of adequate sampling procedures, appropriate statistical tests, and reliable
measurement procedures.
13

You might also like