Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

Statistics 1

2 marks
1. What are the functions of Statistics?
Statistics helps in providing a better understanding and exact description of a phenomenon of nature.

Statistical helps in proper and efficient planning of a statistical inquiry in any field of study.

Statistical helps in collecting an appropriate quantitative data.

Statistics helps in presenting complex data in a suitable tabular, diagrammatic and graphic form for an
easy and clear comprehension of the data.

2. List various kinds of data


1. Categorical (Qualitative)
Qualitative data is a categorical measurement expressed not in terms of numbers, but rather by
means of a natural language description. In statistics, it is often used interchangeably with "categorical"
data.
Eg: favorite color = "blue"
height = "tall"
2. Numerical (Quantitative)
Quantitative data is a numerical measurement expressed not by means of a natural language
description, but rather in terms of numbers. However, not all numbers are continuous and measurable.
Eg: molecule length = "450 nm"
height = "1.8 m"
3. Purpose of Computing harmonic mean
Harmonic mean is used to calculate the average of a set of numbers. Here the number of elements will be
averaged and divided by the sum of the reciprocals of the elements. The Harmonic mean is always the
lowest mean.
Harmonic Mean = N/(1/a1+1/a2+1/a3+1/a4+.......+1/aN)
4. Properties of Standard Deviation
1. The standard deviation of first n natural numbers 1, 2, 3,…, n is
√(n2−1)/12 .
2. The variance and consequently standard deviation of a distribution is independent of change of origin.
3. The variance or standard is not independent of change of scale.
4. If all the values of the variable are same, then standard deviation is zero.
5. The standard deviation does not alter when a constant quantity k is added to or subtracted from each
value of the variable of the series.

By Yenamanram Ramasubbu Venkataraghavan (Software Developer)


5. Distinguish between mutually exclusive event sand independent events
Mutually exclusive event :- two events are mutually exclusive event when they cannot occur at the same
time. e.g if we flip a coin it can only show a head OR a tail, not both.

Independent event :- the occurrence of one event does not effect the occurrence of the others e.g if we
flip a coin two times, the first time may show a head, but the next time when we flip the coin the
outcome will be heads also. From this example we can see the first event does not affect the occurrence
of the next event.

6. Scatter Diagram
Also called: scatter plot, X–Y graph.
The scatter diagram graphs pairs of numerical data, with one variable on each axis, to look for a
relationship between them. If the variables are correlated, the points will fall along a line or curve. The
better the correlation, the tighter the points will hug the line.
7. Tow Natures of statistical methods
1. Statistical methods are mathematical formulas, models, and techniques that are used in statistical
analysis of raw research data.
2. The application of statistical methods extracts information from research data and provides different
ways to assess the robustness of research outputs.
8. How to draw the Ogive curves?
First construct a frequency table.
Cumulate the frequencies.
Calculate the percent cumulative frequency.
Next mark off class boundaries along the horizontal axis of the ogive and mark off percent cumulative
frequencies along the vertical axis of the ogive. It is important to plot the percent cumulative frequencies
above the upper class boundary of each class
9. What is meant by classification of data?
The process of arranging data into homogeneous group or classes according to some common
characteristics present in the data is called classification.
1. One way classification
2. Two way classification
3. Multi way classification
10. What do you mean by stratified random sampling?
A stratified random sample is a population sample that requires the population to be divided into smaller
groups, called 'strata'. Random samples can be taken from each stratum, or group.
11. Distinguish between sampling and non-sampling errors.
Sampling error is the error that arises in a data collection process as a result of taking a sample from a
population rather than using the whole population.
Non-sampling error is the error that arises in a data collection process as a result of factors other than
taking a sample.

By Yenamanram Ramasubbu Venkataraghavan (Software Developer)


12. List out various measures of location
Mean, Median, Mode, Range, Interquartile Range, Standard deviation
13. Define coefficient of variation.
A coefficient of variation (CV) is a statistical measure of the dispersion of data points in a data series
around the mean. It is calculated as follows: (standard deviation) / (expected value).
14. What is meant by skewness?
Skewness is asymmetry in a statistical distribution, in which the curve appears distorted or skewed either
to the left or to the right. Skewness can be quantified to define the extent to which a distribution differs
from a normal distribution.
15. What is conditional probability?
Recall that the probability of an event occurring given that another event has already occurred is called a
conditional probability.
The probability that event B occurs, given that event A has already occurred is

P(B|A) = P(A and B) / P(A)

16. State any two uses of regression equation.


Regression equations can help you figure out if your data can be fit to an equation.
To make future predictions or indications of past behavior predictions from your data.
17. State any two limitations of statistics.
1. Statistics laws are true on average. Statistics are aggregates of facts. So single observation is not a
statistics, it deals with groups and aggregates only.
2. Statistical methods are best applicable on quantitative data.
3. Statistical cannot be applied to heterogeneous data.
18. Define tabulation of data.
The process of placing classified data into tabular form is known as tabulation. A table is a symmetric
arrangement of statistical data in rows and columns. Rows are horizontal arrangements whereas columns
are vertical arrangements. It may be simple, double or complex depending upon the type of
classification.
19. What is the importance of Lorenz curve?
A Lorenz curve is a geometric representation of the distribution of income among families in a given
country at a given time. It measures the percentage of families on the horizontal axis arranged from
poorest to richest and the cumulative percentage of family income on the vertical axis. Lorenz curves are
most useful in visual comparisons of income distribution over time and between countries.
20. Define simple random sampling.
Simple random sampling is the basic sampling technique where we select a group of subjects (a sample)
for study from a larger group (a population). Each individual is chosen entirely by chance and each
member of the population has an equal chance of being included in the sample.

By Yenamanram Ramasubbu Venkataraghavan (Software Developer)


21. Define Harmonic mean.
The harmonic mean can be expressed as the reciprocal of the arithmetic mean of the reciprocals. As a
simple example, the harmonic mean of 1, 2, and 4 is

22. What is linear prediction?


linear predictor function is a linear function of a set of coefficients and explanatory variables
(independent variables), whose value is used to predict the outcome of a dependent variable.

23. What is a frequency curve?


Frequency curve is obtained by joining the points of frequency polygon by a freehand smoothed curve.
Unlike frequency polygon, where the points we joined by straight lines, we make use of free hand
joining of those points in order to get a smoothed frequency curve.
24. Define a population.
A population is any complete group with at least one characteristic in common. Populations are not just
people. Populations may consist of, but are not limited to, people, animals, businesses, buildings, motor
vehicles, farms, objects or events.
25. Define Mode
A statistical term that refers to the most frequently occurring number found in a set of numbers. The
mode is found by collecting and organizing the data in order to count the frequency of each result. The
result with the highest occurrences is the mode of the set.
26. Define Standard Deviation
The standard deviation (SD) is a measure that is used to quantify the amount of variation or
dispersion of a set of data values.
σ = sqrt [ Σ ( Xi - X )2 / N ]

27. What is probability?


The probability of an event refers to the likelihood that the event will occur.
The probability that an event will occur is expressed as a number between 0 and 1. Notationally, the
probability of event A is represented by P(A).
28. State addition theorem of probability.
The addition theorem in the Probability concept is the process of determination of the probability that
either event ‘A’ or event ‘B’ occurs or both occur. The notation between two events ‘A’ and ‘B’ the
addition is denoted as '∪' and pronounced as Union.
The result of this addition theorem generally written using Set notation,
P (A ∪ B) = P(A) + P(B) – P(A ∩ B)
29. Define Correlation
Correlation is a statistical measure that indicates the extent to which two or more variables fluctuate
together. A positive correlation indicates the extent to which those variables increase or decrease in
parallel; a negative correlation indicates the extent to which one variable increases as the other decreases.

By Yenamanram Ramasubbu Venkataraghavan (Software Developer)


30. What is Regression?
Regression is a statistical measure used in finance, investing and other disciplines that attempts to
determine the strength of the relationship between one dependent variable (usually denoted by Y) and a
series of other changing variables (known as independent variables).
31. What are the methods of sampling?
Simple random sampling
Systematic sampling
Stratified sampling
Cluster sampling
Quota sampling
Accidental sampling

32. Write the formula for Karl Pearson’s measures of skewness


Pearson’s Coefficient of Skewness #1 uses the mode. The formula is:

Where = the mean, Mo = the mode and s = the standard deviation for the sample.
Pearson’s Coefficient of Skewness #2 uses the median. The formula is:

Where = the mean, Mo = the mode and s = the standard deviation for the sample.
It is generally used when you don’t know the mode.

By Yenamanram Ramasubbu Venkataraghavan (Software Developer)

You might also like