Professional Documents
Culture Documents
Education
Education
Education
S Education
0334-5515779,0344,5515779,0345-7308411
Course: Educational Statistics (8674)
Level: B.Ed (1.5 Years) Semester: Autumn, 2020
ASSIGNMENT No. 1
(Units: 1-5)
Q.1 Describe level of measurement. Give five examples of each level and explain the role of level
of measurement in decision-making. ANS: Levels of Measurement The level of measurement refers to the
relationship among the values that are assigned to the attributes for a variable. What does that mean? Begin
with the idea of the variable, in this example "party affiliation." That variable has a number of attributes. Let's
assume that in this particular election context the only relevant attributes are "Tepublican", "democrat", and
"independent". For purposes of analyzing the results of this variable, we arbitrarily assign the values 1. 2 and 3 to
the three attributes. The level of measurement describes the relationship among these three values. In this case,
we simply are using the numbers as shorter placeholders for the lengthier text terms. We don't assume
that higher values mean "more" of something and lower numbers signify "less". We don't assume the the
value of 2 means that democrats are twice something that republicans are. We don't assume that
republicans are in first place or have the highest priority just because they have the value of 1. In this case, we
only use the values as a shorter name for the attribute. Here, we would describe the level of measurement as
"nominal". Why is Level of Measurement Important? First, knowing level of measurement helps you
decide how to interpret the data from that variable. When you know that a measure is nominal (like the
one just described), then you know that the numerical values are just short codes for the longer names,
Second, knowing the level of measurementbelps you decide what statistical analysis is appropriate on the
values that were assigned. If a measure to nominal, then you know that you would never average the
data values or do a t-test on the dato There are typically four levels of measurement that are
defined: Nominal Ordinal Interval Ratio In nominal measurement the punorical values justame the tribute
uniquely. No ordering of the cases is implied. For examples Jersey numbetsvid basketballare
measures at the nominal level. A player with number 30 is not more of anything than a player with
number 15. and is certainly not twice whatever number 15 is. In ordinal measurement the attributes can be
rank-ordered. Here, distances between attributes do not have any meaning. For example, on a survey you
might code Educational Attainment as C-less than high school; 1=some high school.: 2=high school
degree: 3=some college:
34244,
BS Education
AIOU Studio یوٹیوب چینل کو سبسکرائب کریں۔
9 کے لیے
عالمہ اقبال اوپن یونیورسٹی کی معلومات
B.S Education
0334-5515779,0344,5515779,0345-7308411 4=college degree: 5-post
college. In this measure, higher numbers mean more education. But is distance from 0 to I same as 3 to
4? Of course not. The interval between values is not interpretable in an ordinal measure. In interval
measurement the distance between attributes does have meaning. For example, when we measure
temperature (in Fahrenheit), the distance from 30-40 is same as distance from 70 80. The interval between
values is interpretable. Because of this, it makes sense to compute an average of an interval variable,
where it doesn't make sense to do so for ordinal scales. But note that in interval measurement ratios don't
make any sense - 80 degrees is not twice as hot as 40 degrees (although the attribute value is twice as large).
Finally, in ratio measurement there is always an absolute zero that is meaningful. This means that you can
construct a meaningful fraction (or ratio) with a ratio variable. Weight is a ratio variable. In applied social
research most "count" variables are ratio, for example, the number of clients in past six months. Why?
Because you can have zero clients and because it is meaningful to say that "...we had twice as many
clients in the past six months as we did in the previous six months." It's important to recognize that there is
a hierarchy implied in the level of measurement idea. At lower levels of measurement, assumptions tend
to be less restrictive and data analyses tend to be less sensitive. At each level up the hierarchy, the
current level includes all of the qualities of the one below it and adds something new. In general, it is
desirable to have a higher level of measurement (e.g.. interval or ratio) rather than a lower one (nominal or
ordinal). Data will always take a back seat in the drive towards better decision-making. As companies spend
increasing amounts of time and effort in capturing organizational and market data, the return on these
investments will depend on our ability to transform the data into unpactful decisions. Data elone doesn't
produce pertinent decisions for decision-making a constantly handicapped by critertainty, ambiguity, and
complexity. Measuring the quality of our decision making may well prove more important than improving
the quality of our data Let's look both at why measuring devision-making is so difficult, and why it is so
potentially rewarding. One of the obstacles in proving decision-making today comes from da diversity of
managerial challenges, Onantobjective" level we refer to problem is simple when the data at hand is
sufficient to identify the best wa forward, and complex when the data provides nothing more the best
answer in a given context. On a "sabjective" level, a manager gauges risk when he or she understands
the probabilities and outcomes of the choices before them. A manager confronts Encantány when for
one reason or another the probabilities and outcomes
cannot be precisely determined. The factor of "ambiguity weighs into the equation when the decision-
maker questions the clarity of the problem itself. As a result, the goal of management is rarely about
finding the right awat, and larget about helping mangers and customers take better decisions through
reducing the sources of Nisk. uncertainty, and ambiguity. Measuring decision-making is complicated by
the fact that the "best" choice depends as much on the manager's state of mind as on the nature of
the problem itself. In Decision Science, we refer to four mindsets that condition human decision-making.
[i] The optimist. like the lotto player, is always betting on the largest payoff possible regardless of the
probabilities. Inversely, visions of the worst haunt the pessimist who will religiously try to minimize
potential losses.
PRIMARY DATA
>
A common example of primary data is the data collected by organizations during market
research, product research, and competitive analysis. This data is collected directly from its original
source which in most cases are the existing and potential customers. Most of the people who collect
primary data are government authorized agencies, investigators. research-based private
institutions, etc.
BS Education
0345-7308411
AIOU Studio یوٹیوب چینل کو سبسکرائب کریں۔
9 رسٹی کی معلومات کے لیےTعالمہ اقبال اوپن یونیو
B.S Education 0334-5515779,0344,5515779,0345-
7308411
SECONDARY DATA
For example, when conducting a research thesis, researchers need to consult past works done in this field and
add findin sathe literature review. Some other things like definitions and theorems are secondary data that are
added to the thesis to be properly referenced and cited accordingly. Some common sources of secondary dat
includerada obligations, government statistics, journals, etc. In most cases, these sources cannot be trusted as
authentic. Read More: What is Secondary Data? + [Examples. Sources, & Analysis Pros Secondary data
is easily accessible compared to primary data. Secondary data is available on different platforms that
can be accessed by the researcher.
Scatter Plot A Scatter Plot is a straightforward yet powerful tool for visualizing data, which are new in this field of
statistics and data science. Today we are going to leam everything about Scatter Plots. So what is a Scatter Plot?
Well, "A Scatter Plot is a graphical tool for visualizing the relation between two different variables of the same
or different data groups, by plotting the data values along with a two dimensional Cartesian system."
The above definition will become more precise with the Scatter Graphı below Scatter Plots are also
known as Scatter Charts or Scatter Graphs.
87884 K OBAAS
BS Education
0345-7308411
2001
3000
500
The above graph made is with two different variables: diameters (in centimeters) and height in meters)
for a group of trees. While the horizontal X-axis depicts the width the longitudinal Y axis represents the
height with each dot specifying a tree. We can derive variousCorrelations between the vadables using
such plots.
Advantages
Disadvantages
Correlation & Correlation coefficient: The term correlation defined as the nature of the relationship between
two variables in this case, discrete variables) in any statistical study or survey A correlation coefficient is
a statistical measure of the extent or degree of this correlation. Positive, negative, and no correlation are
the three types. Thus one can say that a correlation coefficient will be positive or negative or 0. We will look
into these shortly. Line of Best Fit: The Line of Best Fit is drawn up according to previous data collected
and is used to predict the ideal correlation between two given variables. It acts as a reference while plotting a
Scatter Graph. Types of correlations: Positive Correlation: When the value of the dependent variable
increases with an increase in the cost of the independent variable, we say there is a positive correlation
between the two. A
B. Negative Correlation: When the value of the dependent value decreases with the
increase in the cour of the independent variable device-versa, then we say that the two variables
hace #negative correlation.
C. No Correlation: In case we don't find any apparaat relationship between the
Thiriables under study, we say there is no correlation between them. Scatter Plot Examples Eg. I:
Positive Correlation: Problem: To find the relation between electricity bill and temperature: Solution: The
data is gathered and tabulated, and the values are plotted in a Scatter Chart as follows:
0345-7308411
A
"P" between the
AIOU Studio یوٹیوب چینل کو سبسکرائب کریں۔
9 رسٹی کی معلومات کے لیےTعالمہ اقبال اوپن یونیو
B.S Education 0334-5515779,0344,5515779,0345-
7308411
Electricity Bill vs Temperature
8,000.00
7.000.00
6,000.00
5,000.00
4,000.00 3.000.00
2.000.00
1.000.00
0.00
From the above Scatter Plot, we can see that the electricity bill is less when the
temperature is comparatively lower. However, it rises with a rise in temperature. There are
other factors included as well, which does not make a linear relation. Still, we can infer that there is a
positive correlation between the rise in temperature and electricity bills. Eg. II: Negative
correlation: Problem: To find the relation between age and hours of sleep needed: Solution:
Once again, the data gathered is after survey, and a Scatter Graph created as follows:
Hours
Age
034
How to creatScatter Plot with Edraw Max Online? Nowadays, creating a Scatter Chart has
become very easy. You no longer need to do it with pen and paper, even though it is how we leam. Then
again, at a professional level the best results are always seen when you use a diagramming tool like the
Edraw Max Onlibe to create Scatter Plots. It is a great tool to have in your inventory. Moreover, being an online
tool. you don't need to download it on your computer. Before drawing a Scatter Graph you need to
understand the differatu torrelations and correlation coefficients as desudk above. "+1" means positive
linear correlation "O" means no correlation: --I" means negative naar correlation: If the value of the
coefheiatas 0<x<+1, then there is a positive correlation but not linear If the value of the coefficient is -1<x<0,
then there is a negative correlation but not linear; Secondly, get familiar with the interfate of Edd a Onlist: 11
With that taken care of, let us see how we care create y Statler Plot using the Edraw Max Online: Step 1:
In your web browser open the home page and login with your credentials Step 2: From the "Graphs &
Chart' menu, select the "Scatter' option, and a drawing window opens
In this window. you can create your wiring diagram by choosing different wing diagram
symbols from the symbolibrary. There are various symbols available such as transmission path, qualifying
symbols, senticonductor devices, switches and relays and other necessary electrical symbols o .
03A Q.4 Explain normal distribution. How does
normality of data affect the analysis of data? ANS: What is Normal Distribution? Normal distribution, 15
known as the Gaussian distribution. is a probability distribution that is symmetric about the mean, showing
that data near the mean are more frequent in occurrence than data far from the mean. In graph form,
normal distribution will appear as a bell curve. Normal Distribution-0.345-7308411
Understanding Normal Distribation The normal distribution is the most common type of distribution assumed in
technical stock market analysis and in other types of statistical analyses. The standard normal distribution has
two parameters: the mean and the standard deviation. For a normal distribution, 68% of the
observations are within +/- one standard deviation of the mean, 95% are within +/- two standard deviations, and
99.7% are within t-three standard deviations.
12 4 5 6 7 8 9 10 11 12 13 14 15 A MAP 82 84 85C 92 93 94
95 98 100 102 107 110-116 1162 Sex M F F M M F F M M F M F M F M MAP: Mean
arterial preseurs, M: Male. F: Female Descriptive Statistics /Ro There are three major types of desorippve
statistics Measures of frequency (frequency, percent), measures of central tendency (mean, median
and mode), and measures of dispersion or variation (variance, SD, stafidard error, quartile,
interquartile range, percentile, range, and coefficient of variation provide simple summaries about the
sample and the measures. A measure of frequency is usually used for the categorical data while others
are used for quantitative data. Measures of Frequency 0345-7308411 Frequency statistics
simply count the number of times that in each variable occurs, such as the number of males and females
within the sample or population. Frequency analysis is an important area of statistics that deals with the
number of occurrences (frequency) and percentage. For example, according to Table 1. out of the 15
patients, frequency of the males and females were 8 (53.3%) and 7 (46.7%), respectively.
2:344-5513
For example, in the above, SD is 11.01 mmHg When n<30 which showed that approximate
average deviation between mean value and individual values is 11.01. Similarly, variance is 121.22
[i.e.. (11.01)2], which showed that average square deviation between mean value and individual
values is 121.22 [Table 2]. Standard error Standard error she approximate difference between sample
mean and populition mean. When we draw the many samples from same population with same sample size
through random
sampling techniquorthen SD among the sample means is called standard ORLOR If sample SD
and sample size are given, we can calculate standard error for this sample by using the formula. Standard
error=sample 3D sample size For example, according to Tablom standard error is 2.84 mmHA which
showed that average mean difference between sample beans and population med 2.84 mmHg [Table 2].
Quartiles and interquartile range The quartiles are the three points that divide the data set into four
equal groups, each group comprising a quarter of the date, for a set of data values which are arranged in either
ascending or descending order. O1.02. and Q3 are represent the first second, and third quartile's value.
For ith Quartile = [1 *(n+1)4]th observation where I = 1.3 For example, in the above, first quae 700
14=4th observation from initial = 88 mmHg
(i.e., first 25% number of observations of the data are either <88 and rest 75% observations are either -88). Q2
(also called median) = [2 (n + 1)/4] = Sth observation from initial = 95 mmHg, that is, first 50% number of
observations of the data are either less or equal to the 95 and rest 50% observations are either >95. and
similarly Q3 = [3* (n + 174] = 12th observation from initial = 107 mmHg, i.e., indicated that first 75%
number of observations
AIOU Studio یوٹیوب چینل کو سبسکرائب کریں۔
9 کے لیے
عالمہ اقبال او بین یو نیورسٹی کی معلومات
B.S Education
0334-5515779,0344,5515779,0345-7308411 of the data are either <107
and rest 25% observations are either 107. The interquartile range (IQR) is a measure of variability, also
called the midspread or middle 50%, which is a measure of statistical dispersion, being equal to the
difference between 75th (Q3 or third quartile) and 25th (QI or first quartile) percentiles. For example, in the
above example, three quartiles, that is, 01.02. and Q3 are 88.95. and 107, respectively. As the first and third
quartile in the data is 88 and 107. Hence, IOR of the data is 19 mmHg (also can write like: 88-107) [Table 2]
Percentile
The percentiles are the 99 points that divide the data set into 100 equal groups. each group
comprising a 1% of the data for a set of data values which are arranged in either ascending or descending
order. About 25% percentile is the first quartile, 50% percentile is the second quartile also called median value,
while 75% percentile is the third quartile of the data. For ith percentile = [1 + (n + 1)/100]th observation, where I =
1. 2. 3. 99. Example: In the above, 10th percentile=[10* (n+1)/100)=1.6th observation from initial which is
fall between the first and second observation from the initial = Ist observation +0.0* (difference between
the second and first observation) = 83.20 mmHg, which indicated that 10% of the data are either c83.20
and rest 90% observations are either >83.20. Coefficient of Variation Interpretation of SD without considering
the magnitude of mean of the sample or population may be misleading. To overcome this problem. CV gives
an idea, CV gives the result in terms of ratio of SD with respect to its mean value, which expressed in %. CV=
100 X (SD/mean). For example, in the above, coefficient of the variation is 11.3% which indicated that SD
is 11.3% of its mean value [ie., 100*(11.01/97.47) Table 2] Range Difference betyveen largest and smallest
observation is called range. If A and Bure smallest and largest observations in a data set, then the range I is
equal to the difference of largest and smallest observation that is, R=A-B. For example, in the above, minimum
and maximum observation in the data 82 mmHg and 116 mmHg. Hence, the range of the data is 34 mmHg
(also can write like-82-116) [Table 2]. Descriptive statistics can be calculated in the statistical software SPSS"
(analyze descriptive statistics frequencies or 2 finize2lve. Normality of data and testing
The standard normal distribution is the most important continuous probability distribution has a bell-
shaped density and described by its mean and SD and extreme values in the data set have no
significant impact on the mean value. If a continuous data is follow normal distribution then
68.2%.95.4%, and 99.7% observations are lie between mean = 1 SD, mean + 2 SD. and mean 3 SD.
respectively. 345-7308411 Why to test the normality of Malar Various statistical methods used for
data analysis make assumptions about normality, including correlation, regression, t-tests, and analysis of
variance. Central limit theorem states that when sample size has 100 or more observations, violation of the
normality is not a major issue. Although for meaningful conclusions, assumption of the normality should be
followed irrespective of the sample size. If a continuous data follow normal distribution, then we present
6
Value SE
Z
Value
SE Z
K
-S test with Lilliefors Shapiro-Wilk correction
test
Q.5 How is mean different from median? Explain the role of level of measurement in
measure of central tendency.
Mean
Median
It defines the central value of It defines the centre of gravity of the the data
set.
midpoint of the data set. Thus, these are the major dittaendee
fetueen MeadMédtan. It is essential to know the major differences between the two.
Partb: Measures of central tendency tell us what is common or typical about our variable.
Three measures of central tendency are the mode, the median and the mean. The mode is
used