Data Type

CHAP.
13 INTRODUCTION 5
EXAMPLE 1.12 Table 1.6 gives several qualitative variables and a set of possible nominal level data values.
The data values are often encoded for recording in a computer data tile. Blood type might be recorded as 1, 2, 3,
or 4; state of residence might be recorded as 1,2, . . . , or 50; and type of crime might be recorded as 0 or I , or 1
or 2, etc. Similarly, color of road sign could be recorded as 1, 2, 3, 4, or 5 and religion could be recorded as 1,
2, or 3. There is no order associated with these data and arithmetic operations are not performed. For example,
adding Christian and Moslem ( 1 + 2) does not give other (3).
Possible nominal level data values

Qualitative variable associated with the variable
Blood type A, B, AB, 0
State of residence Alabama,. . . , Wyoming
Type of crime Misdemeanor, felony
Color of road signs in the state of Nebraska Red, white, blue, brown, green
Religion Christian, Moslem, other
The ordinal scale applies to data that can be arranged in some order, but differences between
data values either cannot be determined or are meaningless. The ordinal level of measurement is
characterized by data that applies to categories that can be ranked. Ordinal scale data can be
arranged in an ordering scheme.
EXAMPLE 1.13 Table 1.7 gives several qualitative variables and a set of possible ordinal level data values.
The data values for ordinal level data are often encoded for inclusion in computer data files. Arithmetic
operations are not performed on ordinal level data, but an ordering scheme exists. A full-size automobile is
larger than a subcompact, a tire rated excellent is better than one rated poor, no pain is preferable to any level
of pain, the level of play in major league baseball is better than the level of play in class AA, and so forth.
Possible ordinal level data values associated

Qualitative variable with the variable
Automobile size description Subcompact, compact, intermediate, full-size
Product rating Poor, good, excellent
Socioeconomic class Lower, middle, upper
Pain level None, low, moderate, severe
Baseball team classification Class A, class AA, class AAA , major league
The interval scale applies to data that can be arranged in some order and for which differences in
data values are meaningful. The interval level of measurement results from counting or measuring.
Interval scale data can be arranged in an ordering scheme and differences can be calculated and
interpreted. The value zero is arbitrarily chosen for interval data and does not imply an absence of
the characteristic being measured. Ratios are not meaningful for interval data.
EXAMPLE 1.14 Stanford-Binet IQ scores represent interval level data. Joe’s IQ score equals 100 and John’s
IQ score equals 150. John has a higher IQ than Joe; that is, IQ scores can be arranged in order. John’s IQ score
is 50 points higher than Joe’s IQ score; that is, differences can be calculated and interpreted. However, we
cannot conclude that John is 1.5 times (150/100 = 1.5) more intelligent than Joe. An IQ score of zero does not
indicate a complete lack of intelligence.
6 INTRODUCTION [CHAP. 1
EXAMPLE 1.15 Temperatures represent interval level data. The high temperature on February 1 equaled 25°F
and the high temperature on March 1 equaled 50°F. It was warmer on March 1 than it was on February 1. That
is, temperatures can be arranged in order. It was 25” warmer on March 1 than on February 1. That is, differences
may be calculated and interpreted. We cannot conclude that it was twice as warm on March 1 than it was on
February 1. That is, ratios are not readily interpretable. A temperature of 0°F does not indicate an absence of
warmth.
EXAMPLE 1.16 Test scores represent interval level data. Lana scored 80 on a test and Christine scored 40 on
a test. Lana scored higher than Christine did on the test; that is, the test scores can be arranged in order. Lana
scored 40 points higher than Christine did on the test; that is, differences can be calculated and interpreted. We
cannot conclude that Lana knows twice as much as Christine about the subject matter. A test score of 0 does not
indicate an absence of knowledge concerning the subject matter.
The ratio scale applies to data that can be ranked and for which all arithmetic operations
including division can be performed. Division by zero is, of course, excluded. The ratio level of
measurement results from counting or measuring. Ratio scale data can be arranged in an ordering
scheme and differences and ratios can be calculated and interpreted. Ratio level data has an absolute
zero and a value of zero indicates a complete absence of the characteristic of interest.
EXAMPLE 1.17 The grams of fat consumed per day for adults in the United States is ratio scale data. Joe
consumes 50 grams of fat per day and John consumes 25 grams per day. Joe consumes twice as much fat as
John per day, since 50/25 = 2. For an individual who consumes 0 grams of fat on a given day, there is a
complete absence of fat consumed on that day. Notice that a ratio is interpretable and an absolute zero exists.
EXAMPLE 1.18 The number of 91 1 emergency calls in a sample of 50 such calls selected from a 24-hour
period involving a domestic disturbance is ratio scale data. The number found on May 1 equals 5 and the
number found on June 1 equals 10. Since 10/5 = 2, we say that twice as many were found on June 1 than were
found on May 1. For a 24-hour period in which no domestic disturbance calls were found, there is a complete
absence of such calls. Notice that a ratio is interpretable and an absolute zero exists.
SUMMATION NOTATION
Many of the statistical measures discussed in the following chapters involve sums of various
types. Suppose the number of 91 I emergency calls received on four days were 41 I , 375, 400, and
478. If we let x represent the number of calls received per day, then the values of the variable for the
four days are represented as follows: x1 = 41 I , x2 = 375, x j = 400, and x4 = 478. The sum of calls for
the four days is represented as X I + x2 + x j + x4 which equals 41 1 + 375 + 400 + 478 or 1664. The
symbol Cx, read as “the summation of x,” is used to represent X I + x2 + xj + x4 . The uppercase Greek
letter C (pronounced sigma) corresponds to the English letter S and stands for the phrase “the sum
of.” Using the summation notation, the total number of 91 1 calls for the four days would be written
as Cx = 1664.
EXAMPLE 1.19 The following five values were observed for the variable x: xI = 4, x2 = 5, x3 = 0, x4 = 6, and
x5 = 10. The following computations illustrate the usage of the summation notation.
CX = X I +~2 + x j + ~4 + ~ 5 = +4 5 + O + 6 + 1 0 ~ 2 5
(CS)~ = ( X I+ x2 + x j + ~4 + ~ 5 =) (25)’
~ = 625
Cx2 = x12+ x: + x3’ + :X + xS2 = 42 + 5’ + 0’ + 6’ + 102= 177

C(x - 5 ) = (x, - 5) + (x2 - 5) + ( X j - 5) + (x4 - 5) + (xs - 5 )
C(X - 5) = ( 4 - 5 ) + ( 5 - 5 ) + ( 0 - 5 ) + ( 6 - 5 ) + ( 10 - 5 ) = -1 + 0 - 5 + 1 + 5 = 0
CHAP. I ] INTRODUCTION 7
EXAMPLE 1.20 The following values were observed for the variables x and y: X I = I , x2 = 2, x j = 0, x4 = 4,
y1 = 2, y2 = I , y3 = 4, and y4 = 5 . The following computations show how the summation notation is used for two
variables .
ZXY= xlyl + ~ 2 + ~xjyj2 + ~ 4 =~ 1 x4 2 + 2 x 1 + 0 x 4 + 4 x 5 = 24
(Cx)(Cy)=(x1 +X2+Xj+Xq)(Y1 + y 2 + y 3 + y 4 ) = ( 1 + 2 + 0 + 4 ) ( 2 + 1 + 4 + 5 ) = 7 x 1 2 = 8 4
( C X ~ - ( C X ) ~ / ~ ) X ( Z ~ ~ - ( C Y+)4~+/ 0~ +) =1(6~- 7 2 / 4 ) ~ ( 4 +1 + 16+25- 1 2 * / 4 ) = ( 8 . 7 5 ) ~ ( 1 0 )87.5
=
COMPUTERS AND STATISTICS
The techniques of descriptive and inferential statistics involve lengthy repetitive computations as
well as the construction of various graphical constructs. These computations and graphical
constructions have been simplified by the development of computer software. These computer
software programs are referred to as statistical sojhare packages, or simply statistical packages.
These statistical packages are large computer programs which perform the various computations and
graphical constructions discussed in this outline plus many other ones beyond the scope of the
outline. Statistical packages are currently available for use on mainframes, minicomputers, and
microcomputers.
There are currently available numerous statistical packages. Four widely used statistical
packages are: MINITAB, BMDP, SPSS, and SAS. Many of the figures found in the following
chapters are MINITAB generated. MINITAB is a registered trademark of Minitab, Inc., 3081
Enterprise Drive, State College, PA 16801. Phone: 8 14-238-3280; fax: 8 14-238-4383; telex: 88 16 12.
The author would like to thank Minitab Inc. for their permission to use output from MINITAB
throughout the outline.
Solved Problems
DESCRIPTIVE STATISTICS AND INFERENTIAL STATISTICS:

POPULATION AND SAMPLE
1.1 Classify each of the following as descriptive statistics or inferential statistics.
( a ) The average points per game, percent of free throws made, average number of rebounds
per game, and average number of fouls per game as well as several other measures
for players in the NBA are computed.
(b) Ten percent of the boxes of cereal sampled by a quality technician are found to be under
the labeled weight. Based on this finding, the filling machine is adjusted to increase the
amount of fill.
( c ) USA Today gives several pages of numerical quantities concerning stocks listed in AMEX,
NASDAQ, and NY SE as well as mutual funds listed in MUTUALS.
(d)Based on a study of 500 single parent households by a social researcher, a magazine
reports that 25% of all single parent households are headed by a high school dropout.
Ans. (a) The measurements given organize and summarize information concerning the players and is
therefore considered descriptive statistics.
8 INTRODUCTION [CHAP. 1
(6) Because of the high percent of boxes of cereal which are under the labeled weight in the sample, a
decision is made to increase the weight per box for each box in the population. This is an example of
inferential statistics.
( c ) The tables of measurements such as stock prices and change in stock prices are descriptive in nature
and the re fore represent descriptive statistics.
((2) The magazine is stating a conclusion about &hepopulation based upon a sample and therefore this is
an example of inferential statistics.
1.2 Identify the sample and the population in each of the following scenarios.
( a ) In order to study the response times for emergency 91 1 calls in Chicago, fifty “robbery in
progress” calls are selected randomly over a six-month period and the response times are
recorded.
( b ) In order to study a new medical charting system at Saint Anthony’s Hospital, a
representative group of nurses is asked to use the charting system. Recording times and
error rates are recorded for the group.
( c )Fifteen hundred individuals who listen to talk radio programs of various types are selected
and information concerning their education level, income level, and so forth is recorded.
Am. ( a ) The 50 “robbery in progress” calls is the sample, and all “robbery in progress” calls in Chicago
during the six-month period is the population.
(bj The representative group of nurses who use the medical charting system is the sample and all nurses
who use the medical charting system at Saint Anthony’s is the population.
(cj The 1500 selected individuals who listen to talk radio programs is the sample and the millions who
listen nationally is the population.
VARIABLE, OBSERVATION, AND DATA SET
1.3 In a sociological study involving 35 low-income households, the number of children per
household was recorded for each household. What is the variable? How many observations are
in the data set?
Ans. The variable is the number of children per household. The data set contains 35 observations.
1.4 A national survey was mailed to 5000 households and one question asked for the number of
handguns per household. Three thousand of the surveys were completed and returned. What is
the variable and how large is the data set?
Ans. The variable is the number of handguns per household and there are 3000 observations in the data
set.
1.5 The number of hours spent per week on paper work was determined for 200 middle level
managers. The minimum was 0 hours and the maximum was 27 hours. What is the variable?
How many observations are in the data set?
Am. The variable is the number of hours spent per week on paper work and the number of observations
equals 200.
QUANTITATIVE VARIABLE: DISCRETE AND CONTINUOUS VARIABLE
1.6 Classify the variables in problems I .3, 1.4, and 1 .S as continuous or discrete.

Data Type

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Data Type

Uploaded by

Copyright:

Available Formats

CHAP.

Possible nominal level data values

Possible ordinal level data values associated

Cx2 = x12+ x: + x3’ + :X + xS2 = 42 + 5’ + 0’ + 6’ + 102= 177

COMPUTERS AND STATISTICS

DESCRIPTIVE STATISTICS AND INFERENTIAL STATISTICS:

1.1 Classify each of the following as descriptive statistics or inferential statistics.

VARIABLE, OBSERVATION, AND DATA SET

QUANTITATIVE VARIABLE: DISCRETE AND CONTINUOUS VARIABLE

You might also like