In Talking About Variables

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 4

In talking about variables, sometimes you hear variables being described as categorical (or

sometimes nominal), or ordinal, or interval. Below we will define these terms and explain why
they are important.

Categorical

A categorical variable (sometimes called a nominal variable) is one that has two or more
categories, but there is no intrinsic ordering to the categories. For example, gender is a
categorical variable having two categories (male and female) and there is no intrinsic ordering to
the categories. Hair color is also a categorical variable having a number of categories (blonde,
brown, brunette, red, etc.) and again, there is no agreed way to order these from highest to
lowest. A purely categorical variable is one that simply allows you to assign categories but you
cannot clearly order the variables. If the variable has a clear ordering, then that variable would
be an ordinal variable, as described below.

Ordinal

An ordinal variable is similar to a categorical variable. The difference between the two is that
there is a clear ordering of the variables. For example, suppose you have a variable, economic
status, with three categories (low, medium and high). In addition to being able to classify people
into these three categories, you can order the categories as low, medium and high. Now consider
a variable like educational experience (with values such as elementary school graduate, high
school graduate, some college and college graduate). These also can be ordered as elementary
school, high school, some college, and college graduate. Even though we can order these from
lowest to highest, the spacing between the values may not be the same across the levels of the
variables. Say we assign scores 1, 2, 3 and 4 to these four levels of educational experience and
we compare the difference in education between categories one and two with the difference in
educational experience between categories two and three, or the difference between categories
three and four. The difference between categories one and two (elementary and high school) is
probably much bigger than the difference between categories two and three (high school and
some college). In this example, we can order the people in level of educational experience but
the size of the difference between categories is inconsistent (because the spacing between
categories one and two is bigger than categories two and three). If these categories were equally
spaced, then the variable would be an interval variable.

Interval

An interval variable is similar to an ordinal variable, except that the intervals between the values
of the interval variable are equally spaced. For example, suppose you have a variable such as
annual income that is measured in dollars, and we have three people who make $10,000, $15,000
and $20,000. The second person makes $5,000 more than the first person and $5,000 less than
the third person, and the size of these intervals is the same. If there were two other people who
make $90,000 and $95,000, the size of that interval between these two people is also the same
($5,000).

Why does it matter whether a variable is categorical, ordinal or interval?


Statistical computations and analyses assume that the variables have a specific levels of
measurement. For example, it would not make sense to compute an average hair color. An
average of a categorical variable does not make much sense because there is no intrinsic ordering
of the levels of the categories. Moreover, if you tried to compute the average of educational
experience as defined in the ordinal section above, you would also obtain a nonsensical
result. Because the spacing between the four levels of educational experience is very uneven, the
meaning of this average would be very questionable. In short, an average requires a variable to
be interval. Sometimes you have variables that are "in between" ordinal and interval, for
example, a five-point likert scale with values "strongly agree", "agree", "neutral", "disagree" and
"strongly disagree". If we cannot be sure that the intervals between each of these five values are
the same, then we would not be able to say that this is an interval variable, but we would say that
it is an ordinal variable. However, in order to be able to use statistics that assume the variable is
interval, we will assume that the intervals are equally spaced.

Does it matter if my dependent variable is normally distributed?

When you are doing a t-test or ANOVA, the assumption is that the distribution of the sample
means are normally distributed. One way to guarantee this is for the distribution of the
individual observations from the sample to be normal. However, even if the distribution of the
individual observations is not normal, the distribution of the sample means will be normally
distributed if your sample size is about 30 or larger. This is due to the "central limit theorem"
that shows that even when a population is non-normally distributed, the distribution of the
"sample means" will be normally distributed when the sample size is 30 or more, for example
see Central limit theorem demonstration .

If you are doing a regression analysis, then the assumption is that your residuals are normally
distributed. One way to make it very likely to have normal residuals is to have a dependent
variable that is normally distributed and predictors that are all normally distributed, however this
is not necessary for your residuals to be normally distributed. You can see Regression with SAS:
Chapter 2 - Regression Diagnostics, Regression with SAS: Chapter 2 - Regression Diagnostics,
or Regression with SAS: Chapter 2 - Regression Diagnostics

Categorical. Categorical variables take on values that are names or labels. The color of a
ball (e.g., red, green, blue) or the breed of a dog (e.g., collie, shepherd, terrier) would be
examples of categorical variables.

Quantitative. Quantitative variables are numerical. They represent a measurable quantity.


For example, when we speak of the population of a city, we are talking about the number
of people in the city - a measurable attribute of the city. Therefore, population would be a
quantitative variable.

In algebraic equations, quantitative variables are represented by symbols (e.g., x, y, or z).


As discussed in the section on variables in Chapter 1, quantitative variables are variables
measured on a numeric scale. Height, weight, response time, subjective rating of pain,
temperature, and score on an exam are all examples of quantitative variables. Quantitative
variables are distinguished from categorical (sometimes called qualitative) variables such as
favorite color, religion, city of birth, and favorite sport in which there is no ordering or
measuring involved.

There are many types of graphs that can be used to portray distributions of quantitative variables.
The upcoming sections cover the following types of graphs: (1) stem and leaf displays, (2)
histograms, (3) frequency polygons, (4) box plots, (5) bar charts, (6) line graphs, (7) scatter plots
(discussed in a different chapter), and (8) dot plots. Some graph types such as stem and leaf
displays are best-suited for small to moderate amounts of data, whereas others such as
histograms are best-suited for large amounts of data. Graph types such as box plots are good at
depicting differences between distributions. Scatter plots are used to show the relationship
between two variables.

Qualitative Variable: Definition and Examples

Probability and Statistics > Statistics Definitions > What is a Qualitative Variable?

Qualitative Variable: What is it?

Watch the video or read on below:

A qualitative variable, also called a categorical variable, are variables that are not numerical. It
describes data that fits into categories. For example:

Eye colors (variables include: blue, green, brown, hazel).


States (variables include: Florida, New Jersey, Washington).
Dog breeds (variables include: Alaskan Malamute, German Shepherd, Siberian Husky, Shih tzu).

These are all qualitative variables as they have no natural order. On the other hand, quantitative
variables have a value and they can be added, subtracted, divided or multiplied.

Qualitative variables arent ordered on a numerical scale in statistics so they are assigned nominal
scales. The word nominal means name, which is exactly what qualitative variables are. A nominal
scale is a scale where no ordering is possible or implied (except for alphabetical ordering like New York,
Washington, West Virginia or Chelsea, Edinburgh, London). In other words, the nominal scale is where
data is assigned to a category.

You might also like