Download as pdf or txt
Download as pdf or txt
You are on page 1of 2

UNIVERSIDAD POLITÉCNICA DE LA REGIÓN RIBEREÑA

PROBABILITY AND STATISTICS


ING. GILBERTO ANUAR GARCÍA GARCÍA

UNIT 1, DESCRIPTIVE STATISTICS

TOPIC: Populations, Samples, and Branches of Statistics.

Engineers and scientists are constantly exposed to collections of facts, or data, both in their
professional capacities and in everyday activities. The discipline of statistics provides methods
for organizing and summarizing data and for drawing conclusions based on information contained
in the data.

An investigation will typically focus on a well-defined collection of objects constituting a


population of interest. However, the basic idea behind all statistical methods of data analysis is
to make inferences about a population by studying a relatively small sample chosen from it. The
best sampling methods involve random sampling. There are many different random sampling
methods, the most basic of which is simple random sampling.

BASIC CONCEPTS
 A population is the entire collection of objects or outcomes about which information
is sought.
 A sample is a subset of a population, containing the objects or outcomes that are
actually observed.
 A simple random sample of size n is a sample chosen by a method in which each
collection of n population items is equally likely to comprise the sample, just as in a
lottery.
 Parameters are functions of the study variable values. They are unknown, quantitative
measures for the entire population or for specified domains which are of interest to the
investigator.
 A variable is a characteristic of a unit being observed that may assume more than one
of a set of values to which a numerical measure or a category from a classification can
be assigned.

Some people think that a simple random sample is guaranteed to reflect its population
perfectly. This is not true. Simple random samples always differ from their populations in some
ways, and occasionally may be substantially different. Two different samples from the same
population will differ from each other as well. This phenomenon is known as sampling variation.
Sampling variation is one of the reasons that scientific experiments produce somewhat different
results when repeated, even when the conditions appear to be identical.

When observations are carried out, we are usually interested only in certain characteristics of
the objects in a population. A variable is any characteristic whose value may change from one
object to another in the population. Variables can be classified as qualitative or categorical and
quantitative or numerical.

Qualitative variables take on values that are names or labels, for example the color of an
object or the breed of a dog. In contrast, quantitative variables are numeric and represent a
measurable quantity, such as the temperature, distance, or weight.
Data results from making observations either on a single variable or simultaneously on two or
more variables. A univariate data set consists of observations on a single variable. We have
bivariate data when observations are made on each of two variables. Multivariate data arises
when observations are made on more than one variable (so bivariate is a special case of
multivariate).

An investigator who has collected data may wish simply to summarize and describe important
features of the data. This entails using methods from descriptive statistics. Some of these
methods are graphical in nature; the construction of histograms, boxplots, and scatter plots are
primary examples. Other descriptive methods involve calculation of numerical summary
measures, such as means, standard deviations, and correlation coefficients.

Having obtained a sample from a population, an investigator would frequently like to use
sample information to draw some type of conclusion (make an inference of some sort) about the
population. That is, the sample is a means to an end rather than an end in itself. Techniques for
generalizing from a sample to a population are gathered within the branch of our discipline called
inferential statistics.

In a probability problem, properties of the population under study are assumed known and
questions regarding a sample taken from the population are posed and answered. In a statistics
problem, characteristics of a sample are available to the experimenter, and this information
enables the experimenter to draw conclusions about the population. The relationship between the
two disciplines can be summarized by saying that probability reasons from the population to the
sample (deductive reasoning), whereas inferential statistics reasons from the sample to the
population (inductive reasoning). This is illustrated in Figure 1.1.

Figure 1.1: Fundamental relationship between probability and inferential statistics.

You might also like