Unit 1. Teoria

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 18

Statistics I

Topic 1: Introduction and basic concepts


Topic 1: Introduction and basic concepts

Contents

1. What is Statistics? Examples of applications

2. Databases: elements, variables and observations

3. Types of variables

4. Data sources: population and sample

5. Statistical analysis with R Commander

2
Recommended readings
ØPeña, D., Romo, J. Introducción a la Estadística para las Ciencias
Sociales (1997)
üCh. 1, 2 and 3.

ØNewbold, P. et al. Statistics for Business and Economics (2009).


üCh. 1
üSections 2.1 and 2.4

ØTriola, M.F. Estadística, 12ª ed.


üCh. 1

ØTriola, M.F. Essentials of Statistics, 5th ed.


üCh. 1
3
What is Statistics?
In everyday language, the term statistics is used to refer to numbers
that describe some aspect of the world

• Economic statistics: number of unemployed, inflation rate, …


• Demographic statistics: birth rate, life expectancy, …
• Sports statistics: goals scored, number of red cards in a football match
• Meteorological statistics: temperature, rain, …
but….
Statistics is much more than mere numbers:
• It is the discipline that addresses how to collect, summarize, analyze,
and interpret data, to draw conclusions and make better decisions

4
Applications of Statistics
• In Accounting: audits, …
• In Finance: analysis and prediction of the value of a firm, …

• In Marketing: information about consumption habits, design of ad campaigns, …

• In Economics: predictions of economic indicators, analysis of the effects of a policy, …

• In Politics: voting intention polls, analysis of electoral results, social indicators, …


• In Sustainability: UN 2030 Sustainable Development Goals (SDG). Indicators to increase
visibility of vulnerable groups and to detect the degree of attainment of 17 goals (No
poverty, Zero hunger, Good health and well-being, Quality education, Gender equality,
etc.)
• … and many more: in sports, medicine, engineering, …

5
Databases (DB): elements, variables and
observations
• Data: collected features about a phenomenon under study
• Source: SDG Index & Dashboards Report 2017, http://www.sdgindex.org/

Variables
Mean scores for each country on the 17 SDGs

Elements (individuals)
Each of the 157 members
of the UN considered

Observations: data
(metrics) recorded
for each element

6
Examples of statistical variables

q Vote of madrileños: Cs, IU, PP, PSOE, UP, Vox, …

q Employment status of getafenses: unemployed, part time,


full time, …
q Customer purchase satisfaction:

q Number of a newspapers bought by madrileños in a day


q Number of employees of Madrid firms
q Expenses of Spanish city councils

Different types of variables require different treatments

7
Types of statistical variables

VARIABLES

Categorical Numerical
(qualitative) (quantitative)

Ordinal: Discrete: integer Continuous: not


Nominal: no naturally
natural ordering Number of necessarily integer
ordered classes
Most voted party employees in Expenses of Spanish
Purchase Madrid firms city councils
in last elections satisfaction

Notation: typically the letters X, Y, Z are used. Example:


X = Number of employees in Madrid firms (upper case in definition)
x1 = 55; x2 = 3000 (lower case for specific values, we add subscripts to indicate
individuals)
NOTE: Numerical codes for categorical variables DO NOT make them numerical
(ex: Male = 1, Female = 2) 8
Population and sample; data sources

Ø Population: complete collection of individuals


In practice it is unusual to study all the individuals of a
population:
§ It may be economically infeasible to study the entire
population
§ The study might take so much time that it would be infeasible
and, moreover, the population might change over the time
span of the study
§ The study may imply the destruction of individuals
Ø Sample: a subset of individuals drawn from the
population

9
Population and sample; data sources

ØSample: subset of individuals


üTo draw valid conclusions, it must be
representative of the population
üThe sample selection method (sampling
method) is very important
üData sources:
ØAvailable historical information
ØFrom observations (observational studies)
ØFrom experiments (experimental studies)
10
Statistical software: R Commander

• We´ll use the software R Commander


• R is a widely used powerful free statistical
software, which is command-based
• We´ll use R with the Commander package
(Rcmdr), which provides a user-friendly GUI
• Available for Windows, Mac, …
• See the installation tutorial:
https://socialsciences.mcmaster.ca/jfox/Misc/Rcmdr/installation-notes.html
The R console
The R Commander console
Importing a data set
Importing a data set (cont.)
Importing a data set (cont.)
The data set Nations.txt (partial view)
The data set Nations.txt

• UN data on 207 countries

• Variables:

– Country

– TFR: Total fertility rate (births per woman)

– contraception (rate in %)

– infant.mortality (rate per 1,000 births)

– GDP (in US $)

– region: Africa, Americas, Asia, Europe, Oceania

18

You might also like