Professional Documents
Culture Documents
COMPSCI 5590-f23-DS-rr-lecture1-3
COMPSCI 5590-f23-DS-rr-lecture1-3
COMPSCI 5590-f23-DS-rr-lecture1-3
2
Recap
▪ Some methods and techniques are available to summarize and interpret the data.
• Point estimates: these are the single values of the sample statistics/estimates
that are used to estimate the population parameters.
• Graphical presentations: commonly used graphical representation of data
(e.g., boxplots, histogram, bar chart)
3
Descriptive Statistics: Contents
▪ Point estimates
▪ Measures of central tendency: arithmetic mean, median, mode
▪ Measures of variability: variance, standard deviation, range
▪ Measures of position: quartiles, interquartile range, percentile
▪ Measures of distribution: skewness, kurtosis
▪ Measures of association: covariance, correlation
▪ Graphical representation
* Properties of estimators
4
Measures of Association: Covariance
▪ If we consider two variables X and Y, covariance between them tells us how they
change together.
▪ Population:
▪ Sample:
5
Measures of Association: Correlation
▪ Correlation measures the standardized information about the degree of linear
association between two variables.
▪ The correction lies within the interval [-1,1].
▪ Population: Sample:
Travel time
Accident
Speed Speed
Fig. source: Washington, S. et al. (2020) Stat. and Econ. Methods for Transportation Data Analysis
6
Correlation: Examples
Positive correlation Negative correlation
Fig. source: Washington, S. et al. (2020) Stat. and Econ. Methods for Transportation Data Analysis
7
Correlation: Examples
No correlation
8
Measures of Association: Example
9
Measures of Association: Example
10
Measures of Association: Example
11
Descriptive Statistics: Contents
▪ Point estimates
▪ Measures of central tendency: arithmetic mean, median, mode
▪ Measures of variability: variance, standard deviation, range
▪ Measures of position: quartiles, interquartile range, percentile
▪ Measures of distribution: skewness, kurtosis
▪ Measures of association: covariance, correlation
▪ Graphical representation
* Properties of estimators
12
Properties of estimators
▪ In practice, we typically do not know population parameters such as
So, we collect sample data to estimate these population parameters.
▪ An estimator is a rule that combines sample data to give an estimated value of a
population parameter.
▪ Some desirable properties of an estimator:
▪ Unbiasedness: If there are several estimators of population parameter, and if one
of these estimators coincides with the true value of the unknown parameter,
then it is called unbiased.
13
Properties of estimators
▪ Efficiency: There is typically more than one unbiased estimator for a given
population parameter. An estimator is more efficient than another estimator
(both are unbiased) if the variance of is lower than the variance of
14
Properties of estimators
▪ Consistency: An estimator is said to be consistent if the probability of being
closer to the true value of the parameter to be estimated increases with
increasing sample size.
▪ This property indicates that will not differ from as
15