Professional Documents
Culture Documents
LIV-STATS 2
LIV-STATS 2
LIV-STATS 2
OBSERVATION/PATTERN
HYPOTHESIS
PREDICTION
TEST THE PREDICTION (study/experimental
design/data collecting/data analysis/stats)
SUPPORT HYPOTHESIS ( interpret results)
COMMUNICATE FINDINGS (report)
REVISE HYPOTHESIS OR TEST OTHER
PREDICTIONS (do it all over again)
OBSERVATION/PATTERN
HYPOTHESIS
PREDICTION
TEST THE PREDICTION (study/experimental
design/data collecting/data analysis/stats)
SUPPORT HYPOTHESIS ( interpret results)
COMMUNICATE FINDINGS (report)
REVISE HYPOTHESIS OR TEST OTHER
PREDICTIONS (do it all over again)
1
18/3/24
SAMPLING,
STATISTICAL ANALYSIS
AND GRAPHING
1) Learn how to determine what analytical approach is appropriate for different types of data.
2) Learn how to compute descriptive statistics for a sample. These statistics include the mean,
standard deviation, standard error, and 95% confidence intervals.
3) Learn to compare 2 continuous variables with correlation and regression.
4) Learn to create bar graphs (with error bars), scatterplots (with best-fit lines), and other types
of graphs.
SAMPLING
• It is rarely the case that biologists can determine the true mean of a variable – it requires
testing ALL of the individuals in the world
• Make inferences and interpretations based on the measure of only a fraction of the
total population, and compute a mean from those individuals (SAMPLE MEAN)
• Assumption: the sample mean is a good predictor of the true mean for the population.
2
18/3/24
3
18/3/24
DESCRIPTIVE STATISTICS
• Measures of central tendency – reflect the distribution of the data
(response/dependent variable)
• SAMPLE MEAN (average)
• MEDIAN (middle one)
• MODE (most common)
4
18/3/24
MEASURES OF VARIABILITY
• If the means are different.. Is it because of the treatment or are
there normal differences due to other variable or is it the nature of
the data
MEASURES OF VARIABILITY:
STANDARD DEVIATION
10
5
18/3/24
MEASURES OF VARIABILITY:
STANDARD DEVIATION
Represents a measure of variability in a data set. The standard deviation is the
square root of the variance
1) compute the deviation: difference between the mean and the value for each replicate
Do this for each sample
-- these values can be + o -.. But we only care about the absolute value
2) take the square of each deviation value (this gets rid of the negative signs). (Xi-mean)2
This is a measure of how different each replicate is from the mean.
3) summation of the deviation squared values (statisticians call this the sum of squares).
Now we just have one value that measures the variability within a sample – but this depends on
how many replicates you have in your sample (few replicates small, many replicates larger)
5) we need to take the square root of the value r to get back to the original units of the sample
11
MEASURES OF VARIABILITY:
STANDARD ERROR
A statistic that reveals how accurately sample data
represents the whole population. It measures the accuracy
with which a sample distribution represents a population by
using standard deviation
se = sd/√n
h"ps://www.youtube.com/watch?v=A82brFpdr9g
12
6
18/3/24
13
MEASURES OF VARIABILITY:
CONFIDENCE INTERVAL
• The reliability of a sample mean, as a predictor of the true mean, can be
quantified by computing Confidence intervals (CI).
• A CI is actually a range surrounding the sample mean.
• You define the % confidence you will accept in your data… however,
commonly we use 95%
• It is estimated as a derivation of the standard error.
• The 95% CI of the mean is ± 2 times the standard error (2*se).
• How to interpret this … We can be 95% certain that the TRUE MEAN for
the population is somewhere between these two values
h"ps://www.youtube.com/watch?app=desktop&v=w3tM-PMThXk
h"ps://www.youtube.com/watch?v=yDEvXB6ApWc
14
7
18/3/24
MEASURES OF VARIABILITY:
CONFIDENCE INTERVAL
• It is estimated as
a derivation of
the standard
error. The 95% CI of
the mean is ± 2 times
the standard error
(2*se).
• if
15
8
18/3/24
h"ps://meta-calculator.com/blog/how-to-interpret-t-test-results/
17
18
9
18/3/24
Y si no….
Tenemos alternativas…
Transformación de datos (log,z,etc) https://medium.com/mytake/understanding-different-types-of-distributions-
Análisis no paramétricos (más flexibles, menos you-will-encounter-as-a-data-scientist-27ea4c375eec
precisos)
19
SE= sd/√n
20
10
18/3/24
21
22
11
18/3/24
• Tabla de T
23
PRUEBAS DE HIPÓTESIS..
(6) outcome and interpretation.
https://www.tdistributiontable.com/
24
12
18/3/24
25
h"ps://invasivespecies.idaho.gov/leafy-spurge-factsheet h"ps://www.youtube.com/watch?app=desktop&v=qhC8ihrdxHs
26
13
18/3/24
How we go about it
Once we have our objectives, hypothesis and predictions stated, we can start:
• First, we must design a study to test our predictions. How do we manipulate or model the
independent variable in order detect changes in the dependent variable (response variable) for
our predicted outcome.
• Experimental design: Compare big bluestem seed production on 5 plants from plots with
and without leafy spurge.
• Second, we must determine the location, apply treatments, and collect data.
• Application of treatments: two 25m2 plots of land; one with leafy spurge removed
(treatment plot) and one with leafy spurge present (control plot).
• Third, we must collect data.
• Sample: Collect five plants from each plot and count the number of seeds total in each plot.
• Fourth, we must analyze data using statistical procedures and generate graphs and tables (focus
of the next chapter).
• Statistical test: Are seed counts per stem different in the two plots (t-test and bar graph)?
• Fifth- based on the statistical results we can answer whether The leafy spurge plots did/did not
produce more big bluestem seeds than the plots without leafy spurge (support or not our
hypothesis), and of course explain it
27
28
14
18/3/24
29
15