Download as pdf or txt
Download as pdf or txt
You are on page 1of 2

• Sorting: process of arranging data into meaningful order so that you can analyze it more

effectively.
• Mean: average of a collection of numbers.
• Median: median of a set of data is the middlemost number or center value in the set.
• Mode: mode is the value that appears most often in a set of numbers.
• Histogram: representation of a continuous data. Where x-axis represent the range and y-axis
represent the no. or percentage of occurrence in data.
• Frequency distribution: a graph or data set organized to show the frequency of occurrence of
each possible outcome.
• Standard deviation: a measure of how dispersed the data is in relation to the mean.
• Variance: degree of spread in the data. to see how individual numbers relate to each other
within a data set.
• Use of stats: to make predictions about the future with the existing data.
• Null Hypothesis: the statement/ the claim that there is no difference in true means or
proportions of groups that are being compared.
• p-values: statistical measurement used to validate a hypothesis against observed data.
p < 0.05 = statistically significant difference OR test hypothesis is false or should be rejected.
p > 0.05 = no statistically significant difference OR no effect was observed.
• α-value: the threshold for statistical significance. In most cases, researchers use an alpha of
0.05, which means that there is a less than 5% chance that the data being tested could have
occurred under the null hypothesis.
• t-distribution: way of describing a set of observations where most observations fall close to
mean and the rest observations make up a tail on either side.
• Chi-square distribution: used to describe the distribution of a sum of squared random variables.
Starts at zero and continue to infinity.
• f-distribution: used to test the equality of variance from the 2 normal populations.
• Test of significance: a formal procedure for comparing observed data with a claim(also called a
hypothesis).
• Regression: a statistical technique that relates a dependent variable to one or more
independent variables.
• Correlation: a statistical measure that expresses the extent to which two variables are linearly
related. Denoted by “r”. lies between -1 and +1. The closer the value of r towards ±1, the
stronger is the linear relationship between the variables.
• ANOVA test: helps you find out whether the differences between groups of data are statistically
significant.
• How to do ANOVA: 1) Find the mean for each of the groups.
2) Find the overall mean.
3) Find the Within Group Variation.
4) Find the Between Group Variation.
• F factor: variation between sample means/ variation within the samples.
• At what level of significance ANOVA test is performed: when α=0.05 or 5% (This means that
your results only have a 5% chance of occurring, or less, if the null hypothesis is actually true.)
• Time series data: data that is recorded over a consistent interval of time. (Example predicting
temperature)
• Control charts: used for routinely monitor quality. Like baking a cake.

You might also like