Notes 1 For Week 1 PDF

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

Random Phenomenon or Experiments

A Random Phenomenon is a condition that generate variations in outcomes when observed.

With a random phenomenon each attempt or trial to be observed produces an outcome that usually
varies from an attempt to another and it is not guaranteed that the same outcome occurs all the times.

Opposite to Certainty where outcomes repeats exactly when observed under the same conditions.

In real life, all phenomena are random. Certainty is observed through laws that assume given factors
having no effect on the condition which is subject to those factors.

Examples:
From real life:
Mobile app size in MB
Time in minutes is takes you back home from work or study in a given day
A friend answering your call at a given try
Number of new cases per day testing positive for Covid19 in Lebanon

From engineering and scientific background:


The outcome of the slump test of a concrete mix
The number of defects produced in one batch run
The amount of CO2 emitted per 1 milliliter of a gasoline grade when combusted
The number of bacterial clusters obtained in a plate culturing experiment
The measured voltage potential read on a certain electronic device

1
Outcomes and Sample Spaces
A random Outcome is the result of the random phenomenon when observed on a subject or event.
The Sample Space is the set of all possible outcomes of a random phenomenon or random experiment.

Back to our examples:

1- Mobile App size in MB


Just observe the size of any App in the App store or Play store. Record its size.
Most recently observed from Play Store Microsoft Outlook: 67 MB, Google Earth:8.3 MB

Sample space: it is a range from the smallest size to the largest. [0.0024 MB , 500 MB] ?

2- Time in minutes is takes you back home from work or study in a given day
It depends on you. The sample space is [5 min, 100 min] ?

3- A friend answering your call at a given try


Here, only outcomes are possible: answered, or not answered. You may use 0 if not, 1 if yes, thus the
sample space is {0,1} not an interval

4- Number of new cases per day testing positive for Covid19 in Lebanon
Just check the updated chart today. The sample space is {0,1,2,3,….,27,…} you may want to have an
upper limit… OK you may, but say how many!

The other examples depend on the experimentations and observation. But at the end, a specified sample
space must be identified.
Remember, the sample space is attributed to all possible outcomes of the random phenomena.
Those outcomes are expected when the members of the population that are subject to the phenomena
are observed.

2
The Probability Measure (or simply Probability)
Definition: the probability is a real-number measure between 0 and 1 of the likelihood or the chance
for a set of possible outcomes, which is a subset of the sample space
The set of possible outcomes is usually characterized by a property defined as an Event.
So, an Event is specific condition of the random phenomena that can be described by a subset of
outcomes from the sample space.

Example:
You are attempting to answer a True/False question.
The outcome is either Correct or Wrong (assume no answer will result into a wrong outcome)
An event is for example {a correct answer}=C. The other event is obviously {a wrong answer}=W.
With an uneducated guess, you measure the likelihood for wither C or W is 50% (the 50/50 guess)
Hence, the probability for C or for W is 0.5.
This probability measure can vary if you have an educated knowledge for the possible correct answer.
So, if you studied for the exam, you may measure the likelihood of a correct answer for a given T/F
question to be like 0.95 (i.e. 95% chance it is correct)

Example:
The famous die experiment. Throw a die

Source: MATerials MATemàtics,Volum 2015, treball no. 3, 42 pp. ISSN: 1887-1097,Publicació


electrònica de divulgació del Departament de Matemàtiques de la Universitat Autònoma de
Barcelona www.mat.uab.cat/matmat

An outcome can be obtaining the up-showing face (the face) with 3 dots, say 3.
The sample space is S={1,2,3,4,5,6}
Assume the die is fair in the outcome chances (all faces are equally likely to show up).
1
Then the probability for any outcome is 6

Consider the event A={obtaining an odd number of dots on the die face}={1,3,5}
3 1
The probability for A is therefore 3 chances out of 6, or 6 = 2.

3
The Population
The population consists of subjects or events where the random phenomena is observed.
The population must be clearly identified when a statistical analysis is conducted.

Example: Students’ cumulative GPAs (CGPA) in BAU at the end of the Spring semester.
An outcome is any value between 0 (theoretically) and 4.0. So, the sample space for the CGPA is the
interval of real numbers [0.0 , 4.0]
The population consists of all BAU active students where each student will end up with a given
CGPA…

Example: Mobile App size in MB


The outcome is any value within [0.0024 MB , 500 MB]
The population consists of the category of Apps you want to observe for your analysis. i.e. News Apps,
Entertainment Apps, or even all Apps. Be specific about the population that you are concerned about
in your analysis.

Example: A friend answering your call at a given try


The sample space is {0,1}
The population consists of all the times you tried to call a friend. Obviously, some outcomes are
successes (given value of 1) and others are failures (given value of 0). Then if you record these trials
starting now, they may look like 0,1,1,1,1,0,1,0,0,1,1,….
However, your population should have also incorporated all previous trials as well.

Example: Number of new cases per day testing positive for Covid19 in Lebanon
The sample space is {0,1,2,3,….,27,…}
The population consists of all observed daily new cases since the beginning of the pandemic till
perpetually.
You may get many observations having the same outcome, e.g. “No new case” i.e. the value of 0.

4
How to determine the probability of an outcome

The occurrence of the given outcome when observed by trying many members (subjects or events)
from the population can be counted and a relative frequency of its occurrence can be computed using
the rule:
𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑜𝑐𝑐𝑢𝑟𝑟𝑒𝑛𝑐𝑒𝑠 𝑜𝑓 𝑡ℎ𝑒 𝑒𝑣𝑒𝑛𝑡
𝑅𝑒𝑙𝑡𝑖𝑣𝑒 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦 𝑜𝑓 𝑜𝑐𝑐𝑢𝑟𝑒𝑛𝑐𝑒 =
𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑡𝑟𝑖𝑎𝑙 𝑜𝑟 𝑒𝑥𝑝𝑒𝑟𝑖𝑚𝑒𝑛𝑡𝑠 𝑝𝑒𝑟𝑓𝑜𝑟𝑚𝑒𝑑

As the number of trials or experiments performed get larger and larger, then the relative frequency of
the outcome converges to a fixed value for the fraction.

Try it yourselves: Toss a coin and find the relative frequency of getting “heads” through subsequent
trials and record the relative frequency of heads every time. On the first flip say you get “tail”, the
proportion of heads is 0. Say on the next toss you get a “head”, then the proportion of heads goes up
to 0.5. Suppose you get on the next two flips just “tails”, now the proportion of heads in the four flips
is down to 0.25. Continue flipping the coin – 50 times, 100 times, 1000 times. Over the long term a
pattern emerges – the proportions hover around 0.5 – as can be seen in the following Figure.

Source: https://www.learner.org/wp-content/uploads/2019/03/AgainstAllOdds_StudentGuide_Unit18-Introduction-to-Probability.pdf

The above is called:

The Law of Large Numbers: the long-run relative frequency of repeated independent events
emerges to a given relative frequency as the number of trials increases.

MAT2

5
Sampling
Sampling is the process of selecting random members from a population to observe whether a specified
event is detected. Remember the event can be a single outcome or a range of outcomes that fulfill a
given characteristic or requirement.

Sampling must be random to be fair and unbiased.


A biased sample will not reflect the truth and become deceiving, thus non-scientific.
To get a random and unbiased sample, one can choose from many techniques:
• Probability Sampling Techniques: Simple random. Stratified random. Cluster sampling.
Systematic sampling. Multi-stage sampling.
• Non-probability Sampling Techniques. Quota sampling. Snowball sampling. Judgment
sampling. Convenience sampling.
Simple random sampling is where every member in the population has an equal chance and likelihood
of being selected in the sample.
Stratified random sampling requires the division of the population into smaller sub-groups with
common descriptive characteristic known as strata that creates layers horizontally inside the
population. Then, observations are selected equally but randomly from each stratum.
Cluster sampling creates multiple clusters or groups of people from a population clearly separated
through indicative and homogeneous characteristics such as geographic, demographic, ethnic, habit,
or background. Selected observations from each cluster must have an equal chance of being a part of
the sample.
Systematic sampling selects the sample members from a larger population according to a random
starting point but with a fixed, periodic interval. This interval, called the sampling interval, is
calculated by dividing the population size by the desired sample size.
Multistage sampling is the taking of samples in stages using smaller and smaller sampling units at each
stage. Multistage sampling can be a complex form of cluster sampling.
Quota sampling creates a tailored sample according to prespecified proportion to some characteristic
or trait of a population called quota. The population is first stratified then observations are taken
randomly from each stratum up to the quota.
Snowball sampling, (a.k.a chain sampling, chain-referral sampling, referral sampling) collects
observations in a way that existing study subjects “recruit” future subjects from among their
acquaintances.
Judgment sample, or expert sample, also called purposive sampling or authoritative sampling, selects
the sample observations based on the opinion of an expert.
Convenience sampling selects observations from "convenient" sources of data for the researcher.

6
Describing the sample
The sample can be described either numerically or graphically
Numeric description of the sample data
Measures of location
Measure of variability

Measures of location
Also called measures of central tendency.

Example:
The following measurements were recorded for the drying time, in hours, of a certain brand of latex
paint. 3.4, 2.5, 4.8, 2.9, 3.6, 2.8, 3.3, 5.6, 3.7, 2.8, 4.4, 4.0, 5.2, 3.0, 4.8. Assume that the measurements
are a simple random sample. What is the sample size for the above sample?

Solving manually:

However, use a scientific calculator to make your life easy! Using such a calculator is a must. The
procedure varies from a calculator brand/type to another. Have one and get used to how to use it for
statistical calculations.

7
Other measures of location

The sample Median.

The data must be ordered first from smallest to largest then based on the sample size the appropriate
formula is used.
Example: continuing from before..

The Mode
The mode is the observation that occurs most often in the sample.

Example:
The following is the number of calls making orders from a supermarket for the last 20 days.
4, 2, 3, 4, 5, 3, 5, 6, 5, 5, 3, 4, 5, 6, 2, 3, 4, 3, 6, 3
Obviously, the mode is 5

8
Measures of variability

Sample Range
The sample range is the difference between the largest and the smallest observations in the sample.
𝑅𝑎𝑛𝑔𝑒 = 𝑋𝑚𝑎𝑥 − 𝑋𝑚𝑖𝑛

Sample Variance and Sample Standard Deviation

Example:
The data is: 3.4, 2.5, 4.8, 2.9, 3.6, 2.8, 3.3, 5.6, 3.7, 2.8, 4.4, 4.0, 5.2, 3.0, 4.8
The use of a calculator is imminent.

9
Graphical methods to describe the sample
1- The scatter plot
2- The histogram
3- The stem-and-leaf plot
4- The box plot
Note: drawing such plots usually takes time. Instead, using computer packages is required.

10

You might also like