Data Analysis

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 85

Numerical Methods for Describing Data

1) Average = arithmetic mean = mean

𝑠𝑢𝑚 𝑜𝑓 𝑛 𝑛𝑢𝑚𝑏𝑒𝑟𝑠
𝐴𝑣𝑒𝑟𝑎𝑔𝑒 =
𝑛

Sum of n numbers = 𝑛 × 𝐴𝑣𝑒𝑟𝑎𝑔𝑒

5 + 9 + 1 + 7 + 0 + 9 + 2 33
{5, 9, 1, 7, 0, 9, 2} 𝐴𝑣𝑒𝑟𝑎𝑔𝑒 = = = 4.71
7 7
To determine the mean number of children per
household in a community, Tabitha surveyed 20
families at a playground. For the 20 families surveyed,
the mean number of children per household was 2.4.
Which of the following statements must be true?

A) The mean number of children per household in the community is 2.4.

B) A determination about the mean number of children per household in the


community should not be made because the sample size is too small.

C) The sampling method is flawed and may produce a biased estimate of the
mean number of children per household in the community.

D) The sampling method is not flawed and is likely to produce an unbiased


estimate of the mean number of children per household in the community.
The average of a set of n numbers is x. If each number is increased or
decreased by c, then what is the average of the new set of numbers?

𝑥1 + ⋯ 𝑥𝑛 𝑥1 + 𝑐 + ⋯ 𝑥𝑛 + 𝑐 𝑥1 + ⋯ 𝑥𝑛 + 𝑛𝑐
=𝑥 = =𝑥+𝒄
𝑛 𝑛 𝑛

The average of a set of n numbers is x. If each number is multiplied by


c, then what is the average of the new set of numbers?

𝑥1 + ⋯ 𝑥𝑛 𝑐𝑥1 + ⋯ 𝑐𝑥𝑛 𝑐(𝑥1 + ⋯ 𝑥𝑛 )


=𝑥 = = 𝒄𝑥
𝑛 𝑛 𝑛
Sam earned a $2,000 commission on a big sale, raising his average commission by
$100. If Sam's new average commission is $900, how many sales has he made?

𝑆 = 𝑠𝑢𝑚 𝑜𝑓 𝑜𝑙𝑑 𝑐𝑜𝑚𝑚𝑖𝑠𝑠𝑖𝑜𝑛


𝑆 = 800𝑛
𝑛 = 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑜𝑙𝑑 𝑠𝑎𝑙𝑒𝑠
800𝑛 + 2000
𝑆 + 2000 → 900 =
𝑛𝑒𝑤 𝑎𝑣𝑒𝑟𝑎𝑔𝑒 = 900 = 𝑛+1
𝑛+1

𝑛𝑒𝑤 𝑎𝑣𝑒𝑟𝑎𝑔𝑒 = 900 → 800𝑛 + 2000 = 900(𝑛 + 1)


→ 𝑜𝑙𝑑 𝑎𝑣𝑒𝑟𝑎𝑔𝑒 = 900 − 100 = 800
→ 100𝑛 = 1100 → 𝑛 = 11
𝑆
𝑜𝑙𝑑 𝑎𝑣𝑒𝑟𝑎𝑔𝑒 = 800 = → 𝑡𝑜𝑡𝑎𝑙𝑙 𝑠𝑎𝑙𝑒𝑠 = 11 + 1 = 12
𝑛
2) Median

If n is odd The middle number


In a monotone set:
If n is even The average of 2 middle numbers

{5, 9, 1, 7, 0, 9, 2} {0, 1, 2, 5, 7, 9, 9} M=5

{6, 2, 4, 5, 1, 5} 4+5
{1, 2, 4, 5, 5, 6} M= = 4.5
2
Consider the set {4,x,9,15,15,27,32}.
𝑛 = 7 then no matter what is x the median is 15.

Now consider the set{x, 2, 5, 11, 12, 12, 33}

𝑛 = 7 But: If x≤ 11 → median = 11.


If 11 < 𝑥 < 12 → median = 𝑥.
If x≥12 → median = 12.
R is a list of 15 consecutive integers, and T is a list of 21 consecutive
integers. The median of the integers in list R is equal to the least integer in
list T. If the two lists are combined into one list of 36 integers, how many
different integers are on the combined list?

The median of the numbers in list R is the middle number when the numbers are listed in order
from least to greatest, that is, the 8th number. Since the median of the numbers in list R is equal to
the least integer in list T, the 8 greatest integers in R are the 8 least integers in T, and the number
of different integers in the combined list is 15 + 21 – 8, or 28.
In evenly spaced sets: mean = median.

The mean and median of the set are equal to the average of the FIRST and LAST terms.
(Or any two terms that are symmetric about the center of the set)

5 + 30 15 + 20 10 + 25
The average of the set {5, 10, 15, 20, 25, 30} is = = = 17.5
2 2 2
601+101
The average of the set {101, 111, 121. .. 581, 591, 601} is equal to = 351
2

What is the value of 𝟑 + 𝟕 + 𝟏𝟏 + ⋯ + 𝟖𝟑?


83 − 3
𝑇ℎ𝑒 𝑠𝑒𝑡 3, 7, 11, … , 83 𝑖𝑠 𝑎𝑛 𝑒𝑣𝑒𝑛𝑙𝑦 𝑠𝑝𝑎𝑐𝑒𝑠 𝑤𝑖𝑡ℎ + 1 = 21 𝑚𝑒𝑚𝑏𝑒𝑟𝑠.
4
83 + 3
𝑚𝑒𝑎𝑛 = = 43 → 𝑠𝑢𝑚 = 21 × 43 = 903
2
3) Mode

The numbers occur the most are modes.

{5, 6, 1, 9, 7, 4, 6, 3} Mode =6

{6, 2, 4, 2, 9, 7, 7, 9, 5, 2, 7} Mode = 2, 7
{8, 1, 5, 4, 9, 2} Mode= 8, 1, 5, 4, 9, 2
The modes of a set of 9 numbers are x, y, and z, and the average
(arithmetic mean) of the 9 numbers is 20. Three of the 9 numbers
are 2x + 5, 2y, and 2z - 3. What is the value of 4(x + y + z)?
Weighted average (Average of 2 sets)

list A has 𝑛 number with average 𝑎 Sum of numbers in A = 𝑛𝑎


list B has 𝑚 number with average 𝑏 Sum of numbers in B = 𝑚𝑏

𝑛𝑎+𝑚𝑏
Total average of A, B =
𝑛+𝑚

𝑛>𝑚 𝑛=𝑚 𝑛<𝑚

𝑎 𝐴𝑣𝑔 𝑜𝑓 𝐴, 𝐵 𝐴𝑣𝑔 𝑜𝑓 𝐴, 𝐵 𝑏
𝑎+𝑏
2
Each employee of a certain company is in either Department X or
Department Y, and there are more than twice as many employees in
Department X as in Department Y. The average (arithmetic mean) salary
is $25,000 for the employees in Department X and $35,000 for the
employees in Department Y. Which of the following amounts could be
the average salary for all of the employees of the company?
A) $26,000
B) $29,000
C) $30,000
D) $31,000

Since 𝑋 > 2𝑌, there are more employees with the lower average salary, the average
salary of all employees must be less than the average of $25,000 and $35,000, which
is $30,000. Therefore the answer choices is A.
Quartiles and Percentiles

Like the median M, quartiles and percentiles are numbers that divide the data into roughly equal
groups after the data have been ordered from the least value L to the greatest value G.
There are three quartile numbers that divide the data into four roughly equal groups.

The first quartile 𝑸𝟏 , the second quartile 𝑸𝟐 (which is simply the median M), and the third
quartile 𝑸𝟑 divide a group of data into four roughly equal groups as follows.

{−5, 0, 2, 2 , 3, 3, 3, 5, 7, 7, 7, 7, 8 , 10, 10, 11, 15}

3+2 8 + 10
𝑄1 = = 2.5 𝑄3 = =9
2 2
𝑄2 = M
𝑸𝟐 is the median of all data and then 𝑸𝟏 is the median of the numbers lesser than M and 𝑸𝟑 is the
median of the numbers greater than M.
sometimes a data value is so unusually small or large in comparison with the rest of the data. Such
data are called outliers.
The middle half of data is called interquartile and the interquartile range is defined as, 𝑸𝟑 – 𝑸1 .
There are 99 percentiles numbers that divide the data into 100 roughly equal groups.

Percentiles are mostly used for very large lists of numerical data ordered from least to greatest.

The 99 percentiles 𝑷𝟏, 𝑷𝟐, 𝑷𝟑, … , 𝑷𝟗𝟗 divide the data into 100 groups. Consequently,

𝑸𝟏 = 𝑷𝟐𝟓 , 𝑴 = 𝑸𝟐 = 𝑷𝟓𝟎 , 𝑸𝟑 = 𝑷𝟕𝟓 .

10𝑡ℎ 𝑡𝑒𝑟𝑚 + 11𝑡ℎ 𝑡𝑒𝑟𝑚


If there are 1000 data then: 𝑃1 =
2
20𝑡ℎ 𝑡𝑒𝑟𝑚 + 21 𝑠𝑡 𝑡𝑒𝑟𝑚
𝑃2 =
2
.
.
.
The range of the numbers in a group of data is the difference between the greatest number G in the
data and the least number L in the data, that is, G – L.
The range of previous set is 15 − (−5) = 20
If x centimeters is at the 73rd percentile, then approximately 73% of the measurements in
the distribution are less than or equal to x centimeters. The 68 measurements that are
68
greater than y centimeters but less than x centimeters are × 100% = 8%, of the
850
distribution. Thus approximately 73% – 8%, or 65%, of the measurements are less than or
equal to y centimeters, that is, y is approximately at the 65th percentile in the distribution.
The correct answer is Choice E.
DATA ANALYSIS
Data analysis is used to understand data and predict future events.
Data can be organized and summarized using tables, graphical and numerical methods.

A survey was taken to find the number of children in each of 25 families. A list of the values collected
in the survey follows.
1204133120452323241230231

Frequency and relative frequency distribution

Number of Frequency Relative Frequency is the number of times that the category or
children frequency value appears in the data.

0 3 12% The relative frequency is the ratio of associated


frequency to the total number of data.
1 5 20%
2 7 28% We call the number of children a variable.
3 6 24%
4 3 12%
5 1 4%
Total 25 100%
1 3 5 3 7
2 +7 +8 +8 +9 32
2 4 4 2 4
= ≈ 0.94
2+7+8+8+9 34
Bar Graphs

Number of children in 25 families


8

6 In a bar graph, rectangular bars are used to


represent the categories of the data, and the
height of each bar is proportional to the
Number of families

5
corresponding frequency or relative frequency.
4

0
0 1 2 3 4 5
Number of children
The chart above depicts the number of electoral votes assigned to each
of the six New England states. What is the average (arithmetic mean)
number of electoral votes, to the nearest tenth, assigned to these states?
segmented bar graph

Bar graphs are also used to compare different groups using the same categories.

FALL 2009 ENROLLMENT AT FIVE COLLEGES FALL 2009 ENROLLMENT AT FIVE COLLEGES
8000 8000
Full-time
7000 7000 Part-time
6000 6000

5000 5000
Enrollment

4000 4000

3000 3000

2000 2000

1000 1000

0 0
COLLEGE A COLLEGE B COLLEGE C COLLEGE D COLLEGE E COLLEGE A COLLEGE B COLLEGE C COLLEGE D COLLEGE E
COLLEGES
The chart shows year‐end values for Darnella’s
investments. For just the stocks, what was the
increase in value from year‐end 2000 to year‐end
2003 ?
(A) $1,000
(B) $2,000
(C) $3,000
(D) $4,000
(E) $5,000
Using bar graph to compare numerical data

Fall 2009 and Spring 2010 enrollment at five colleges


8000

7000

6000

5000
Enrollment

4000

3000

2000

1000

0
College A College B College C College D College E
Fall 2009 Spring 2010
Circle Graphs (pie charts)
They illustrate how a whole is separated into parts.
Usually is used to show relative frequency.

FALL 2009 ENROLLMENT AT FIVE COLLEGES

College A
15%
College E
The area of each sector is proportional to the percent of
28%
the whole that the sector represents,
College B the measure of the central angle of a sector is
16%
proportional to the percent of 360 degrees that the sector
represents.
College D
College C
23%
18%

College A College B College C College D College E


If the jar contains 1200 marbles and
there are twice as many orange marbles
as there are green, how many green
marbles are there?

𝑂 + 𝐺 = 100% − 30% + 25% + 20% = 25%

1
25% 𝑜𝑓 1200 = 𝑜𝑓 1200 = 300
4

𝑂 = 2𝐺

⇒ 𝑂 = 200, 𝐺 = 100
The annual budget of a certain college is to be shown
on a circle graph. If the size of each sector of the
graph is to be proportional to the amount of the
budget it represents, how many degrees of the circle
should be used to represent an item that is 15 percent
of the budget?
(A) 15°
(B) 36°
(C) 54°
(D) 90°
(E) 150°
Histograms

When a list of data is large, it is useful to organize it by grouping the values into intervals, often called
classes.
To do this,
divide the entire interval of values into smaller intervals of equal length
and then count the values that fall into each interval.

Histograms are useful for identifying the general shape of a distribution of data.
Scatterplots
A Scatterplot has points that show the relationship between two sets of data.
Such data are called bivariate data.
Sales and temperature for 12 Ice creams
$700
(25,610) A scatterplot makes it possible to
$600 observe an overall pattern, or trend.
$500 The more points closer the trend
line, the finding a relation between
$400
two set of data and make prediction is
Sales

Trend line easier.


$300

$200 To estimate the slope, estimate


(12,190) the coordinates of any two
$100 points on the line.
$0 610 − 190 400
= ≈ 31
0 5 10 15 20 25 30
25 − 12 13
Temperature C
warmer weather leads to more sales
The scatterplot above shows the densities of
7 planetoids, in grams per cubic centimeter,
with respect to their average distances from
the Sun in astronomical units (AU). The line
of best fit is also shown.

According to the scatterplot, which of the following


statements is true about the relationship between a
planetoid’s average distance from the Sun and its
density?
A) Planetoids that are more distant from the Sun
tend to have lesser densities.
B) Planetoids that are more distant from the Sun
tend to have greater densities.
C) The density of a planetoid that is twice as far (negative correlation)
from the Sun as another planetoid is half the
density of that other planetoid.
D) The distance from a planetoid to the Sun is
unrelated to its density.
One method of calculating the approximate age, in years,
of a tree of a particular species is to multiply the diameter
of the tree, in inches, by a constant called the growth
factor for that species. The table above gives the growth
factors for eight species of trees.

The scatterplot gives the tree diameter plotted


against age for 26 trees of a single species. The
growth factor of this species is closest to that of
which of the following species of tree?
A) Red maple
B) Cottonwood
C) White birch
D) Shagbark hickory
(positive correlation)
Time Plots or line graphs

A time plot (sometimes called a time series) is a graphical display useful for showing
changes in data collected at regular intervals of time.

Fall enrollment for college A


2001-2009
4500

4000

3500

3000
Enrollment

2500

2000

1500

1000

500

0
2001 2002 2003 2004 2005 2006 2007 2008 2009
Year
In what year was the percent increase in the value of a share of stock B
the greatest?
Since the slope of the graph B is steepest in 2007 (between January 1, 2007 and January 1,
2008), the rate of growth was greatest then.

What was the average yearly increase in the value of a share of stock A from
2005 to 2010?

Over the 5-year period from January 1, 2005, to January 1, 2010, the value of a share of stock A rose
from $30 to $45, an increase of $15. The average yearly increase was $15 ÷ 5 years or $3 per year.
boxplots or box-and-whisker plots

L 𝑄1 𝑄2 𝑄3 G

Fall enrollment for college A


2001-2009
Standard Deviation (SD)
SD is somehow an average of distance between the mean and each values.
The more the data are spread away from the mean, the greater the standard deviation; and vice versa.

Take the data set 𝑎1 , 𝑎2 , 𝑎3 , … , 𝑎𝑛 and m its mean.

2
+ 𝑎2 − 𝑚 2 + ⋯ + 𝑎𝑛 − 𝑚 2
𝑎1 − 𝑚
𝑆𝐷 =
𝑛
The process of subtracting the mean from each value and then dividing the result by the
standard deviation is called standardization.

In any group of data, most of the data are within about 3 standard deviations above or below the mean.

Informal SD is the average of distances between the mean and values.

Variance := 𝜎 2 = 𝑆𝐷2

Units of standard deviation


If the SD of a data set is s then s=1 unit of standard deviation.
If each data point of a set increased (decreased) by a
constant C then SD will be constant.

If each data point of a set multiply by constant C,


If C>1, SD increases and If C<1, SD decreases.
Counting methods

The term set is informally a collection of objects that have some property (members can not be repeated).

The objects of a set are called members or elements.

Some sets are finite or infinite.

Set of students of a class is finite. Set of all positive integers={1,2,3,…} is infinite.

A set has no members is called empty and is denoted by ∅ or {}.

If A and B are sets and all of the members of A are also members of B, then A is a subset of B.

A list is like a finite set that the members are ordered and can be repeated.
The subsets of the set {𝒘, 𝒙, 𝒚} are {𝒘}, {𝒙}, {𝒚}, {𝒘, 𝒙},
{𝒘, 𝒚}, {𝒙, 𝒚}, {𝒘, 𝒙, 𝒚}, and { } (the empty subset). How
many subsets of the set {𝒘, 𝒙, 𝒚, 𝒛} contain 𝒘 ?
(A) Four
(B) Five
(C) Seven
(D) Eight
(E) Sixteen
If S and T are sets, then the intersection of S and T is the set of all elements that are in both S
and T and is denoted by S ∩ T.

The union of S and T is the set of all elements that are in S or T, or both, and is denoted by S ∪ T.

If sets S and T have no elements in common, they are called disjoint or mutually exclusive.

U=universal set

Venn diagram
A B
𝑁=out of A,B
𝑨 ∪ 𝑩 = 𝑨 + 𝑩 − |𝑨 ∩ 𝑩|

𝑈 = 𝐴∪𝐵 +𝑁
Each of 25 people is enrolled in history, mathematics, or both. If 20
are enrolled in history and 18 are enrolled in mathematics, how
many are enrolled in both history and mathematics?
There are 87 balls in a jar. Each ball is painted with at least one of two colors, red or green. It is
observed that 2/7 of the balls that have red color also have green color, while 3/7 of the balls
that have green color also have red color. What fraction of the balls in the jar have both red and
green colors?
(A) 6/14
(B) 2/7
(C) 6/35
R G
(D) 6/29
(E) 6/42 B

𝑅∪𝐺 =𝑅+𝐺−𝐵
7 7 29
2 3 ⇒𝑅∪𝐺 = 𝐵+ 𝐵−𝐵 ⇒ 87 = 𝐵
𝐵= 𝑅= 𝐺 2 3 6
7 7

𝐵 6
=
87 29
In a certain production lot, 40 percent of the toys are red and the remaining
toys are green. Half of the toys are small and half are large. If 10 percent of
the toys are red and small, and 40 toys are green and large, how many of the
toys are red and large?

40 𝑔𝑟𝑒𝑒𝑛 𝑎𝑛𝑑 𝑙𝑎𝑟𝑔𝑒 = 20% ⇒ 𝑡𝑜𝑡𝑎𝑙 = 100% = 200

⇒ 𝑟𝑒𝑑 𝑎𝑛𝑑 𝑙𝑎𝑟𝑔𝑒 = 30% 200 = 60


Someone wants to go by plane to; New York , Washington , Los Angles , San Francisco or Chicago.

There is 5 possibility.

Someone wants to go by bus, train, or plane;


New York , Washington , Los Angles , San Francisco or Chicago.

New York Washington Los Angles San Francisco Chicago

bus train plane bus train plane bus train plane bus train plane bus train plane

There is 15 possibilities.
Multiplication principle

If an operation consists of 𝒌 steps, of which the first can be done in 𝒏𝟏 ways, for each of these the
second step can be done in 𝒏𝟐 ways, for each of the first two the third step can be done in 𝒏𝟑 ways,
and so forth, then the whole operation can be done in 𝒏𝟏 𝒏𝟐 … 𝒏𝒌 ways.

A quality control inspector wishes to select a part for inspection from each of four different bins containing
4, 3, 5, and 4 parts, respectively. In how many different ways can she choose the four parts?

The total number of ways is 𝟒 × 𝟑 × 𝟓 × 𝟒 = 𝟐𝟒𝟎.

In how many different ways can one answer all the questions of a true–false test
consisting of 20 questions?

Altogether there are 𝟐 × 𝟐 × 𝟐 × 𝟐 ×. . .× 𝟐 × 𝟐 = 𝟐𝟐𝟎 = 𝟏, 𝟎𝟒𝟖, 𝟓𝟕𝟔


How many 3-digit numbers can be formed with the hundredth of
even number, tens of odd number and unit digit different from both?
permutations

A permutation is a distinct arrangement of n different elements of a set.

How many permutations are there of the letters a, b, and c?


The possible arrangements are: 𝑎𝑏𝑐, 𝑎𝑐𝑏, 𝑏𝑎𝑐, 𝑏𝑐𝑎, 𝑐𝑎𝑏, 𝑐𝑏𝑎

Using multiplication principle we have: 3 2 × 1 =6


×

suppose 𝒏 objects are to be ordered from 1𝑠𝑡 to 𝑛𝑡ℎ , and we want to count the number of ways the
objects can be ordered.
𝑛 × 𝑛−1 × 𝑛−2 ×. . . × 2 × 1 ≔ 𝒏! (called n factorial)

In how many different ways can the five starting players of a basketball team be introduced to the public?
There are 5! = 5 ・ 4 ・ 3 ・ 2 ・ 1 = 120 ways in which they can be introduced.
How many different three-digit positive integers can be formed using the digits 1, 2, 3, 4, 5, 6, 7 if none
of the digits can occur more than once in the integer?

7 6 × 5 7!
× =
7−3 !

Suppose that 𝒌 objects will be selected from a set of 𝒏 objects, where 𝒌 ≤ 𝒏, and the k objects
will be placed in order from 𝟏𝒔𝒕 to 𝒌𝒕𝒉 .

𝒏!
𝑛 × 𝑛−1 × 𝑛−2 ×. . . × 𝑛−𝑘+1 =
𝒏−𝒌 !

The number of ways to select and order k objects out of n objects is denoted by

𝒏!
𝑷 𝒏, 𝒌 =
𝒏−𝒌 !
How many different permutations are there of the letters in the word “book”?

Let distinguish between the two 𝑜’s by labeling them 𝑜1 and 𝑜2 , → {𝑏, 𝑜1 , 𝑜2 , 𝑘}

There are 4! = 24 different permutations.

By dropping subscripts 𝑏𝑜1 𝑘𝑜2 = 𝑏𝑜2 𝑘𝑜1 = 𝑏𝑜𝑘𝑜.


→ Without subscript each pair of permutations are the same.
4!
→ the total number of arrangements is = 12
2

The number of permutations of 𝑛 objects of which 𝑛1 are of one kind, 𝑛2 are of a second kind, . . . ,
𝑛𝑘 are of a 𝑘 𝑡ℎ kind, and 𝑛1 + 𝑛2 + ⋯ + 𝑛𝑘 = 𝑛 is

𝒏!
𝒏𝟏 ! 𝒏𝟐 ! … 𝒏𝒌 !
How many different anagrams (meaningful or nonsense) are possible for the
word MASSASAVGA?
We have one M, four A’s, three S’s, one V and one G thus the number of different anagrams is
10!
1! × 1! × 1! × 3! × 4!

In how many ways can two paintings by Monet, three paintings by Renoir, and two paintings
by Degas be hung side by side on a museum wall if we do not distinguish between the
paintings by the same artists?
7!
= 210
2! 3! 2!
Combinations

A combination is a selection of r objects taken from n distinct objects without regard to the order of
selection.

In how many different ways can a person gathering data for a market research organization select three
of the 20 households living in a certain apartment complex?

If we care about the order in which the households are selected, the answer is

𝑃 20, 3 = 20 × 19 × 18 = 6,840

But each set of three households would then be counted 3! = 6 times.


If we do not care about the order in which the households are selected, there are only
𝑃 20,3 6,840
= = 1,140
3! 6
ways in which the person gathering the data can do his or her job.
𝑛
The number of combinations of 𝑛 distinct objects taken 𝑟 at a time that is denoted by 𝐶 𝑛, 𝑟 𝑜𝑟 is
𝑟
𝑛 𝑃 𝑛, 𝑟 𝑛!
= =
𝑟 𝑟! 𝑟! 𝑛 − 𝑟 !
In how many different ways can six tosses of a coin yield two heads
and four tails?

6 6! 6!
= = = 15
2 2! 6 − 2 ! 2! 4!
n
=1
0
n
=n
1
n
=1
n
n n
=
r n−r

n 𝑛 𝑛−1
=
2 2
“or” and “and”

How many different committees of two chemists and one


physicist can be formed from the four chemists and three
physicists on the faculty of a small college?

4 3
= 6 × 3 = 18
2 1
In how many different ways can the letters of the word 'LEADING' be arranged in such a way that the vowels
always come together?

Out of 7 consonants and 4 vowels, how many words of 3 consonants and 2 vowels can be formed?
In a group of 6 boys and 4 girls, four children are to be selected. In how many different ways can
they be selected such that at least one boy should be there?

How many 3-digit numbers can be formed from the digits 2, 3, 5, 6, 7 and 9,
which are divisible by 5 and none of the digits is repeated?
probability
How likely something is to happen.
# of ways event can happen
Probability of an event happening=
# of total outcomes

#(𝐴)
𝑃 𝐴 =
#(𝑆)

-When a coin is tossed, there are two possible outcomes: Heads or tails
1 1
𝑃 H = and 𝑃 𝑇 =
2 2
-When a single die is thrown, there are six possible outcomes 1, 2, 3, 4, 5, 6.
1 1 1
𝑃 1 = , 𝑃 2 = , …, 𝑃 6 =
6 6 6
3
𝑃 𝑒𝑣𝑒𝑛 = 𝑃 2,4,6 =
6
-There are 5 marbles in a bag: 4 are blue, and 1 is red. What is 4
𝑃 𝑏𝑙𝑢𝑒 =
the probability that a blue marble gets picked? 5
𝟎 ≤ 𝑷𝒓𝒐𝒃𝒂𝒃𝒊𝒍𝒊𝒕𝒚 𝒐𝒇 𝒂𝒏 𝒆𝒗𝒆𝒏𝒕 ≤ 𝟏

𝟏
𝟐
Sarah cannot completely remember her four-digit ATM pin
number. She does remember the first two digits, and she
knows that each of the last two digits is greater than 5. The
ATM will allow her three tries before it blocks further
access. If she randomly guesses the last two digits, what is
the probability that she will get access to her account?
(A) 1/2 (B) 1/4 (C) 3/16 (D) 3/18 (E) 1/32
𝒂, 𝒃 are two integers that 𝟏 < 𝒂 < 𝟑 , −𝟏 < 𝒃 < 𝟑,
what is the probability when 𝟑𝒂 − 𝟒𝒃 is less than 1?
Complementary event 1
𝑃 𝑟𝑒𝑑 =
There are 5 marbles in a bag: 4 are blue, and 1 is red. 5
4
𝑃 𝑏𝑙𝑢𝑒 =
5
the complement of selecting blue marbles is selecting marbles that are not blue.

Complement of event A occurs = event A does not occur

1 4
𝑃 𝑟𝑒𝑑 + 𝑃 𝑏𝑙𝑢𝑒 = + =1
5 5

𝑷(𝒆𝒗𝒆𝒏𝒕 𝒉𝒂𝒑𝒑𝒆𝒏) + 𝒑(𝒆𝒗𝒆𝒏𝒕 𝒅𝒐𝒆𝒔𝒏′ 𝒕 𝒉𝒂𝒑𝒑𝒆𝒏) = 𝟏

If in a class the probability of picking men is 0.35, what is the probability of picking women?

𝑃 𝑤𝑜𝑚𝑒𝑛 = 𝟏 − 𝑃 𝑚𝑒𝑛 = 𝟏 − 0.35 = 0.65


Of the 700 members of a certain organization, 120 are
lawyers. Two members of the organization will be selected at
random. Which of the following is closest to the probability
that neither of the members selected will be a lawyer?
A) 0.5
B) 0.6
C) 0.7
D) 0.8
E) 0.9

580
2 580 × 579 600 × 600 36 36 72
700 = ≈ = ≈ = ≈ 0.7
700 × 699 700 × 700 49 50 100
2
There are 27 students in Mr. White’s homeroom. What is the
probability that at least 3 of them have their birthdays in the same
month?
𝟑
(A)
𝟐𝟕

𝟑
(B)
𝟏𝟐

𝟏
(C)
𝟐

(D) 1
If one number is randomly selected from each of the following sets, what is the probability
that the product of three numbers is even?
{2,3}, {5,8,12,18,20}, {4, 6, 9, 10}

𝑃 𝑝𝑟𝑜𝑑𝑢𝑐𝑡 𝑒𝑣𝑒𝑛 = 1 − 𝑃(𝑝𝑟𝑜𝑑𝑢𝑐𝑡 𝑜𝑑𝑑)

# 𝑝𝑟𝑜𝑑𝑢𝑐𝑡 𝑜𝑑𝑑
𝑃 𝑝𝑟𝑜𝑑𝑢𝑐𝑡 𝑜𝑑𝑑 =
# 𝑝𝑜𝑠𝑠𝑖𝑏𝑙𝑒 𝑜𝑢𝑡𝑐𝑜𝑚𝑒𝑠

#𝑝𝑜𝑠𝑠𝑖𝑏𝑙𝑒 𝑜𝑢𝑡𝑐𝑜𝑚𝑒𝑠 = 2 × 5 × 4 = 40 2 𝑤𝑎𝑦𝑠 × 5 𝑤𝑎𝑦𝑠 × 4 𝑤𝑎𝑦𝑠 = 40

𝑃 𝑝𝑟𝑜𝑑𝑢𝑐𝑡 𝑜𝑑𝑑 = 1/40

1 39
# 𝑝𝑟𝑜𝑑𝑢𝑐𝑡 𝑒𝑣𝑒𝑛 = 𝟏 − 𝑃 𝑝𝑟𝑜𝑑𝑢𝑐𝑡 𝑜𝑑𝑑 = 𝟏− =
40 40
Mutually exclusive events: Two events cannot happen together

A ball is randomly selected form a box contains red and green balls.

Event A: ball is red


Event B: ball is green Event A and event B are mutually exclusive

Select one number from {3, 6, 7, 11, 15, 17, 100}


Event A: the selected number is odd 15 is odd and multiple of 5
Event B: the selected number is multiple of 5 Event A and event B are not mutually exclusive

Throw a die.

Event A: the number is even


Event B: the number is odd Event A and event B are mutually exclusive
𝐸 and 𝐹 are said to be independent if the occurrence of either event
does not affect the occurrence of the other.

𝐸 and 𝐹 are independent → 𝑃 𝐸 ∩ 𝐹 = 𝑃 𝐸 × 𝑃(𝐹)

Note that if 𝑃(𝐸) ≠ 0 and 𝑃(𝐹) ≠ 0, then events 𝐸 and 𝐹 cannot be


both mutually exclusive and independent.
𝟐 𝟐 𝟒
𝟒 𝟏 𝟏 𝟒 𝟏 𝟔
= =
𝟐 𝟐 𝟐 𝟐 𝟐 𝟏𝟔
Two dice are thrown simultaneously. What is the probability of getting two
numbers whose product is even?
Probability of 2 events
Let E and F be two events. 𝐸 ∩ 𝐹 means both events occur together (𝑬 𝒂𝒏𝒅 𝑭).
𝐸 ∪ 𝐹 means event 𝐸 or event 𝐹 or both of them occur (𝑬 𝒐𝒓 𝑭).
Throwing a die:
E:= having an odd number ={1, 3, 5} F:= having a prime number={2,3,5}
𝐸 ∩ 𝐹 = 1, 3, 5 ∩ 2,3,5 = {3,5} 𝐸 ∪ 𝐹 = 1, 3, 5 ∪ 2,3,5 = {1, 2, 3, 5}
3 1 3 1 2 1 4 2
𝑃 𝐸 = = 𝑃 𝐹 = = 𝑃 𝐸∩𝐹 = = 𝑃 𝐸∪𝐹 = =
6 2 6 2 6 3 6 3

𝑷 𝑬 ∪ 𝑭 = 𝑷 𝑬 + 𝑷 𝑭 − 𝑷(𝑬 ∩ 𝑭)

𝐸 and 𝐹 are mutually exclusive →𝐸∩𝐹 =∅ → 𝑃 𝐸∩𝐹 = 0

→ 𝑃 𝐸 ∪ 𝐹 = 𝑃 𝐸 + 𝑃(𝐹)
There are 𝒏 phone lines. The probability that each one has problem is 0.3, if
the probability that at least one of them don’t have problem is more than
0.99, what is the least possible value of 𝒏?
A) 2
B) 4
C) 6
D) 8
E) 10
Near a certain exit of I-17, the probabilities are 0.23 and 0.24,
respectively, that a truck stopped at a roadblock will have faulty
brakes or badly worn tires. Also, the probability is 0.38 that a truck
stopped at the roadblock will have faulty brakes and/or badly worn
tires. What is the probability that a truck stopped at this roadblock
will have faulty brakes as well as badly worn tires?

If 𝐵 is the event that a truck stopped at the roadblock will have faulty brakes and
𝑇 is the event that it will have badly worn tires, we have

𝑃 𝐵 = 0.23, 𝑃(𝑇) = 0.24, and 𝑃(𝐵 ∪ 𝑇) = 0.38;

substitution into the probability formula of two event

0.38 = 0.23 + 0.24 − 𝑃 𝐵 ∩ 𝑇


→ 𝑃(𝐵 ∩ 𝑇) = 0.23 + 0.24 − 0.38 = 0.09
Random Variables

A survey was taken to find the number of children in each of 25 families. A list of the values collected
in the survey follows.
# of Frequency Relative
1204133120452323241230231
children frequency
Assume we have chosen these 25 families randomly from wide range of families.
0 3 12%
We call the number of children a variable X;
since families are randomly chosen, We call X a random variable. 1 5 20%
2 7 28%
What is the probability that 𝑋 = 3?
3 6 24%
6
There is 25 families and 6 of them have 3 children. →𝑃 𝑋=3 = = 24% 4 3 12%
25
What is the probability that X > 3? 5 1 4%
3 1 4 Total 25 100%
𝑋 > 3 → 𝑋 = 4 𝑜𝑟 𝑋 = 5 → 𝑃 𝑋 > 3 = 𝑃 𝑋 = 4 + 𝑃 𝑋 = 5 = + =
25 25 25
What is the probability that X < 4?
𝑋 < 4 𝑎𝑛𝑑 𝑋 ≥ 4 𝑎𝑟𝑒 𝑚𝑢𝑡𝑢𝑎𝑙𝑙𝑦 𝑒𝑥𝑐𝑙𝑢𝑠𝑖𝑣𝑒. → 𝑃 𝑋 <4 +𝑃 𝑋 ≥4 =1
4 4 21
𝑃 𝑋 ≥4 =𝑃 4 +𝑃 5 = = 16% →𝑃 𝑋 <4 =1−𝑃 𝑋 ≥4 =1− = = 84%
25 25 25
Lets make a histogram of children in 25 families.
Relative frequency of X = 𝑷(𝑿)
Probability Distribution of the Random Variable X
X P(X)
Relative frequency
Probability

0 0.12
1 0.2
2 0.28
3 0.24
4 0.12
5 0.4
Number of children 1
Value of X
The mean of data is :

0 3 +1 5 +2 7 +3 6 +4 3 +5 1 3 5 7 6 3 1
𝑀= =0 +1 +2 +3 +4 +5
25 25 25 25 25 25 25
= 0𝑃 0 + 1𝑃 1 + 2𝑃 2 + 3𝑃 3 + 4𝑃 4 + 5𝑃 5 = 𝟐. 𝟏𝟔

Then mean sometimes is called expected value that is the sum of 𝑋𝑃(𝑋)’s.

You might also like