Professional Documents
Culture Documents
mcd2080 Tutorial Questions 2018 03
mcd2080 Tutorial Questions 2018 03
Tutorial Questions
&
Computing Exercises
MCD2080 - Tutorial Questions and Computing Exercises
Information on all Tutorials:
Tutorials are divided into 3 parts: Part A, Part B and Excel Session
Part A questions are required to be completed by the student before attending allocated
tutorial class. This will be checked by your Tutor at Tutorial classes.
Part B questions will be completed with the guidance of your tutor during tutorial class.
Your tutor will award participation marks based on your performance in your tutorial session
(based on all the three parts). Participation marks will be awarded out of 100 %. The awarding
of marks will be based on the following criteria.
1. Tutorial part A questions are completed before attending the tutorial class. Answers
should be written clearly with all required steps.
2. Tutorial part B questions should be completed by the end of the tutorial class. Answers
must be written clearly with all required steps.
3. Excel Session lessons.
• Excel session is essential for applying the relevant statistics methods learned in
classes (both Lectures and Tutorials) using Excel to analyse data information.
• For you to accomplish Excel Computing Session effectively, you are supposed to
watch the video clips in weekly “Video Lessons” under Weekly Tutorial Folder on
Moodle. The computing session will consist of Exercises activities which will be
completed using excel (watch the video clips)
• NOTE there are Excel Exercises during week 2 and 4 to be completed with
questions answered online on Moodle.
4. Students should participate actively (discussion) in all the sessions for full marks.
5. Students must attend the tutorial class on time. Every 15 min being late will result in a
deduction of marks by 5%.
Tutorials 1 & 2:
The first two tutorials are about learning Excel and using it to do statistical calculations etc.
Work done in these tutorials will be useful when completing the Excel Exercises (Week 2 &
4). The work for these tutorials is designed for self-paced learning. There are a series of
videos that you are required to watch to enable you successfully complete the weekly
Tutorial Excel Lessons.
Tutorials 3 - 12:
The Tutorial participation and engagement activities will start from Week 3 onwards.
(a) a + b / 2 (b) ( a + b) / 2
(c) ( a + b) / 2 + c / 2 (d) ( ( a + b) / 2 + c ) / 2
Note: It is useful to be able to read the kind of notation used in this question, because that is how
formulae appear in the formula bar in Excel. In Excel, the variables would be cell addresses. So for
example instead of a, b, and c, we might have A1, A2, and A3.
The answer to part (a) would be the In order to obtain the answer to (b),
number in cell B1. you will need to supply brackets:
2. Draw a number line, place the following numbers on the number line, and then answer the
questions below. –1.645, 1.645, –1.6, 1.6, –1.7, 1.7. [Remember that if x is to the left of y on the
number line, then we say x < y .]
3. In the following, n is the number of times an event occurs, so it might be positive or zero but not
negative. List the non-negative numbers n that satisfy each inequality:
4. Transform the following equations to make x the subject and hence find the value of x.
3x
(a) −2=
11
4
x−3
(b) = 1.645
2
x − 2.3
(c) = z , where z = 1.96
1.5
In (c), write x in terms of z and then make the substitution z = 1.96 .
3 3 3
(a) ∑ xi (b) ∑ yi (c) ∑ xi yi
i =1 i =1 i =1
3 3 3 2
(d) ∑ (xi − 9) (e) ∑ (xi − 9) 2
(f)
∑
x
i
i =1 i =1 i =1
3 2
n n
(g) ∑ xi2 (h) ∑ i ∑ xi
x − 2
i =1 = i 1= i1
6. Suppose the average income in a certain town has varied over the years, as shown in the table:
Average
Year income
2000 $57,309
2005 $55,430
2010 $69,408
(i) What was the percentage change in average income from 2000 to 2005?
(ii) What was the percentage change in average income from 2000 to 2010?
7. Scientific notation. (This will help in reading Excel output, especially in regression.)
In many computer packages, including Excel, scientific notation has format such as for example:
1.2098E-03
Write the following numbers in decimal notation accurate to three decimal places:
(d) 1.603 × 10-3
(e) 4.9862E-02
(f) 5.0907235E03
Tutorials 2 – 12:
General Introduction
This is a set of exercises to develop your skills with Excel. The exercises are divided into groups. If you
already have some skill with the package, you will not need to do all the exercises in the
introductory group.
Please watch video clips on Moodle in the “tutorial folder” under WEEK 01
These series of videos provide basic Excel skills required for the completion of Excel
Computing Activity Tasks.
Simply click on the "Next" or "Back" buttons to navigate through all videos.
Practice with “MyMathLab STUDY PLAN” for Mid-Tri Test
In addition, the instructions in the exercises are usually sufficient for you to ‘get by’. There is an external
link for Extra Excel Tutorials given under the “Textbook and Excel Notes” Briefcase on Moodle Note
that there is an emphasis here on good practice in Excel — not just doing things, but doing them in the
best way. Students who have a Mac Book, and are willing to use it outside MCD2080 Computing Lab
session, find instructions relation to Excel on the Moodle site.
It is assumed that you have some basic computer knowledge; that you:
• can boot up a computer and log in if necessary, as it is on campus
• understand ‘files’, ‘directories’, etc.
• can operate within the appropriate operating system, usually a version of Windows (though Excel
is available for Apple computers).
In these exercises, specific statistical techniques are developed which are closely related to the lectures,
tutorials, assignments, tests and exam questions.
Exercise 1: Background
Real Estate in Regional Australia
The nation’s regional property market has generally been weaker than urban areas over the past year,
which has opened up some great buying opportunities.
Your first task is to locate the data to work with. You will find the data and relevant tables in the file
PROPERTY.xls in the Tutorial Material folder within the Week 1 section on Moodle.
When reading the instructions in these tutorial problems, you may wish to refer to the following diagram
to identify the locations mentioned.
How to split:
1. While holding down the “Windows” key , press the “ ” OR “ ” key to take the
active windows to the left and the other active windows to the right respectively.
2. For example, open PROPERTY.xls. While holding down the “Windows” key, press the “ ”
arrow key and this will take the Excel with PROPERTY.xls window to the right side of the screen.
Do the same with any other document open but holding again holding “Windows” key and
pressing the “ ” arrow key to get it on the left side of the screen.
Task 2A: Calculating the percentages of properties in each location across the varying number
of bedrooms using Excel FORMULAS approach
The great deal of the power of spreadsheet packages also lies in the way in which they use formulas to
carry out repetitive calculation.
In this section of the tutorial, you will learn how to use formulas. The concepts of ‘relative’ and
‘absolute’ cell addresses are absolutely essential when using formulas. A great deal of the power of
spreadsheet packages also lies in the way in which they use formulas to carry out repetitive calculations.
In this task, we need to complete Table 1 in the worksheet Property Size in the PROPERTY.xls Excel
spreadsheet. We will start with location Rural first.
In cell I4 type: =B4/B9. Alternatively, you can type = and then click on cell B4, type / and then click on
cell B9. (Note the equals sign. The equals sign tells Excel that what is to follow is a formula, to be
carried out).
Right click on the cell and choose “Format Cells”. A dialogue box with a number of tabs
appears. The Number tab (shown here) offers more detailed options to change the way cell
data is displayed. Change the number of decimal places to 2.
Alternatively, click twice on the Decrease Decimal button on the Home tab on the Ribbon
(in the group labelled Number).
In practice we would enter the formula once only. Entering a formula more than once is a waste of
effort, and likely to lead to error. To get the formula into the other cells we can copy it. Drag and drop
cannot be used here because the destination range is not the same shape as the source range, but the
keyboard shortcuts can be used:
However, before we copy a formula, we need to change the cell to an absolute cell.
Excel interprets a cell address in a formula as a relative cell address. For cell I4, it divides B4 by B9.
Therefore, if B9 is not changed to an absolute cell, when copying the formula for I5, it will divide B5
To prevent this, B9 must be entered as an absolute cell address. Excel uses dollar signs to identify an
absolute address. These can be entered by typing, but the easiest way to enter these is to press the F4
function key. Then any cell addresses that are highlighted in the formula bar acquire dollar signs.
Now select I4, move the cursor over the bottom right hand corner of the cell till it changes to a thin,
solid cross. Press the mouse key and hold it down while you drag the cursor down the column. Release
the key and the entries will appear. Get the sum of these percentages for the Rural location into I9 using
formula =SUM(I4:I8). Find more about sum function in Task 3. If all is correct this sum must be 1.00.
Do you know why?
We need to change all values into percentages. This can done in two ways:
1. Highlight the numbers in column I (I4:I9). Right click and select “Format cells”. In the number
tab, choose “percentage” and change the decimal places to 1.
2. Another way to format as a percent is to click on the Percent button, also in the Number group
on the Home tab. This will display with no decimal places.
‘Percent’ button
In this task, we will learn how to tabulate the number of bedrooms in the two locations, Rural and
Urban. In other words, we need to create a cross tabulation known as a Pivot Table. To do this, open
the worksheet Data in PROPERTY.xls. (Please watch the video for illustration)
• To create a Pivot Table, place the cursor anywhere in the data set.
• Click on Insert and choose Pivot Tables as shown below.
• The following window will appear. It highlights the entire data range by default.
• Click on Existing Worksheet and choose a blank cell anywhere on your worksheet. This will
allow you to place the pivot table on the same worksheet. Otherwise the pivot table will be placed
on a new worksheet.
• Click OK and a blank pivot table will appear with the Field List shown on the right of the screen.
• On the Field List, drag Bedrooms into the Row Labels window below, Location into the
Column Labels window and Location again into the Σ Values window. This is shown in the
screenshot below.
• Finally, always remember to produce your output in tabular form. If you are using Excel 2013 or
beyond, your output will automatically be produced in tabular form. If you are using an earlier
version, click on the Design tab, followed by Report Layout and choose Show in Tabular
Form. Refer to the screenshot below.
A cross tabulation or contingency table, on the other hand, is a summary table for two categorical
variables as shown in the table obtained above. Thus, cross tabulation allows us to examine observations
that belong to specific categories on more than one variable. By examining these frequencies, we can
identify relations between cross tabulated variables.
Based on the table above, calculate the percentages of properties in each location across the varying
number of bedrooms.
Compare your Pivot Table with the final image of the Pivot Table with Count and Percentage Count in
the Solution of lab exercises: Week 1 section (at the end of week 1 Tutorial)
Task 3: Graphical Comparison of properties in each location across the varying number of
bedrooms. Continue working on task 3 in week 2
Having created the percentage distribution, the next task is to graph it as a bar chart. Excel calls
this a Column Chart.
This tab only appears on the Ribbon when the chart is selected. Point to the range H3:J8. (Note
that this includes the heading.) The dialogue box should now appear as below.
Since the No. of Bedrooms is already correctly represented on the Horizontal Axis, we need to
remove it from Legend Entries (Series). Do this by first highlighting ‘No. of Bedrooms’ and then
click on ‘Remove’, then OK. You should now have produced a column chart. (Please watch the
video for illustration)
Now switch to the Layout tab. In the Labels group, click on Chart Title, then Centred Overlay
Title. A text box will appear above the chart. Type in an appropriate title.
To remove the grid lines, highlight the lines and press delete.
Task 4: Table of Summary Statistics (Using Insert Function) Continue working on task 4 in
week 3.
For this task, we need to complete table 2 in the worksheet Selling Price in the file PROPERTY.xls.
The following statistics can be calculated using formulas listed below for variable RURAL. You can
practice to get the same results by using Insert Function, Statistical Functions etc. Find the same list of
descriptive statistics for variable TOWN.
(When you use QUARTILE.EXC, Excel calculates the quartiles except possible extreme values. The
alternative is QUARTILE.INC which includes extreme values and sometimes leads to odd values.)
Interquartile range, Standard deviation, Range and Coefficient of variation are all measures of
variability.
Exercise 2
A great deal of the power of spreadsheet packages lies in the way in which they use formulas to carry
out repetitive calculations. Open the file Caring&Sharing.xlsx.
Task 1: Find expenditure for each day (check solution in the Excel table in Task 3)
Task 2: Find 2.5% brokerage on expenditure for each day (check solution in the Excel table in
Task 3)
Task 3: Find the total and average of quantity, price, expenditure and brokerage.
Excel has a large number of built in functions. Some of these operate on individual cells or pairs of cells
- for example, the multiply function and some operate on ranges. All of them can be used as parts of
formulas. This exercise covers the sum and average functions.
In cell A8 type Sum, and in A9 type Mean. In C8, enter the total
quantity of shares bought over the four days by clicking on the
AutoSum button in the Editing group on the Home tab. The
Calculating Percentages
Part A:
1. The owner of a large fleet of taxis is trying to estimate his costs for next year’s operations.
One of the major costs is fuel purchases. To estimate fuel purchases, the owner needs to
know the total distance his taxis will travel next year, the cost of a litre of petrol and the fuel
consumption (in km/litre) of his taxis. The owner has been provided with the first two
figures (distance estimate and cost). However, because of the high cost of petrol, the owner
has recently converted his taxis to operate on LPG. He measures the fuel consumption for
50 taxis and the results are stored.
Part B:
3. Classify each of the following variables as numerical or categorical, discrete or continuous,
ordinal or nominal.
a. your student ID number
b. eye colour (brown, blue, . . . )
c. whether a person drinks alcohol (yes, no)
d. length of cucumbers (in centimeters)
e. number of cars in a car park
f. salary (high, medium, low)
g. salary (in dollars and cents)
h. daily temperature in ◦C
5. One hundred and twenty-one university students ( n = 121) were asked to identify their
preferred leisure activity. The results are displayed in a bar chart, as shown below.
20
15
10
5
0
Sport TV Music Movies Reading Other
Preferred Leisure Activity
6. Recall Lecture 1 (Exercise 2.17, p36 Berenson) – data represents the electricity cost in
dollars during the month of July for a random sample of 50 two-bedroom apartments in a
New Zealand city.
We created a table with class intervals using the Pivot “Group” option.
Based on the information in the table, around what amount does the monthly electricity cost
seem to be concentrated?
(hint: focus on modal class)
3rd edition: Introduction and data collection (Chapter 1)p. 10: 1.4
Presenting data in tables and charts (Chapter 2)
p. 24: 2.3, 2.4 and p. 39: 2.20
4th edition: Introduction and data collection (Chapter 1)p. 10: 1.4
Presenting data in tables and charts (Chapter 2)
p. 25: 2.3, 2.4
p. 40: 2.20
Please watch video clips on Moodle in the “tutorial folder” under WEEK 02
These series of videos provide basic Excel skills required for the completion of Excel
Computing Activity Tasks.
Simply click on the "Next" or "Back" buttons to navigate through all videos.
COMPLETE EXCEL HOMEWORK AND SUBMIT ONLINE (MOODLE) BY END OF WEEK 3.
Histogram
We recommend that you need to learn individually sections E3 from notes on the use
of Excel (ExcelNotesv3.pdf) on Moodle
Exercise 1
Complete Task 3 from Exercise 1 in Lab Week 1
Exercise 2
Task 1: Modifying a graph
Wets and Dries is a company which provides
economic advice to governments. They bill by the
hour. Over the last three months, the number of hours
of advice billed each day is recorded in the table
shown here.
Open a new workbook, enter the data and format it as
shown.
To wrap the text in the two headings, select them, and then click on the Wrap Text button in
the Alignment group on the Home tab. Use the AutoSum button to obtain the total.
We want to calculate the corresponding
percentage distribution, as shown here. To
calculate the percentages, in cell D2 enter the
formula =C2/$C$6*100. Note the dollar signs!
Drag this down to D5. Reduce the number of
decimal places to two by clicking on the
Decrease Decimal button three times.
Drag from C6 to D6 to get the sum of the percentages.
Having created the percentage distribution, the next task is to graph it as a bar chart.
Excel calls this a Column Chart. (This is reflection of your knowledge from the
previous exercise). The resulting bar graph should look like this
39%
3 4 5 6
With the chart selected, click on the Change Chart Type button on the Design
tab. In the dialogue box that opens, click on Pie, then OK. This chart is now
not very informative, because of the options you selected for the column chart.
The Chart Layouts group on the design tab gives you more sensible options
for the layout of the pie chart. Layout 2 is shown above. Alternatively, you can ‘Change
use the Layout tab to make individual changes to the appearance of the chart, Chart Type’
as described above for the column chart. button
Note that the multicolored pie chart does not print well in black and white. For this kind of
printing it is better to select the greyscale version (on the Design tab, at the left of the Chart
styles group).
Finally, click on the Chart title and drag it to one of the corners. Save the file.
The trustee of a mining company’s accident compensation plan solicited the employees’
feelings toward a proposed revision in the plan. The responses are shown in the following table.
There are three employee types (Mine-Workers, Clerical staff and Managers denoted by W, C
and M) and two kinds of decisions, either for (F) or against (A). Data are available in file
Compensation.xls.
When sorting a number of columns of data by one variable, you must highlight the whole block
of data (all required rows and columns) first. If you just highlight the column for the variable
you are sorting by, then the values of this variable will be separated from the cases to which
they belong. Click on Sort & Filter in the Editing group on the Home tab, and select Custom
Sort. The dialogue shown overleaf appears. Make sure ‘My data has headers’ is checked. In
the Sort by box, select ‘Job classification’. In the Sort On box, select ‘Values’. In the Order
box, select ‘A to Z’. Click OK to sort the data.
Use the Excel function Countif to obtain the number of each kind of Decision
corresponding to each employee type. In other words fill in the frequencies in Table 2 in
the file, reproduced here:
W C M
F
A
For example the number in the shaded cell of Table 2 (cell G3 in the worksheet) should
be the number of clerical staff in favour of the scheme.
When you have sorted the data, you should find that the list of Decisions by clerical staff
lie in cells C3:C14.
and so on.
Using the data in “COUNTIFS” Worksheet, without sorting the data, obtain the counts in
cells AND RECORD IN TABLE 2:
F3 by typing: =COUNTIFS(B3:B32,"W",C3:C32,"F");
F4 by typing: =COUNTIFS(B3:B32,"W",C3:C32,"A");
Do same for the other cells (G3, G4 then H3 and H4)
Now create a grouped bar chart (called “clustered column chart” in Excel) showing the
number of employees in each category who are for and against the decision as follows.
(ii) On the Insert tab, select Column/2-D Column (the first chart sub-type). You can
choose between series in rows or in columns by clicking the Switch Row/Column
button in the Data group on the Design tab (A and F series, or W, C, and M series).
Decide which one is appropriate. Also try the “Stacked column” and “100% stacked
column” (the next two sub-types) by clicking the Change Chart Type button and
consider what aspects of the data are thereby made clear.
You will work with data from the file CarbEm.xlsx, provided in the subject weekly computing
exercises in electronic resources. This file is extracted from World Bank data for the year 2007
and provides, for a sample of 122 countries,
Task 1: Construct a frequency histogram (not using Pivot chart approach) for the
variableCO2E, choosing the interval width that you consider to be appropriate.
SOLUTION STEPS
a. Find the maximum and minimum values of the variable CO2E using the functions
MAX and MIN (see Excel Notes E4, particularly E4.2.)
etc
b. With 122 data points, it is reasonable to have about 8 to 10 class intervals, so choose a
convenient starting point and width of class interval to cover all the data from the
minimum to the maximum value calculated in A.
(For example, you may decide first class interval starts at 0, have 9 intervals, of width
4, or you may decide on shorter intervals to show more detail.)
c. Construct a frequency table, filling in the lower and upper limits. The upper limit of
one interval is equal to the lower limit of the next. According to the Excel convention,
a class interval includes its upper limit but not its lower limit. Thus the first class
interval shown below includes values greater than 0 and less than or equal to 4.
d.
Upper
Lower limit limit Frequency
0 4
etc
e. In order to obtain the frequencies to complete this table, use the histogram tool:
Choose the Data tab. In the right hand group, click on the Data analysis button
• select as the input range the CO2E column, including the heading; select as the Bin
range the Upper Limit column of the table you are creating, including the heading.
• Check the labels box (because you have included headings in your range selections).
• Click on the white box next to Output Range and point to the cell where you want the
output to start.
• Check Chart Output.
• You should see something like
f. You need to make the so-called histogram look something like a histogram
• The legend is not needed – get rid of it (Click to highlight it, and delete)
• Provide a more informative heading (Click to highlight the current heading and
type a replacement) Do the same with the axis titles.
• Right-click on one of the histogram bars; select Format data series, and slide Gap
Width to No Gap
50
40
30
20
10
0
(Note that the horizontal axis labels should be at the upper limit of each class but are
actually in the middle.
A quick and easy hint how to fix this is to replace the upper limits with the interval midpoints:
1. Create extra column with interval midpoints (2, 6, 10, ….)
2. Left click on horizontal axis with values 4, 8, 12 … [known as select axis]
3. Right click and Select Data option [or in Design tab click on Select Data]
4. In Horizontal (Category) Axis Labels click Edit
5. In Axis label range select cells with interval midpoints (2, 6, 10, ….) O.K. and O.K.
After this step original values 4, 8, 12, …. will change to 2, 6, 10, ….
6. Name your horizontal axis accordingly (Interval midpoint of…..) and provide units of
measurements. Resulting image of your histogram
Histogram of ........
60
50
Frequency
40
30
20
10
0
2 6 10 14 18 22 26 30 34 38
Interval Midpoint of CO2E metric tons per capita
5. Drag and drop CO2E into ∑ Values and frequencies will appear
6. Report the table as a “Tabular Form” by clicking “design” then select “show in Tabular
Form”
THEN
7. Select any cell in the Pivot Table. In the Tools select Pivot Chart (in the Option tab),
select Column and O.K.
8. Close gaps between columns using the following: in Design (in drop down menu) select
Layout #8. Resulting working image of your histogram
Frequency
30
20
10
0
0-4 4-8 8-12 12-16 16-20 20-24 24-28 28-32 32-36
CO2E emissions
Note: VLOOKUP assumes that the first column of the table is sorted in ascending order.
There are two tables provided:
Table 1 is the bonus band.
Table 2 refers to the information regarding the name of sales persons and percentage above
their target sales.
YOUR TASK:
Use the VLOOKUP function to assign the number of bonus points to each salesperson.
The VLOOKUP function is used in column C to look in the bonus band table and
automatically assign bonus points.
Once you have completed this task, move to column D and convert your answers in C to a
percentage of the total bonus points. Now in column E, calculate the bonus amount.
In favor 4 7 4
Against 8 5 2
5 In favour
4 Against
0
Mine workers Clerical staff Managers
16
Note that equal numbers are in
14
favor and against but the
Number in favour and against
0
In favour Against
90%
Thus we see that a high proportion of
80%
managers and a low proportion of mine
70%
workers are in favor of the scheme.
60%
Against However, from this chart, we cannot see
50%
40%
In favour the relative number in each worker
30%
category.
20%
10%
0%
Mine w orkers Clerical staff Managers
Part A:
1. Berenson p.90: 3.30 modified – The set of the data below is from a sample of n = 7
12,7, 4,9,0,7,3
2. The side by side boxplots below shows the distribution of age at marriage of 45 married men
and 38 married women.
(a) Compare the two distributions in terms of:
i. measures of central location,
ii. measures of variability, and
iii. shape (note that it is not possible to comment on modality; do you know
why?)
(b) Comment on how the age at marriage of men compares to women for the data.
20
Frequency
15
10
200
300
400
500
600
700
800
900
1000
1100
Annual Rainfall (mm)
20
Frequency
15
10
0
200
300
400
500
600
700
800
900
1000
1100
Annual Rainfall (mm)
Recent
Historical
200 300 400 500 600 700 800 900 1000 1100
4. The remuneration packages for the CEO’s of 12 international companies are (in $US 000’s)
as follows:
2512 3424 3800 4152
2636 3640 3870 4480
3424 3690 4078 9020
The following table of summary measures was obtained using Excel:
Remuneration of CEO’s
Mean 4060.5
Median 3745
Mode 3424
Standard
Deviation 1663.4
Coefficient of
variation
Lower quartile 3424
Upper quartile 4133.5
Interquartile range
Range
Minimum 2512
Maximum 9020
Sum 48726
Count 12
(a) Complete the table by supplying the coefficient of variation, range and interquartile
range.
(b) For each of the following, comment on its suitability as a measure of a “typical
value” from this dataset:
(i) mean
(ii) median
(iii) mode
(c) In breaking news, it has just been announced that the highest paid of these CEO’s
has negotiated a new remuneration package and will now receive $25 million.
For this revised data set, calculate the revised value of each of the following
summary measures, and briefly comment on whether and how the value has
changed from the corresponding value given in the table above.
(i) Mode
(ii) Median
(iii) Mean
(iv) Range
(v) Interquartile range
(vi) Given that the new standard deviation is 6201.34, calculate the new coefficient
of variation
(vii) Comment briefly on which measures of central location have changed
significantly.
(viii) Comment briefly on which measures of spread have changed significantly.
10.0%
0.0%
250
500
750
1000
1250
1500
1750
2000
2250
2500
2750
3000
3250
3500
3750
4000
Consume
No alcohol alcohol
Mean 456.9 708.4
Median 353 638.5
Modal class $0-$250 $500-$750
Standard deviation 403.0 461.3
Coefficient of
variation 88.2% 65.1%
Minimum 12 12
Maximum 3846 3696
Lower quartile 173.75 356.75
Upper quartile 632.25 936
Interquartile range 458.5 579.25
Count 234 766
Please watch video clips on Moodle in the “tutorial folder” under WEEK 03
These series of videos provide basic Excel skills required for the completion of Excel Computing
Activity Tasks.
Simply click on the "Next" or "Back" buttons to navigate through all videos.
COMPLETE EXCEL HOMEWORK AND SUBMIT ONLINE (MOODLE) BY END OF WEEK 4.
[hint: you can use Split Screen i.e. the learning outcome from Computing Lab Week
1, Exercise 1, Task1]
Exercise 2
Task 1: Summary statistics using Data Analysis tool.
In this exercise, the data gives the dividend yield on shareholders’ funds for Australia’s top
150 companies for the year 2005.
Note that Dividend yield is defined as the amount of a company’s annual dividend expressed
as a percentage of the current price of the share of that company. The data is in the file
Dividend.xlsx. Column A stores the dividend yield for the top 1 – 50 companies (Group A)
ranked by market capitalisation, Column B stores the dividend yield for the companies 51 –
100 (Group B), and Column C stores the dividend yield for companies 101 – 150 (Group C).
Use the Data Analysis button on the Data tab, and select Descriptive Statistics. If
Data Analysis is not available, see Excel Notes section E5.
Copy and paste all relevant values from Excel output. Individual functions may be
accessed using the Insert Function button, discussed in Exercise 1. to obtain the
remaining values such as Range, Coefficient of variation, Quartiles and IQR.
Note that since the data concerns all the top 150 companies, rather than a sample, the
STDEV.P function should be used for standard deviation. (Population standard
deviation is calculated by the function STDEV.P.)
(iii) shape
Note: since we have all 50 top companies, and the next 50, etc, these are to be regarded as
populations, not samples, therefore, use STDEV.P
Exercise 3:
Task1:
A B
Minimum 5544 6701
Q1 6708.25 7578.5
Median 7316.5 8140.5
Q3 8085 9027.25
Maximum 8731 9744
Task 2:
Note: You are expected to name your horizontal axis as CFL Bulb Life and provide the units
as (hours).
Please watch video clips on Moodle in the “tutorial folder” under WEEK 04
These series of videos provide basic Excel skills required for the completion of Excel
Computing Activity Tasks.
Simply click on the "Next" or "Back" buttons to navigate through all videos.
Practice with “MyMathLab STUDY PLAN” for Mid-Tri Test
Part A:
Question 1.
Refer to the spreadsheet Elecmart.xlsx . Recall the Lecture week 3, you produced and
interpreted a pivot table that showed the breakdown of Time across the Regions, by putting
Time in the row and Region in the column along with changing position of variables in a
pivot table.
You will be working with variables Gender and Region now. To demonstrate the differences
you get when you change the position of a variable in a pivot table, you need to use the
following pivot tables and interpret the values below.
Provide an interpretation for the following cells so that you understand the impact of
changing the position of variables in a pivot table.
Question 2.
Among households in which at least one child is attending a private school, it is found that the
total number of tablets (iPad, Kindle, etc.) owned by members of the household has the
following probability distribution:
Number of tablets X 0 1 2 3 4 5 6 or
more
Probability P( X ) 0.33 0.25 0.22 0.12 0.07 0.01 0.00
(a) Write down the formulae for the mean and variance of a discrete probability
distribution.
(b) Use the table below to calculate the mean µ and standard deviation σ of the number
of tablets.
X P( X ) X × P( X ) P( X ) × ( X − µ ) 2
0
1
2
3
4
5
Total
(c) Also, find the median and mode of the number of tablets.
HyTex Company is a direct marketer of electronic equipment and wants to investigate the
efficacy (Is HyTex sending the catalogues to the right customers? If not, to whom should
HyTex send the catalogues?) of catalogue mailings to its 1,000 mail order customers.
Catalogue Marketing.xlsx contains customer demographic attributes including the Marital
Status of the customer and the Region they live in. The following pivot tables have been created:
a) Interpret the following values to understand the differences between the pivot tables.
(i) From Table 1 the values 12.40% and 25.30%
(ii) From Table 2 the value 47.13%
(iii) From Table 3 the value 24.50%
b) Is the percentage of married customers who live in the Midwest region smaller than
the percentage of customers who are not married that live in the South region?
x 0 1 2 3 4 5 6
Probability 0.1176 0.3025 0.3241 0.1852 0.0595 0.0102 0.0007
(b) (i) What is the most likely number of old model widgets in a box? Explain your
answer.
(ii) What is the expected number of old model widgets in a box? Explain your
answer.
(iii) Must the expected number be quoted as a whole number? Explain your answer.
5. In the following scenario, state whether X is a binomial random variable. Explain your
answer.
Thirty percent of households buy the leading brand of dishwasher detergent. A random sample
of 25 households is surveyed to determine the brand of dishwasher detergent they buy. Let X
be the number of households in the sample that buy the leading brand.
6. From experience, a teacher has determined that the number of times a student has failed
to attend class, X, has the following probability distribution:
X 0 1 2 3
Probability 0.75 0.16 0.06 0.03
(a) Suppose that the teacher has 20 students currently in her class. If we let Y be the
number of students who attend the class,
(i) What kind of probability distribution does Y have? State the values of the
parameters.
(ii) What is the probability that there are no more than 15 but at least 8 students who
will attend the class?
(b) What is the expected number of students who will attend the class?
(c) What is the standard deviation of number of students who will attend the class?
1. From experience, a retailer has determined that the number of broken light bulbs, X ,
in a box containing 10 dozen Super brand light bulbs has the following probability
distribution:
X 0 1 2 3
Probability 0.80 0.10 0.05 0.05
(a) What is the probability that in a randomly selected box of Super light bulbs, X
satisfies corresponding phrase (in the first column)?
To complete the following question you may wish to refer to a number line: 0 1 2 3
For each phrase in the first column, write down the corresponding inequality or inequalities
(Column A), and the list of numbers specified by this phrase (Column B). Then write the
probability that X satisfies this condition along with result (Column C). As an example, the
first two are completed.
Please watch video clips on Moodle in the “tutorial folder” under WEEK 04
These series of videos provide basic Excel skills required for the completion of Excel Computing
Activity Tasks.
Simply click on the "Next" or "Back" buttons to navigate through all videos.
COMPLETE EXCEL HOMEWORK AND SUBMIT ONLINE (MOODLE) BY END OF WEEK 4.
You need to understand Lecture 3. For a quick reference watch the following
video: Excel 2007 Tutorial PIVOT TABLE (Part 1: Basic Introduction)
https://www.youtube.com/watch?v=w8WnVPmzmTk
https://youtu.be/9NUjHBNWe9M Useful videos
https://youtu.be/g530cnFfk8Y for your Excel Exercises
Exercise 1
To achieve a better understanding of Pivot Tables and make a link between Descriptive
Statistics I, II and current part III, we will continue working with Elecmart.xlsx data
with variables Gender (male, female) and Buy Category (low, medium, high).
Produce a pivot table for the variables Gender in the Column and Values fields and Buy
Category in the Row field, report the count values as:
o % Grand Total (Table 1) - (What kind of percentages are these?)
o % Row (Table 2) - (What kind of percentages are these?), and
o % Column (Table 3) - (What kind of percentages are these?)
b) Is the percentage of female customers given they are purchasing at the high buy
category level larger than the percentage of male customers if they are purchasing
at the medium buy category level?
Exercise 2
For McHammer Hardware (McH) data, question 4, calculate the standard deviation of
old model widgets in a box. Use Excel file McH.xlsx.
Exercise 3
Repeat tutorial question 6 (a) (ii) using the relevant Excel functions BINOM.DIST instead of
tables. Instructions how to use Excel functions BINOM.DIST are given in Section E4.3 of the
Excel notes and in the lecture week 4.
a) Yes, as the percentage of female customers purchasing at the low buy category
level is 38.89% whilst the percentage of male customers purchasing at the
medium buy category level is 32.53%.
Exercise 2
Exercise 3
There will be Formative Assessment Tasks covering topics learned in Week 5 – 11.
Remember to Practice with MyMathLab STUDY PLAN for your Final Exam
Tutorial Questions:
1. When using tables to obtain standard normal probabilities, values of Z can only be
specified to two decimal places. Use Table 1 in the statistical tables provided to find
the following probabilities. You will need to round the Z-values to two decimal places
in order to use the tables. If you get the relevant probabilities using Excel (refer to
computing lab session) you will observe some differences from the answers obtained
using Excel.
2. When using tables to obtain standard normal percentiles, values of Z can only be
specified to two decimal places. Use Table 1 in the statistical tables provided to find
the following percentiles. You will need to round the Z-values to two decimal places
in order to use the tables. If you get the relevant percentiles using Excel (refer to
computing lab session) you will observe some differences from the answers obtained
using Excel
For your answers to the following questions, please remember to do the following:
• Define the variable
• State the distribution
• Draw curves
3. The lifetimes of the heating element in a Heatfast electric oven are normally
distributed, with a mean of 7.8 years and a standard deviation of 2.0 years.
(a) (i) If the element is guaranteed for 2 years, what percentage of the ovens sold will
need replacement in the guarantee period because of element failure?
(ii) In a year in which 10,000 ovens are sold, how many ovens would you expect to
have to replace in the guarantee period because of element failure?
(b) What proportion of elements are expected to last for between 2 and 10 years?
(d) Find the length of time such that it includes 95% of all ovens. Include a statement
describing your answer.
(ii) fail after the warranty expires, but before they have lasted for 70,000 km?
(iii) last more than 72,500 km?
(b) Bert claims to have owned a Tyrannosaurus tyre which lasted 145,000 km.
Respond to Bert’s claim, without performing any further calculation.
(c) (i) Obtain the 50th percentile of tyre life. Explain, to a non-statistician, what this
value means.
(ii) Obtain (to the nearest km) the 99th percentile of tyre life.
4th edition
The normal distribution (Chapter 6)
p. 187: 6.2, 6.4, 6.6, 6.8, 6.10
Please watch video clips on Moodle in the “tutorial folder” under WEEK 05
These series of videos provide basic Excel skills required for the completion of Excel
Computing Activity Tasks.
Simply click on the "Next" or "Back" buttons to navigate through all videos.
To be able to answer the Excel Exercises you need to read and practice the following:
Insert the required z value. Insert “True” in the cumulative box. Click OK.
If you supply any positive or negative value z in the dialogue box, NORM.S.DIST will yield
the result p, where p = P(Z < z) is the area to the left of z under the standard normal curve
(shaded). Note that if z < 0, NORM.S.DIST(z, true) will be a number less that 0.5:
0 z z 0
It is always helpful to sketch a normal curve, and shade the area you are looking for, in order
to work out exactly what calculation is needed.
X ~ N(µ = 20, σ = 5)
1. Require P(X < 30)
Answer: 0.9772
Then
P(15 < X < 30) = P(X < 30) – P(X < 15) = 0.9772 – 0.1587 = 0.8185
SUMMARY:
x 𝜇𝜇 z 0
Calculating percentiles
Select a cell to contain the result, and select the NORM.S.INV function from the list of
Statistical functions. The following dialogue box appears. Insert the required probability. Click
on OK.
In the above dialogue box, if you supply the value of the shaded area p shown below,
NORM.S.INV will return the value of z0. Note that if p < 0.5, z0 will be a negative number.
shaded area p
For example, what is the (note: p > 0.5 in
this example)
1. 10th
2. 95th
0 z0
percentile of the standard normal distribution?
Answer: z0 = – 1.2816.
Answer: z0 = 1.6449.
Note: this is the value of z 0 that cuts off an upper tail of area 0.05:
Any normal distribution (any value of population mean µ and standard deviation σ )
Printed tables for finding normal probabilities rely on standardisation – the tables used are of
the standard normal distribution, with mean 0 and standard deviation 1. With the Excel we
can find the percentile without standardization.
Answer: x0 = 31.6312.
SUMMARY:
p p
z 0 x 𝜇𝜇
Exercise 2:
(Always select Cumulative = True for NORM.DIST)
Exercise 4:
Part A:
1. A statistical analyst who works for a large insurance company is in the process of
examining several pension plans. Company records show that the age at which its male
clients retire is approximately normally distributed with a mean of 63.7 years and a
standard deviation of 3.1 years.
(a) Calculate the probability that a randomly selected male client will retire before the age
of 65 years.
(b) If a random sample of 50 male clients is to be selected from the company database,
what is the probability that the sample mean will be less than 65 years?
(c) Close examination of the ages of recent retirees shows that the assumption of a normal
distribution may be false.
Which, if either, of your answers above would be changed by this information, and
why?
2. Consider the following sets of data drawn from a normally distributed population.
Set A: 1,1,1,1,8,8,8,8
Set B: 1,2,3,4,5,6,7,8
Each set of data is used to calculate a 95% confidence interval for the population mean.
Without doing any calculations, state, with explanation which confidence interval will
be wider.
3. The following observations were drawn from a normal population whose variance is 100
12, 8, 22, 15, 30, 6, 39, 48
and 90% confidence interval of the population mean has been calculated:
10 10
22.5 − 1.645 < µ < 22.5 + 1.645
8 8
Part B:
For all your answers, please remember to do the following:
• Define the variable
• State the distribution
• Draw curves
4. Soft drink bottles are filled so that they contain on average 330 ml of soft drink in each
bottle. The standard deviation is 4 ml. Assume that the content of soft drink bottles is
normally distributed
(a) Calculate the probability that a randomly selected bottle will contain less than 325 ml?
(b) The bottles are sold in 6-packs. What is the probability that in a randomly selected 6-
pack the mean amount per bottle is less than 325 ml?
(c) What if the assumption of a normal distribution was incorrect? What will happen to
your answers for Parts (a) and (b)? Which, if either, of your answers above would be
changed by this information, and why?
5. A random sample of 20 petrol stations in the city of Casey on a Tuesday, found that the
mean price per litre over the 20 stations was $1.52. Assuming the population standard
deviation was 3 cents
(a) Find a 95% confidence interval for the mean price of unleaded petrol in Casey on
that day and interpret.
(b) Find a 90% confidence interval for the mean and interpret.
(c) If the same mean had been found for a sample of 80 stations, what would the 95%
confidence interval be?
(i) Discuss the confidence interval width by comparing results in part (a)
with results in part (c)
(ii) Discuss the precision of estimation by comparing results in part (a)
with results in part (c)
(a) Construct 98% confidence interval estimate of the mean processing time and interpret.
Calculate with precision to 2 decimal places.
HINT: Apply and learn the following systematic approach by circling what is
appropriate and find missing words.
• This is statistical inference related to estimation/hypothesis testing about the
population mean/proportion because...............................................................
• Standard deviation required for this calculation is population/sample
standard deviation.
• I will use the following formulae for calculating 98% confidence
interval ………………………………,because………………………………
• Do I know all the components what I need to substitute into this formula? Y/N
• List all components which are known…………………………………………
• List all components which are unknown………………………………………
• How do I find missing components?
• Find the 98% confidence interval estimate (include units)
• Interpret (include units)
(b) What assumption must you make about the population distribution in (a)?
(c) Do you think that the assumption made in (b) is seriously violated? Explain.
(d) If a random sample of size 50 was selected, would your answers to part (b) and (c) be
different? Explain.
8. Bags of a certain brand of tortilla chips claim to have a net weight of 400grams. The net
weights vary slightly from bag to bag and are non-normally distributed.
A representative of a consumer advocacy group wishes to see if there is any evidence that the
mean net weight is less than advertised. For this, the representative randomly selects 46 bags
of this brand and determines the net weight of each. He finds the mean of these selected bags
to be 395grams and the standard deviation to be 6.8grams. Use these data to calculate a 90%
confidence interval for the true mean weight. State the formula, show ALL working and
remember to always interpret your interval in the context of this question.
(i) find the probability that one randomly selected unit has a length greater than 120
cm;
(ii) find the probability that if three units are randomly selected, their mean length
exceeds 120 cm;
(iv) Referring to your answers above, draw the probability density function of the
length of a subcomponent in (i) and that of the mean of three subcomponents in
(ii) on a single axis.
(v) Is the distribution of the sample mean less or more variable than the distribution
of the parent population? Explain your answer.
(vi) close examination of the lengths of an important subcomponent show that these
are not normally distributed. Which, if either, of your answers (i) and (ii) above
will change and why?
2. In a random sample of 400 observations from a population whose variance is 100, it was
found that x = 75 . Find the 95% confidence interval estimate of the population mean and
interpret.
3rd edition
Sampling distribution (Chapter 7)
p. 213: 7.2, 7.6, 7.8
4th edition
Sampling distribution (Chapter 7)
p. 214: 7.2, 7.6, 7.8
Please watch video clips on Moodle in the “tutorial folder” under WEEK 06
These series of videos provide basic Excel skills required for the completion of Excel
Computing Activity Tasks.
Simply click on the "Next" or "Back" buttons to navigate through all videos.
The dean of a business faculty claims that the average MBA graduate is offered a starting
salary of $73,000. The standard deviation of offers is $6,000. Use Excel to answer the
following questions.
(a) Find the probability that in a random sample of 38 MBA graduates the mean starting
salary is
i. Less than $70,000
ii. Between $70,000 and $74,000
iii. More than $75,000
(b) What is the lowest mean salary in top 5% salaries in a random sample of 38 MBA
graduate?
(c) What is the lowest mean salary in top 90% salaries in a random sample of 38 MBA
graduate?
(d) Is any assumption about the distribution of MBA graduate salaries necessary?
Explain.
The t distribution
In introductory statistics courses such as this, the t distribution is used only in its role in the
sampling distribution of the sample mean when the population standard deviation has to be
estimated by the sample standard deviation. In many texts it is used only in its inverse form, to
obtain a confidence interval or a critical value. With Excel, the direct form is available; it is
used primarily to obtain p values in hypothesis tests.
The t
distribution is
standardised, so
the mean and
standard
deviation are not
required.
However the
distribution
depends on
However, there are also t distribution functions that deal with the area in the right tail or two
tails. This reflects the use of the distribution in obtaining p values in a hypothesis test. The
function T.DIST.RT returns the right-tail probability. The function T.DIST.2T returns the
two-tail probability.
In the following example using T.DIST.RT, a variable following the t distribution with 159
degrees of freedom has a probability of 0.01137 of having a value greater than 2.3.
T.DIST.RT(2.3,159)
= Pr(t > 2.3)
= 0.01137
2. The two tails probability, we find T.DIST.2T(2.3,159) = 0.02275, just double the
previous answer.
The inverse function for the t distribution exists in two versions: T.INV can be used to obtain
the critical value of student’s t distribution that cuts off a lower tail of a specified area.
T.INV.2T ( 0.05,159 )
= 1.974996
In this example, if a variable t has the t distribution with 159 degrees of freedom,
P(t < – 1.975 or t > 1.975) = 0.05
p p
t 0 t 0
𝑝𝑝 𝑝𝑝 𝑝𝑝 𝑝𝑝
2 2 2 2
-t 0 t -t 0 t
p p
0 t 0 t
Exercise 2
If t is drawn from the Student t distribution with 19 degrees of freedom,
Exercise 4
The routes of postal deliveries are carefully planned so that each deliverer works between 7 and
7.5 hours per shift. The planned routes assume an average walking speed of 2kph, and no
shortcuts across lawns. In an experiment to examine the amount of time deliverers actually
spend completing their shifts, a random sample of 75 postal deliverers were secretly timed. The
data are available in Ex4.xls in worksheet (a). Assuming that the times are normally distributed,
estimate with 99% confidence the mean shift time for all postal deliverers.
P ( X < X * ) = 0.95 ∴ X * = ?
(b)
=
NORM . INV ( 0.95,73000,973.3) X * $74,601
Exercise 3
(a)
Note that zcrit should cut off an upper tail equal to alpha/2.
ie Pr(Z > zcrit) = alpha/2
so Pr(Z < zcrit) = 1 - alpha/2
ie zcrit = NORM.S.INV(1 - alpha/2)
Sample size 8
Population variance: 100
Population standard deviation 10
Sample mean 22.5
Alpha 0.1
Critical value 1.644854
Upper confidence limit 28.31544
Lower confidence limit 16.68456
We can state with 90% confidence that the true mean of the
population lies between 16.68 and 28.32.
Exercise 4
(Sample size n 75
Part A:
1. Use tables to find the p-values for the following tests. (We assume the population standard
deviation is known). If α=0.05, state the conclusion (no interpretation possible here). Circle
what is appropriate and find missing words in the text below: Draw curves and clearly show
where is p-value. Circle what is appropriate and find missing words in the text below:
(i)
H 0 : µ = 500
H 1 : µ ≠ 500
z calc = −1.76
p-value is …………… I calculate p-value for this one/two sided test as ………………….
p-value is smaller /not smaller than ………….. therefore we can/cannot reject……………….
(ii)
H 0 : µ ≤ 200
H1 : µ > 200
z calc = 2.63
p-value is …………… I calculate p-value for this one/two sided test as ………………….
p-value is not smaller /smaller than ………….. therefore we can/cannot reject……………….
If the machinery is working correctly, the mean diameter of the buttons will be 1.3cm. The
variance for that machine is known to be 0.0081 cm2. A sample of 16 buttons is measured with
a laser, and found to have a mean diameter of 1.25cm. Test at the 5% level of significance, the
hypothesis that the mean diameter of the population differs from 1.3cm using the critical value
approach. Ensure that you clearly state your hypotheses, show ALL steps, ALL your working
AND interpret your conclusion in context of this question.
3. The director of manufacturing at a fabric mill needs to determine whether a new machine is
producing a particular type of cloth according to the manufacturer’s specifications, which
Is there evidence that the machine is not meeting the manufacturer’s specifications for mean
breaking strength? Use a 5% level of significance and the critical value approach. Ensure
that you clearly state your hypotheses, show ALL steps, ALL your working AND interpret your
conclusion in context of this question.
Part B:
4. A company that produces bias-ply tires is considering a certain modification in the tread
design. An economic feasibility study indicates that the modification can be justified only if the
true average tire life under standard test conditions exceeds 20,000 km. A random sample of 16
prototype tires is manufactured and tested, resulting in a sample mean tire life of 20,758 km.
Suppose tire life is normally distributed with standard deviation 1,500 km (the value for the
current version of the tire). Does this data suggest that the modification meets the condition
required for changeover? Test the appropriate hypothesis using significance level 0.01. Use
critical value approach.
(a) Suggest appropriate null and alternative hypotheses, and explain your choice of null
and alternative hypotheses.
(c) State the test statistic. Specify distribution of this test statistic.
(d) Perform the test at the 1% level of significance by the critical value method. Use
recommended 5 steps procedure (refer to Lecture week 7)
Step 4: State decision rule (condition leading to rejecting H0) and make decision
about H0
Step 5: State the conclusion within the context of the problem (see hint below)
(Confirm finding the critical value during computer lab session. For critical value you
will need to use NORM.S.INV(0.99)
0.01
NORM.S.INV(0.99)
(e) Find the p-value associated with the value of the test statistic obtained from the
sample. (Confirm finding the p-value during computer lab session. To do this, use
NORM.S.DIST(….,….). Is the p-value larger or smaller than the significance level?
What does it mean in relation to the test?
Conclusion:
____________________________________________________________________
____________________________________________________________________
(a) Perform a hypothesis test to determine whether there is evidence at the 5% level of
significance to conclude that the average revenue is lower than the consultant predicted.
HINT: Apply and learn the following systematic approach by circling what is appropriate
and find missing words.
• This is statistical inference related to estimation/hypothesis testing about the population
mean/proportion because...............................................................
• Standard deviation required for this calculation is population/sample standard
deviation because………………….
• I will use the following formulae for test statistic ………………… with the following
distribution ……………, because………………………………
• Do I know all components what I need to substitute into this formula? Y/N
• List all components which are known…………………………………………
• List all components which are unknown………………………………………
• How do I find missing components?
• I suppose to use critical value/p-value approach because…………….
• Follow recommended 5 step procedure
(b) If the test were done at the 10% level of significance, would the answer change?
(a) can it be inferred at the 5% significance level that the management negotiator is correct?
Use critical value approach.
(Note: In this question we are asking whether there is strong evidence that the
negotiator’s claim is correct, i.e. strong evidence that the national mean income for
building workers is less than $50,000.)
(b) can it be inferred at the 5% level that the mean income of building workers across the
country is different from $50,000? Use critical value approach.
8. At a large furniture and electrical store customers usually find that the furniture on display is
not held in stock. Rather than being immediately available, it must be sourced from
manufacturers. In the sofa department, the average delivery time is expected to be six
weeks after purchase, and it is believed that delivery times are normally distributed around
this value. In order to test whether the six-week target is accurate, the store recorded the
delivery time (in days) taken for 50 sofa purchases and calculated the sample mean
delivery time was 43.68 days with the sample standard deviation 27.078 days.
If there is strong evidence that the mean delivery time is greater than 42 days,
the store may
• Advise customers of a longer waiting period, potentially driving away customers
who are not prepared to wait that long
• Negotiate with suppliers to investigate the possibility of more rapid service.
Both of these may be a significant cost to the store, so will only be undertaken if
the evidence is strong.
(a) Suggest appropriate null and alternative hypotheses, and explain your choice
of null and alternative hypotheses.
(c) State the test statistic. Specify distribution of this test statistic.
(d) Perform the test at the 5% level of significance by the critical value method.
Using the critical value approach. Use recommended 5 steps procedure (refer
to Lecture week 6)
(e) The p-value associated with the value of the test statistic obtained from the
sample is 0.33. (Confirm finding the p-value during computer lab session.
To do this, use T.DIST.RT which requires you to provide the sample test
statistic value from Step 5, and the degrees of freedom.) Is the p-value larger
or smaller than the significance level? What does it mean in relation to the test?
Conclusion:
____________________________________________________________________
____________________________________________________________________
The approval process for a life insurance policy requires a review of the application and the
applicant’s medical history, possible requests for additional medical information and medical
examinations, and a policy compilation stage where the policy pages are generated and then
delivered. The ability to deliver policies to customers in a timely manner is critical to the
profitability of this service. During one month, a random sample of 25 approved policies is
selected and the total processing time in days is recorded. The sample mean is found to be 34.64
and the sample standard deviation is 26.00.
(a) In the past, the mean processing time averaged 45 days. At the 5% level of
significance, is there evidence that the mean processing time has changed from 45
days?
(b) What assumption about the population distribution is needed in (a)?
pp.301 - 302: 9.46, 9.50, 9.54, 9.60 [Steel available in Chapter 9 data files]
Please watch video clips on Moodle in the “tutorial folder” under WEEK 07
These series of videos provide basic Excel skills required for the completion of Excel
Computing Activity Tasks.
Simply click on the "Next" or "Back" buttons to navigate through all videos.
MyMathLab STUDY PLAN – practice topic relevant questions followed by the quiz.
This is useful for your EXAM
(ii) H0 : µ ≤ 200
H1 : µ > 200
z = 2.63
= =
x 145, =
s 50, n 100
H0 : µ ≥ 150
H1 : µ < 150
Exercise 4
The following data were drawn from a normal population. Can we conclude at the 5%
significance level that the population mean is not equal to 32?
25 18 29 33 17
Exercise 5
Ecologists have long advocated recycling newspapers as a way of saving trees and
reducing landfills. In recent years, a number of companies have gone into the business of
collecting used newspapers from households and recycling them. A financial analyst for
one such company has recently calculated that the firm would make a profit if the mean
weekly newspaper collection from each household exceeded 1 kg. In a study to determine
the feasibility of a recycling plant, a random sample of the weights of recycled newspapers
from 100 households was obtained. Find the relevant data in Exercise5.xls
Do these data provide sufficient evidence at the 1% significance level to allow the
analyst to conclude that a recycling plant would be profitable?
-1.76 0 1.76
(b)
This hypothesis test is one sided and the p-value is the area of the upper tail. This can be
obtained as
1 − NORM .S .DIST (2.63, true) =
1 − 0.99573 =
0.00427
Alternatively, use the symmetry of the normal distribution which means that the lower tail
cut off by -2.63 is equal to the upper tail cut off by 2.63, so obtain directly:
= NORM .S .DIST ( −2.63, true
= ) 0.00427
Exercise 2
Notice Step 5: Conclusion for both parts (a) and (b) – we CANNOT reject H 0 at 5% level
of significance. The sample DOES NOT provide enough evidence against H 0 . Therefore the
fill level for this machine is NOT significantly different from 1050ml.
xbar 145
s 50
n 100
H 0 : mean 150
t -1.000
p-value 0.16
Exercise 4
Data
25
18
29
33
17
H 0 : µ = 32
H 1 : µ ≠ 32
Significance level 5%
xbar 24.4
s 6.913754
t -2.45802
tcrit 2.776445 and -2.77645
p-value 0.06984
Exercise 5
xbar 1.0925
s 0.330073
n 100
Test statistic t 2.80241
critical value
T.INV(0.99,99)
or –T.INV(0.01,99) 2.364606
p-value
T.DIST.RT(2.802,99) 0.003051
You will have Formative Assessment Task (FAT I) during Computer Lab period.
Students revise Weeks 4 – 7 lecture material & respective tutorials
REMEMBER: Practice with “MyMathLab STUDY PLAN” for Final Exam
Part A:
1. With the recent interest in the proportion of Australians who use recreational drugs,
suppose a survey is conducted on 10,000 randomly chosen Australians aged 15 years or older.
It is found that 1,600 of the participants in the survey currently use recreational drugs.
Obtain a 95% confidence interval for the proportion of all Australians aged 15 years and older
who currently use recreational drugs.
State the formula, show all working and remember to always interpret your interval in
the context of the question.
2. With the imminent new cigarette packaging legislation, there is a lot of interest at the
moment in the proportion of Australians who smoke. If we have an estimate of the proportion
who smoke now, then we will have a benchmark against which to judge any change that could
be attributed to the legislation. Suppose a survey is conducted on the smoking habits of 5,000
randomly chosen Australians aged 15 years or older. It is found that 784 of the participants in
the survey currently smoke.
Obtain a 95% confidence interval for the proportion of Australians aged over 15 who currently
smoke.
HINT: Apply and learn the following systematic approach by circling what is
appropriate and find missing words.
• This is statistical inference related to estimation/hypothesis testing about the
population mean/proportion because...............................................................
• I will use the following formulae for calculating 95% confidence
interval ………………………………,because………………………………
• Do I know all components what I need to substitute into this formula? Y/N
• List all components which are known…………………………………………
• List all components which are unknown………………………………………
• How do I find missing components?
• Find the 95% confidence interval estimate (include units)
• Interpret (include units)
Test whether there is evidence at the 5% level of significance that the percentage of Australians
aged 15 or over who smoke is greater than 15%. Use the critical value approach. Is there
evidence at the 10% level? What is the p-value for this test? Interpret the p-value.
HINT: Apply and learn the following systematic approach by circling what is
appropriate and find missing words.
• This is statistical inference related to estimation/hypothesis testing about the
population mean/proportion because...............................................................
• I will use the following formulae for test statistic ………………… with the following
distribution ……………, because………………………………
• Do I know all components what I need to substitute into this formula? Y/N
• List all components which are known…………………………………………
• List all components which are unknown………………………………………
• How do I find missing components?
• I suppose to use critical value/p-value approach because…………….
• Follow recommended 5 step procedure
Part B:
4. Many public polling agencies conduct surveys to determine the current consumer
sentiment concerning the state of economy. Suppose that one agency randomly samples 484
consumers and finds that 257 are optimistic about the state of economy.
(a) Use 90% confidence interval to estimate the proportion of all consumers who are
optimistic about the state of economy.
Answer the following questions:
• Point estimator of the population proportion π is …….. and equals to ……..
• The calculation is based on formula ………………………………………
because …………………………………..
• The lower confidence limit is …………………… and I can calculate it
as ……………………………
• The upper confidence limit is ……………………. And I can calculate it
as ……………………………
• I am …….% confident that the population proportion (specify what it is
within the context of this question) is somewhere between ………..
and ………….
(b) Based on the confidence interval, can we infer that the majority of all consumers are
optimistic about the economy?
6. The reputation and hence sales of many business can be severely damaged by
shipments of manufactured items that contain a large percentage of defectives. A
manufacturer of alkaline batteries wants to be reasonably certain that less than 5% of its
batteries are defective. Suppose 300 batteries are randomly selected from a very large shipment;
each is tested and 10 defective batteries are found.
(a) Does this outcome provide sufficient evidence for the manufacturer to conclude that
the fraction defective in the entire shipment is less than 0.05? Use α = 0.01 .
(b) Find the p-value for the test in part (a). How strong was the weight of evidence
favouring alternative hypothesis in part (a)?
(a) Recall Elecmart.xlsx sample data, Pivot table of Spent vs Gender – focus on count
(c) Recall Elecmart.xlsx sample data, Pivot table of Spent vs Gender and Time – focus
on count
3rd edition
p.256-257: 8.24, 8.30
p.306: 9.62, 9.64, 9.66, 9.68
4th edition
p.257-258: 8.24, 8.30
p.306: 9.62, 9.64, 9.66, 9.68
(a)
H 0 : π = 0.15
H1 : π ≠ 0.15
α = 5%, zcalc = −1.79
(b)
H 0 : π ≥ 0.79
H1 : π < 0.79
α = 1%, zcalc = −3.001
Exercise 2 A random sample of 50 consumers taste-tasted a new snack food. Their responses
were coded (0: do not like; 1: like; 2: indifferent) and recorded as follows:
1 0 0 1 2 0 1 1 0 0
0 1 0 2 0 2 2 0 0 1
1 0 0 0 0 1 0 2 0 0
0 1 0 0 1 0 0 1 0 1
0 2 0 0 1 1 0 0 0 1
(a) Use an 80% confidence interval to estimate the proportion of consumers who like the
new snack food.
(b) Based on your finding in part (a) can we infer that majority of customers will like the
new snack food?
Exercise 4..Refer to PlanFinan.xlsx data (tutorial week 4) again. Create Pivot Table of Salary
vs Sex and EducLevel – focus on Count
(a) Obtain a 95% confidence interval for the proportion of all male within
postgraduate educational level group.
(b) Can we conclude at 5% level of significance that proportion of all male within
postgraduate educational level group is smaller than 57%?
Exercise 1
(a)
p − value = 2 × P ( Z < −1.79 ) zcrit = ± z0.025
2 × NORM .S .DIST ( −1.79,1)
= =
NORM .S .INV (0.025) =
−1.95996
= 0.073454 = NORM = .S .INV (0.975) 1.95996
DR : Reject H 0 if p − value < α DR : Reject H 0 if zcalc < − zcrit or zcalc > zcrit
0.073454 > 0.05 − 1.79 > −1.95996 or − 1.79 < 1.95996
Using p-value and critical value approach we get the same conclusion: we CANNOT
reject the null hypothesis. The sample DOES NOT provide enough evidence against H 0
at 5% level of significance. Therefore, the population proportion IS NOT
SIGNIFICANTLY different to 15%.
(b)
p − value = P ( Z < −3.001) zcrit = − z0.01
NORM .S .DIST ( −3.001,1)
= =
NORM .S . INV (0.01) =
−2.32635
= 0.001345
DR : Reject H 0 if p − value < α DR : Reject H 0 if zcalc < zcrit
0.001345 < 0.01 − 3.001 < −2.32635
Using p-value and critical value approach we get the same conclusion: we CAN reject the
null hypothesis. The sample DOES provide enough evidence against H 0 at 1% level of
significance. Therefore, the population proportion IS SIGNIFICANTLY smaller than
0.79.
Exercise 2 (a) Sample proportion of customers who like the new snack is 0.3. Use
NORM.S.INV(0.1) or NORM.S.INV(0.9) for finding the critical value ± 1.2815. Hence 80%
confidence interval limits are: upper 0.38 and lower 0.22.
(b) based on finding that both confidence interval limits are lower than majority
proportion (50% or 0.5) ; we cannot say at 80% confidence level that majority of customers
will like the new snack.
Exercise 3
z = −2.8284 , fail to reject H 0.
Count of
Salary Column Labels
Grand
Row Labels 1 2 3 4 Total
0 9 46 53 46 154
1 11 39 36 34 120
Grand Total 20 85 89 80 274
(a)
46
n=
80, p =≈ 0.58, z α =
z0.025 =
1.96
80 2
46
=
n 80,=
p ≈ 0.58
80
H 0 : π ≥ 0.47
H1 : π < 0.47
Note that 47% of 80 is greater than 5 and also 53% of 80 is greater than 5,
and hence the normal approximation is valid.
p − 0.47
Test statistic: z = is distributed approximately as N(0,1).
0.47(1 − 0.47)
80
0.58 − 0.47
=zcalc ≈ 1.971
0.47(1 − 0.47)
80
p-value =P ( Z < 1.971) =0.9756
Please watch video clips on Moodle in the “tutorial folder” under WEEK 9
These series of videos provide basic Excel skills required for the completion of
Excel Computing Activity Tasks.
Simply click on the "Next" or "Back" buttons to navigate through all videos.
Practice with “MyMathLab STUDY PLAN” for Final Exam
Part A:
1. A regression analysis output from Excel on the SALES and PRICE for franchises of a
(unnamed) burger chain in a selection of different cities across the US is provided bellow.
SALES is in thousands of dollars, while PRICE is an index over all products sold in a given
month, and is expressed as a notional number of $ for a meal. Notice that you will practice how
to create this Excel output along with scatter plot in computing lab session.
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.625541
R Square 0.391301
Adjusted R
Square 0.382963
Standard Error 5.096858
Observations 75
ANOVA
df SS MS F Significance F
Regression 1 1219.091 1219.091 46.9279 1.97E-09
Residual 73 1896.391 25.97796
Total 74 3115.482
Standard
Coefficients Error t Stat P-value Lower 95% Upper 95% Lower 99.0% Upper 99.0%
Intercept 121.9002 6.526291 18.67832 1.59E-29 108.8933 134.9071 104.639 139.1614
Manatee deaths
60
Numer of manatee deaths
50
40
30
20
10
0
400 450 500 550 600 650 700 750
Number of registered powerboats ('000s)
c) Find the linear model for estimating the Number of manatee deaths from the number
of registered powerboats (‘000s). Use the Excel summary output below. (You will
practice how to create this output in lab session).
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.941477289
R Square 0.886379485
Adjusted R Square 0.876911109
Standard Error 4.276387771
Observations 14
ANOVA
df SS MS F Significance F
Regression 1 1711.979 1711.979 93.61473 5.11E-07
Residual 12 219.4499 18.28749
Total 13 1931.429
Standard
Coefficients Error t Stat P-value Lower 95% Upper 95%
Intercept -41.43043895 7.412217 -5.58948 0.000118 -57.5803 -25.2806
Powerboats
(thousands) 0.124861692 0.012905 9.675471 5.11E-07 0.096744 0.152979
Part B:
4. Assuming that relationship is statistically significant, use the information from the
scatterplot below to estimate the Final mark (maximum 100 mark) if the study time
(a) is 0 hours
(i) comment on validity (reliability) of this estimate
(ii) explain why this estimate is different to observed value of Final mark for
study time 0 hours.
(b) is 50 hours
(i) comment on validity (reliability) of this estimate
(ii) explain why this estimate is not encouraging for a student who wants to
get the perfect Final mark
(iii) by visual inspection of the scatter plot, state with explanation the range
of study time leading to achieving the perfect Final mark. Explain why study
time 45 hours is outside this study time range?
120
80
Final mark
60
40
20
0
0 5 10 15 20 25 30 35 40 45 50
Study time (hours)
6. In a context of question 4, circle what is appropriate and find missing words in the following
report:
(a) Find the least squares regression line (in terms of variables) and interpret
(b) Find the slope of the least square regression line and interpret
The slope is ………………. The slope of regression line predicts that, on average,
……………………increases/decreases by ……………..(along with units) for one unit
increase/decrease in ……………………..
(c) Find the y-intercept of the least square regression line and interpret
The y-intercept is………………. The y-intercept of regression line predicts that, on average, the
……………………when no ……………………… is …………..(along with units). In the context of
this question it is a valid (reliable)/invalid (not reliable) estimate because …………………
OR
a) What do you expect the relationship to be between Price and Odometer Reading?
b) Use Excel scatter diagram against Odometer Reading vs Price (be sure you will be able
to use Excel for getting this scatterplot).
Comment on how this visual relationship compares with your expectations.
20000
15000
10000
5000
0
0 50000 100000 150000 200000 250000
Odometer Reading (km)
c) Based on the scatter plot, comment on whether it is appropriate to fit a regression line
to the data.
A regression analysis was performed using Excel, with the following result:
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.441278
R Square 0.194727
Adjusted R
Square 0.12152
Standard Error 7249.394
Observations 13
ANOVA
df SS MS F Significance F
Regression 1 1.4E+08 1.4E+08 2.659956 0.131176221
Residual 11 5.78E+08 52553716
Total 12 7.18E+08
Data file Pages 420 – 421 Pages 427 – 428 Pages 443 – 443
CO2.xlsx 12.4 12.15(a), (c) 12.37
Class Size.xlsx 12.7 12.18(a), (c) 12.40
3rd edition
p. 454: 12.67 (a) – (d), (f); p. 454: 12.68 (a) – (d), (f), (i) (data file crude.xlsx)
4th edition
p. 454: 12.70 (a) – (d), (f); p. 454: 12.61 (a) – (d), (f), (i) (data file crude.xlsx)
See Section E5.3 in the Excel Notes on Moodle for instructions about how to
generate a simple linear regression output.
(This will provide a 99% confidence interval for population coefficients in addition to the 95%
confidence interval that is always provided.)
As a check that your output is correct, make sure that that the fourth number from the top of
the output, Standard Error, is 5.09686.
(d) State the estimated linear regression equation for this data
Using the X, Y labels
Using the variable names instead of the X, Y labels
(e) Conduct a hypothesis test to determine whether there is evidence at the 5% level of
significance that a linear relationship exists between PRICE and SALES. (Remember,
only the p-value approach is used in regression analysis, apply 5 steps hypothesis testing
approach)).
(f) What is the slope of the estimated regression line? Provide an interpretation of this value.
(g) What is the value of the intercept of the regression line? Give an interpretation of this
value and discuss whether it is meaningful in this case.
Exercise 2
A government economist is attempting to produce a better measure of poverty than is currently
in use. To help acquire information, she recorded the annual household income in $000s and
the amount of money spent on food during one week for a random sample of households. Data
is available in Excel file Exercises2-4.xlsx in worksheet Exercise 2.
(a) Use Excel to produce a scatter plot of the data. Comment on whether linear regression
will supply a suitable model of the relationship.
(b) Obtain a regression output for this data, and state the equation of the regression line.
(c) Make an economic interpretation of the slope.
(d) What does the value of the intercept tell you?
(e) Estimate the weekly expenditure on food if the annual household income is:
(i) $60,000
(ii) $150,000
Comment on these estimates.
(a) Use Excel to produce a scatter plot of the data. Comment on whether linear regression
will supply a suitable model of the relationship.
(b) Obtain a regression output for this data, and state the equation of the regression line.
(c) From the output, write down the equation of the regression line.
(d) Interpret the slope.
(e) Interpret the coefficient of determination.
(f) Can we conclude that a significant linear relationship exists between years of
education and hours of internet use? At what significance level?
Exercise 4
In order to determine a realistic price for a new product that a company wishes to market, the
company’s research department selected 10 sites thought to have essentially identical sales
potential and offered the product in each at a different price. The resulting sales are recorded in
the following table and also in in Excel file Exercises2-4.xlsx in worksheet Exercise 4.
(a) Use Excel to find the graph of the scatter plot, the regression output, and the graph of
the regression line.
(b) From the output write down the equation of the regression line.
(c) Interpret the slope.
(d) Interpret the coefficient of determination.
(e) Is there sufficient evidence at the 0.5% significance level to allow us to conclude that
significant linear relationship exists between price and sales?
Exercise 5
Reproduce scatterplot and Excel Summary Table for question 2. Data are available in Excel
file Manatee.xlsx
80
70
60
50
4.5 5 5.5 6 6.5 7
Price ($)
(d)
OR Estimated=
Sales 121.900 − 7.829 Price
Step 2:
α = 0.05
Step 3:
p-value = 0.0000
Step 4:
Reject H 0 if p-value < α
Since the 1.97E-09 < 0.05, we CAN reject the null hypothesis,
Step 5:
We CAN reject H 0 at the 5% level of significance. The sample DOES provide enough
evidence against H 0 . That is, a significant linear relationship DOES exist between the
Sales and Price of Burgers.
(f) Slope = -7.829. For every $1 increase in Price, the Sales is estimated to decrease on
average by $7,829.
(g) Intercept = 121.9. If the price were zero then the sales level, on average, would be
$121,900. This could be thought of as a prediction of the sales level if the hamburgers were
being given away. However, this is not a valid prediction because the prices in the data set
range from $4.83 to $6.49, and therefore $0 is well outside the range of the data.
[Note that when interpreting the intercept and slope, it is important to take account of the
units in which the data is specified. In the current case, in particular, the sales level is in
thousands of dollars.]
Exercise 2:
(a) Linear model might work, however with very large variation amongst the data, the
strength of the possible linear relationship will be very low.
Weekly food expenditure vs annual household income
400
Weekly food expenditure ($)
350
300
250
200
150
20 30 40 50 60 70 80 90 100
Annual income ($000's)
Regression Statistics
Multiple R 0.4958
R Square 0.2459
Adjusted R
Square 0.24077
Standard Error 36.9393
Observations 150
ANOVA
df SS MS F Significance F
Regression 1 65841.5803 65841.5803 48.2528 1.1047E-10
Residual 148 201948.016 1364.5136
Total 149 267789.5963
=
Equation of regression line: Yˆ 153.90 + 1.96 X
Where, X is the annual income (in $000's) and Yˆ is the estimated food expenditure.
(c) Economic interpretation of slope: On average, for every extra $1000 of annual income,
expect weekly food expenditure to rise by $1.96.
(d) Value of intercept: The intercept suggests that a household with no income will spend
$153.90 per week on food. However, zero income is a long way from the values in the data
set, so this estimate is not likely to be reliable. (any other plausible economic explanations?)
(e) Weekly expenditure on food if
(i) Income = $60,000. Weekly expenditure estimated to be
153.90+60*1.96 = $271.50 (or $271.39 if you use the unrounded values straight from the
output.).
Since this income level is within the range of the data, the estimate is regarded as reliable.
(ii) Income = $150,000. Weekly expenditure estimated to be
153.90+150*1.96 = $447.9 (or $447.63 if you use the unrounded values straight from the
output.) Since this income level is outside the range of the data, the estimate is not
reliable.
(f) Hypothesis test:
H 0 : β1 =0
H1 : β1 ≠ 0 where β 1 is the slope of the linear relationship.
Two-tail p-value = 1.1047E-10 (from the regression output). If α = 5%
Since the 1.1047E-10 < 0.05, we CAN reject the null hypothesis,
We CAN reject H 0 at the 5% level of significance. The sample DOES provide enough
evidence against H 0 . That is, a significant, linear relationship DOES exist between the
Weekly expenditure and Income.
The null hypothesis would be rejected at all significance levels greater than about 1.1047E-
10. (In other words at all reasonable significance levels it is virtually certain that there is a
linear relationship between these two variables.)
12
10
8
6
4
2
0
5 7 9 11 13 15 17 19
Years of education completed
Most of the data seems to be scattered around a line with positive slope, however at most years
of education completed, there are some data points for zero internet usage. This may represent
those who are not connected to the internet. So a linear regression model is worth trying, but
it will really only apply to those who have access to the internet. However, following the
instructions in the question, we continue the analysis using all the data.
(b)
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.3308
R Square 0.1094
Adjusted R
Square 0.1050
Standard
Error 4.4539
Observations 200
ANOVA
df SS MS F Signi F
Regression 1 482.7345 482.7345 24.3345 0.00000171
Residual 198 3927.8205 19.8375
Total 199 4410.555
Exercise 4
(a)
Sales Versus Price
18
16
sales ($000)
14
12
10
8
6
4
2
0
14 15 16 17 18 19 20
Price
(b)
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.9107
R Square 0.8294
Adjusted R
Square 0.8081
Standard
Error 1.6418
Observations 10
ANOVA
Significance
df SS MS F F
Regression 1 104.8364 104.8364 38.8938 0.0002
Residual 8 21.5636 2.6955
Total 9 126.4
Standard Upper
Coefficients Error t Stat P-value Lower 95% 95%
4.8814E-
Intercept 49.2909 6.2576 7.8770 05 34.8608 63.7210
x -2.2545 0.3615 -6.2365 0.00025 -3.0882 -1.4209
Sales ($000's)
12
10
8
6
4
2
0
14 15 16 17 18 19 20
Price of product ($)
(c) Interpretation of slope: For every extra dollar in price the sales will drop on
average by $2,254.50.
(e) The p-value for the test for a significant linear relationship is 0.00025. This value
is well below 0.5% (or 0.005) and therefore we we CAN reject the null hypothesis,
We CAN reject H 0 at the 0.5% level of significance. The sample DOES provide enough
evidence against H 0 . That is, a significant linear relationship DOES exist between the Sales
and Price.
You will have Formative Assessment Task (FAT II) during this period.
Students revise Weeks 8 & 9 lecture material & respective tutorials
REMEMBER: PRACTICE WITH “MyStudyPlan” for EXAM
Part A:
1.
(a) If a contingency table has 5 row categories and 6 column categories, how many degrees of
freedom are there for the χ test for independence?
2
(b) What is the critical value for the test of independence for the categories represented in the
table at the 1% level of significance?
(c) And at the 5% level of significance?
(d) If the χ value calculated for the test is greater than the critical value, what is your
2
conclusion?
2. Recall Elecmart.xlsx data, Pivot table of Spent vs Gender and Time – focus on count. Find
expected frequencies.
3.
Consider the following data in the contingency table (TABLE B). Conduct a test of independence
at the 5% level for the L and M categories using Table B.
TABLE B TABLE B
M1 M2 Total M1 M2 Total
19 19
L1 15 4 L1
47 47
L2 28 19 L2
Total 43 23 66 Total 43 23 66
TABLE A TABLE A
M1 M2 Total M1 M2 Total
38 38
L1 30 8 L1 24.76 13.24
94 94
L2 56 38 L2 61.24 32.76
Total 86 46 132 Total 86 46 132
(c) Conduct a test of independence at the 5% level of significance for the L and M categories
using Table A.
Find missing values in the table below. Follow 5 steps procedure in hypothesis testing. Find missing
words or circle words appropriate in the context of interpretation.
( fo − f e )
2
L1M 2 8
L2 M 1 56
L2 M 2 32.76
Total 132 132
Since every cell in the contingency table has expected/observed frequencies larger than ………, this
confirms that χ distribution is/is not appropriate.
2
Step 1: Hypotheses
Reject H 0 if……………
Since ………. > …….. we CAN/CANNOT reject H 0
Step 5: Conclusion
We can/cannot reject H 0 at …….% level of significance. The sample DOES/DOES NOT provide
enough evidence to show that variables ……………. and …………. are independent/dependent.
5. Correctly predicting the direction of change in foreign currency exchange can be lucrative.
216 investors were asked to predict the direction of change over a certain period, and the actual
direction was later recorded. The results are given in the following table.
Predicted direction:
Actual Down Up TOTAL
Direction: Down 65 64
Up 39 48
TOTAL 216
TIME SERIES
6. In each of the following 4 time series plots explain the choice of additive/multiplicative
seasonality with trend/no trend. In the case of present trend, state whether it is linear/quadratic and
positive/negative. For each time series plot, suggest a relevant model for forecasting with correct
components.
Additive – because the variance (the differences between the highest and lowest values) in Yt seems
to be constant with respect to the time
Seasonality – because there is an evident repetitive pattern such as that around every 12th time
period value of Yt is lowest and at around every 4th or 5th time period the value of Yt is highest
No trend – because overall the level of Yt fluctuates around constant value ( Yt does not increase
or decrease with respect to the time)
Suggested model for forecasting
Additive model in form Y=t St + I t , where St represents the seasonal component and I t irregular
(random) component
8. For the following graphs of time series, comment on what components appear to be present.
Is it possible to decide if the components should be combined additively or multiplicatively? State
the model what you suggest to use for forecasting this time series. State with an explanation, what
component is not present in the time series.
150
100
50
0
0 5 10 15 20 25 30
(b)
Unemployment
20
18
Unemployment (%) in Logosia
16
14
12
10
8
6
4
2
0
0 5 10 15 20 25 30
Time in quarters since beginning of 2001
(c)
GDP ($millions, current)
800000
700000
600000
500000
$millions
400000
300000
200000
100000
0
1988 1990 1992 1994 1996 1998 2000 2002 2004
t, years
You will have Formative Assessment Task (FAT II) during this period.
Students revise weeks 8 & 9 lecture material & respective tutorials
Instructions will be given by your Teacher
For χ test of independence, if you need a critical value for a certain degrees of freedom, you need
2
For example, in lecture week 10, the cross-classification of the job status (having or not having a
job) with the exam status (HD or not HD) with degrees of freedom (2-1)(2-1) = 1 [for two columns
and two rows], the level of significance α = 0.05 and the calculated value of TS χ test is 4.444.
2
places 3.841]
(b) The p-value using = CHISQ.DIST.RT(4.444,1) is 0.035024 .
High Low
income income Total
Sportpack 123 154 277
Moviepack 118 111 229
Total 241 265 506
(a) Is there evidence of a relationship between income level and subscriber option? (Use
α = 0.10 )
(b) Calculate the p-value and interpret its meaning.
________________________________________________________________________
________________________________________________________________________
____________________________________________________________
ii. Some of the necessary calculations are provided below. Complete the tables on this
page.
Moviepack 119.9308
iv. Is the frequency count in all cells of the contingency table ideal to conduct this test?
Explain (answer Yes or No is not sufficient).
_______________________________________________________________________
_______________________________________________________________________
______________________________________________________________
(a) At the 0.01 level of significance, is there evidence of a significant relationship between
family role and type of preferred communication?
(b) What is your answer to (a) if you use the 0.05 level of significance?
The quarterly sales of a department store chain were recorded for the past four years from 2002 to
2005. These data are available on Moodle in Excel Exercises Week 11, Exercise1-3.xlsx worksheet
Ex3.
(i) Graph the time series. (You will need to create an appropriate column for the time variable.)
To be able to answer both questions, we advise you to follow step by step procedure and
answer the relevant questions
Recommended working for part (a)
i. State the null and alternative hypotheses:
𝐻𝐻0 : Subscriber option is independent of the income level (relationship between income level and
subscriber option does not exist)
𝐻𝐻1 : Subscriber option is dependent of the income level (relationship between income level and
subscriber option exists)
ii. Some of the necessary calculations are provided below. Complete the tables on this
page (calculated values are bolded).
(𝑓𝑓𝑜𝑜 − 𝑓𝑓𝑒𝑒 )2
Test statistic: 𝜒𝜒 2 = � , level of significance: 𝛼𝛼 = 0.1
𝑓𝑓𝑒𝑒
We cannot reject H 0 at 10% level of significance. The sample DOES NOT provide enough
evidence to show that variables the subscriber option and the income level are dependent
iv. Is the frequency count in all cells of the contingency table ideal to conduct this test?
Explain (answer Yes or No is not sufficient).
We require that all expected frequencies f e ≥ 5 for the χ test to be valid. We have all
2
expected frequencies in table ii. larger than 5, hence frequency count in all cells appeared
to be ideal to conduct this test.
=CHISQ.DIST.RT(2.5507,1) gives 0.110245 which is the p-value. The smallest value of alpha
leading to rejection of null hypothesis should be approximately 0.11 (or at least 11%)
Exercise 2:
We recommend that you set and follow a similar structure to Exercise 1 to be able to answer part
(a) and (b)
(a) Calculated value of the test statistics is 234.6986, which is larger than the critical value
21.666. This leads to rejection of null hypothesis hence at 1% level of significance we can
conclude that there is relationship between the household role and the type of preferred
communication.
(b) If the level of significance increased to 0.05, the critical value decreased to 16.919. This
however is still not large enough for changing the answer from part (a). Extremely small
−45
p-value based on using =CHISQ.DIST.RT(234.6986,9) is 1.6843 × 10 confirms that it
is basically impossible not to reject null hypothesis about independence between the type
of preferred communication and the household role, hence at any level of confidence we
can conclude that there is relationship between the household role and the type of preferred
communication.
40
30
20
10
0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Quarters since 2002
(ii) There is evident trend (linear, positive) component. There is also seasonal (quarterly)
component, which is overpowered by a strong random component. Should we remove or decrease
the strength of this random component, the quarterly pattern (quarterly seasonal component) will
be dominant. Removal of random component is not required in MCD2080.
Part A:
1. Two forecasting procedures were applied to the series in Excel Ex29 in order to forecast the sales
for the four quarters of 2006. The forecasts and the actual sales are given in the following table (Sales
and forecasts in $million):
Year Quarter Sales Forecast 1 Forecast 2
2006 1 30 31.2 28.6
2 31 38.0 33.7
3 40 42.7 36.1
4 49 54.2 44.4
(a) For each forecasting procedure, calculate the Mean Absolute Deviation and the Mean Square Error.
Show working. Summarize result in the table bellow
Forecast 1 (method 1) Forecast 2 (method 2)
MAD
MSE
(b) Based on these measures, which is the more accurate forecasting method? Explain.
2. Recently the economic welfare of North Carolina has been in the spotlight with the Democratic
Convention being held there in preparation for the 2012 Presidential election.
(a) Here is the full story of the unemployment rate in North Carolina each quarter for the last 10 years.
Note that the time series starts in the second year of the Bush presidency, and t = 29 corresponds
to January 2009, the beginning of the Obama presidency. (The quarterly figures are provided in
January, April, July and October.) 1
1 The data used in this question is derived from the US Bureau of Labor Statistics website,
http://www.bls.gov/home.htm , accessed 7 September 2012.
MCD2080 Tutorial Questions and Computing Exercise – Week 11 Page 1
3. “Shop and Run” sports kit store intends to measure the seasonal effect on its sales based on the last
three years’ data. The seasonal indices for each quarter of each of these three years have been provided
in the table below.
Use this data to calculate seasonal indices for Summer, Autumn, Winter and Spring, correct to three
decimal places.
Part B:
4. (Q2 continued) 2
(b) Let’s look at the period of steady decline in the unemployment rate from t = 1 (January 2002) to t
= 26 (April 2008) highlighted by the rectangle.
We could model this steady decline with a linear downward trend and a seasonal component.
The trend line was determined based on the data from t = 1 to t = 26, using linear regression, and found to
=
be Tt 6.878 − 0.094t
Based on this trend line, and assuming a multiplicative model, seasonal indices were calculated and found
to be:
S 1 (Jan) 1.06
S2
(Apr) 0.97
S 3 (Jul) 1.04
S 4 (Oct) 0.93
(i) Interpret each seasonal index, use template wording provided. Circle correct choice and find
missing words in the text (be sure that you know how to get required %)
1st quarter (Jan) = 1.06 this indicates that on average the ………………. in North Carolina is
…….. % above/below the …………………………….. projection.
2nd quarter (Apr) = 0.97 this indicates that on average the ………………. in North Carolina is
…….. % above/below the …………………………….. projection.
3rd quarter (Jul) = 1.04 this indicates that on average the ………………. in North Carolina is
…….. % above/below the …………………………….. projection.
2
The data used in this question is derived from the US Bureau of Labor Statistics website,
http://www.bls.gov/home.htm , accessed 7 September 2012.
MCD2080 Tutorial Questions and Computing Exercise – Week 11 Page 2
4th quarter (Oct) = 0.93 this indicates that on average the ………………. in North Carolina is
…….. % above/below the …………………………….. projection.
(ii) Using this (linear trend and quarterly seasonality) model, state the forecasting model first and
find the forecasts of the unemployment rate for the last two quarters of the Bush presidency (July
2008, t = ? and October 2008, t = ?) and the first quarter of the Obama presidency (January 2009,
t = ?).
(For comparison only, the actual values were 6.7%, 6.9% and 9.5%.)
(c) Assuming (is this assumption reasonable?) the seasonal indices you calculated using data from 2002
to 2008 are still valid in 2012, calculate the deseasonalised (or seasonally adjusted) unemployment
rate in North Carolina for the first three quarters of 2012. The raw data is given in the following
table, which you should complete.
Seasonally
adjusted
Unemployment unemployment
Time rate rate
t = 41, January 2012 10.5
t = 42, April 2012 9.1
t = 43, July 2012 9.8
Comment on the underlying trend in the North Carolina unemployment figures in 2012.
The data used in this question is derived from the US Bureau of Labor Statistics website,
http://www.bls.gov/home.htm , accessed 7 September 2012.
5. Based on data from 2000 to 2005, a tourism expert developed a model of room occupancy rates in
Australia which involved a linear trend yˆ t = 9807.6 + 88.43t , where t is measured in quarters since
the beginning of 2000, and seasonal indices as shown in the following table:
Quarter SI
1 0.9940
2 0.9493
3 1.0283
4 1.0284
6.
Manufacturing of Australian passenger vehicles has been of particular interest in recent times with many
vehicles driven in Australia manufactured overseas. This has forced the closure of some manufacturing
plants in Australia resulting in many employees losing their jobs. The graph below shows the number of
Australian passenger vehicles manufactured monthly between January 2006 and December 2013.
60000
50000
40000
30000
20000
10000
(a) Discuss what components are present in the series, and what evidence you see in the graph for each
of them.
(b) The data was analysed by fitting a straight line to all the data from 2006 to the end of 2013 and
calculating seasonal indices based on this regression line. The estimated trend line is 𝑇𝑇�𝑡𝑡 =
50817.35 − 31.908𝑡𝑡, where 𝑡𝑡 is time in months, with 𝑡𝑡 = 1 corresponding to January 2006. Based
on this trend line, and assuming a multiplicative model, the seasonal indices were calculated and
found to be:
(i) Using the trend and seasonal components, forecast the number of vehicles
manufactured for April 2014.
(ii) Provide an interpretation of the seasonal index for May as seen in the table.
(c) Forecasts were also obtained for the first three months of 2014. Use your answer in (b) to complete
the table below (including the total) and calculate the mean absolute deviation for the forecasts of
the Australian manufactured passenger vehicles. (Some of the necessary calculations are provided
in the table.)
(d) An alternative forecasting method is also used and is found to have a mean absolute deviation of
2647.443. Use the value calculated in (c), along with this information, to determine whether the
method used in (b) or the alternative method is the best for forecasting. Explain briefly.
Revenue ($million)
YEAR
QUARTER 2005 2006 2007 2008 2009
1 16 14 17 18 21
2 25 27 31 29 30
3 31 32 40 45 52
4 24 23 27 24 32
(a) Use Excel to plot the time series and comment on the components that appear to be present in
the series.
(b) Regression analysis produced the trend line quarters, with t = 1 in Quarter 1 of 2005.
∧
yt = 20.2 + 0.732t , where t is the time in
S1 0.646
S2 1.045
S3 1.405
S4 0.904
Use this information to forecast revenues for the four quarters of 2010.
You will have Formative Assessment Task (III) during this period.
Students revise week 10 lecture material & respective tutorial
This will be delivered on Learning Catalytics
Instructions will be given by your Teacher
Exercise 1
For this question, you will need to use the Excel document in Computing Exercises week 12 Exercise1-
3.xls worksheet Ex1.
For the “Turnover in hospitality” example discussed in the Week 10 lecture, evaluate the two forecast
methods using the following steps:
1. Restrict attention to the first 20 data points relevant to years 1983 to 2002 (highlighted by black
colour in the Excel document), and estimate the two models
Model 1: Model 2:
y = β 0 + β1t + ε y = β 0 + β1t + β 2t 2 + ε
Following these shortcut instructions:
• Create the scatter plot by selecting time and turnover variables (refer to computing lab in
week 10)
• Select all points in your scatter plot by left click
• While all points are selected, right click and select Add Trendline from drop down menu
• Make your choice of functions, select Linear (default) for Model 1 or 2nd order Polynomial
for Model 2 (one at the time) and tick box Display Equation on chart from the Trendline
Options
• Requesting functions in form
=y 3560.5 + 700.52 x y=
−0.2981x 2 + 706.78 x + 3537.6
will appear on chart
2. You need to recognise and use the values of β̂ 0 and β̂1 obtained for Model 1, and of β̂ 0 , β̂1 and
β̂ 2 obtained for Model 2, for calculating the values yˆi corresponding to 2003 to 2007, relevant to
the last 5 data points (highlighted by red colour in the Excel document) . Use your Excel function
MCD2080 Tutorial Questions and Computing Exercise – Week 11 Page 6
creating skills from computing lab in week 1 or follow instructions provided. Put these values in
columns E and F of the worksheet. For example, suppose the intercept and slope for Model 1 are
in cells L17 and L18 respectively. Then the entry in E22 would be
=$L$17+$L$18*B22
and could be dragged down to calculate the remaining years.
3. Now in columns G and H, calculate the sum of absolute forecast errors for Model 1 and Model 2
respectively; and in columns I and J, calculate the sum of squared forecast errors. (Absolute value
is the function ABS in Excel.)
4. Summarise the results of these calculations by completing the table below, and state, with reasons,
which model you choose.
Model 1 Model 2
Estimated equation
Exercise 2
For this question, you will need to use the Excel document in Computing Exercises week 12 Exercise1-
3.xls worksheet Ex2. Exports are an important component of the exchange rate and, domestically, are an
important indicator of employment and profitability in certain industries. The value of Australian exports
has increased in the 26 year period described in the following table.
(a) Plot the time series.
(b) Estimate a linear trend line.
(c) Does this trend-line capture the long-term behavior of the series?
(d) Predict trend value of exports for Year 2004.
Revenue ($million)
Year
Quarter 2001 2002 2003 2004 2005
1 16 14 17 18 21
2 25 27 31 29 30
3 31 32 40 45 52
4 24 23 27 24 32
(c) This trend line was used for calculating the seasonal indices listed below. Based on these seasonal
indices, describe the seasonal pattern of the time series.
Quarter SI
1 0.646
2 1.045
3 1.405
4 0.904
(d) Using the seasonal indices and the trend line, forecast the revenues for the four quarters of 2006.
(e) Given that the actual revenues for 2006 were observed as:
2006
quarter revenue
1 23
2 35
3 50
4 32
calculate the Mean Absolute Deviation (MAD) and Mean Square Error (MSE) for the forecast.
(f) If instead we multiply the trend-line values by seasonal indices obtained from the moving average the
MAD value for the resulting 2006 forecast is 2.005, and the MSE value is 5.081. What do you conclude
about the relative merits of the two forecasts?
Model 1 Model 2
18271.42 18248.52
18971.94 18942.48
19672.46 19635.85
20372.98 20328.61
21073.50 21020.79
3.
a. In G22, type =abs(E22 – D22), and drag down the formula
b. In H22, type =abs(F22 – D22), and drag down the formula
c. In I22, type =G22^2, and drag down the formula
d. In J22, type =H22^2, and drag down the formula
e. Then sum each of the columns G, H, I, J to get the values in the table below
4.
Model 1 Model 2
Estimated equation =yˆ 3560.5 + 700.52t yˆ =
−0.2981t 2 + 706.78t + 3537.6
Sum of Absolute forecast 16798.31 16984.63
errors
Sum of Squared forecast 63059962.34 64426811.28
errors
The sum of absolute forecast errors, and the sum of squared forecast errors are both lower for the
linear model (Model 1) than for the quadratic model (Model 2). Therefore I would choose Model
1.
[Note that we could also calculate MAD and MSFE by dividing each of the numbers in the table
by 5. This would not make any difference to the comparison.]
5.
At first sight, the answer may seem surprising because Model 2 fitted better with in-sample values.
But this had to be the case, because Model 1 is a particular case of Model 2 so a nonzero value of
β 2 is only used to get an improvement on the Model 1 result. If the time series is really linear, then
the quadratic term might be fitted to small random errors and rapidly move away from the actual
values in the next few years.
140000
120000
Exports ($million)
100000
80000
60000
40000
20000
0
1970 1975 1980 1985 1990 1995 2000 2005
Year
(b)
Regression Statistics
Multiple R 0.933652
R Square 0.871706
Adjusted R
Square 0.867124
Standard Error 10619.13
Observations 30
ANOVA
df SS MS F Significance F
Regression 1 2.15E+10 2.15E+10 190.248 5.21E-14
Residual 28 3.16E+09 1.13E+08
Total 29 2.46E+10
Upper
Coefficients Standard Error t Stat P-value Lower 95% 95%
Intercept 3002.021 3976.577 0.754926 0.456597 -5143.63 11147.67
t 3089.579 223.9955 13.79304 5.21E-14 2630.745 3548.413
Australian annual exports
140000
120000
Exports ($million)
100000
80000
60000
40000
20000
0
1970 1975 1980 1985 1990 1995 2000 2005
Year
(c) It is clear that this trend line does not capture the long term behavior of the series. Rather than being
randomly above and below the trend line, early values are above, from 1990 to almost 2000 they are below,
and recent values are again above the trend line. This indicates that the appropriate model would be
nonlinear. The shape of the time series graph looks approximately exponential.
60
50
40
$ million
30
20
10
0
0 5 10 15 20 25
Time in quarters since beginning of 2001
(b)
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.4525
R Square 0.2047
Adjusted R Square 0.1606
Standard Error 8.77229
Observations 20
ANOVA
Significance
df SS MS F F
Regression 1 356.6451 356.6451 4.634580645 0.045146
Residual 18 1385.155 76.95305
Total 19 1741.8
Standard
Coefficients Error t Stat P-value Lower 95% Upper 95%
Intercept 20.2105 4.0750 4.9596 0.000101325 11.64926 28.77179
t 0.73233 0.34017 2.1528 0.04514568 0.01765 1.447011
(c)
Interpretation of Seasonal Indices:
1st quarter = 0.646 this indicates that on average the Ice cream revenue is by 35.4 % below the
trend line projection.
2nd quarter = 1.045 this indicates that on average the Ice cream revenue is by 4.5 % above the
trend line projection.
3rd quarter = 1.405 this indicates that on average the Ice cream revenue is by 40.5 % above the
trend line projection.
4th quarter = 0.904 this indicates that on average the Ice cream revenue is by 9.6 % below the
trend line projection.
Quarter t Yˆt SI Ft
(e)
Quarter Yt Ft Yt - Ft |Y t - F t | (Y t -F t ) 2
4
7.16636 17.5882526
∑| Y − F | t t
7.16636
MAD = t =1
= = 1.7916
4 4
4
∑ (Y − F ) t t
2
17.5883
MSE = t =1
= = 4.3971
(f) 4 4
Calculated values
Method1: MAD for the trend-line SI forecast = 1.8
Method1: MSE for the trend-line SI forecast = 4.4
Provided values
Method 2: MAD for the moving average SI forecast = 2.005
Method 2: MSE for the moving average SI forecast = 5.081
The trend and seasonal forecast is much better, because both MAD and MSE for the trend line SI forecast
are smaller than the moving averages forecasts.
Sampling distributions
σ X - μX
X ~ N(μX ,σ ) where μX = μ and σX =
2 Z= ~ N(0,1)
X
n σ
n
π (1- π )
p ~N π, σp2 where σp = if nπ ≥ 5 and n(1 – π) ≥ 5
n
Estimation
σ σ σ
x-z μ< x + z x±z
n
α
n n
α α
2 2 2
s s s
x -tn-1, < μ< x +tn-1, x ± tn-1,
n
α
n n
α α 2
2 2
Test statistics
x - μ0 x - μ0 p - π0
z= t= z=
σ n s n π 0(1- π 0 )
n
b1 - c
t= ~ tn-2
se(b1)
n
Yt - Ft
Mean Absolute Deviation MAD = i=1
n
n
2
Yt - Ft
Mean Square Forecast Error MSFE = i=1
n
Cumulative probabilities (z ≤ 0)
z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
–0.0 0.5000 0.4960 0.4920 0.4880 0.4840 0.4801 0.4761 0.4721 0.4681 0.4641
–0.1 0.4602 0.4562 0.4522 0.4483 0.4443 0.4404 0.4364 0.4325 0.4286 0.4247
–0.2 0.4207 0.4168 0.4129 0.4090 0.4052 0.4013 0.3974 0.3936 0.3897 0.3859
–0.3 0.3821 0.3783 0.3745 0.3707 0.3669 0.3632 0.3594 0.3557 0.3520 0.3483
–0.4 0.3446 0.3409 0.3372 0.3336 0.3300 0.3264 0.3228 0.3192 0.3156 0.3121
–0.5 0.3085 0.3050 0.3015 0.2981 0.2946 0.2912 0.2877 0.2843 0.2810 0.2776
–0.6 0.2743 0.2709 0.2676 0.2643 0.2611 0.2578 0.2546 0.2514 0.2483 0.2451
–0.7 0.2420 0.2389 0.2358 0.2327 0.2296 0.2266 0.2236 0.2206 0.2177 0.2148
–0.8 0.2119 0.2090 0.2061 0.2033 0.2005 0.1977 0.1949 0.1922 0.1894 0.1867
–0.9 0.1841 0.1814 0.1788 0.1762 0.1736 0.1711 0.1685 0.1660 0.1635 0.1611
–1.0 0.1587 0.1562 0.1539 0.1515 0.1492 0.1469 0.1446 0.1423 0.1401 0.1379
–1.1 0.1357 0.1335 0.1314 0.1292 0.1271 0.1251 0.1230 0.1210 0.1190 0.1170
–1.2 0.1151 0.1131 0.1112 0.1093 0.1075 0.1056 0.1038 0.1020 0.1003 0.0985
–1.3 0.0968 0.0951 0.0934 0.0918 0.0901 0.0885 0.0869 0.0853 0.0838 0.0823
–1.4 0.0808 0.0793 0.0778 0.0764 0.0749 0.0735 0.0721 0.0708 0.0694 0.0681
–1.5 0.0668 0.0655 0.0643 0.0630 0.0618 0.0606 0.0594 0.0582 0.0571 0.0559
–1.6 0.0548 0.0537 0.0526 0.0516 0.0505 0.0495 0.0485 0.0475 0.0465 0.0455
–1.7 0.0446 0.0436 0.0427 0.0418 0.0409 0.0401 0.0392 0.0384 0.0375 0.0367
–1.8 0.0359 0.0351 0.0344 0.0336 0.0329 0.0322 0.0314 0.0307 0.0301 0.0294
–1.9 0.0287 0.0281 0.0274 0.0268 0.0262 0.0256 0.0250 0.0244 0.0239 0.0233
–2.0 0.0228 0.0222 0.0217 0.0212 0.0207 0.0202 0.0197 0.0192 0.0188 0.0183
–2.1 0.0179 0.0174 0.0170 0.0166 0.0162 0.0158 0.0154 0.0150 0.0146 0.0143
–2.2 0.0139 0.0136 0.0132 0.0129 0.0125 0.0122 0.0119 0.0116 0.0113 0.0110
–2.3 0.0107 0.0104 0.0102 0.0099 0.0096 0.0094 0.0091 0.0089 0.0087 0.0084
–2.4 0.0082 0.0080 0.0078 0.0075 0.0073 0.0071 0.0069 0.0068 0.0066 0.0064
–2.5 0.0062 0.0060 0.0059 0.0057 0.0055 0.0054 0.0052 0.0051 0.0049 0.0048
–2.6 0.0047 0.0045 0.0044 0.0043 0.0041 0.0040 0.0039 0.0038 0.0037 0.0036
–2.7 0.0035 0.0034 0.0033 0.0032 0.0031 0.0030 0.0029 0.0028 0.0027 0.0026
–2.8 0.0026 0.0025 0.0024 0.0023 0.0023 0.0022 0.0021 0.0021 0.0020 0.0019
–2.9 0.0019 0.0018 0.0018 0.0017 0.0016 0.0016 0.0015 0.0015 0.0014 0.0014
–3.0 0.0013 0.0013 0.0013 0.0012 0.0012 0.0011 0.0011 0.0011 0.0010 0.0010
–3.1 0.0010 0.0009 0.0009 0.0009 0.0008 0.0008 0.0008 0.0008 0.0007 0.0007
–3.2 0.0007 0.0007 0.0006 0.0006 0.0006 0.0006 0.0006 0.0005 0.0005 0.0005
–3.3 0.0005 0.0005 0.0005 0.0004 0.0004 0.0004 0.0004 0.0004 0.0004 0.0003
–3.4 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0002
Cumulative probabilities (z ≥ 0)
z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
0.0 0.5000 0.5040 0.5080 0.5120 0.5160 0.5199 0.5239 0.5279 0.5319 0.5359
0.1 0.5398 0.5438 0.5478 0.5517 0.5557 0.5596 0.5636 0.5675 0.5714 0.5753
0.2 0.5793 0.5832 0.5871 0.5910 0.5948 0.5987 0.6026 0.6064 0.6103 0.6141
0.3 0.6179 0.6217 0.6255 0.6293 0.6331 0.6368 0.6406 0.6443 0.6480 0.6517
0.4 0.6554 0.6591 0.6628 0.6664 0.6700 0.6736 0.6772 0.6808 0.6844 0.6879
0.5 0.6915 0.6950 0.6985 0.7019 0.7054 0.7088 0.7123 0.7157 0.7190 0.7224
0.6 0.7257 0.7291 0.7324 0.7357 0.7389 0.7422 0.7454 0.7486 0.7517 0.7549
0.7 0.7580 0.7611 0.7642 0.7673 0.7704 0.7734 0.7764 0.7794 0.7823 0.7852
0.8 0.7881 0.7910 0.7939 0.7967 0.7995 0.8023 0.8051 0.8078 0.8106 0.8133
0.9 0.8159 0.8186 0.8212 0.8238 0.8264 0.8289 0.8315 0.8340 0.8365 0.8389
1.0 0.8413 0.8438 0.8461 0.8485 0.8508 0.8531 0.8554 0.8577 0.8599 0.8621
1.1 0.8643 0.8665 0.8686 0.8708 0.8729 0.8749 0.8770 0.8790 0.8810 0.8830
1.2 0.8849 0.8869 0.8888 0.8907 0.8925 0.8944 0.8962 0.8980 0.8997 0.9015
1.3 0.9032 0.9049 0.9066 0.9082 0.9099 0.9115 0.9131 0.9147 0.9162 0.9177
1.4 0.9192 0.9207 0.9222 0.9236 0.9251 0.9265 0.9279 0.9292 0.9306 0.9319
1.5 0.9332 0.9345 0.9357 0.9370 0.9382 0.9394 0.9406 0.9418 0.9429 0.9441
1.6 0.9452 0.9463 0.9474 0.9484 0.9495 0.9505 0.9515 0.9525 0.9535 0.9545
1.7 0.9554 0.9564 0.9573 0.9582 0.9591 0.9599 0.9608 0.9616 0.9625 0.9633
1.8 0.9641 0.9649 0.9656 0.9664 0.9671 0.9678 0.9686 0.9693 0.9699 0.9706
1.9 0.9713 0.9719 0.9726 0.9732 0.9738 0.9744 0.9750 0.9756 0.9761 0.9767
2.0 0.9772 0.9778 0.9783 0.9788 0.9793 0.9798 0.9803 0.9808 0.9812 0.9817
2.1 0.9821 0.9826 0.9830 0.9834 0.9838 0.9842 0.9846 0.9850 0.9854 0.9857
2.2 0.9861 0.9864 0.9868 0.9871 0.9875 0.9878 0.9881 0.9884 0.9887 0.9890
2.3 0.9893 0.9896 0.9898 0.9901 0.9904 0.9906 0.9909 0.9911 0.9913 0.9916
2.4 0.9918 0.9920 0.9922 0.9925 0.9927 0.9929 0.9931 0.9932 0.9934 0.9936
2.5 0.9938 0.9940 0.9941 0.9943 0.9945 0.9946 0.9948 0.9949 0.9951 0.9952
2.6 0.9953 0.9955 0.9956 0.9957 0.9959 0.9960 0.9961 0.9962 0.9963 0.9964
2.7 0.9965 0.9966 0.9967 0.9968 0.9969 0.9970 0.9971 0.9972 0.9973 0.9974
2.8 0.9974 0.9975 0.9976 0.9977 0.9977 0.9978 0.9979 0.9979 0.9980 0.9981
2.9 0.9981 0.9982 0.9982 0.9983 0.9984 0.9984 0.9985 0.9985 0.9986 0.9986
3.0 0.9987 0.9987 0.9987 0.9988 0.9988 0.9989 0.9989 0.9989 0.9990 0.9990
3.1 0.9990 0.9991 0.9991 0.9991 0.9992 0.9992 0.9992 0.9992 0.9993 0.9993
3.2 0.9993 0.9993 0.9994 0.9994 0.9994 0.9994 0.9994 0.9995 0.9995 0.9995
3.3 0.9995 0.9995 0.9995 0.9996 0.9996 0.9996 0.9996 0.9996 0.9996 0.9997
3.4 0.9997 0.9997 0.9997 0.9997 0.9997 0.9997 0.9997 0.9997 0.9997 0.9998
0.995 0.99 0.975 0.95 0.90 0.10 0.05 0.025 0.01 0.005
df: 1 0.001 0.004 0.016 2.706 3.841 5.024 6.635 7.879
2 0.010 0.020 0.051 0.103 0.211 4.605 5.991 7.378 9.210 10.597
3 0.072 0.115 0.216 0.352 0.584 6.251 7.815 9.348 11.345 12.838
4 0.207 0.297 0.484 0.711 1.064 7.779 9.488 11.143 13.277 14.860
5 0.412 0.554 0.831 1.145 1.610 9.236 11.070 12.833 15.086 16.750
6 0.676 0.872 1.237 1.635 2.204 10.645 12.592 14.449 16.812 18.548
7 0.989 1.239 1.690 2.167 2.833 12.017 14.067 16.013 18.475 20.278
8 1.344 1.646 2.180 2.733 3.490 13.362 15.507 17.535 20.090 21.955
9 1.735 2.088 2.700 3.325 4.168 14.684 16.919 19.023 21.666 23.589
10 2.156 2.558 3.247 3.940 4.865 15.987 18.307 20.483 23.209 25.188
11 2.603 3.053 3.816 4.575 5.578 17.275 19.675 21.920 24.725 26.757
12 3.074 3.571 4.404 5.226 6.304 18.549 21.026 23.337 26.217 28.300
13 3.565 4.107 5.009 5.892 7.042 19.812 22.362 24.736 27.688 29.819
14 4.075 4.660 5.629 6.571 7.790 21.064 23.685 26.119 29.141 31.319
15 4.601 5.229 6.262 7.261 8.547 22.307 24.996 27.488 30.578 32.801
16 5.142 5.812 6.908 7.962 9.312 23.542 26.296 28.845 32.000 34.267
17 5.697 6.408 7.564 8.672 10.085 24.769 27.587 30.191 33.409 35.718
18 6.265 7.015 8.231 9.390 10.865 25.989 28.869 31.526 34.805 37.156
19 6.844 7.633 8.907 10.117 11.651 27.204 30.144 32.852 36.191 38.582
20 7.434 8.260 9.591 10.851 12.443 28.412 31.410 34.170 37.566 39.997
21 8.034 8.897 10.283 11.591 13.240 29.615 32.671 35.479 38.932 41.401
22 8.643 9.542 10.982 12.338 14.041 30.813 33.924 36.781 40.289 42.796
23 9.260 10.196 11.689 13.091 14.848 32.007 35.172 38.076 41.638 44.181
24 9.886 10.856 12.401 13.848 15.659 33.196 36.415 39.364 42.980 45.559
25 10.520 11.524 13.120 14.611 16.473 34.382 37.652 40.646 44.314 46.928
26 11.160 12.198 13.844 15.379 17.292 35.563 38.885 41.923 45.642 48.290
27 11.808 12.879 14.573 16.151 18.114 36.741 40.113 43.195 46.963 49.645
28 12.461 13.565 15.308 16.928 18.939 37.916 41.337 44.461 48.278 50.993
29 13.121 14.256 16.047 17.708 19.768 39.087 42.557 45.722 49.588 52.336
30 13.787 14.953 16.791 18.493 20.599 40.256 43.773 46.979 50.892 53.672
N n
xi xi
Mean i 1 x i 1
N n
N
(x i )2
n
(x - x) i
2
Variance 2 i 1 S2 = i=1
N n-1
p
Location of pth percentile Lp = (n +1)
100
Standard Deviation σ = σ2 s = s2
σ s
Coefficient of Variation CV = ×100% CV = ×100%
μ x
Probability distributions
Discrete
xi , i 1k is the list of possible values that the variable can take.
k
Expected value μ = E(X)= x ip(x i )
i=1
k
Variance σ2 = Var(X)= (x i -μ)2 p(x i )
i=1
Binomial
1
Table 4a: Binomial Distribution: P(X = x)
n x p 0.05 0.10 0.20 0.25 0.30 0.40 0.50 0.60 0.70 0.75 0.80 0.90 0.95
5 0 0.7738 0.5905 0.3277 0.2373 0.1681 0.0778 0.0313 0.0102 0.0024 0.0010 0.0003 0.0000 0.0000
1 0.2036 0.3281 0.4096 0.3955 0.3602 0.2592 0.1563 0.0768 0.0284 0.0146 0.0064 0.0005 0.0000
2 0.0214 0.0729 0.2048 0.2637 0.3087 0.3456 0.3125 0.2304 0.1323 0.0879 0.0512 0.0081 0.0011
3 0.0011 0.0081 0.0512 0.0879 0.1323 0.2304 0.3125 0.3456 0.3087 0.2637 0.2048 0.0729 0.0214
4 0.0000 0.0005 0.0064 0.0146 0.0284 0.0768 0.1563 0.2592 0.3602 0.3955 0.4096 0.3281 0.2036
5 0.0000 0.0000 0.0003 0.0010 0.0024 0.0102 0.0313 0.0778 0.1681 0.2373 0.3277 0.5905 0.7738
6 0 0.7351 0.5314 0.2621 0.1780 0.1176 0.0467 0.0156 0.0041 0.0007 0.0002 0.0001 0.0000 0.0000
1 0.2321 0.3543 0.3932 0.3560 0.3025 0.1866 0.0938 0.0369 0.0102 0.0044 0.0015 0.0001 0.0000
2 0.0305 0.0984 0.2458 0.2966 0.3241 0.3110 0.2344 0.1382 0.0595 0.0330 0.0154 0.0012 0.0001
3 0.0021 0.0146 0.0819 0.1318 0.1852 0.2765 0.3125 0.2765 0.1852 0.1318 0.0819 0.0146 0.0021
4 0.0001 0.0012 0.0154 0.0330 0.0595 0.1382 0.2344 0.3110 0.3241 0.2966 0.2458 0.0984 0.0305
5 0.0000 0.0001 0.0015 0.0044 0.0102 0.0369 0.0938 0.1866 0.3025 0.3560 0.3932 0.3543 0.2321
6 0.0000 0.0000 0.0001 0.0002 0.0007 0.0041 0.0156 0.0467 0.1176 0.1780 0.2621 0.5314 0.7351
7 0 0.6983 0.4783 0.2097 0.1335 0.0824 0.0280 0.0078 0.0016 0.0002 0.0001 0.0000 0.0000 0.0000
1 0.2573 0.3720 0.3670 0.3115 0.2471 0.1306 0.0547 0.0172 0.0036 0.0013 0.0004 0.0000 0.0000
2 0.0406 0.1240 0.2753 0.3115 0.3177 0.2613 0.1641 0.0774 0.0250 0.0115 0.0043 0.0002 0.0000
3 0.0036 0.0230 0.1147 0.1730 0.2269 0.2903 0.2734 0.1935 0.0972 0.0577 0.0287 0.0026 0.0002
4 0.0002 0.0026 0.0287 0.0577 0.0972 0.1935 0.2734 0.2903 0.2269 0.1730 0.1147 0.0230 0.0036
5 0.0000 0.0002 0.0043 0.0115 0.0250 0.0774 0.1641 0.2613 0.3177 0.3115 0.2753 0.1240 0.0406
6 0.0000 0.0000 0.0004 0.0013 0.0036 0.0172 0.0547 0.1306 0.2471 0.3115 0.3670 0.3720 0.2573
7 0.0000 0.0000 0.0000 0.0001 0.0002 0.0016 0.0078 0.0280 0.0824 0.1335 0.2097 0.4783 0.6983
8 0 0.6634 0.4305 0.1678 0.1001 0.0576 0.0168 0.0039 0.0007 0.0001 0.0000 0.0000 0.0000 0.0000
1 0.2793 0.3826 0.3355 0.2670 0.1977 0.0896 0.0313 0.0079 0.0012 0.0004 0.0001 0.0000 0.0000
2 0.0515 0.1488 0.2936 0.3115 0.2965 0.2090 0.1094 0.0413 0.0100 0.0038 0.0011 0.0000 0.0000
3 0.0054 0.0331 0.1468 0.2076 0.2541 0.2787 0.2188 0.1239 0.0467 0.0231 0.0092 0.0004 0.0000
4 0.0004 0.0046 0.0459 0.0865 0.1361 0.2322 0.2734 0.2322 0.1361 0.0865 0.0459 0.0046 0.0004
5 0.0000 0.0004 0.0092 0.0231 0.0467 0.1239 0.2188 0.2787 0.2541 0.2076 0.1468 0.0331 0.0054
6 0.0000 0.0000 0.0011 0.0038 0.0100 0.0413 0.1094 0.2090 0.2965 0.3115 0.2936 0.1488 0.0515
7 0.0000 0.0000 0.0001 0.0004 0.0012 0.0079 0.0313 0.0896 0.1977 0.2670 0.3355 0.3826 0.2793
8 0.0000 0.0000 0.0000 0.0000 0.0001 0.0007 0.0039 0.0168 0.0576 0.1001 0.1678 0.4305 0.6634
9 0 0.6302 0.3874 0.1342 0.0751 0.0404 0.0101 0.0020 0.0003 0.0000 0.0000 0.0000 0.0000 0.0000
1 0.2985 0.3874 0.3020 0.2253 0.1556 0.0605 0.0176 0.0035 0.0004 0.0001 0.0000 0.0000 0.0000
2 0.0629 0.1722 0.3020 0.3003 0.2668 0.1612 0.0703 0.0212 0.0039 0.0012 0.0003 0.0000 0.0000
3 0.0077 0.0446 0.1762 0.2336 0.2668 0.2508 0.1641 0.0743 0.0210 0.0087 0.0028 0.0001 0.0000
4 0.0006 0.0074 0.0661 0.1168 0.1715 0.2508 0.2461 0.1672 0.0735 0.0389 0.0165 0.0008 0.0000
5 0.0000 0.0008 0.0165 0.0389 0.0735 0.1672 0.2461 0.2508 0.1715 0.1168 0.0661 0.0074 0.0006
6 0.0000 0.0001 0.0028 0.0087 0.0210 0.0743 0.1641 0.2508 0.2668 0.2336 0.1762 0.0446 0.0077
7 0.0000 0.0000 0.0003 0.0012 0.0039 0.0212 0.0703 0.1612 0.2668 0.3003 0.3020 0.1722 0.0629
8 0.0000 0.0000 0.0000 0.0001 0.0004 0.0035 0.0176 0.0605 0.1556 0.2253 0.3020 0.3874 0.2985
9 0.0000 0.0000 0.0000 0.0000 0.0000 0.0003 0.0020 0.0101 0.0404 0.0751 0.1342 0.3874 0.6302
10 0 0.5987 0.3487 0.1074 0.0563 0.0282 0.0060 0.0010 0.0001 0.0000 0.0000 0.0000 0.0000 0.0000
1 0.3151 0.3874 0.2684 0.1877 0.1211 0.0403 0.0098 0.0016 0.0001 0.0000 0.0000 0.0000 0.0000
2 0.0746 0.1937 0.3020 0.2816 0.2335 0.1209 0.0439 0.0106 0.0014 0.0004 0.0001 0.0000 0.0000
3 0.0105 0.0574 0.2013 0.2503 0.2668 0.2150 0.1172 0.0425 0.0090 0.0031 0.0008 0.0000 0.0000
4 0.0010 0.0112 0.0881 0.1460 0.2001 0.2508 0.2051 0.1115 0.0368 0.0162 0.0055 0.0001 0.0000
5 0.0001 0.0015 0.0264 0.0584 0.1029 0.2007 0.2461 0.2007 0.1029 0.0584 0.0264 0.0015 0.0001
6 0.0000 0.0001 0.0055 0.0162 0.0368 0.1115 0.2051 0.2508 0.2001 0.1460 0.0881 0.0112 0.0010
7 0.0000 0.0000 0.0008 0.0031 0.0090 0.0425 0.1172 0.2150 0.2668 0.2503 0.2013 0.0574 0.0105
8 0.0000 0.0000 0.0001 0.0004 0.0014 0.0106 0.0439 0.1209 0.2335 0.2816 0.3020 0.1937 0.0746
9 0.0000 0.0000 0.0000 0.0000 0.0001 0.0016 0.0098 0.0403 0.1211 0.1877 0.2684 0.3874 0.3151
10 0.0000 0.0000 0.0000 0.0000 0.0000 0.0001 0.0010 0.0060 0.0282 0.0563 0.1074 0.3487 0.5987
2
Table 4a: Binomial Distribution: P(X = x) (ctd)
n x p 0.05 0.10 0.20 0.25 0.30 0.40 0.50 0.60 0.70 0.75 0.80 0.90 0.95
12 0 0.5404 0.2824 0.0687 0.0317 0.0138 0.0022 0.0002 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
1 0.3413 0.3766 0.2062 0.1267 0.0712 0.0174 0.0029 0.0003 0.0000 0.0000 0.0000 0.0000 0.0000
2 0.0988 0.2301 0.2835 0.2323 0.1678 0.0639 0.0161 0.0025 0.0002 0.0000 0.0000 0.0000 0.0000
3 0.0173 0.0852 0.2362 0.2581 0.2397 0.1419 0.0537 0.0125 0.0015 0.0004 0.0001 0.0000 0.0000
4 0.0021 0.0213 0.1329 0.1936 0.2311 0.2128 0.1208 0.0420 0.0078 0.0024 0.0005 0.0000 0.0000
5 0.0002 0.0038 0.0532 0.1032 0.1585 0.2270 0.1934 0.1009 0.0291 0.0115 0.0033 0.0000 0.0000
6 0.0000 0.0005 0.0155 0.0401 0.0792 0.1766 0.2256 0.1766 0.0792 0.0401 0.0155 0.0005 0.0000
7 0.0000 0.0000 0.0033 0.0115 0.0291 0.1009 0.1934 0.2270 0.1585 0.1032 0.0532 0.0038 0.0002
8 0.0000 0.0000 0.0005 0.0024 0.0078 0.0420 0.1208 0.2128 0.2311 0.1936 0.1329 0.0213 0.0021
9 0.0000 0.0000 0.0001 0.0004 0.0015 0.0125 0.0537 0.1419 0.2397 0.2581 0.2362 0.0852 0.0173
10 0.0000 0.0000 0.0000 0.0000 0.0002 0.0025 0.0161 0.0639 0.1678 0.2323 0.2835 0.2301 0.0988
11 0.0000 0.0000 0.0000 0.0000 0.0000 0.0003 0.0029 0.0174 0.0712 0.1267 0.2062 0.3766 0.3413
12 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0002 0.0022 0.0138 0.0317 0.0687 0.2824 0.5404
15 0 0.4633 0.2059 0.0352 0.0134 0.0047 0.0005 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
1 0.3658 0.3432 0.1319 0.0668 0.0305 0.0047 0.0005 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
2 0.1348 0.2669 0.2309 0.1559 0.0916 0.0219 0.0032 0.0003 0.0000 0.0000 0.0000 0.0000 0.0000
3 0.0307 0.1285 0.2501 0.2252 0.1700 0.0634 0.0139 0.0016 0.0001 0.0000 0.0000 0.0000 0.0000
4 0.0049 0.0428 0.1876 0.2252 0.2186 0.1268 0.0417 0.0074 0.0006 0.0001 0.0000 0.0000 0.0000
5 0.0006 0.0105 0.1032 0.1651 0.2061 0.1859 0.0916 0.0245 0.0030 0.0007 0.0001 0.0000 0.0000
6 0.0000 0.0019 0.0430 0.0917 0.1472 0.2066 0.1527 0.0612 0.0116 0.0034 0.0007 0.0000 0.0000
7 0.0000 0.0003 0.0138 0.0393 0.0811 0.1771 0.1964 0.1181 0.0348 0.0131 0.0035 0.0000 0.0000
8 0.0000 0.0000 0.0035 0.0131 0.0348 0.1181 0.1964 0.1771 0.0811 0.0393 0.0138 0.0003 0.0000
9 0.0000 0.0000 0.0007 0.0034 0.0116 0.0612 0.1527 0.2066 0.1472 0.0917 0.0430 0.0019 0.0000
10 0.0000 0.0000 0.0001 0.0007 0.0030 0.0245 0.0916 0.1859 0.2061 0.1651 0.1032 0.0105 0.0006
11 0.0000 0.0000 0.0000 0.0001 0.0006 0.0074 0.0417 0.1268 0.2186 0.2252 0.1876 0.0428 0.0049
12 0.0000 0.0000 0.0000 0.0000 0.0001 0.0016 0.0139 0.0634 0.1700 0.2252 0.2501 0.1285 0.0307
13 0.0000 0.0000 0.0000 0.0000 0.0000 0.0003 0.0032 0.0219 0.0916 0.1559 0.2309 0.2669 0.1348
14 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0005 0.0047 0.0305 0.0668 0.1319 0.3432 0.3658
15 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0005 0.0047 0.0134 0.0352 0.2059 0.4633
20 0 0.3585 0.1216 0.0115 0.0032 0.0008 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
1 0.3774 0.2702 0.0576 0.0211 0.0068 0.0005 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
2 0.1887 0.2852 0.1369 0.0669 0.0278 0.0031 0.0002 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
3 0.0596 0.1901 0.2054 0.1339 0.0716 0.0123 0.0011 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
4 0.0133 0.0898 0.2182 0.1897 0.1304 0.0350 0.0046 0.0003 0.0000 0.0000 0.0000 0.0000 0.0000
5 0.0022 0.0319 0.1746 0.2023 0.1789 0.0746 0.0148 0.0013 0.0000 0.0000 0.0000 0.0000 0.0000
6 0.0003 0.0089 0.1091 0.1686 0.1916 0.1244 0.0370 0.0049 0.0002 0.0000 0.0000 0.0000 0.0000
7 0.0000 0.0020 0.0545 0.1124 0.1643 0.1659 0.0739 0.0146 0.0010 0.0002 0.0000 0.0000 0.0000
8 0.0000 0.0004 0.0222 0.0609 0.1144 0.1797 0.1201 0.0355 0.0039 0.0008 0.0001 0.0000 0.0000
9 0.0000 0.0001 0.0074 0.0271 0.0654 0.1597 0.1602 0.0710 0.0120 0.0030 0.0005 0.0000 0.0000
10 0.0000 0.0000 0.0020 0.0099 0.0308 0.1171 0.1762 0.1171 0.0308 0.0099 0.0020 0.0000 0.0000
11 0.0000 0.0000 0.0005 0.0030 0.0120 0.0710 0.1602 0.1597 0.0654 0.0271 0.0074 0.0001 0.0000
12 0.0000 0.0000 0.0001 0.0008 0.0039 0.0355 0.1201 0.1797 0.1144 0.0609 0.0222 0.0004 0.0000
13 0.0000 0.0000 0.0000 0.0002 0.0010 0.0146 0.0739 0.1659 0.1643 0.1124 0.0545 0.0020 0.0000
14 0.0000 0.0000 0.0000 0.0000 0.0002 0.0049 0.0370 0.1244 0.1916 0.1686 0.1091 0.0089 0.0003
15 0.0000 0.0000 0.0000 0.0000 0.0000 0.0013 0.0148 0.0746 0.1789 0.2023 0.1746 0.0319 0.0022
16 0.0000 0.0000 0.0000 0.0000 0.0000 0.0003 0.0046 0.0350 0.1304 0.1897 0.2182 0.0898 0.0133
17 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0011 0.0123 0.0716 0.1339 0.2054 0.1901 0.0596
18 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0002 0.0031 0.0278 0.0669 0.1369 0.2852 0.1887
19 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0005 0.0068 0.0211 0.0576 0.2702 0.3774
20 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0008 0.0032 0.0115 0.1216 0.3585
3
Table 4b: Cumulative Binomial Distribution: P(X ≤ x)
n x p 0.05 0.10 0.20 0.25 0.30 0.40 0.50 0.60 0.70 0.75 0.80 0.90 0.95
5 0 0.7738 0.5905 0.3277 0.2373 0.1681 0.0778 0.0313 0.0102 0.0024 0.0010 0.0003 0.0000 0.0000
1 0.9774 0.9185 0.7373 0.6328 0.5282 0.3370 0.1875 0.0870 0.0308 0.0156 0.0067 0.0005 0.0000
2 0.9988 0.9914 0.9421 0.8965 0.8369 0.6826 0.5000 0.3174 0.1631 0.1035 0.0579 0.0086 0.0012
3 1.0000 0.9995 0.9933 0.9844 0.9692 0.9130 0.8125 0.6630 0.4718 0.3672 0.2627 0.0815 0.0226
4 1.0000 1.0000 0.9997 0.9990 0.9976 0.9898 0.9688 0.9222 0.8319 0.7627 0.6723 0.4095 0.2262
5 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000
6 0 0.7351 0.5314 0.2621 0.1780 0.1176 0.0467 0.0156 0.0041 0.0007 0.0002 0.0001 0.0000 0.0000
1 0.9672 0.8857 0.6554 0.5339 0.4202 0.2333 0.1094 0.0410 0.0109 0.0046 0.0016 0.0001 0.0000
2 0.9978 0.9842 0.9011 0.8306 0.7443 0.5443 0.3438 0.1792 0.0705 0.0376 0.0170 0.0013 0.0001
3 0.9999 0.9987 0.9830 0.9624 0.9295 0.8208 0.6563 0.4557 0.2557 0.1694 0.0989 0.0159 0.0022
4 1.0000 0.9999 0.9984 0.9954 0.9891 0.9590 0.8906 0.7667 0.5798 0.4661 0.3446 0.1143 0.0328
5 1.0000 1.0000 0.9999 0.9998 0.9993 0.9959 0.9844 0.9533 0.8824 0.8220 0.7379 0.4686 0.2649
6 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000
7 0 0.6983 0.4783 0.2097 0.1335 0.0824 0.0280 0.0078 0.0016 0.0002 0.0001 0.0000 0.0000 0.0000
1 0.9556 0.8503 0.5767 0.4449 0.3294 0.1586 0.0625 0.0188 0.0038 0.0013 0.0004 0.0000 0.0000
2 0.9962 0.9743 0.8520 0.7564 0.6471 0.4199 0.2266 0.0963 0.0288 0.0129 0.0047 0.0002 0.0000
3 0.9998 0.9973 0.9667 0.9294 0.8740 0.7102 0.5000 0.2898 0.1260 0.0706 0.0333 0.0027 0.0002
4 1.0000 0.9998 0.9953 0.9871 0.9712 0.9037 0.7734 0.5801 0.3529 0.2436 0.1480 0.0257 0.0038
5 1.0000 1.0000 0.9996 0.9987 0.9962 0.9812 0.9375 0.8414 0.6706 0.5551 0.4233 0.1497 0.0444
6 1.0000 1.0000 1.0000 0.9999 0.9998 0.9984 0.9922 0.9720 0.9176 0.8665 0.7903 0.5217 0.3017
7 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000
8 0 0.6634 0.4305 0.1678 0.1001 0.0576 0.0168 0.0039 0.0007 0.0001 0.0000 0.0000 0.0000 0.0000
1 0.9428 0.8131 0.5033 0.3671 0.2553 0.1064 0.0352 0.0085 0.0013 0.0004 0.0001 0.0000 0.0000
2 0.9942 0.9619 0.7969 0.6785 0.5518 0.3154 0.1445 0.0498 0.0113 0.0042 0.0012 0.0000 0.0000
3 0.9996 0.9950 0.9437 0.8862 0.8059 0.5941 0.3633 0.1737 0.0580 0.0273 0.0104 0.0004 0.0000
4 1.0000 0.9996 0.9896 0.9727 0.9420 0.8263 0.6367 0.4059 0.1941 0.1138 0.0563 0.0050 0.0004
5 1.0000 1.0000 0.9988 0.9958 0.9887 0.9502 0.8555 0.6846 0.4482 0.3215 0.2031 0.0381 0.0058
6 1.0000 1.0000 0.9999 0.9996 0.9987 0.9915 0.9648 0.8936 0.7447 0.6329 0.4967 0.1869 0.0572
7 1.0000 1.0000 1.0000 1.0000 0.9999 0.9993 0.9961 0.9832 0.9424 0.8999 0.8322 0.5695 0.3366
8 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000
9 0 0.6302 0.3874 0.1342 0.0751 0.0404 0.0101 0.0020 0.0003 0.0000 0.0000 0.0000 0.0000 0.0000
1 0.9288 0.7748 0.4362 0.3003 0.1960 0.0705 0.0195 0.0038 0.0004 0.0001 0.0000 0.0000 0.0000
2 0.9916 0.9470 0.7382 0.6007 0.4628 0.2318 0.0898 0.0250 0.0043 0.0013 0.0003 0.0000 0.0000
3 0.9994 0.9917 0.9144 0.8343 0.7297 0.4826 0.2539 0.0994 0.0253 0.0100 0.0031 0.0001 0.0000
4 1.0000 0.9991 0.9804 0.9511 0.9012 0.7334 0.5000 0.2666 0.0988 0.0489 0.0196 0.0009 0.0000
5 1.0000 0.9999 0.9969 0.9900 0.9747 0.9006 0.7461 0.5174 0.2703 0.1657 0.0856 0.0083 0.0006
6 1.0000 1.0000 0.9997 0.9987 0.9957 0.9750 0.9102 0.7682 0.5372 0.3993 0.2618 0.0530 0.0084
7 1.0000 1.0000 1.0000 0.9999 0.9996 0.9962 0.9805 0.9295 0.8040 0.6997 0.5638 0.2252 0.0712
8 1.0000 1.0000 1.0000 1.0000 1.0000 0.9997 0.9980 0.9899 0.9596 0.9249 0.8658 0.6126 0.3698
9 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000
10 0 0.5987 0.3487 0.1074 0.0563 0.0282 0.0060 0.0010 0.0001 0.0000 0.0000 0.0000 0.0000 0.0000
1 0.9139 0.7361 0.3758 0.2440 0.1493 0.0464 0.0107 0.0017 0.0001 0.0000 0.0000 0.0000 0.0000
2 0.9885 0.9298 0.6778 0.5256 0.3828 0.1673 0.0547 0.0123 0.0016 0.0004 0.0001 0.0000 0.0000
3 0.9990 0.9872 0.8791 0.7759 0.6496 0.3823 0.1719 0.0548 0.0106 0.0035 0.0009 0.0000 0.0000
4 0.9999 0.9984 0.9672 0.9219 0.8497 0.6331 0.3770 0.1662 0.0473 0.0197 0.0064 0.0001 0.0000
5 1.0000 0.9999 0.9936 0.9803 0.9527 0.8338 0.6230 0.3669 0.1503 0.0781 0.0328 0.0016 0.0001
6 1.0000 1.0000 0.9991 0.9965 0.9894 0.9452 0.8281 0.6177 0.3504 0.2241 0.1209 0.0128 0.0010
7 1.0000 1.0000 0.9999 0.9996 0.9984 0.9877 0.9453 0.8327 0.6172 0.4744 0.3222 0.0702 0.0115
8 1.0000 1.0000 1.0000 1.0000 0.9999 0.9983 0.9893 0.9536 0.8507 0.7560 0.6242 0.2639 0.0861
9 1.0000 1.0000 1.0000 1.0000 1.0000 0.9999 0.9990 0.9940 0.9718 0.9437 0.8926 0.6513 0.4013
10 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000
4
Table 4b: Cumulative Binomial Distribution: P(X ≤ x) (ctd)
n x p 0.05 0.10 0.20 0.25 0.30 0.40 0.50 0.60 0.70 0.75 0.80 0.90 0.95
12 0 0.5404 0.2824 0.0687 0.0317 0.0138 0.0022 0.0002 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
1 0.8816 0.6590 0.2749 0.1584 0.0850 0.0196 0.0032 0.0003 0.0000 0.0000 0.0000 0.0000 0.0000
2 0.9804 0.8891 0.5583 0.3907 0.2528 0.0834 0.0193 0.0028 0.0002 0.0000 0.0000 0.0000 0.0000
3 0.9978 0.9744 0.7946 0.6488 0.4925 0.2253 0.0730 0.0153 0.0017 0.0004 0.0001 0.0000 0.0000
4 0.9998 0.9957 0.9274 0.8424 0.7237 0.4382 0.1938 0.0573 0.0095 0.0028 0.0006 0.0000 0.0000
5 1.0000 0.9995 0.9806 0.9456 0.8822 0.6652 0.3872 0.1582 0.0386 0.0143 0.0039 0.0001 0.0000
6 1.0000 0.9999 0.9961 0.9857 0.9614 0.8418 0.6128 0.3348 0.1178 0.0544 0.0194 0.0005 0.0000
7 1.0000 1.0000 0.9994 0.9972 0.9905 0.9427 0.8062 0.5618 0.2763 0.1576 0.0726 0.0043 0.0002
8 1.0000 1.0000 0.9999 0.9996 0.9983 0.9847 0.9270 0.7747 0.5075 0.3512 0.2054 0.0256 0.0022
9 1.0000 1.0000 1.0000 1.0000 0.9998 0.9972 0.9807 0.9166 0.7472 0.6093 0.4417 0.1109 0.0196
10 1.0000 1.0000 1.0000 1.0000 1.0000 0.9997 0.9968 0.9804 0.9150 0.8416 0.7251 0.3410 0.1184
11 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.9998 0.9978 0.9862 0.9683 0.9313 0.7176 0.4596
12 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000
15 0 0.4633 0.2059 0.0352 0.0134 0.0047 0.0005 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
1 0.8290 0.5490 0.1671 0.0802 0.0353 0.0052 0.0005 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
2 0.9638 0.8159 0.3980 0.2361 0.1268 0.0271 0.0037 0.0003 0.0000 0.0000 0.0000 0.0000 0.0000
3 0.9945 0.9444 0.6482 0.4613 0.2969 0.0905 0.0176 0.0019 0.0001 0.0000 0.0000 0.0000 0.0000
4 0.9994 0.9873 0.8358 0.6865 0.5155 0.2173 0.0592 0.0093 0.0007 0.0001 0.0000 0.0000 0.0000
5 0.9999 0.9978 0.9389 0.8516 0.7216 0.4032 0.1509 0.0338 0.0037 0.0008 0.0001 0.0000 0.0000
6 1.0000 0.9997 0.9819 0.9434 0.8689 0.6098 0.3036 0.0950 0.0152 0.0042 0.0008 0.0000 0.0000
7 1.0000 1.0000 0.9958 0.9827 0.9500 0.7869 0.5000 0.2131 0.0500 0.0173 0.0042 0.0000 0.0000
8 1.0000 1.0000 0.9992 0.9958 0.9848 0.9050 0.6964 0.3902 0.1311 0.0566 0.0181 0.0003 0.0000
9 1.0000 1.0000 0.9999 0.9992 0.9963 0.9662 0.8491 0.5968 0.2784 0.1484 0.0611 0.0022 0.0001
10 1.0000 1.0000 1.0000 0.9999 0.9993 0.9907 0.9408 0.7827 0.4845 0.3135 0.1642 0.0127 0.0006
11 1.0000 1.0000 1.0000 1.0000 0.9999 0.9981 0.9824 0.9095 0.7031 0.5387 0.3518 0.0556 0.0055
12 1.0000 1.0000 1.0000 1.0000 1.0000 0.9997 0.9963 0.9729 0.8732 0.7639 0.6020 0.1841 0.0362
13 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.9995 0.9948 0.9647 0.9198 0.8329 0.4510 0.1710
14 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.9995 0.9953 0.9866 0.9648 0.7941 0.5367
15 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000
20 0 0.3585 0.1216 0.0115 0.0032 0.0008 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
1 0.7358 0.3917 0.0692 0.0243 0.0076 0.0005 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
2 0.9245 0.6769 0.2061 0.0913 0.0355 0.0036 0.0002 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
3 0.9841 0.8670 0.4114 0.2252 0.1071 0.0160 0.0013 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
4 0.9974 0.9568 0.6296 0.4148 0.2375 0.0510 0.0059 0.0003 0.0000 0.0000 0.0000 0.0000 0.0000
5 0.9997 0.9887 0.8042 0.6172 0.4164 0.1256 0.0207 0.0016 0.0000 0.0000 0.0000 0.0000 0.0000
6 1.0000 0.9976 0.9133 0.7858 0.6080 0.2500 0.0577 0.0065 0.0003 0.0000 0.0000 0.0000 0.0000
7 1.0000 0.9996 0.9679 0.8982 0.7723 0.4159 0.1316 0.0210 0.0013 0.0002 0.0000 0.0000 0.0000
8 1.0000 0.9999 0.9900 0.9591 0.8867 0.5956 0.2517 0.0565 0.0051 0.0009 0.0001 0.0000 0.0000
9 1.0000 1.0000 0.9974 0.9861 0.9520 0.7553 0.4119 0.1275 0.0171 0.0039 0.0006 0.0000 0.0000
10 1.0000 1.0000 0.9994 0.9961 0.9829 0.8725 0.5881 0.2447 0.0480 0.0139 0.0026 0.0000 0.0000
11 1.0000 1.0000 0.9999 0.9991 0.9949 0.9435 0.7483 0.4044 0.1133 0.0409 0.0100 0.0001 0.0000
12 1.0000 1.0000 1.0000 0.9998 0.9987 0.9790 0.8684 0.5841 0.2277 0.1018 0.0321 0.0004 0.0000
13 1.0000 1.0000 1.0000 1.0000 0.9997 0.9935 0.9423 0.7500 0.3920 0.2142 0.0867 0.0024 0.0000
14 1.0000 1.0000 1.0000 1.0000 1.0000 0.9984 0.9793 0.8744 0.5836 0.3828 0.1958 0.0113 0.0003
15 1.0000 1.0000 1.0000 1.0000 1.0000 0.9997 0.9941 0.9490 0.7625 0.5852 0.3704 0.0432 0.0026
16 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.9987 0.9840 0.8929 0.7748 0.5886 0.1330 0.0159
17 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.9998 0.9964 0.9645 0.9087 0.7939 0.3231 0.0755
18 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.9995 0.9924 0.9757 0.9308 0.6083 0.2642
19 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.9992 0.9968 0.9885 0.8784 0.6415
20 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000
5
Notes on the Faculty-approved HP 10bII+ Financial Calculator
Decimal places
By default, the calculator displays only 2 decimal places.
To change this, press followed by, for example 6, for 6 decimal places.
For a convenient display, of up to the maximum number of digits, press , followed
by the decimal point key.