Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

Data Analysis Exercise

Open Study Materials – Week 8 and download the Excel file: Data Analysis Practice 2022 Data

BASIC DESCRIPTIVE STATISTICS – TAB 1

1. Save a copy to your own personal Microsoft university one drive

2. Perform basic statistics on each column of data, report the mean (average) in row 47 and
standard deviation in row 48 under each

Example Method for pH:

MEAN Put cursor in the cell C47 (where the mean result for pH will go)

Click ‘Formulas’ – ‘more functions’ – ‘statistical’ – ‘average’ – Select or highlight the cell range (C2:C46)
in number 1 – OK, result will appear in cell C47

STANDARD DEVIATION (shows how far the result lies from the mean value)

Report as a ± from the mean value. Move cursor to result cell, C48.

Click ‘Formulas’ – ‘more functions’ – ‘statistical’ – ‘STDEVA’ - Select or highlight the cell range, e.g.
(C2:C46) in value 1 box, ENTER.

In cell C49, report as mean ± standard deviation, e.g. pH = 5.8 ± 0.2 (± special Symbol character code
00B1)

3. Repeat for all data columns – the formula can be cut and paste but make sure correct cell numbers
are in formula

GRAPHS – CLICK ON TO GRAPH TAB AT BOTTOM OF WORKSHEET

Draw a graph showing conductivity with a trend line and error bars for standard deviation

1. Perform the mean and standard deviation for each grid reference against the 3 triplicate results
and put results in columns adjacent.

e.g. For grid reference SK05392 44856, cell range = (B2:B4), mean result in F2, std dev result G2

2. Use curser to highlight area C2:F16

Click, ‘Insert’ – ‘’chart’ – select type as either column/bar chart or scatter/XY plot – your own choice

3. Use the + symbol on left of chart to open up all different op- tions

4. Add Chart title

5. Add titles to x-axis & y-axis (& units) by pressing > option
6. Add a trend line by pressing > more options – linear – automatic – Click on R-squared (correlation
coefficient to see if any pattern)

7. Add Custom Error bars for the standard deviation – Press + for chart op-
tions

Check ‘Error Bars’ – click on horizontal line & delete using PC keyboard, so only vertical ones remain

Click on > symbol next to Error bars, ‘More Options’ – Click ‘custom’ box – ‘Specify Value’ – New box
opens. Add the standard deviation cells by highlighting for both positive & negative

Note: I used different data so will not look like yours!


NORMALITY TESTING – IS DATA PARAMETRIC OR NON-PARAMETRIC? - PARAMETRIC tab

Click on the web link to open the Shapiro-Wilks calculator

Delete their data and copy and paste your data into the Data box

If value for p>0.05 = parametric

If value for p<0.05 = non-parametric

INFERENTIAL TESTS TO SEE IF 2 DATA SETS ARE SIGNIFICANTLY DIFFERENT

Go to Inferential tab on worksheet

1. Run a Shapiro-Wilks on the data together to check distribution is parametric or non-parametric

Mann-Whitney for non-parametric data

Use this website if you are unsure

https://www.statskingdom.com/170median_mann_whitney.html

Input the 2 sets of data to compare and calculates the p-value

t-test, for statistical differences for parametric data

Click on ‘Formulas’ – more functions – statistical – TTEST

Array 1 – highlight the first data set

Array 2 – highlight the second data set to compare

Tail = 2 (2 = a 2-tailed distribution)

Type = 3 (3 = type of t-test being 2 sample unequal variance)

If the null hypothesis is: The mean concentration of nitrates in the River Churnet in 2020 is NOT
significantly different from 2022 data (In other words there is no statistical difference in data
sets) p<0.05

If p>0.05 accept the null hypothesis (no significant difference).

If p<0.05 reject the null hypothesis (significant difference)

(High P value = true null – no statistical difference, low P value = untrue null statistical difference).

Quote your actual P number in results.

You might also like