Download as pdf or txt
Download as pdf or txt
You are on page 1of 1

Name: _____________________________________ Date.

____/____/____

Exercise list 3 – Deadline May 3rd, 4 p.m.

1) Explain in your own words what linear regression is. Additionally, also in your own words, explain
the assumptions of the method.

2) The parameter Total dissolved solids (TDS) is one of the guidelines to determine water quality for
human consumption. The World Health Organization (WHO) establishes that 500 ppm is the
acceptable limit of TDS. However under conditions where water is scarce and these standards
cannot be achieved, WHO states that a limit 2000 ppm is tolerable. The file ‘LaGuajira Interior.xlsx’
contain information gathered across 67 wells in the vicinities of Maicao city, La Guajira. Perform a
statistical analysis (including Principal Component Analysis) of the variables and construct the best
possible linear regression model to predict TDS content in the groundwater. Discuss your results.

3) In Sicua you will find a new version of the slides from ‘Class-Stochastic EXES’. This new set
contains instruction of how to use principal components to run a stochastic simulation of correlated
values. Use the file ‘GeoEAS Stochastic.xlsx’ to run the simulation of 1000 correlated values for
Arsenic, Cadmium and Lead. Finally calculate the correlation matrix of the original values and
another correlation matrix of the simulated values. Discuss the results.

4) Stochastic simulation is especially useful tool when it is necessary to generate maps and quantify
the uncertainty of such maps. Use the file ‘LaGuajira Interior.xlsx’ to generate 60 normally
distributed values of Total Dissolved Solids for each sample site. That means that the simulated
stochastic values will be associated to a geographic XY location, i.e., the simulation should be
performed across rows. To achieve that use the values of TDSth collected sample as the mean, and
the standard deviation of all the collected samples as the parameters for the excel function that will
generate the normally distributed values (Tip: create a different spreadsheet with random number
between 0 and 1 to use in the formula). Finally, after you generate the 60 values, calculate the
average, median and the 95th percentile of the simulated values and use Surfer to create
interpolated maps for these 3 parameters using inverse distance weighting. Interpret and discuss the
generated maps.

You might also like