Math IA

IB Internal Assessment
Math AA SL
Creating a equation to predict the relationship between the death rate and PM2.5 values of a
country
1
Introduction and Rationale
The applications of mathematics in real-life have always seemed to fascinate me. I have always
been intrigued about how one can apply the mathematics learnt in the classroom to real life contexts
and use it to resolve real-life problems. I became highly enthusiastic when I stumbled upon
something that could elucidate the elusive nature of mathematic and something that was highly
personal to each one us during the pandemic.
According to recent scientific and mathematical discoveries, it is hypothesized that the Covid-19
death rate and PM2.5 levels of that country have a strong positive correlation. PM2.5 refers to
atmospheric particulate matter that have a diameter of less than 2.5 micrometers, particles in this
category are so small that they can only be seen with a microscope. Owing to their minute size,
particles smaller than 2.5 micrometers are able to bypass the nose and throat and penetrate deep
into the lungs, some may even enter the circulatory system. As I continued to research I found that
since Covid-19 also has an airborne transmission, particulate matter (PM2.5) could act as a carrier
and spread the infection at a higher rate. Particulate matter could also have induced damage to the
lungs cells which increases inflammation rate which could further affect the severity of Covid-19
in an individual. Upon reading this, I was captivated and wanted to learn more about this
relationship.
The situation of the world during the pandemic is extremely crucial and any information that can
potentially save lives is pivotal. I believe modelling an equation that can make predictions in the
real world between pollution rates and death rates could have innumerable benefits. Creating this
model with a context and real-life considerations could help us conclude upon the best way to
predict the death rate in a country.
2
Aim and Method
As established in the introduction, my aim of this exploration is to determine the relationship
between world pollution rates and the death rates and create an equation to make predictions
in the real world. This model will relate the PM2.5 levels of a country and the death rate of a
country for the year of 2020.
Because I wanted this model to determine a relationship between PM2.5 levels of a country and
the death rate of a country, there was no way to collect primary data and I had to use secondary
data. In order to make sure my data was authentic I used data only from authentic government
websites. I decided to collect data from these sites for 30 different countries and the death rate of
the following countries for the year 2020.
I decided to approach the problem by seeing if the relationship between the PM2.5 levels and the
death rate of a country for the year 2020 described as an equation. The data used in this analysis
consist of daily deaths due to COVID-19 for 30 countries and their respective provinces. The
data covers 30 countries for a year, from January of 2020 to December of 2020, obtained from
the WHO website. The data of 2020 was analyzed as this was the data that was most accessible
during the time of writing. The PM2.5 levels each country was also extracted from IQAir. A
large number of deaths were not reported during the pandemic, this could affect the reliability of
the data as well.
In order to achieve my aim, it becomes extremely important to collect credible data. As
mentioned before the data collected would be from the year 2020 from the WHO website and
IQair. After that I decided to graph my data on a scatterplot to help me understand if the
association between the variables was mostly linear or non-linear. I would then calculate the
value of the Pearson’s Product Moment Correlation Coefficient (𝑟) for both the independent
3
variable and dependent variable. Using the r value can help determine the strength of the
correlation obtained between the two variables. Then, I would then calculate the Spearman’s
Rank Correlation Coefficient (𝑟𝑠 ) that would determine the statistical dependence between the
rankings of the two variables. It assesses how well the relationship between the two variables can
be described using a monotonic function. If the variables have a monotonic relationship, the
Spearman’s Rank Correlation Coefficient would indicate a higher value than the Pearson’s
Product Moment Correlation Coefficient.
According to my aim, I want to create an equation to predict the relationship between the total
deaths due to COVID-19 and the PM2.5 values of a country so it is not enough only to calculate
the correlation coefficients. To achieve my aim, I will deduce the best fit model for my data from
which predictions about the relationship between the total deaths due to COVID-19 and the
PM2.5 values can be calculated. This can help us mathematically understand the implications of
the PM2.5 levels and how the total deaths due to COVID-19 and the PM2.5 levels are related.
The Coefficient of Determination (𝑅2 ) to determine the accuracy and predictive power of the
model. 𝑅2 is the Coefficient of Determination, since this is a ratio of the successfully predicted
variation, we can interpret it as a percentage which will help us determine the accuracy of the
model.
4
Data Collection
Country Total death by covid (as of PM2.5 values (ⲙg/m^3)

the end of 2020)
India 148,738 58.80
Nepal 2690 39.20
Pakistan 10047 59.00
Indonesia 21944 40.70
Myanmar 2664 29.40
Afghanistan 2189 46.50
Oman 1497 44.40
United Arab Emirates 665 29.20
Bangladesh 7531 77.10
Bosnia and Herzegovina 4050 40.60
China 4634 34.70
North Macedonia 2488 30.60
Thailand 61 21.40
Sri Lanka 199 22.40
Madagascar 261 20.00
South Korea 879 19.50
Malaysia 471 15.60
Taiwan 8 15.00
Mali 269 37.90
Andorra 85 7.40
Australia 908 7.60
New Zealand 30 7.00
Norway 433 5.70
Estonia 266 5.90
Myanmar 2637 29.40
Armenia 2807 24.90
Serbia 3163 24.30
Kazakhstan 1783 21.90
Georgia 2313 20.40
Croatia 3920 21.20
Table 1: Data collection
5
Based on the data collected, I decided to plot a scatter plot which will help observe the relationship
between the variables.
y = 0.0003x + 26.607
R² = 0.167
𝑅2 𝑣𝑎𝑙𝑢𝑒 = 0.167
Graph1: the pollution rate is plotted on the y-axis with µg/m3as the unit and total deaths on the x axis
On plotting the data points, we can observe that the correlation between the variables is moderate.
The data points on this graph are not equally distributed with respect to the mean line in this graph
especially in the left-hand portion of this graph. In order to see the strength of association between
the two variables, I will calculate the Pearson’s Product Moment Correlation Coefficient (𝑟).
The value for r for this graph is:
6
Table 1: Calculation of r value
𝑀x = 7654.333
∑ x = 229630
∑( x − 𝑀x )2 = 𝑆𝑆x = 21121402166.66
𝑀𝑦 = 28.59
∑ 𝑦 = 857.7
7
2
∑( 𝑦 − 𝑀𝑦 ) = 𝑆𝑆𝑦 = 8487.378
∑((x−𝑀x )(𝑦− 𝑀𝑦 ))
𝑟= ……(1)
√((𝑆𝑆x )(𝑆𝑆𝑦 ))
5471195.5
𝑟= = 0.486
√(21121402166.6)(8487.387)
where 𝑀x = mean values of x and 𝑀𝑦 = mean of y values
The value of r is always in between -1 ≤ 𝑟 ≤ +1. If r ≥ 0.50 it is considered a strong positive
relationship, in this case r < 0.50 so the relationship between the two variables is weak. Since a
weak correlation is observed, we can check if the relationship is non-linear. To check if the
relationship is non-linear, I will graph a line of best fit with a non-zero curvature and see if it will
pass through more data points.
8
Graph 2: the pollution rate is plotted on the y-axis with µg/𝑚 3as the unit and total deaths on the x axis
This graph is able to account for more data points and is related monotonically, which means that as
one variable increases the other variable increases as well. The Spearman’s Rank Coefficient will be
between the value -1 ≤ 𝑟𝑠 ≤ +1, the sign indicates the direction of association. A positive
coefficient shows that as one variable increases, so does the other while a negative coefficient
shows that as one variable decreases the other increases. We can calculate the Spearman’s Rank
Coefficient to determine the strength between the two variables I have selected.
9
Table 2: Spearman’s Rank Coefficient Calculation
To calculate the correlation:
6 ×∑ 𝑑 2
𝑟𝑠 = 1 − ………..(2)
𝑛(𝑛2 −1)
(Where n= sample size and d= difference in ranks)
6 × 1259
𝑟𝑠 = 1 −
30(302 − 1)
𝑟𝑠 = 0.72
10
I utilised technology to verify the same, the value I obtained on R studio is 0.720. It can be observed
that both values are exactly the same. After reflecting on Graph 1 and Graph 2, We can also see that
the value of r obtained was much lower than the value of 𝑟𝑠 , this suggests a monotonic relationship
between the variables. Any coefficient obtained that is 𝑟𝑠 ≥ +0.70 is a strong positive correlation.
Since my 𝑟𝑠 = 0.72, we can conclude that a very strong positive correlation is obtained between
my variables however both variables do not increase in the same proportion as it is non-linear.
Reflecting back on my main aim, I wanted to formulate an equation to relate the Total deaths and
PM2.5 levels of countries. To achieve this, I decided I needed to linearize my data. Most
relationships that are not linear can be graphed so that the graph is a straight line, linearization does
not change the fundamental relationship or what it represents, but it does change the way the graph
looks. I decided to linearize my data to help make the analysis of the data easier and compute an
equation. One method to linearize the data is using the logarithmic models, we can re-express all of
the different data points by applying the logarithmic value of the data. Logarithms can be used to
linearize data in three forms:
• Log(x) is plotted against y

• Log(y) is plotted against x
• Log(x) is plotted against Log(y)
According to my aim, I wanted to formulate an equation that would relate the two variables. In
order to achieve this, I will conduct a regression analysis. The first step for that would be curve
fitting, a process to specify the model that best fits the curves of the specific dataset. Plotting these
three graphs (log x vs y, logy vs x and log x vs log y) and determining models for each of these can
help me understand which one of these models can predict the future the most accurately. Using the
𝑅2 value can also help me determine which will yield the best result.
11
Power model(log(x) vs log(y))
Upon re-expressing the data in terms of log(x) vs log(y), the data seems to be linearized one again. I
plotted the log(total deaths) on the x axis as the independent variable while I plotted the
log(pollution rate) on the y axis as the dependent variable.
y = 0.2278(x) + 0.6729
R² = 0.434
Graph 3:The third graph has plotted the ln(pollution rate) on the x-axis and the ln(death rate) on the y-axis.
Upon visualising the data, it is clear that the log-log model does not do the best job in linearizing
the data. Without looking at the 𝑅2 value, it is clear that the data points do not follow the trend of
points closely. We must then use the properties of logarithms and exponents to find the non-linear
model
12
Linearization method Linear model Non-linear model
log(y) vs log(x) log(y)=mlog(x)+c (m being y= 10𝑐 𝑥 𝑚
the slope and c being the y-
intercept)
The equation of the line obtained is:
y = 0.2278(x) + 0.6729
(where y represents Total deaths and x represents the PM2.5 levels)
However, since the equation needs to fit the re-expressed data, it can be re-written as
Log(y)=0.2778 × log (x) + 0.6729
The next step would be to find the best fitting non-linear model for the equation, the equation
obtained previously would be ideal if we wanted to predict log(y) and not y, this is the reason it
is important to transform this equation and obtain a power model. To predict y we need to
remove the log on both sides a power model.
10log(y) = 100.2278×log(x)+0.6729
𝑆𝑖𝑛𝑐𝑒 10log(y) = y, 𝑤𝑒 𝑐𝑎𝑛 𝑟𝑒𝑤𝑟𝑖𝑡𝑒 𝑡ℎ𝑒 𝑒𝑞𝑢𝑎𝑡𝑖𝑜𝑛𝑠 𝑎𝑠
y = 100.2278×log(x)+0.6729
Using the property of 𝑥 𝑎 × 𝑥 𝑏 = (𝑥 𝑎 )𝑏
𝑦 = (10log(x) )0.2278 × 100.6729
𝑆𝑖𝑛𝑐𝑒 10log(x) = x, 𝑤𝑒 𝑐𝑎𝑛 𝑟𝑒𝑞𝑟𝑖𝑡𝑒 𝑡ℎ𝑒 𝑒𝑞𝑢𝑎𝑡𝑖𝑜𝑛 𝑎𝑠
𝑦 = x 0.2278 × 100.6729
13
The power model involves taking both the logarithm of the dependent as well as the independent
2
variable. It also has a 𝑅 value of 0.434 for the power model, which is significantly lower. This can
help us understand that the power model is a weak model This model can be interpreted by saying
that 1% increase in the value of x results in the increase of the y value by 0.2278%, which is
significantly weak. I decided to continue the other methods of linearization to find the best fit model
to help create the equation.
14
Logarithmic model(log(x) vs y)
I began by creating a log- lin graph, this model would be in the form y=mlog(x)+c (where m is the
slope and c is the y intercept)
Upon re-expressing the data in terms of log(x) vs y, the data seems to be linearized. I plotted the
log(total deaths) on the x axis as the independent variable while I plotted the pollution rate on the y
axis as the dependent variable.
y = 13.30x - 12.08
R² = 0.457
Graph 4: The graph has plotted log(total death) on the x-axis and PM2.5 levels on the y axis.
Without doing any mathematical analysis it can be seen clearly how the data points are not equally
distributed with respect to the best fit line. Without doing any mathematical analysis, it can be seen
by eye that the data points deviate greatly from the best fit line, particularly in the center portion
where the data points show values much higher and lower than the line of best fit. The 𝑅2 𝑣𝑎𝑙𝑢𝑒
recorded is not high, however by obtaining a value of 0.457, it is slightly higher than the previous
model. However, since the data points do deviate from the best fit line we can clearly assume there
15
is a better fit model for the equation. This model can be interpreted by saying that when x increases
by a factor of 10, y increases by 13.30 units.

y vs log(x) y=mx+c y=mlog(x)+c
The equation of the line obtained is
y = 13.30x - 12.08
It is then important to rewrite this equation to fit the re-expressed data. The equation of this line is
y=13.30x -12.08 can be re-expressed as
y= 13.30✕log (x) – 12.08
This equation fits into the logarithmic model equation [y= mlog(x)+c] where m is the slope and c
is the y-intercept.
16
Exponential model (x vs log(y))
Lastly, I created the lin-log graph and wanted to see if this would give me better results. Upon re-
expressing the data in terms of log(y) vs x, the data seems to be linearized. I plotted the total deaths
on the x axis as the independent variable while I plotted the log(pollution rate) on the y axis as the
dependent variable.
y= 3.60× 10−6 x + 1.34
R² = 0.96
Graph 5: The graph represents the values of the death rate on the x-axis and the log (pollution rate)on the y axis.
From observing the plotted graph, a few assumptions can be made. Upon looking at the data, it
appears that this model does the best job at linearizing the data. The line of best fit does not pass
through all the data points, but there seems to be a much better distribution of the data points in
comparison to the best fit line. However, it is not enough to consider the overall fit of the model,
finding the 𝑅2 value will help us determine if the exponential model is the best model. 𝑅2 is the
coefficient of determination, since this is a ratio of the successfully predicted variation, we can
interpret it as a percentage. If the obtained 𝑅2 value is 0.90 we can say that 90% of the variation is
17
predicted by the regression line. I decided to calculate the 𝑅2 value for this particular model as it
seems to be the best model. Having a high 𝑅2 value means that the predictive power of the model is
very high and it is efficient in predicting the effect of x on y.
To determine the 𝑅2 value for this graph, I have manually calculated the value of 𝑅2 in the table
below:
Table 3: Value of 𝑅2 calculation
18
∑(𝑦𝑖 − 𝑦̂𝑖 )2
𝑅2 = 1 − ∑(𝑦𝑖 − 𝑦̅)2
….(3)
2.351
𝑅2 = 1 −
58.88
=0.960
In this case 𝑅2 value is 0.96 we can say that 96% of the variation is predicted by the regression line.
The high 𝑅2 value suggests that the predictive power of the model is very high and accurate. The
𝑅2 𝑣𝑎𝑙𝑢𝑒 obtained is quite high and indicates a strong relationship. Having a strong 𝑅2 𝑣𝑎𝑙𝑢𝑒
suggests that the model is able to predict with accuracy. It can be interpreted that as x increases by 1
−6
unit, y increases by a factor of 103.60×10 which is 1, this shows the association is extremely
strong.
The main aim of my exploration was to determine a equation that could determine the relationship
between the total deaths and the PM2.5 values. In order to do this, we have to model the equation
and convert the linear model into a non-linear model

x vs log(y) log(y)=mx+c 𝑦 = 10𝑐 (10𝑚 )x
The equation of the line obtained is
y= 3.60× 10−6 x + 1.34
The equation of an exponential model needs to be in the form of
Log(y)= 3.60× 10−6x + 1.34
19
−6 x+1.34
10log(𝑦) = 103.60×10
Since 10log(𝑦) = 𝑦, 𝑤𝑒 𝑐𝑎𝑛 𝑟𝑒𝑤𝑟𝑖𝑡𝑒 𝑖𝑡 𝑎s
−6 x
𝑦 = (103.60×10 ) × 101.34
This fits into the exponential model: y= (10𝑚 )x 10𝑐
Until now we demonstrated that there was a strong non-linear association between the variables
with a correlation coefficient of 0.72 between the total deaths and the PM2.5 values. However, after
linearizing the data and creating a model, we obtained a strong model that has a 𝑅2 value of 0.96.
The Coefficient of Determination or the 𝑅 2 value accesses the overall effectiveness of the model.
This means that the model has an accuracy of 96%, this high value suggests the high predictive
power of this model.
Comparisons
Line of best fit 𝑅2 𝑣𝑎𝑙𝑢𝑒 (rounded to 3 d.p.)
Total deaths VS ln(PM2.5 The line of best fit mostly 0.960

values) passes through the points
ln(Total deaths) VS ln(PM2.5 It mostly does not fit the points 0.434
values)
ln(Total deaths) VS PM2.5 It mostly does not fit the points 0.457
values
Table 4: Comparisons of models
20
From the above table, it is clear that the plot of Total deaths VS ln(PM2.5 values) does the best job
in linearizing the data compared to the other graphs. It has the highest 𝑅2 𝑣𝑎𝑙𝑢𝑒 which suggests that
the model is highly efficient.
Verification of equation
To further evaluate and test the predictive strength of the equation formulated, I decided to verify
the equation and conduct error analysis. I had stated in my aim that I wanted my equation to make
real-world predictions, verifying the equation using our collected data will help us achieve this.
Observed data from the data collected:
Country x𝑜𝑏𝑠𝑒𝑟𝑣𝑒𝑑 𝑦𝑜𝑏𝑠𝑒𝑟𝑣𝑒𝑑
Kazakhstan 1783 21.90
Thailand 61 21.40
Table 5: Observed data
Calculated data using the equation:
−6 x
𝑦 = (103.60×10 ) × 101.34
Country Calculation 𝑦𝑐𝑎𝑙𝑐𝑢𝑙𝑎𝑡𝑒𝑑
Kazakhstan −6 1783 22.204

= (103.60×10 ) × 101.34
Thailand −6 61 21.888
= (103.60×10 ) × 101.34
Table 6: Calculated data
21
Error analysis
|𝑚𝑒𝑎𝑠𝑢𝑟𝑒𝑑 𝑣𝑎𝑙𝑢𝑒−𝑟𝑒𝑎𝑙 𝑣𝑎𝑙𝑢𝑒|

Percentage error= × 100
𝑟𝑒𝑎𝑙 𝑣𝑎𝑙𝑢𝑒
Country Calculation % 𝑒𝑟𝑟𝑜𝑟
Kazakhstan |22.204 − 21.90| 1.370%

= × 100
21.90
Thailand |21.888 − 21.40| 2.280%
= × 100
21.40
Table 7: Error analysis
Both the percentage errors obtained are relatively small and the results obtained here is the very
close to the value we were hoping to obtain, this can help us conclude that the exponential model is
the best model and does the best job in creating the equation. Reflecting back on the aim, I have
successfully determined the best model to establish the relationship between the total deaths and
PM2.5 values and predict the value. Henceforth, this model can be used and the variables can be
extrapolated in order to predict the total deaths in the future. Since the 𝑅2 𝑣𝑎𝑙𝑢𝑒 obtained was
extremely high and did a good job in predicting the data, the null hypothesis can be rejected and it
can be concluded that as the PM2.5 values of a country increase the total deaths due to COVID-19
in that country also increases.
Evaluation
I tried to determine the accuracy of the data I collected through using many different websites and
22
did not see any discrepancies. However, since the reporting of deaths was lower than the actual
deaths in all country, we cannot determine if my model will be able to predict accurately. Hence, it
might be challenging to determine the validity of the model I produced.
I mainly focused on 2 variables – Total deaths affected the PM2.5 values. However, a potential
extension of this investigation could be analyzing more variables like Recovered cases and
Population and formulating an equation that takes many factors into consideration. I could also take
a look at other particulate matter such as 𝐶02 , 𝑆𝑂2 , 𝑁𝑂2 𝑎𝑛𝑑 𝑂3 and formulate an equation taking
this into consideration.
In general, despite the limitations, I have achieved the main aim of the exploration and used
accurate data from government websites throughout. I have also rigorously conducted the
linearization of data and found the best model that would predict the relationship between the
variables. All my manual calculations have been verified using technology and was mostly precise.
23
Bibliography
Ali, Nurshad. Islam, Farjana. “Infection and Mortality—A Review on Recent Evidence”. Front,
Public Health, 2020, https://doi.org/10.3389/fpubh.2020.580057. Accessed on October 2020.
Bashir, Muhammad Farhan et al. “Correlation between environmental pollution indicators and
COVID-19 pandemic: A brief study in Californian context.” Environmental research vol. 187
(2020): 109652. doi:10.1016/j.envres.2020.109652. Accessed on January, 2021.
Marco Travaglio, Yizhou Yu et al. “Links between air pollution and COVID-19 in England. Volue
268, Part A, Science Direct, 2021. https://doi.org/10.1016/j.envpol.2020.115859. Accessed on October
2021.
Muhammad, Sulaman et al. “COVID-19 pandemic and environmental pollution: A blessing in
disguise?.” The Science of the total environment vol. 728 (2020): 138820.
doi:10.1016/j.scitotenv.2020.138820. Accessed on March,2021.
WHO Coronavirus (COVID-19) Dashboard, https://covid19.who.int/. Accessed on April,2021.
24
IQAir Dashboard, https://www.iqair.com/in-en/world-most-polluted-countries . Accessed on April,
2021
25
26

Math IA

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Math IA

Uploaded by

Copyright:

Available Formats

IB Internal Assessment

personal to each one us during the pandemic.

predict the death rate in a country.

As established in the introduction, my aim of this exploration is to determine the relationship

country for the year of 2020.

the following countries for the year 2020.

the data as well.

In order to achieve my aim, it becomes extremely important to collect credible data. As

Product Moment Correlation Coefficient.

Country Total death by covid (as of PM2.5 values (ⲙg/m^3)

India 148,738 58.80

Nepal 2690 39.20

Pakistan 10047 59.00

Indonesia 21944 40.70

Myanmar 2664 29.40

Afghanistan 2189 46.50

Oman 1497 44.40

United Arab Emirates 665 29.20

Bangladesh 7531 77.10

Bosnia and Herzegovina 4050 40.60

China 4634 34.70

North Macedonia 2488 30.60

Sri Lanka 199 22.40

Madagascar 261 20.00

South Korea 879 19.50

Malaysia 471 15.60

Mali 269 37.90

Australia 908 7.60

New Zealand 30 7.00

Norway 433 5.70

Estonia 266 5.90

Myanmar 2637 29.40

Armenia 2807 24.90

Serbia 3163 24.30

Kazakhstan 1783 21.90

Georgia 2313 20.40

Croatia 3920 21.20

Table 1: Data collection

between the variables.

The value for r for this graph is:

where 𝑀x = mean values of x and 𝑀𝑦 = mean of y values

The value of r is always in between -1 ≤ 𝑟 ≤ +1. If r ≥ 0.50 it is considered a strong positive

pass through more data points.

To calculate the correlation:

(Where n= sample size and d= difference in ranks)

linearize data in three forms:

• Log(x) is plotted against y

log(pollution rate) on the y axis as the dependent variable.

(where y represents Total deaths and x represents the PM2.5 levels)

Log(y)=0.2778 × log (x) + 0.6729

remove the log on both sides a power model.

𝑆𝑖𝑛𝑐𝑒 10log(y) = y, 𝑤𝑒 𝑐𝑎𝑛 𝑟𝑒𝑤𝑟𝑖𝑡𝑒 𝑡ℎ𝑒 𝑒𝑞𝑢𝑎𝑡𝑖𝑜𝑛𝑠 𝑎𝑠

Using the property of 𝑥 𝑎 × 𝑥 𝑏 = (𝑥 𝑎 )𝑏

𝑦 = (10log(x) )0.2278 × 100.6729

𝑆𝑖𝑛𝑐𝑒 10log(x) = x, 𝑤𝑒 𝑐𝑎𝑛 𝑟𝑒𝑞𝑟𝑖𝑡𝑒 𝑡ℎ𝑒 𝑒𝑞𝑢𝑎𝑡𝑖𝑜𝑛 𝑎𝑠

to help create the equation.

slope and c is the y intercept)

axis as the dependent variable.

by a factor of 10, y increases by 13.30 units.

Linearization method Linear model Non-linear model

The equation of the line obtained is

y=13.30x -12.08 can be re-expressed as

y= 13.30✕log (x) – 12.08