Finding The Most Efficient Square Root Algorithm

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 24

Downloaded from www.clastify.

com by Hania Elgawsaky

MATHS COURSEWORK

Finding the most efficient square root algorithm

om
By Martim Cardeira

l.c
ai
gm
6@
39
r7
ad
ni
ha
y
tif
as
Cl

1
Downloaded from www.clastify.com by Hania Elgawsaky

Table of Contents
Plan of investigation ................................................................................................ 3
Obtaining data .......................................................................................................... 4
Algorithm 1: Babylonian method ............................................................................ 5
Algorithm 2: Bakhshali method .............................................................................. 8
Algorithm 3: Exponential identity ......................................................................... 10
Raw and processed results ................................................................................... 12
Algorithm 1: Babylonian method ........................................................................................................................... 13
Algorithm 2: Bakhshali method ............................................................................................................................. 15
Algorithm 3: Exponential identity .......................................................................................................................... 18
All graphs superimposed........................................................................................................................................ 20
Conclusion .............................................................................................................. 21

om
Reflection ................................................................................................................ 22

l.c
ai
Works cited ............................................................................................................. 23
gm
6@
39
r7
ad
ni
ha
y
tif
as
Cl

2
Downloaded from www.clastify.com by Hania Elgawsaky

Plan of investigation

I have chosen this topic as it relates to my interests in physics and computer science.

Physics simulation software such as Simscale, MATLAB and BeamNG.Drive (which

is more of a game but still has physics that emulate real life) all operate on a type of

computer software known as “physics engines”. These physics engines apply real

life physics to a virtual world and objects of your choice. Thus, mathematical

(physics) formulas and arithmetic which are used in real life must be used in order to

calculate the movement and state of your virtual world / objects.

om
l.c
ai
This is where square roots (and algorithms for finding them) are essential; one very
gm
basic, but crucial example of the necessity for finding the square root in physics is
6@
39

vector addition. In vector addition, we represent the forces acting on an object using
r7

arrows which signify magnitude and direction. We “add” those arrows to form a
ad
ni

triangle with a missing side and then use functions of trigonometry such as the
ha

Pythagorean and cosine formulae (both of which require us to find square roots in
y
tif
as

order to get an answer) to calculate not only the magnitude (length) but also the
Cl

angle of the missing side which is the resultant / net force that is experienced by the

object. Of course, computers have no need to represent these forces as arrows and

triangles, especially when there are lots of objects and forces and no room for extra

data in memory, but the arithmetic is the same and thus an algorithm is needed in

order the calculate the square root for the computer to calculate a resultant force.

This is only one example out of many where we may require the square root of a

number in a physics engine. As with any piece of software, it is imperative that we

optimise the code for it to be as efficient as possible. In essence, we want our square

3
Downloaded from www.clastify.com by Hania Elgawsaky

root algorithm to calculate our answer in the least amount of time possible so that we

can have more calculations per second, or so that our list of calculations can be

completed quicker.

Obtaining data:

First of all, to find out which algorithm is the most efficient I must first test every

single algorithm that I deem sensible enough to run (ones that by visual inspection of

the code or actual math do not appear to take too many calculations / operations for

om
them to be competitive). As a result of this, I must either code, or find existing code

l.c
ai
for these algorithms in the programming language Java, and then run each algorithm
gm
6@

multiple times with a variety of different numerical inputs and simultaneously run a
39

timer to find out which algorithm takes the least time on average to make a
r7

calculation. I will be using the NetBeans IDE and compiler to achieve this as it is the
ad
ni

program I am most familiar with from my computer science lessons. To facilitate this,
ha
y

I will write a method for each algorithm which will iterate through all the number
tif
as

inputs and write the time taken to a CSV spreadsheet file. In the case of algorithms
Cl

such as the Babylonian method, we may need to adapt certain variables such as 𝑥0 ,

our initial guess, to our domain. As we know, the domain for any square-root finding

algorithm is any real, positive number. For the sake of simplicity, I will restrict my

domain to all integers from 1 to 1000 inclusive. My rationale for this is that the most

efficient algorithm will be applied to a physics engine, and so the integers 1 to 1000

will cover vector additions from a range of 1 to 1000 newtons in magnitude. If we go

over 1000 newtons, then we can switch to a larger scale, kilonewtons (kN) and use

the domain 1 to 1000 again. Naturally, this will make our uncertainty much larger and

4
Downloaded from www.clastify.com by Hania Elgawsaky

produce more errors in our physics simulation (our absolute uncertainty is multiplied

by 1000), however our physics simulation is basic and not to be used for exorbitantly

large scaled simulations. As to why we’re not using decimal numbers, I would like to

keep my domain short so that it is well within the hardware capabilities of my laptop.

As to the effect of using integers instead of decimal numbers (floats), there will be no

practical difference as I will denote my integer inputs as floats (decimal numbers) in

my Java programs so that they are handled the same exact way as decimal numbers

so that they carry out floating point arithmetic. In short, my use of integers will not

affect my data / trends in comparison to decimal inputs.

om
l.c
ai
gm
Algorithm 1: Babylonian method
6@
39

The Babylonian method is likely the first ever algorithm used for approximating √𝑆.
r7
ad

This however does not mean we should discount it. The earliest account of the
ni

Babylonian method in use was 1700 BC (Baez), on a clay tablet. From the artifacts
ha
y

that have been found, it is apparent that the Babylonians were commonly using this
tif
as

square root solving method for trigonometry, more specifically applying this method
Cl

to the Pythagorean identity in order to find the hypotenuse of a right-angle triangle.

The most likely reason for this need of trigonometry is architecture and construction.

What is interesting is that the Babylonians were commonly using integer inputs (in

this case triangle side lengths), which is exactly what we’re doing with our

experiment. This perhaps suggests that the Babylonian method may be well suited to

integer inputs. On the other hand, Babylonians did not have access to calculators, so

it’s likely they used integers solely to simplify the arithmetic, not because the method

is suited for integers. Nevertheless, it is an interesting method to consider.

5
Downloaded from www.clastify.com by Hania Elgawsaky

The Babylonian method is an iterative algorithm (at least if the first iteration does not

achieve the desired accuracy, which in our case will never be). Here’s how the

method works. S is our number of which we want to find the square root. Let’s take 𝑥
𝑆
as guess for √𝑆. If our guess 𝑥, is an overestimate, then 𝑥 will be an underestimate

or vice versa. The average of these two numbers will provide a closer approximation

to √𝑆. We can this represent as such:

𝑆
+𝑥
√𝑆 ≈ 𝑥

om
2

l.c
Of course, doing this one step will not always produce a very accurate estimate. If we complete

ai
gm
this step, we will have an error, 𝜀 such that 𝑆 = (𝑥 + 𝜀)2 . If the error produced satisfies our set
6@

threshold, we can take the estimate for √𝑆 from this first step. If this not the case (which it will
39

never be in our case), we must begin the iterative process. By expanding 𝑆 = (𝑥 + 𝜀)2 and
r7
ad

solving for 𝜀 we get:


ni
ha

𝑆 = (𝑥 + 𝜀)2 = 𝑥 2 + 2𝜀𝑥 + 𝜀 2 ,
y
tif

𝑆 − 𝑥 2 = 2𝜀𝑥 + 𝜀 2 ,
as
Cl

𝑆 − 𝑥 2 = 𝜀(2𝑥 + 𝜀),

𝑆− 𝑥 2 𝑆− 𝑥 2
𝜀= ≈ ∵ 𝜀≪𝑥.
2𝑥+ 𝜀 2𝑥

With this in mind, we can come up with an improved, more accurate estimate as:
𝑆
𝑆−𝑥 2 𝑆+𝑥 2 +𝑥
𝑥+ 𝜀 ≈𝑥+ = = 𝑥
= 𝑥𝑟𝑒𝑣𝑖𝑠𝑒𝑑 .
2𝑥 2𝑥 2

If the new value of 𝜀 still doesn’t satisfy our threshold, we may repeat the step above over and

over until it does satisfy our threshold, each time taking 𝑥𝑟𝑒𝑣𝑖𝑠𝑒𝑑 and plugging it back into the

equation as 𝑥.

6
Downloaded from www.clastify.com by Hania Elgawsaky

My rationale for this algorithm being efficient is the fact that it has simple operations

for a computer to perform (addition and squares (which is multiplication)), which can

be considered “cheap” operations (or rather, quick operations) with the exception of

division which is twice as long as multiplication (of which itself takes roughly 4 times

as long as addition/ subtraction) (Hindriksen). In addition to that, the Babylonian

method is quadratically convergent, meaning the number of correct digits of the

approximation almost doubles with every single iteration (“Methods of computing

square roots”).

om
l.c
However, the greatest strength of this algorithm has to do with the fact that it is well

ai 𝑆
gm
+𝑥
suited to a base 2 number system. If we take the equation 𝑥
= 𝑥𝑟𝑒𝑣𝑖𝑠𝑒𝑑 , which is
2
6@

repeated every iteration, we’ll notice that there are only two division operations that
39
r7

are happening. Of these two division operations, one of them is to simply divide the
ad

numerator of the entire fraction by 2. So, what is the significance of this number 2?
ni
ha

Well, computers use binary which is a base 2 number system. This means every
y
tif

digit of a binary number can represent two numbers as the value can either be 0 or
as
Cl

1. To simplify the understanding of the binary system, here is an example of a value

table:

(“Binary”)

7
Downloaded from www.clastify.com by Hania Elgawsaky

If we were to add an extra digit to the right of the binary number (with the value 0),

we would the double the value of every other digit and essentially double the

number. So, in essence, shifting every digit one place to the left will double the

number. If we shift every digit one place to the right instead, the opposite will

happen. We will instead halve the number. If you pay attention, moving every digit to

the right in this case will cause the rightmost digit (1) to disappear, making it a half of

the previous value, rounded to the nearest integer. We don’t need to worry about this

as the computer will handle the situation using floating point arithmetic (this problem

would only occur with integer arithmetic). All of this goes to show that multiplying by

om
l.c
2, and more importantly in our case, dividing by 2 are two very quick operations that

ai
take the computer very little time due to it just being a matter of shifting all digits one
gm
6@

𝑆
+𝑥
place. These operations are so efficient that in the case of 𝑥
= 𝑥𝑟𝑒𝑣𝑖𝑠𝑒𝑑 , we can
2
39
r7

treat the equation as only having one division operation in terms of processing time.
ad

With this perspective, the Babylonian algorithm only has a singular, heavily
ni
ha

processor-taxing operation per iteration. This sounds very promising in terms of


y
tif

algorithm efficiency.
as
Cl

Algorithm 2: Bakhshali method

The Bakhshali method is an ancient Indian square root solving method from a time

period between the 6th and 12th centuries (Bailey). Information regarding the

Bakhshali method is very scarce, and little is known about rationale / application of

such method. Scholar G. R. Kaye believed the “mathematical content was derivative

8
Downloaded from www.clastify.com by Hania Elgawsaky

from Greek sources” (Bailey). Given this possibility, we may consider this method as

an improvement or a successor to previously existing methods.

The Bakhshali method is yet another iterative algorithm. Even though, by first

inspection of the formula the method may seen very taxing on the processor (by the

standards that were set in terms of the ‘cost’ of operations), it should be considered

as the Bakhshali method is also quartically convergent (as opposed to the

quadratically converging Babylonian method) and therefore equivalent to two

iterations of the aforementioned Babylonian method given the same initial guess

om
l.c
(Bailey). Yet again, we must make an initial guess, the closer it is to the actual

ai
gm
square root, the more accurate the results from the first and each subsequent
6@

iterations will be. √𝑆 = 𝑥0 , where 𝑥0 is our first guess. We must then iterate as
39

follows:
r7
ad
ni
ha

2
𝑎𝑛 2
𝑎𝑛
𝑥𝑛+1 = 𝑏𝑛 − = (𝑥𝑛 + 𝑎𝑛 ) − .
2𝑏𝑛 2(𝑥𝑛 + 𝑎𝑛 )
y
tif
as
Cl

Definitions:
2
𝑆−𝑥𝑛
𝑎𝑛 = ,
2𝑥𝑛

𝑏𝑛 = 𝑥𝑛 + 𝑎𝑛 .

(Think of 𝑥𝑛+1 as 𝑥𝑟𝑒𝑣𝑖𝑠𝑒𝑑 from the Babylonian method)

9
Downloaded from www.clastify.com by Hania Elgawsaky

We can use this make a rational approximation to the square root. So long as 𝑥0 2 is

close to 𝑆, the first iteration of the Bakhshali method can be written, and simplified as

such (“Methods of computing square roots”):

𝑑 = 𝑆 − 𝑥0 2 ,

𝑑 𝑑2 8𝑥0 4 +8𝑥0 2 𝑑+ 𝑑 2 𝑥0 4 +6𝑥0 2 𝑆+ 𝑆 2 𝑥0 2 (𝑥0 2 +6𝑆)+ 𝑆 2


√𝑆 ≈ 𝑥0 + 2𝑥0

8𝑥0 3 +4𝑥0 𝑑
=
8𝑁3 +4𝑥0 𝑑
=
4𝑥0 3 +4𝑥0 𝑆
=
4𝑥0 (𝑥0 2 +𝑆)
.

As previously touched upon, my main rationale for the Bakhshali method being competitive is

that each iteration is worth two Babylonian iterations. Despite there being a lot more operations

om
per iteration, the lowered number of iterations needed for an accurate (per our error threshold)

l.c
ai
value of √𝑆 as a cause of quartic convergence may prove efficient later on.
gm
6@
39

Algorithm 3: Exponential identity


r7
ad
ni
ha

Pocket calculators commonly use exponential identities to calculate the square root
y

of number. This algorithm is particularly intriguing, as by a quick visual inspection it


tif
as

is quite hard to get a sense of whether or not it will be efficient or even competitive
Cl

amongst other methods. Considering the use of a natural logarithm, which is very

taxing on the processor, the algorithm may initially seem inefficient. However, we

must consider that the algorithm is non-iterative. Following the properties of

logarithms, we can find the identity for √𝑆 :

1
√𝑆 = 𝑆 2 ,
1
ln √𝑆 = ln 𝑆 2 ,

10
Downloaded from www.clastify.com by Hania Elgawsaky

1
ln √𝑆 = ln 𝑆,
2

1
ln 𝑒 = 1, ln √𝑆 = ln 𝑆 ln 𝑒 2 ,
1
ln √𝑆 = ln 𝑒 2 ln 𝑆 ,
1
ln 𝑆
√𝑆 = 𝑒 2 .

The inherit problem with this formula is that the efficiency of our identity is dependant

on the efficiency of our logarithm and ‘raising an exponent’ methods. (You may refer

om
l.c
to source code of math java library to find these respective methods). The methods I

ai
gm
will use come from the StrictMath java library (Blake), a library that is very commonly
6@

and extensively used by Java programmers. Since this method is extensively used,
39

I’m assuming that it’s the most efficient method available.


r7
ad
ni
ha

My rationale for this method being efficient mostly comes down to the fact that is
y

non-iterative. Whilst ‘raising an exponent’ and logarithms are extremely taxing on the
tif
as

processor, they only have to be done once. Due to this, I feel that the exponential
Cl

identity may not prove the most efficient with smaller inputs to begin with as the other

methods will require few iterations for smaller numbers. However, when it comes to

larger inputs, I feel that the exponential identity method will excel in terms of

efficiency as the other methods will require a lot of iterations to reach an accurate

result. My second reason for thinking this algorithm is efficient is simply down to

contextual information. This algorithm is already being used in pocket calculators,

devices which operate in binary (base 2), this goes to suggests that not only is this

11
Downloaded from www.clastify.com by Hania Elgawsaky

method efficient enough to be used commercially, it is also well suited to our

application (computers which run on binary).

Raw and processed results:

Using Java, I have iterated through each of the three methods for 1 ≥ 𝑆 ≥ 1000 until

reaching 12 digits of precision for each result (slightly less than what MATLAB uses,

which is 16 digits of precision (“Increase Precision of Numeric Calculations - MATLAB &

om
Simulink”)), and to achieve a high degree of accuracy in my mean average I have

l.c
ai
done 10 repeats per method. Looking through my results I found several anomalies
gm
6@

that needed to be removed so as to not affect my trendline that is semi-automatically


39

generated by Excel (type of trendline (e.g., polynomial, linear, exponential, etc…) is


r7
ad

selected by me, however the actual trendline and its respective equation is done
ni

automatically). As there were 3 x (1000 x 10) = 30000 results in total, I would have to
ha
y

use a process that would automatically eliminate these anomalies from my mean
tif
as

average, which is the column that I would be graphing for each method. To
Cl

accomplish this, I used the TRIMMEAN Excel function, which would systematically

eliminate outliers from an array of values to be used in my mean average. I used the

function to eliminate the top 20% (2), and the bottom 20% (2) values for each array.

This meant my mean average would be considerably less accurate as it would be

using an array of 6 results each time instead of 10, however this was my only good

option as it would be extremely time-consuming to look through 30000 results. I

couldn’t just iterate each method more times for more results and then apply the

TRIMMEAN function as my Excel on my laptop was already lagging (almost beyond

12
Downloaded from www.clastify.com by Hania Elgawsaky

reasonable usability) with my current set of results. Finally, I did not record the

processing time for any of the methods where 𝑆 = 1, as a Java error would include

the time taken to compile the program (which is not part of the algorithm).

Algorithm 1: Babylonian method:

om
l.c
ai
gm
6@
39
r7
ad
ni
ha
y
tif
as
Cl

This the graph of processing time in nanoseconds vs S for the Babylonian algorithm.

I chose my trendline as a polynomial of order 3 as I could deduce two turning points

13
Downloaded from www.clastify.com by Hania Elgawsaky

visually. It seems as though the results are a lot more accurate when 𝑆 < 350, as

after that the deviation from the trendline becomes a lot greater. I have modelled the

function for the processing time in nanoseconds given S for the domain { 𝑆 ∈ 𝕫 |0 < 𝑆

≤ 1000}, visible in the top right of the graph. Some of the visible outliers that you can

see on the graph were points S = 401, 629, 742, 864. I have re-run these inputs into

my method several times (in the order of 20-30 times) and eventually I found that

they do comply with the trend of the other points, it’s just that for some reason Java

doesn’t always display the correct execution time for these inputs which has resulted

in these outliers.

om
l.c
ai
gm
The total time taken for all 1000 inputs with this method is 0.373ms (3 s.f)
6@
39
r7
ad
ni
ha
y
tif
as
Cl

14
Downloaded from www.clastify.com by Hania Elgawsaky

Algorithm 2: Bakhshali method:

om
l.c
ai
gm
6@
39
r7
ad
ni
ha
y
tif
as
Cl

The results of the Bakhshali method were quite peculiar. Due to the shape of the

graph, I decided to split the trendline in two, making it a piecewise function. The first

part of the graph where 𝑆 ≤ 288, exhibits a quite clear exponential relationship

between the processing time and S as is evident with the exponential growth.

However, and quite interestingly, when 𝑆 > 288, the graph exhibits a logarithmic

relationship between the time and S. I have modelled the first function for the

15
Downloaded from www.clastify.com by Hania Elgawsaky

processing time in nanoseconds given S for the domain { 𝑆 ∈ 𝕫 |0 < 𝑆 ≤ 288}, visible

in the top right of the graph, the function that is at the top. I have also modelled the

second function for the processing time in nanoseconds given S for the domain { 𝑆 ∈

𝕫 |288 < 𝑆 ≤ 1000}, visible in the top right of the graph, the function that is at the

bottom. Immediately from visual inspection I can tell that the Bakhshali method will

be highly inefficient (for the large majority of the domain of 0 > 𝑆 ≥ 1000) relative to

the other two, as is evident by the scale of the y-axis (20 times larger than the

previous graph).

om
l.c
I found the changeover point, S = 289 to be particularly intriguing. Firstly, 289 is not a

ai
gm
power of 2, therefore we can’t attribute the change in trend to the change in size of
6@

the memory address which stores S. One that is interesting is that 289 is a perfect
39

square, being the square of 17. Again, 17 isn’t a power of two so we can’t attribute
r7
ad

the change in trend to any quirks of the base 2 number system. Perfect squares do
ni
ha

have particular significance in our root finding algorithms as they usually converge
y

very quickly iteration-wise. One possible explanation for this changeover in behavior
tif
as

may be that there is a sudden decrease in iterations to reach the desired accuracy.
Cl

To test this, I altered my Bakhshali method code to instead count the number of

iterations taken to give a result that falls within my error threshold. I calculated the

number of iterations for values of S 286 to 292 inclusive, which are in and around the

changeover point. The results were as follows:

S Number of iterations

286 32

287 32

16
Downloaded from www.clastify.com by Hania Elgawsaky

288 33

289 1

290 2

291 2

292 3

As expected, the perfect square of 289 resulted in very quick convergence. This is to

be expected. However, what is not to be expected is the shockingly low amount of

iterations required for the next few integer inputs, all of which are imperfect squares.

om
l.c
I lack the knowledge to explain why this is the case, however these results affirm that

ai
gm
this change in trend is not just a program error but rather real.
6@
39

This discovery led me to notice a pattern, in that the bottom of the trails in this graph
r7
ad

were all perfect squares (which is to be expected). The intriguing part is that the
ni
ha

execution time for each following integer after a perfect square would increase
y
tif

linearly until there was another perfect square, in which case the execution time
as

would again reset back to a near-zero value. The gradient of the linear patterns
Cl

would increase with each trail prior to S = 289, leading to an overall exponential

trend. After the changeover, the opposite would happen: the gradient of the linear

patterns would decrease with each trail leading to a logarithmic overall trend. Whilst I

lack an explanation for this phenomenon, it is quite interesting. Still, the Bakhshali

method appears quite inefficient at first inspection.

The total time taken for all 1000 inputs with this method is 9.22ms (3 s.f).

17
Downloaded from www.clastify.com by Hania Elgawsaky

Algorithm 3: Exponential method:

om
l.c
ai
gm
6@
39
r7
ad
ni
ha
y
tif
as
Cl

Here is the graph of the exponential method. The general shape of the graph seems

a lot closer to the expected results. I chose a polynomial of order 4 as my trendline

as I could see (but it is very hard to tell), 3 different turning points, one of which was

not meant to be < S = 200, however I could not control how the trendline was

generated. Judging visually by the scale of the y-axis, this method seems to be more

18
Downloaded from www.clastify.com by Hania Elgawsaky

or less on par with the Babylonian method (especially when you consider that the

Babylonian graph had way greater, but very few, outliers which contribute to the

increase in scale). I have modelled the function for the processing time in

nanoseconds given S for the domain { 𝑆 ∈ 𝕫 |0 < 𝑆 ≤ 1000}, visible in the top right of

the graph.

The outliers in this graph are exactly like the ones in algorithm 1, meaning that they

do actually comply with the trend and are just the result of a Java error.

om
l.c
Perfect squares did not have an effect on execution time in this algorithm, this is to

ai
be expected due to the non-iterative nature of the exponential identity being used
gm
6@

and the lack of gradual convergence. There isn’t really much to be said about this
39

algorithm, I feel as though the results are too uncertain to deduce a solid trend.
r7
ad

However, it does seem that this method loosely exhibits linear correlation. However,
ni

our set of data is too small to confirm this.


ha
y
tif
as

The total time taken for all 1000 inputs with this method is 0.180ms (3 s.f)
Cl

19
Downloaded from www.clastify.com by Hania Elgawsaky

All graphs superimposed:

om
l.c
ai
gm
6@
39
r7
ad
ni
ha
y
tif

Finally, I used Desmos to superimpose the graphs from each image into one to
as
Cl

graphically check which method is the most efficient. I did this by typing the equation

for each trendline into Desmos. To see which method is the most efficient we have to

check which line (or in this case lines) are the lowest in the graph. We can actually

see that the red line, the first part of the piecewise function for the Bakhshali method

is the most efficient from S = 0 to the point where it intersects with purple trend line

from the Babylon method, where S = 69.265. From S > 69.265, the exponential

identity method becomes the most effective one. The yellow trend line of the second

part of the piecewise function of Bakhshali intercepts the x-axis at point 295.0. It’s

briefly the most efficient method until it intersects with the trendline of the exponential

20
Downloaded from www.clastify.com by Hania Elgawsaky

identity at point S = 298.46, where the exponential method becomes most efficient

yet again until the end of the domain (S ≤ 1000). It is important to note that we only

take integer inputs for S, and the points of intersection of the trendlines give us non-

integers. Hence, we should round every point of intersection to the previous whole

number, not the nearest. For example, there is an intersection at point S = 298.46 ,

meaning that the Bakhshali method is most efficient at S = 298 and the exponential

identity is most efficient at S = 299. Following this, the points of method changeover

are as such: S = 69, 295, 298.

om
l.c
ai
gm
Conclusion:
6@
39

From the processed results, we can graphically deduce that the Bakhshali algorithm
r7

is most efficient from 0 < S < 70, and 295 ≤ S < 299. On the other hand, the
ad
ni

exponential identity algorithm is most efficient from 70 ≤ S < 295 and from 299 ≤ S ≤
ha

1000. From these results, we should not consider the Babylonian algorithm in our
y
tif
as

physics engine, as it is never the most efficient method. We should aim to use the
Cl

exponential identity and Bakhshali algorithms in their respective domains where they

are the most efficient methods. This can be achieved by using if statements to check

whether the inputs of S fall into these domains, and then using the best method to

calculate their square roots (e.g., if S < 70 → Bakhshali(S) ). If statements are

relatively quickly operations, however, given the small domain where the Bakhshali

method is most efficient, the time taken for an if statement, or rather multiple if

statements every single time, would increase the processing time every time,

unnecessarily more than just sticking to a single method. For this reason, I have

21
Downloaded from www.clastify.com by Hania Elgawsaky

chosen the exponential identity algorithm as the single algorithm to use for solving

square roots as its domain is the largest. To further support this choice, the

exponential identity has a total execution time of 0.180ms (3 s.f) for all 1000 integer

inputs. This is less than half of the total execution time for all 1000 integer inputs with

the Babylonian method, 0.373ms (3 s.f). Compared to the Bakhshali method

however, the difference is even more drastic, with a total execution time of 9.22ms (3

s.f). By this metric of efficiency, the exponential identity comes out as the most

efficient method by quite a margin (0.193ms).

om
l.c
Reflection:

ai
gm
Overall, this experiment was quite unsuccessful. To start with, some of the results for all
6@
39

methods would seldom give huge uncertainties such as ± 200ns when the uncertainties should
r7

be 0. This is because each method will mathematically always take a set number of steps
ad
ni

(operations per iteration x iterations) to reach an approximation of √𝑆 within our error threshold,
ha
y

meaning that given the static clock rate of my processor, should take the exact same time,
tif
as

every time for a particular input. This is not the case in my results as is evident by the
Cl

uncertainties I obtained. This tells me that there are lots of ‘hidden’ variables that need to be

controlled in my experiment, a task that is nigh-on-impossible. Factors such as the caching of

my program mid-execution are impossible to control and Java itself has some oddities in

execution which yet again I can’t control.

If I were to answer this question again, I would test actual practical application. Testing each

method in an actual physics engine, running lots of simulations and deducing which method has

the least total processing time / lowest mean processing time would be the best approach. This

22
Downloaded from www.clastify.com by Hania Elgawsaky

unfortunately isn’t possible for me as I haven’t designed a physics engine (it’s quite

complicated).

The second best thing to do would be to use a larger domain for S, continuing into the millions. I

found that my rather small set of data was insufficient to deduce a solid trend. This is also the

reason that I have deliberately avoided extrapolation in my analysis as a trend was very difficult

to notice in all of the methods.

Unfortunately, my IA topic leaves me with very little theoretical information to go off. It is

extremely difficult to theoretically calculate the execution time required for a square root method

om
l.c
as there are thousands of operations (which aren’t always equal to each other in terms of CPU

ai
gm
time) to keep track of. This leaves me with nothing to compare my practical results with.
6@
39
r7
ad

Works Cited:
ni
ha

Baez, John. “Babylon and the Square Root of 2 | Azimuth.” Azimuth, 2 December 2011,
y
tif
as

https://johncarlosbaez.wordpress.com/2011/12/02/babylon-and-the-square-root-of-2/. Accessed 15 June


Cl

2022.

Bailey, David H. “Ancient Indian Square Roots: An Exercise in Forensic Paleo-Mathematics.” David H Bailey,

https://www.davidhbailey.com/dhbpapers/india-sqrt.pdf. Accessed 15 June 2022.

Blake, Eric. “Source for java.lang.StrictMath (GNU Classpath 0.95 Documentation).” developer.classpath.org!,

https://developer.classpath.org/doc/java/lang/StrictMath-source.html. Accessed 15 June 2022.

Hindriksen, Vincent. “How expensive is an operation on a CPU? - StreamHPC.” Stream HPC, 16 July 2012,

https://streamhpc.com/blog/2012-07-16/how-expensive-is-an-operation-on-a-cpu/. Accessed 15 June

2022.

23
Downloaded from www.clastify.com by Hania Elgawsaky

“Methods of computing square roots.” Wikipedia,

https://en.wikipedia.org/wiki/Methods_of_computing_square_roots#Babylonian_method. Accessed 15

June 2022.

“Binary.” japanistry.com, https://www.japanistry.com/binary/. Accessed 15 June 2022.

“Increase Precision of Numeric Calculations - MATLAB & Simulink.” MathWorks,

https://www.mathworks.com/help/symbolic/increase-precision-of-numeric-calculations.html. Accessed 15

June 2022.

om
l.c
ai
gm
6@
39
r7
ad
ni
ha
y
tif
as
Cl

24

You might also like