Professional Documents
Culture Documents
(Report) PACN Lab Assignment - 2
(Report) PACN Lab Assignment - 2
Experiment 01
Aim : To compare three methods of time measurement for a program titled ‘cpuhog’.
Description : Create a program ‘cpuhog’. In this experiment, we are calculating the inverse tangent and tangent of a
number(8) using Java. We have varied the number of iterations so that the execution time varies from a few seconds to a
minute. Three different techniques used to measure the execution time are -
(i) Using stopwatch in mobile
(ii) Using time command in command line
(iii) Using time() & getrusage() function before/after the while loop.
Procedure : Measuring execution time using three methods mentioned in the description.
➔ Following readings have been observed using stopwatch in mobile -
Number of Iterations Round 01 (seconds) Round 02 (seconds) Round 03 (seconds) Average (seconds)
No. of Iterations Round 01 (seconds) Round 02 (seconds) Round 03 (seconds) Average Execution Time
(seconds)
No. of Iterations Round 01 (seconds) Round 02 (seconds) Round 03 (seconds) Average Execution Time
(seconds)
Program cpuhog:
public class cpuhog {
public static void main(String[] args) {
int iterations = Integer.parseInt(args[0]);
long start = System.currentTimeMillis();
for(int i = 0; i < iterations; i++){
double result = calculateTan(8);
}
long end = System.currentTimeMillis();
System.out.println("Execution Time: " + ((end - start) / 1000f) + "s");
}
public static double calculateTan(int baseNumber){
double result = 0;
double baseNumberPowered = Math.pow(baseNumber, 7);
while(baseNumberPowered >= 0){
result += Math.atan(baseNumberPowered) * Math.tan(baseNumberPowered);
baseNumberPowered--;
}
return result;
}
}
Observations :
1. The execution time calculated using the currentTimeMillis() method is the least among all the three methods,
whereas the execution time calculated using the Stopwatch method is maximum in every round of
experimentation.
2. Time Command gives a uniform execution time for each round of a particular iteration where the execution time
does not vary much, but the same is not true for the other two methods.
3. The Time command measures the total amount of CPU time used by the program, which includes the time spent
by the program in system calls and in executing other programs.
Inferences:
1. Execution time - Stopwatch > Time Command ≈ currentTimeMillis()
2. Since there’s always a margin of human error in calculating execution time using the Stopwatch method, the
results are not as accurate as with the other two methods.
3. All the methods used for calculating execution time depict similar behavior i.e, the graph grows in linear fashion,
with the time command calculations being the most linear.
4. As the number of iterations increases, the execution time also increases in all the methods. So, we can conclude
that,
Execution time ∝ No. of iterations
Experiment No. 2
Factors given:
As per the given experiment, we have a total of four factors. And each factor can have a minimum of four levels
(assumed). As per the full factorial design, the total number of rounds to be conducted can be given as -
nk = 44 = 256 experiments
Since we are using fractional factorial design, the number of factors to be considered can be reduced by including only
the factors that affect the result the most. So, here we are reducing the number of factors to 2. So, the total number of
rounds to be conducted can be given as
Procedure: Measure throughput using the chosen factors and their respective levels.
Table 2.1 - Factors with their respective levels considered for Fractional Factorial Design
Exp No. File Number of Download Time of the Round 1 Round 2 Round 3 Average
Size Concurrent File Speed Limit Day (MB/s) (MB/s) (MB/s) Throughput
Downloads (MB/s)
Note:
The link to check the autologs of the three repetitions of the above experiment are given below:
➔ https://drive.google.com/drive/folders/1qiDxJoZVpqNZzXJtHn0XKK9rPywLy9Go?usp=share_link
Using the above data, we plot the Network Throughput vs. No. of Concurrent File Downloads graph for various file sizes
namely 100KB, 1MB, 100MB in MB/s.
Note:
To compute the above graphs, the average of all values for each level have been taken as a single point.
Observations from the Line Charts:
● As the File Size increases, the network throughput decreases i.e., File Size is inversely proportional to the
Network Throughput.
● The behavior regarding Time of the Day is somewhat similar to that of the File Size i.e., as the time increases(from
morning to evening), the network throughput decreases.
● The behavior of the remaining two factors are similar i.e., the graph dips down as we increase the factor and then
increases linearly.
Python Code : Below is the python script that runs the wget command to concurrently download the given files using
Threadpool. After the wget command gets executed, the method “run_wget_cmd” returns the log of the wget and we
extract the network throughput for each downloaded file.
import numpy as np
import re, os, subprocess, concurrent.futures
# runs the wget command and logs the information in log file
def run_wget_cmd(wget_cmd, verbose = False, *args, **kwargs):
process = subprocess.Popen(
wget_cmd,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
text=True,
shell=True
)
std_out, std_err = process.communicate()
return std_err
print("#########################################################################################",
file=logFile)
i = 0
for link in links:
runConcurrentDownloads(repetition, sizes[i], link, logFileName, noOfConcurrentDownloads[i], i)
i += 1
Analysis Methods: Three different methods have been employed for the performance analysis of the above experiment
namely,
1. Ranking Method
2. Range Method
3. Allocation of Variance
➔ Ranking Method: The ranking method is similar to the observation method except that the experiments are written in
the order of increasing or decreasing responses so that the experiment with the best response is first and the worst
response is the last. Now the factor columns are observed to find the levels that consistently produce good or bad
results.
3 -1 1 -1 -1 1.71
1 -1 -1 1 0 1.21
2 -1 0 0 1 0.72
6 0 1 -1 -1 0.65
7 1 -1 1 -1 0.59
4 0 -1 1 1 0.52
5 0 0 0 0 0.25
8 1 0 0 0 0.12
9 1 1 -1 1 0.04
3>1>2>6>7>4>5>8>9
● We can’t infer much from the other factors as their effect is not very clear.
File Size > Download Speed Limit = Concurrent File Downloads = Time of the Day
➔ Range Method: In the Range Method, we find the average response corresponding to each level of the factor and find
the difference between the maximum and minimum of such averages. This difference is called the range. A factor
with a large range is considered important.
Table 2.4 - Factor Averages and Range for the Network Throughput Study
File Size > Time of the Day > Download Speed Limit = Concurrent File Downloads
Conclusion:
As per results from ranking method & range method, we can say that two factors that have the greatest
impact are -
➔ Allocation of Variation: In order to analyze the effect of the two most crucial factors found using the Ranking and
Range Method, let’s consider a design with r replications of each of the ab experiments corresponding to the a levels
of factor A (File Size) and b levels of factor B. The model in this case is,
ȳijk = μ + αj + βi + 𝛾ij
Here,
ȳijk = response (observation) in the kth replication of experiment with factor A at level j and factor B at level i
μ = mean response
⍺j = effect of factor A at level j
βi = effect of factor B at level k
𝛾ij = effect of interaction between factor A at level j and factor B at level i
eijk = experimental error
Note:
While calculating the data for the above table, the below mentioned factors were kept constant.
● Number of Concurrent Downloads = 25
● Download Speed Limit = 10 MB/s
Computation of Effects: The expressions for effects can be obtained in a manner similar as above for two-factor
designs without replications. The observations are assumed to be arranged in ab cells arranged as a matrix of ‘b’
rows and ‘a’ columns. Each cell contains r observations belonging to the replications of one experiment. Averaging
the observations in each cell produces,
Similarly, averaging across columns and rows and overall observations produces,
Table 2.6: Computation of Effects for the Network Throughput Study with Replications
The interactions (or cell effects) for the (i, j)th cell are computed by subtracting μ + αj + βi from the cell mean 𝑦ij.
The computation can be verified by checking that the row as well as column sums of interactions are zero.
The total variation of 𝑦 can be allocated to the two factors, the interaction between them, and the experimental errors.
SSY = SSO + SSA + SSB + SSAB + SSE
The percentage of variation explained by a factor or interaction can be used to measure the importance of the
corresponding effect,
Conclusions: