FA20-BCS-030 Statistics A1

Statistics and Probability Theory
Name: Waleed Ilyas
Reg#: FA20-BCS-030
Assignment No. 1
Submitted To:
Ma’am Nosheen
Q.1) Describe significant role/applications of “Statistics” in computer
science.
Use of Statistics in Computer Science

Statistics in computer science are used for several things, including data mining, data
compression, and speech recognition. Other areas where statistics are used in computer
science include vision and image analysis, artificial intelligence, and network and traffic
modeling.
Statistics have been especially useful in speech recognition software with the advent of
Apple’s Siri. Statistics also back programs such as Google Translate, which uses data to
perform online translations. Statistics are used in both of these instances by using the spoken
or typed word and changing it into a sequence of numbers that matches it with known
dictionaries.
Data mining is performed with the help of statistics by using functions to find irregularities
or inconsistencies within data. Data compression uses statistical algorithms to compress
data. Statistics are also used in network traffic modeling; whereby available bandwidth is
exploited to be usable while the use of statistical programs avoids network congestion.
Artificial intelligence tries to simulate human thought using algorithms that are similar to
voice recognition or translation software. Other statistical uses in computer science include
quality management, software engineering, storage, and retrieval processes, and software
and hardware engineering and manufacturing. Algorithms have become necessary in many
facets of computer programming and data mining.
Use of Probability in Computer Science

Probabilities pervade many areas of computer science, particularly when performance is
being considered. Off the top of my head, here are eight examples where some knowledge
of probability theory is important:
1. Computer hardware: The computer’s cache memory is designed and
managed to maximize the speed of the computer’s RAM.
2. Computer algorithms: We efficiently test a very large number to verify

that it is prime? (This is one example of a randomized algorithm.)
3. Data structures: We implement a hash table that provides the fastest

lookup times on average.
4. Optimizing compilers: We reorganize the machine code created from a

program so that it executes with the fastest speed.
5. Cloud computing: We bring extra computing resources online to

guarantee good user response times (and when do we take the resources
offline again) so that we do not waste money on keeping idle capacity
around.
6. Databases: We distribute data between expensive fast memory (e.g.

RAM) and cheaper slower memory (e.g. spinning disks) so that the best
access times on average are maintained.
Q.2) EXEL file
For given variables (Weight and height) in “Sheet 1”
a. Construct simple bar chart
b. Construct histogram
c. Identify distribution of both variables as symmetrical or skewed. If
skewed then what type of skewed.
d. Also identify the class of mean, median and mode where they occurred.
HEIGHT GRAPHS
Height Frequency Histogram

30
25 24
23
21
20
16
15
10 9
5 4
1 1
0
0
70-71 72-73 74-75 76-77 78-79 80-81 82-83 84-85
Height Frequency Ogive Less
700
624
600
539
500
456
400
375
300 296
200 219
144
100
71
0
70-71 72-73 74-75 76-77 78-79 80-81 82-83 84-85
Height
30
25
20
15
10
0
1 2 3 4 5 6 7 8 9
Height Pivot Graph

16
14
14 13 13
12 11
10 9 9 Total
8
8 7 7
6
4
2 2 2
2 1 1
0
70 72 73 74 75 76 77 78 79 80 81 82 83 84
WEIGHT GRAPHS
Weight Frequency (Weight)

30
26
25
20
15
15
10 10
10 8 8
7 7
6
5
2
0
170-176 177-183 184-190 191-197 198-204 205-211 212-218 219-225 226-232 233-240
Weight Frequecny Ogive Less

2500
2000
1500
1000
500
0
1 2 3 4 5 6 7 8 9 10
Weight
30
26
25
20
15
15
10 10
10 8 8
7 7
6
5
2
0
170-176 177-183 184-190 191-197 198-204 205-211 212-218 219-225 226-232 233-240
Weight Pivot Graph
14
12
10
8 Total
13
6
10
4 8 8 8
7 7
6
5 5
2 4
3
2 2
1 1 1 1 1 1 1 1 1 1 1
0
170 175 178 180 185 186 187 189 190 195 198 200 205 208 210 215 218 220 222 225 228 230 232 234 235
Q.3) Consider the frequency polygon for weights of packets of sugar purchased in
a store per day. Answer the following questions as
Frequency Polygone
14
12
No. of Packets
9
8
6 6
44
2
9.8 10.1 10.4 10.7 11 11.3 11.6 11.9 12.2

Weights of Packets / Kg
a. 2 no. of packets has average weight more than 11.6 kg?

b. 4 no. of packets has average weight less than 11 kg?
c. What is a class interval?
Ans: 0.3
d. Make the frequency distribution from given graph.

(Hint: ℎ = 10.1 − 9.8 = 0.3 , 0.3 = 0.15, 𝐿𝐶B = 9.8 − 0.15 = 9.65 ,
2
𝑈𝐶B = 9.8 + 0.15 = 9.95)

Class frequency Cumulative Class Class Limits
Marks Frequency Boundaries
LCB UCB LCL UCL
9.945 2 2 9.65 9.95 9.80 10.09
10.245 6 8 9.95 10.25 10.10 10.39
10.545 8 16 10.25 10.55 10.40 10.69
10.845 14 30 10.55 10.85 10.70 10.99
11.145 12 42 10.85 11.15 11.00 11.29
11.445 9 51 11.15 11.45 11.30 11.59
11.745 4 55 11.45 11.75 11.60 11.89
12.045 4 59 11.75 12.05 11.90 12.19
12.345 6 65 12.05 12.35 12.20 12.49
e. Find mean and median. Identify the distribution as symmetrical or skewed?

Mean = 11.3
Median = l + [(n/2−c)/f] × h
= 10.55 + ((32.5-30)/12)0.3
= 10.91
Q.4) Consider the frequency Ogive less than and more than form of speed of cars.
Answer the following questions
a. 8 No. of cars has speed more than 29.5 meter per second.
b. 61.53 Percentage of cars has less than 44.5 meter per second.
c. What is the class interval?
d. Construct the frequency distribution from the given graph
LCB UCB LCL UCL x less f cf f xf
4.495 9.495 4.5 9.49 6.995 0 0 0 0
9.495 14.495 9.5 14.49 11.995 3 3 3 35.985
14.495 19.495 14.5 19.49 16.995 11 14 11 186.945
19.495 24.495 19.5 24.49 21.995 27 38 24 527.88
24.495 29.495 24.5 29.49 26.995 44 71 33 890.835
29.495 34.495 29.5 34.49 31.995 56 100 29 927.855
34.495 39.495 34.5 39.49 36.995 63 119 19 702.905
39.495 44.495 39.5 44.49 41.995 68 131 12 503.94
44.495 49.495 44.5 49.49 46.995 72 140 9 422.955
49.495 54.495 49.5 54.49 51.995 74 146 6 311.97

54.495 59.495 54.5 59.49 56.995 76 150 4 227.98
59.495 64.495 59.5 64.49 61.995 77 153 3 185.985
64.495 69.495 64.5 69.49 66.995 78 155 2 133.99
155 5059.23
e. Find mean and median. Identify the distribution as symmetrical or skewed?
Mean = 32.64
Median = l + [(n/2−c)/f] × h
= 30.6
Q.5) The median of values 2, 7, 9, 10, x, 19, 23, 25, 30 is 17. Find the mean of
these values?
Ans: 2, 7, 9, 10, 17, 19, 23, 25, 30
Mean = x̅ = Σx/n = 2+7+9+10+17+19+23+25+30/9 = 142/9 = 15.7
x̅ = 15.7
Q.6) The median of values 2, 7, 9, 10, x, 19, 23, 25, 30, 36 is 15.5. Find the mean of
these values?
Ans: x+19/2 = 15.5
x = 15.5*2/19
x = 12
2, 7, 9, 10, 12, 19, 23, 25, 30, 36
Mean = Σx/n = 2 +7 +9 +10 +12 +19 +23 +25 +30 +36/10 = 173/10 = 17.3
x̅ = 17.3
Q.7) Find x and y so that the ordered data set has a mean of 42.7 and a median of 37.
17, 22 , 26 , 29 , 34 , x , 42 , 67 , 70 , y
Median = 17, 22 , 26 , 29 , 34 , x, 42 , 67 , 70 , y
Median = (34 + x) / 2
37*2=34+x
x=40
Mean = 17, 22 , 26 , 29 , 34 , 40 , 42 , 67 , 70 , y
Mean = x̅ = Σx/n = 17+ 22 + 26 + 29 + 34 + 40 + 42 + 67
+ 70 + y / 10
42.7 = (347 + y) / 10
427 -347 = y
y = 80
x = 40
y = 80
Q.8) Find x so that the ordered data set has a mode of 42.
17 , 22 , 26 , 29 ,29, 42, x , 42 , 67 , 70
x=42
As we know that mode is the most frequent

term repeated in the dataset. Hence x will
42 to become the mode
Q.9) Find x and y so that the ordered data set has a mean of 117.22 and a median of
111.
99, 105, 106, 109, x, 120, 125, y, 150
x = 111
99, 105, 106, 109, 111, 120, 125, y, 150
Mean = x̅ = Σx/n = (99 + 105 + 106 + 109 + 111 + 120 + 125 + y + 150) / 9
117.2*9 = 924 + y
1054.8 - 924 = y
y = 130.8
99, 105, 106, 109, 111, 120, 125, 130.8, 150

Q.10) Given the following grouped frequency table, in which interval does the
median fall?
Class Interval f CF
40-49 8 36
30-39 12 28
20-29 10 16
10-19 4 6
0-9 2 2
Median = l + [(n/2−c)/f] × h
1 = 29.5
a. 30-39 b. 20-29 c. 10-19 d. 0-9

FA20-BCS-030 Statistics A1

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

FA20-BCS-030 Statistics A1

Uploaded by

Copyright:

Available Formats

Statistics and Probability Theory

Name: Waleed Ilyas

Use of Statistics in Computer Science

Use of Probability in Computer Science

2. Computer algorithms: We efficiently test a very large number to verify

3. Data structures: We implement a hash table that provides the fastest

4. Optimizing compilers: We reorganize the machine code created from a

5. Cloud computing: We bring extra computing resources online to

6. Databases: We distribute data between expensive fast memory (e.g.

Height Frequency Histogram

Height Pivot Graph

Weight Frequency (Weight)

Weight Frequecny Ogive Less

9.8 10.1 10.4 10.7 11 11.3 11.6 11.9 12.2

a. 2 no. of packets has average weight more than 11.6 kg?

d. Make the frequency distribution from given graph.

𝑈𝐶B = 9.8 + 0.15 = 9.95)

10.245 6 8 9.95 10.25 10.10 10.39

10.545 8 16 10.25 10.55 10.40 10.69

10.845 14 30 10.55 10.85 10.70 10.99

11.145 12 42 10.85 11.15 11.00 11.29

11.445 9 51 11.15 11.45 11.30 11.59

11.745 4 55 11.45 11.75 11.60 11.89

12.045 4 59 11.75 12.05 11.90 12.19

12.345 6 65 12.05 12.35 12.20 12.49

e. Find mean and median. Identify the distribution as symmetrical or skewed?

d. Construct the frequency distribution from the given graph

LCB UCB LCL UCL x less f cf f xf

4.495 9.495 4.5 9.49 6.995 0 0 0 0

9.495 14.495 9.5 14.49 11.995 3 3 3 35.985

14.495 19.495 14.5 19.49 16.995 11 14 11 186.945

19.495 24.495 19.5 24.49 21.995 27 38 24 527.88

24.495 29.495 24.5 29.49 26.995 44 71 33 890.835

29.495 34.495 29.5 34.49 31.995 56 100 29 927.855

34.495 39.495 34.5 39.49 36.995 63 119 19 702.905

39.495 44.495 39.5 44.49 41.995 68 131 12 503.94

44.495 49.495 44.5 49.49 46.995 72 140 9 422.955

49.495 54.495 49.5 54.49 51.995 74 146 6 311.97

59.495 64.495 59.5 64.49 61.995 77 153 3 185.985

64.495 69.495 64.5 69.49 66.995 78 155 2 133.99

e. Find mean and median. Identify the distribution as symmetrical or skewed?

2, 7, 9, 10, 12, 19, 23, 25, 30, 36

As we know that mode is the most frequent

99, 105, 106, 109, 111, 120, 125, y, 150

99, 105, 106, 109, 111, 120, 125, 130.8, 150

You might also like