Be - Computer Engineering - Semester 7 - 2022 - December - Big Data Analysis Rev 2019 C Scheme

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 2

7

A0

3
C

A4

B7

17
58
Paper / Subject Code: 42172 / BIG DATA ANALYTICS

C3
CA

78
09
C7

A4

B7
3A

7
58
CB

81
CA

09
4C
C7

77
65

3A
58

AA
B

9B
B4

C
C7

A0
65
AD
Time: 03 Hours Marks: 80

4
8

A
CB
B4

C3
5

A
DB

C7
65
D

A4

3A
8
54

BA

B
4

5
Note: 1. Question 1 is compulsory

5C

CA
DB

4C
C7
73

46
81

58

AA
54

B
2. Answer any three out of the remaining five questions.

5C
B

DB
77

7
73
3. Assume any suitable data wherever required and justify the same.

4D

8C
C
46
9B

BA

CB
8

75
35

DB
7
A0

65
7

BC
17
9B
Q1 a) What is function of Map Tasks in the Map Reduce framework? Explain with the [5]

58
54

A
C3

4
8

5C
DB

DB
77
0

C7
73
help of an example.
A4

6
9B

54

BA
C3

CB
4
CA

8
b) Demonstrate how business problems have been successfully solved faster, cheaper [5]

B
7
A0

3
4

65
B7

D
17
58

A
and more effectively considering NoSQL Google’s MapReduce case study. Also

A
3

B4
CA

78
09

5
C
C7

DB
73
illustrate the business drivers and the findings in it.

AD
A
58

AA
CB

46
9B

4
3

35
4C
C7

c) Why is HDFS more suited for applications having large datasets and not when there [5]

DB

DB
77
A0
65

17
8

AA
CB

are small files? Elaborate.

9B
4

54

BA
C3
75
DB

8
77
A0
65

73
8C
C

A4
d) Explain the concept of bloom filter with an example [5]

4D
BA

9B
B4

81
C3
75
5C

35
77
A0
4D

C
C

4
6

17
8

AA
A

CB

9B
4

Q2 a) Name the three ways that resources can be shared between computer systems. Name [10]

3
5
35

DB

78
4C
C7

A0
65
4D

8C
17

the architecture used in big data solutions and describe it in detail.

B7
AA
A

CB
B4
78

C3
75
35

09
b) Write a map reduce pseudo code for word count problem. Apply map reduce [10]
65
B7

8C
BC
17

A4

3A
4

B4

working on the following document:


78
09

75
35

5C

A
DB

4C
B7

D
3A

8C
C
7

46
1

AA
54

BA

CB
8
09

5
C

“This is an apple. Apple is red in color”.


DB
7

C7
3
A4

65
B7

D
3A

8C
17

BA

CB
Q3 a) Suppose the stream is 1, 3, 2, 1, 2, 3, 4, 3, 1, 2, 3, 1. Let h(x) = 6x + 1 mod 5. [10]
B4
CA

78
09

75
35
C
A4

65

Show how the Flajolet- Martin algorithm will estimate the number of distinct
7

4D

AD
3A

BC
7
58

B4
CA

8
09

elements in this stream.


35
C
C7

5C
B
77
A4

D
3A

17
58
CB

46
9B

54

BA

b) Consider the following data frame given below: [10]


CA

78
4C
C7

DB
A0
65

73

subject class marks


7

4D
58

AA
CB

B
B4

BA
C3

1 1 56
78
09

35
C7
65
AD

B7

4D
3A

2 2 75
7
58

A
CB
4

81
CA
DB

09

3 1 48
35
4C
C7

77
65

17

4 2 69
58

AA
A

CB

9B
4

C3
DB

DB

78
7

5 1 84
A0
65

8C
C

A4

B7
4

CB

6 2 53
4

C3
75
35

CA
DB

DB

09
65

C
7

A4

3A
81

58
4

B
4
35

i. Create a subset of subject less than 4 by using subset () function and demonstrate
5C

A
DB

DB

4C
C7

C
7

the output.
6
81

AA
4

CB
4

5
35

DB

DB
7

ii. Create a subset where the subject column is less than 3 and the class equals to 2
7
65
B7

8C
BC
7
81

by using [ ] brackets and demonstrate the output.


B4
09

75
35

5C
DB
7
B7

D
3A

BC
7

46
81

A
9

Q4 a) What are the Core Hadoop components? Explain in detail. [10]


35

5C
DB

DB
7
A0

B7

46
81

54

BA
C3

b) With a neat sketch, explain the architecture of the data-stream management system. [10]
9

DB
7
A0

3
A4

B7

4D
7
81

Q5 a) Determine communities for the given social network graph using Girvan- Newman [10]
BA
C3
CA

35
7
A0

algorithm.
A4

B7

4D
17
58

C3
A

8
9

35
77
A0
8C

A4

17
B
C3
5

78
9
C7

A0
8C

A4

B7
CB

C3
5

09

15786 Page 1 of 2
C7
65

8C

A4

3A
CB
B4

75

4C
65

8C
BC

AA
B4

BADB465CBC758CAA4C3A09B77817354D
75
5C
7
A0

3
C

A4

B7

17
58
Paper / Subject Code: 42172 / BIG DATA ANALYTICS

C3
CA

78
09
C7

A4

B7
3A

7
58
CB

81
CA

09
4C
C7

77
65

3A
58

AA
B

9B
B4

C
C7
A B D E

A0
65
AD

4
8

A
CB
B4

C3
5

A
DB

C7
65
D

A4

3A
8
54

BA

B
4

5
5C

CA
DB

4C
C7
73

D
C G F

46
81

58

AA
54

B
5C
B

DB
77

7
73

4D

8C
C
46
9B

BA

CB
8

75
35

DB
7
A0

65
7

BC
17
9B

58
54

A
C3

4
b) [10]

8
The data analyst of Argon technology Mr. John needs to enter the salaries of 10

5C
DB

DB
77
0

C7
73
A4

A
employees in R. The salaries of the employees are given in the following table:

6
9B

54

BA
C3

CB
4
CA

B
7
A0

3
4

65
B7

D
17
58

A
Sr. No. Name of employees Salaries

A
3

B4
CA

78
09

5
C
C7

DB
73
4

AD
A
1 Vivek 21000
58

AA
CB

46
9B

4
3

35
4C
C7

DB

DB
77
A0
65

17
2 Karan 55000
8

AA
CB

9B
4

54

BA
C3
75
DB

8
77
A0
65

73
8C
C

A4

4D
3 James 67000
BA

9B
B4

81
C3
75
5C

35
77
A0
4D

C
C

4
4 Soham 50000
6

17
8

AA
A

CB

9B
4

3
5
35

DB

78
4C
C7

A0
65
4D

8C
17

5 Renu 54000

B7
AA
A

CB
B4
78

C3
75
35

09
65
B7

8C
BC
17

6 Farah 40000

A4

3A
4

B4
78
09

75
35

5C

A
DB

4C
B7

D
3A

8C
7 Hetal 30000 C
7

46
1

AA
54

BA

CB
8
09

5
C

DB
7

C7
3
A4

8 Mary 70000
65
B7

D
3A

8C
17

BA

CB
B4
CA

78
09

75
35
C
A4

9 Ganesh 20000
65
7

4D

AD
3A

BC
7
58

B4
CA

8
09

35
C
C7

5C
B
77

10 Krish 15000
A4

D
3A

17
58
CB

46
9B

54

BA
CA

78
4C
C7

DB
A0
65

73
7

4D

i. Which R command will Mr. John use to enter these values demonstrate the output.
58

AA
CB

B
B4

BA
C3

78
09

35
C7

ii. Now Mr. John wants to add the salaries of 5 new employees in the existing table,
65
AD

B7

4D
3A

7
58

A
CB
4

which command he will use to join datasets with new values in R. Demonstrate the
81
CA
DB

09

35
4C
C7

77
65

output.
A

17
58

AA
A

CB

9B
4

C3
DB

DB

78
7

A0
65

8C
C

A4

B7

Q6 a) i. Write the script to sort the values contained in the following vector in ascending [10]
4

CB
4

C3
75
35

CA
DB

DB

09

order and descending order: (23, 45, 10, 34, 89, 20, 67, 99). Demonstrate the
65

C
7

A4

3A
81

58
4

output.
B
4
35

5C

A
DB

DB

4C
C7

ii. Name and explain the operators used to form data subsets in R.
C
7

6
81

AA
4

CB
4

5
35

DB

DB

b) How recommendation is done based on properties of product? Elaborate with a [10]


7

7
65
B7

8C
BC
7
81

suitable example.
4

B4
09

75
35

5C
DB
7
B7

D
3A

BC
7

46
81

-----------------
9

35

5C
DB

DB
7
A0

B7

46
81

54

BA
C3

DB
7
A0

3
A4

B7

4D
7
81

BA
C3
CA

35
7
A0
A4

B7

4D
17
58

C3
A

8
9

35
77
A0
8C

A4

17
B
C3
5

78
9
C7

A0
8C

A4

B7
CB

C3
5

09

15786 Page 2 of 2
C7
65

8C

A4

3A
CB
B4

75

4C
65

8C
BC

AA
B4

BADB465CBC758CAA4C3A09B77817354D
75
5C

You might also like