Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 3

Hadoop interview questions

Hpe rounds

4-th round- customer/client round.

1- brief about your self and your current role.

2- what measures are you taking to avoid errors and issues in your cluster?

3- what is end to end architecture of your cluster and components and company working?

4- HighAvailability of which components and how provided?

4- storage layer? Processing layer??

5- total capacity of your hdfs?

6- Yarn specifications?

7- injestion tool?

8- what is the replication factor in your cluster?

9 - why 3 is the replication factor?

10- have you heard of erasure coding?

11- what is rackawareness? And topology??

12- why do we do rackawareness?

13- suppose there is an incoming data and the schema of that data is invalid so what can you do?
Can you stop that data? Or update that data??

14- any recent issues you have faced in your daily routines?

15- what do you do if you receive any RC( request change) ?

16- suppose any request for change regarding any service comes to you with all the permissions and
passings completed in that case what changes will you do and how will you do?

17 - encountered any such recent request for change? What was the request and how did you do?
And did you do it all alone by your self?

18- do you know shell scripting?

19- have you created any linux script on your own?

20 - how good are you in linux?

21- command for renaming the file in linux?

22- single command for renaming the file and also simultaneously moving it to another location.?.

Round 3- hpe manager round

1- introduce yourself with your current roles and responsibilities?


2- present company?

3- cluster specifocations?

4- how do you check logs? Cli command?

5- suppose I want to check logs of specific task with cli? What is the command?

6- do you graduated in comp. Science?

7- why and how did you transited from civil to comp?

8- any recent issues faced in your cluster along with solution?

9- what will you do in any situation when a task is running and it's consuming alot of resources and
due to it the data node went down... What can be done now?

Round 2- hpe on site manager

1 introduce yourself with current company and description

2- r u married?

3- do you have a son Or a daughter?

4 - how many years of marriage?

5- what's the name of son?

6- r u ready to relocate?

7- How many members in the family?

8- how many nodes of cluster?

9- what do you know about HighAvailability?

10- why do we do namenode high availability?

11- mechanism and components behind high availability of namenode?

12- what is the purpose of standby/secondary namenode?

13- any recent issues faced and solutions provided to them?

14- are you ok in working rotational shifts?

Round 1-Hpe tech lead.

1- introduce yourself

2- cluster specifications?

3- what type of client/domain ?


4- worked in migration team?

5- what is the sla in your team?

6- which ticketing tool do you use?

7- how do you track the sla and working of you and team according to the sla?

8- how do you decide which ticket is to be solved on priority basis according to sla?

9- team structure in your company?

10- any issues faced in your current company and solved that issues?

11- out of memory error?

12- do you know pyspark?

13- what is the replication factor in your env?

14- what is rackawareness? Topilogy?

15- what is fault tolerance?

16- suppose 1 rack goes down and along with it 2 replicas of your file also goes down so now what
will be done?

17- does running the balancer can re-replicate the data to 2 more nodes?

You might also like