Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

Name : ABHINANDITA BANERJEE

REG NO :20BCE2080
THEORY DIGITAL ASSIGNMENT
DATA VISUALIZATION
DATASET: https://www.kaggle.com/benroshan/factors-affecting-campus-placement
This data set consists of Placement data of students in a XYZ campus. It includes secondary and
higher secondary school percentage and specialization. It also includes degree specialization, type
and Work experience and salary offers to the placed students.Below are the description of some of
the used attributes:

gender-gender of the student

degree_p-Degree Percentage

degree_t-Under Graduation(Degree type)- Field of degree education

workex-Work Experience

specialization-specialization done in under graduation

mba_p-mba percentage of the student(if done)

status-Placed or not placed i.e if the student is recruited or not

salary-salary of the placed student

INTERACTIVE VISUALIZATIONS:

conda install -c plotly plotly

conda install -c conda-forge cufflinks-py

import pandas as pd

import cufflinks as cf

import plotly.offline

cf.go_offline()

cf.set_config_file(offline=False, world_readable=True)

dc=pd.read_csv("recruitment.csv")
1. degree_p vs salary
dc.iplot(kind="scatter", theme="white",x="degree_p",y="salary",

categories="gender")

OBSERVATION:
It was observed for the same range of degrees passing mark(degree_p) more salary was
awarded to male candidates compared to female candidates
2.To conclude how chances of placement varies with gender

dc=pd.read_csv("recruitment.csv")

placed=dc[dc['status']=='Placed']['gender'].value_counts()

notplaced=dc[dc['status']=='Not Placed']['gender'].value_counts()

df1 = pd.DataFrame([placed,notplaced])

df1.index = ['Placed','NotPlaced']

df1.iplot(kind='bar',barmode='stack', title='Placement by the Sex')


OBSERVATION:
It is clearly visible that more male candidates were placed compared to the female.The number of non
placed female were less than the no of non placed male

i)There were 48 placed female and 100 placed male candidates

ii)There were 29 non placed female and 39 non placed male.

3.Plot to find out the salary the placed students are earning
placed=dc[dc['status']=='Placed']['salary']

placed.iplot(kind="histogram", bins=20, theme="white", title="Salary Awarded",xTitle='Salary',


yTitle='Count')
OBSERVATION:
The maximum salary earned by the placed students is 900K (1 student)but most of the placed students
earned a package in the range 250K-299K(54 students)

4.Plot to see the scores gained by male and female students of


each department
import plotly.express as px

fig = px.box(dc, x="degree_t", y="degree_p", color="gender",title="degree percentange vs degree


specialization based on gender")

fig.show()
OBSERVATIONS:

i)The maximum marks scored by males in Sci and Tech is 78.86 while the lowest marks is 52 and median
is 65.5

ii)The maximum marks scored by females in sci and tech is 91 while the lowest is 59 and median is 69.6

iii) The maximum marks scored by males in Commerce and Manangement is 83 while the lowest marks
is 55 and median is 60

iv) The maximum marks scored by females in Commerce and Manangement is 85 while the lowest is
55.2 and median is 68

v) The maximum marks scored by males in other specialization is 65 while the lowest marks is 52 and
median is 60

vi) The maximum marks scored by females in other specialization is 78 while the lowest is 52 and
median is 59

5.Mba_percantage vs Specialization based on work experience


fig = px.violin(dc, y="mba_p", x="specialisation", color="workex",box=True,title="Mba_percantage vs
Specialization based on work experience")
fig.show()

OBSERVATIONS:
For both the specializations the mba_p is more for the ones who have work experience so we can
conclude that the ones who have more work experience are more knowledgeable which increases their
mba percentage.

You might also like