Professional Documents
Culture Documents
PA Lab 4
PA Lab 4
1 import pandas as pd
2 from matplotlib import pyplot as plt
3 %matplotlib inline
1 df = pd.read_csv("HR_comma_sep.csv")
2 df.head()
---------------------------------------------------------------------------
FileNotFoundError Traceback (most recent call last)
<ipython-input-2-928fbcd7c60a> in <module>
----> 1 df = pd.read_csv("HR_comma_sep.csv")
2 df.head()
5 frames
/usr/local/lib/python3.9/dist-packages/pandas/io/common.py in
get_handle(path_or_buf, mode, encoding, compression, memory_map, is_text, errors,
storage_options)
784 if ioargs.encoding and "b" not in ioargs.mode:
785 # Encoding
--> 786 handle = open(
787 handle,
788 ioargs.mode,
1 left = df[df.left==1]
2 left.shape
(3571, 10)
1 retained = df[df.left==0]
2 retained.shape
(11428, 10)
1 of 6 25-09-2023, 12:40
PA Lab Exp 4 Predicting HR Analytics using logistic _regression_exerci... https://colab.research.google.com/drive/1fE4-juU7B1VadoTPkDZWp...
1 df.groupby('left').mean()
left
1 pd.crosstab(df.salary,df.left).plot(kind='bar')
<matplotlib.axes._subplots.AxesSubplot at 0x1442bf3eda0>
2 of 6 25-09-2023, 12:40
PA Lab Exp 4 Predicting HR Analytics using logistic _regression_exerci... https://colab.research.google.com/drive/1fE4-juU7B1VadoTPkDZWp...
1 pd.crosstab(df.Department,df.left).plot(kind='bar')
<matplotlib.axes._subplots.AxesSubplot at 0x1442bf75128>
1 subdf = df[['satisfaction_level','average_montly_hours','promotion_last_5years','salary'
2 subdf.head()
3 of 6 25-09-2023, 12:40
PA Lab Exp 4 Predicting HR Analytics using logistic _regression_exerci... https://colab.research.google.com/drive/1fE4-juU7B1VadoTPkDZWp...
1 df_with_dummies = pd.concat([subdf,salary_dummies],axis='columns')
1 df_with_dummies.head()
1 df_with_dummies.drop('salary',axis='columns',inplace=True)
2 df_with_dummies.head()
0 0.38 157 0 0
1 0.80 262 0 0
2 0.11 272 0 0
3 0.72 223 0 0
4 0.37 159 0 0
4 of 6 25-09-2023, 12:40
PA Lab Exp 4 Predicting HR Analytics using logistic _regression_exerci... https://colab.research.google.com/drive/1fE4-juU7B1VadoTPkDZWp...
1 X = df_with_dummies
2 X.head()
0 0.38 157 0 0
1 0.80 262 0 0
2 0.11 272 0 0
3 0.72 223 0 0
4 0.37 159 0 0
1 y = df.left
1 model.fit(X_train, y_train)
1 model.predict(X_test)
1 model.score(X_test,y_test)
0.78428571428571425
5 of 6 25-09-2023, 12:40
PA Lab Exp 4 Predicting HR Analytics using logistic _regression_exerci... https://colab.research.google.com/drive/1fE4-juU7B1VadoTPkDZWp...
6 of 6 25-09-2023, 12:40