Aquif Ibrar 1212

237723, 743 PM ‘Aquilorar_1212 ipynb- Colaboratory Double-click (or enter) to edit ~ we have imported the necessary libraries to create our ML model Dataframe name = HR Next we can check the head of the dataframe warnings. teruarnings(Sgnore") Lnoort py 25 r@ = pa.rend csv */content/#8 comma. sep csv" ) hess) mint, ‘ 44399 non-null 5 pS 34999 noncnull object types: Floaten(2), int6a(6), 90 head) faction level Last_evalustion runber project average_sontly.hours tine spend_conpany ‘ » Double-click (or enter) to edit ~ Now we can check the count of people left as 1 and retained as 0 value courts() Nave: eft, ctype: ints Dosble-ik rete) to elt + Created a list with all the columns 1 = seem cotums) (csststacton tee", naan prefer htpsscolab research google comidrival'SHSGZOAmnIB3sK0Sh_aMsobHpFSeOltiscralTo=bol UgdobSOSSEprniMode=7723, 743 PM ‘Aquilorar_1212 ipynb- Colaboratory Doubleclick rene) to ect + Dropping the Column "left" x fostures = Tist( HR.coluens ) Xfeatures.renove( "Ieft! (Csseisfoction level", runber_project',” pronation Double-click or enter) to ect + Using Hot Code to create (N-1) Dummy variable for N categories encoded df = pl.get dunates( HA(X Features], srep first = True ) List encodes What. colunas) Uisatistaction level’, Departaent_narketing' Departaent_produc selary_nediin’} Double-click (or enter) to edit ~ Importing statsmodels library as ‘sm’ Creating dependent variable as X Creating independent variable as | and adding a constant so the Im model gives intercept value htpsscolab research google comidriva!'SHSGZOAmnIB3sK0Sh_aMsobHpFSeOltiscralTo=bol UgdobSOSSEprniMode=7723, 743 PM ‘Aquilorar_1212 ipynb- Colaboratory X= snladé_constant( encodes Mtef } Double-click (or enter) to edit Training the model with 70% of dataframe and spliting the dataframe and testing with 30% data random state + 42) Doubleclick or ante) to et + Importing statsmodel library and naming the regression model as "logit_model’” Angort statstedels.api 35 sm logit = sn-togit(y-train, X train) logit poded = Toglt. tO Now checking the summary of the model and identifying significant and non significant variables ogst_podet sunnary20) Lop Pseudo R-squarod: 0.225, ert Vaal: et Ac: es7124s 2023.03.07 13:24 BIG 109.5462 No. Observations: 10499 Leg Liatnoos: 4466.8 or Meds 18 un “set DrResidals: 10480 UR pao: 0.0000 Conver: 1.0000 Seat +0000 No Mortons: 7.0000 cout, Sidr. Poke] (0025 0975) umber project -0.30190.0254 -11.8800 0000-03517 0252+ mondy.howre 0048 0.0006 7.7658 0.0000 0.0036 0.0060, end_company 0.2878 00:87 143073 o.9000023%1 oso44 promotion last Syeare -1.24860:3021 ~4.4534 0.0000-1.9405-0,798 Department. marketing 0.2529 0.1489 1.7891 0.0736-0.0251 0.8500 Department product mg 062 0.1402 0.8283 0.4075-0.1587 03810 Department support 01707 0173 14552. 01458-00592 04006 rymedium 15442 0.1809 95877 o.oo 12180 1a4e4 htpsscolab research google comidriva!'SHSGZOAmnIB3sK0Sh_aMsobHpFSeOltiscralTo=bol UgdobSOSSEprniMode=7723, 743 PM ‘Aquilorar_1212 ipynb- Colaboratory Double-click or enter) to edit ~ Defining a function which will extract the SIGNIFICANT variables from the logit_model det get_significant_vars( In): \ar_p_vals_of = pa.batarrare( In.praiues var_plvals_of.coluins = ["pvais', "vars" rh JAsEC var_p.vale_of[var_p vals df.pvals <= @.@5){vars"] ) Double-click (or enter) to edit + Viewing the significant variables sHanificant_vars = got_santficart_vars( logit rodel ) Creonst’, selary_nediin’} Double-click (or enter) to edit Now creating a new model "final_logit’ with only significant variables of the previous ” model final Logit ~ sn.Loaitt y_trein, smada_eonstant( X-train[sienificant_vars] ) )-f880) ‘carrent function value! 0.425996 ~ Now checking the summary of final model #inal_togtt.summary20) htpsscolab research google comidriva!'SHSGZOAmnIB3sK0Sh_aMsobHpFSeOltiscralTo=bol UgdobSOSSEprniMode= 497723, 743 PM ‘Aquilorar_1212 ipynb- Colaboratory od: Lop Pseudo R-squrod: 0.224 Dependent Varable let Alc: sor.ori Date 2023-03.07 13:26 BIG: 005 4399 Ne. Obsoratons: 10409 Leg Likatnoos: 44728 tM 2 una “set DrResidvals: 10488 UR pale: 0.0000, Converged: 1.0000 Sear: +0000 No. Nertons: 7.0000 Coot, Sider 2 Patel (0028 0975) ~ Creating a new dataframe y_pred_df which includes the predicted values a.A = pd.datatranet ( "actual": 9.3 predicted prob": Final logit.pregict( smade_constant( xLtese[ significant vars] )) ) ) ~ Checking the predicted values y.preg_sf.sowple(3®, randon state = 42) actual preaictes.preb 3908 0.07618, 0 ozran0 0 ogzarie est 0 aura ao ove996 am 0.08008 Creating a list which compares the actual values with predicted values, def if prob value > .50 esle 0 Tanoce x: 1 1¢ x > 0.5 else ®) y.pred.af.sorgle(2e, randon_state htpsscolab research google comidriva!'SHSGZOAmnIB3sK0Sh_aMsobHpFSeOltiscralTo=bol UgdobSOSSEprniMode=1723, 743 PM ‘Aquilorar_1212 ipynb- Colaboratory sctusl predicted prob predicted wee 40004 ° 0 a3235s6 ° + Importing necessary libraries for plotting graphs ngort natplotlib.pyplot as ple ‘ngort seaborn 36 30 Defining a function to draw a confusion matrix to evaluate the performance of a classification model. A confusion matrix is a table that summarizes the performance of a classification model by showing the number of true positives, true negatives, false positives, and false negatives. erix( actual, predicted ) Sh.neatap(en, annot=irus, fat="-2F", sticklabels = [Bae credit", "Good Chet ytiekiaters = ("Bae creat", “Good Credst”) } pltsylabel(“trve label") abel( Predicted Label’) shew) craw_ent y_pred_of actoal, y.pred.af.predicted oll 1 : Now printing a classification report, which provides a comprehensive evaluation of the performance of a classification model. rint( matrice.classiflcatton_report( y_pr Y.pret_at.predicted )) hitpsscolab research google comidrivel'SHSGZGAmnI83sK0Shx_aMsobHpFSeOIFscralTo=bol UgdebSOSSSprintNod7723, 743 PM ‘Aquilorar_1212 ipynb - Colaboratory precision cecall. fl-score support 1 ons baz 0 accuracy oa 500 macro age 788.63 get weighted 27 87a 8 ase Ploting a histogram of the predicted probabilities for each class (bad credit and good credit). pit. tigure Figsize = (8,6) ) Sn.distplot( y_pred_affy_pred_df.actual == 230" label = “ad Creelt” ) sncdistolot( y_pred_df{y_pred_df.actual == e)("predtcted prob"), desFalse, color = "8's abel = "Good Creait™ } plt-legend() ple-showt) ren pe Defining a function to plot the Receiver Operating Characteristic (ROC) curve of a binary ” classification model and calculate the Area Under the Curve (AUC) score. oF arag_roct actual, probs ) for, ters \ thresholds = netrics.roc_curve( actual, prove, frop_intarnectate = False ) ave score = netrics.roc_aue_score( actual, probs ) puts Figure(figsize~(s, 8) pitwplot( fers tbr, label: net OC curve (area = 38.26)" % aue_score ) Plt nlabel(“False Positive Rate or [1 ~ True Negative Rate”) pltiylabel(“True Positive tate") puts dagend(2oc="2ower rignt") pit-shou() return for. thr, thresholds pr, toe, thresholds = deouroct y-pres.af.sctusl, pred. df.predicted proo} hitpsscolab research google comidtival'SHSGZQAmnIB3sK0Sh_aMsobHpFSeOltiscrolTo=bol UgDobSOSSEprntMode=true 7197723, 743 PM ‘Aquilorar_1212 ipynb- Colaboratory Fave ote ate Wu ape ate Now calculating the Area Under the Curve (AUC) score of a binary classification model using the predicted probabilities and actual labels. ounat flost( ave score ), 2 ) cscore( y_pred.of. actual Creating DataFrame that contains True Positive Rate (TPR), False Positive Rate (FPR), and threshold values for a binary classification model. pé.datarrane( { ‘tor’: tor, esholds': thresholés ) ) fort'@304"] ~ tor for-tor ~ tor_for-for sortavalues( Téifé", aacending = False 91855 tor for thresholds iff 157 osro10 0.175029 oz0612 o4sases Again plotting a confusion matrix Now craeting a new column predicted_new in a pandas DataFrame y_pred_df, where the ~ predicted class labels are based on a new threshold value of 0.22 for the predicted probabilities. ype sep ew] = y_pred_df. predicted prob. nap Tanbda xi 1 1¢ x > 0122 else 0) hitpsscolab research google comidtival'SHSGZQAmnIB3sK0Sh_aMsobHpFSeOltiscrolTo=bol UgDobSOSSEprntMode=true7723, 743 PM ‘Aquilorar_1212 ipynb- Colaboratory ~ Again, Plotting Consuion matrix with threshold values sean ent y_pred_of-actual, 7. pees, ae predicted_ nen) + Again printing new classification report with highes precision and recall values print(oetescs classification report( y_pred_df.2c Ypres. of-predicted new }) 08 completed at 706° htpsscolab research google comidriva!'SHSGZOAmnIB3sK0Sh_aMsobHpFSeOltiscralTo=bol UgdobSOSSEprniMode=

Aquif Ibrar 1212

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Aquif Ibrar 1212

Uploaded by

Copyright:

Available Formats

You might also like