ML Implementation Diabetes Dataset

111123, 1228 PM ML-Diabetes_ dataset -Jupyter Notebook We will use pima indian diabetes dataset to predict if a person has a diabetes or not based on certain features such as blood pressure, skin thickness, age etc. Importing the Libraries In [1]: import pandas as pd import numpy as np import seaborn as sns import matplotlib.pyplot as plt import plotly.express as px import warnings warnings. filterwarnings("ignore") Ymatplotlib inline localhost @868inotebooks/MI-Diabetes_ dataset pynbt wet111123, 12:28 PM ML-Diabetes_dataset -Jupyter Notebook In [2]: df = pd.read_csv("pima-indians-diabetes.csv", names=[‘Pregnancies', ‘Glucose’, ‘t df -head() out [2]: Pregnancies Glucose BloodPressure SkinThickness Insulin BMI o 6 4B 2 35 0 336 0.627 1 1 85 66 29 0 266 0.2351 2 a 183 os ° 0 233 0872 3 1 89 66 23 94 28.4 0.87 4 0 137 40 35168 43.1 2.288 Checking the null values In [3]: df.isnull().sum() out[3]: Pregnancies Glucose BloodPressure SkinThickness Insulin BME DiabetesPedigreeFunction Age outcome dtype: intea We got zero null value Statistical Analysis localhost @868inotebooks/MI-Diabetes_ dataset pynbt 221111123, 12:28 PM In [4]: df.describe() ML-Diabetes_dataset -Jupyter Notebook out[4]: Pregnancies Glucose BloodPressure SkinThickness Insulin BMI DiabotesP: ‘count —768,000000 768.000000 768.0000 + 768,000000 768.0000 768.000000 mean 3.845052 120,894531 69,105469 -20,836458 | 79,790479 931,992578, std 3.369578 31.972618 19355807 18.952218 115244002 7.884160 min 0.000000 0.000000 0.000000, 6.000000 0.000000 0.000000 25% 1.000000 99,000000 62.0000 0.000000 0.000000 27300000 50% 3.000000 117.000000 © 72.000000 «3.000000 30.500000 32.0000 75% 6.000000 149.2500 80,000000 2.000000 127.250000 6.600000 max 17.000000 199.000000 122000000 —99,000000 846.000000 67.1000 Checking the distribution of the dataset 321 localhost @868inotebooks/MI-Diabetes_ dataset pynbt‘11123, 12:28 PM ML-Diabetes_ dataset -Jupyter Notebook In [5]: plt.figure(figsize=(20,15), facecolor="white' ) plot_num = 1 for column in di if plot_num <= 9: # number of columns is 9 ax = plt.subplot(3,3,plot_num) sns.distplot (df[column]) plt.xlabel (column, fontsize=20) plot_num+= plt.show() f= “Pregnancies” ** Gucose “ “etaoarressure *“Stintnickness © esti i: ne i fa QC “DiabetesPedigreerunction. me age ee Ourcome We can see there is some skewness in the data . Also , we can see there few data for columns Glucose, Insulin, skin thickness, In [6]: df.columns Out[6]: Index(["Pregnancies', ‘Glucose’, "BME", dtype object") localhost @868inotebooks/MI-Diabetes_ dataset pynbt "BloodPressure’, "DiabetesPedigreeFunction’, BMI and Blood Pressure which have value as 0. th thei "SkinThickness', ‘Insulin', "Age, ‘Outcome’], 4a‘11123, 12:28 PM In (7]: In [8]: ML-Diabetes_daiase -upyter Notebook # replacing zero values with the mean of the column df['BMI'] = df['BMI'].replace(@,dF["BMI'].mean()) df['BloodPressure'] = df['BloodPressure' ].replace(@,df[ 'BloodPressure' ].mean()) df['Glucose'] = df['Glucose’].replace(@,df[ ‘Glucose’ ].mean()) df['Insulin'] = df[' Insulin’ ].replace(2,df[ Insulin’ }.mean()) df['SkinThickness'] = df['SkinThickness'].replace(@,df['SkinThickness'].mean()) Distribution of data after replacing zero values with mean plt.figure(figsize=(20,15),facecolor='white') plot_num = 1 for column in di if plot_num <= 9: — # number of columns is 9 ax = plt.subplot(3,3,plot_num) sns.distplot (df[column]) plt.xlabel (column, fontsize=20) plot nuns plt.show() i r ee samc wae “Saeteieanrersnton age Sige So we deal with zero values. But there are some outliers in the dataset localhost @868inotebooks/MI-Diabetes_ dataset pynbt 521111123, 1228 PM ML-Diabetes_ dataset -Jupyter Notebook In [9]: fig, ax = plt.subplots(Figsize=(15,10)) sns.boxplot(data=df, width= .5,ax-ax, fliersiz 3) out[9]: 845 + Es Teginces Gide Howie ations alan Tle Gabeasieaetincion We none Removing some amount of outliers from the dataset localhost @868inotebooks/MI-Diabetes_ dataset pynbt 621‘11123, 12:28 PM ML-Diabetes_ dataset -Jupyter Notebook In [10]: q = df['Pregnancies"].quantile(o.98) # we are removing the top 2% data from the Pregnancies column df_cleaned = df[df{ ‘Pregnancies’ ]
You might also like
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
From Everand
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
Mark Manson
Rating: 4 out of 5 stars
4/5 (5834)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
From Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brene Brown
4/5 (1093)
Never Split the Difference: Negotiating As If Your Life Depended On It
From Everand
Never Split the Difference: Negotiating As If Your Life Depended On It
Chris Voss
Rating: 4.5 out of 5 stars
4.5/5 (852)
Magazines
Podcasts
Sheet music
Principles: Life and Work
From Everand
Principles: Life and Work
Ray Dalio
4/5 (612)
The Glass Castle: A Memoir
From Everand
The Glass Castle: A Memoir
Jeannette Walls
4.5/5 (1720)
Grit: The Power of Passion and Perseverance
From Everand
Grit: The Power of Passion and Perseverance
Angela Duckworth
4/5 (590)
Sing, Unburied, Sing: A Novel
From Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
4/5 (1194)
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
From Everand
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
Margot Lee Shetterly
4/5 (903)
Shoe Dog: A Memoir by the Creator of Nike
From Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
4.5/5 (541)
The Perks of Being a Wallflower
From Everand
The Perks of Being a Wallflower
Stephen Chbosky
4.5/5 (2105)
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
From Everand
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
Ben Horowitz
4.5/5 (349)
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
From Everand
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
Ashlee Vance
4.5/5 (474)
Bad Feminist: Essays
From Everand
Bad Feminist: Essays
Roxane Gay
4/5 (1029)
The Outsider: A Novel
From Everand
The Outsider: A Novel
Stephen King
4/5 (1871)
Her Body and Other Parties: Stories
From Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
4/5 (823)
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
From Everand
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
Viet Thanh Nguyen
4.5/5 (122)
The Emperor of All Maladies: A Biography of Cancer
From Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
4.5/5 (271)
Angela's Ashes: A Memoir
From Everand
Angela's Ashes: A Memoir
Frank McCourt
4.5/5 (443)
Brooklyn: A Novel
From Everand
Brooklyn: A Novel
Colm Tóibín
3.5/5 (1948)
The Little Book of Hygge: Danish Secrets to Happy Living
From Everand
The Little Book of Hygge: Danish Secrets to Happy Living
Meik Wiking
3.5/5 (405)
A Man Called Ove: A Novel
From Everand
A Man Called Ove: A Novel
Fredrik Backman
4.5/5 (4772)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
From Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
3.5/5 (2259)
Steve Jobs
From Everand
Steve Jobs
Walter Isaacson
4.5/5 (809)
The Art of Racing in the Rain: A Novel
From Everand
The Art of Racing in the Rain: A Novel
Garth Stein
4/5 (4214)
The Yellow House: A Memoir (2019 National Book Award Winner)
From Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
4/5 (98)
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
From Everand
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
Gilbert King
4.5/5 (266)
A Tree Grows in Brooklyn
From Everand
A Tree Grows in Brooklyn
Betty Smith
4.5/5 (1930)
Yes Please
From Everand
Yes Please
Amy Poehler
4/5 (1905)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
From Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
3.5/5 (231)
Team of Rivals: The Political Genius of Abraham Lincoln
From Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
4.5/5 (234)
The Woman in Cabin 10
From Everand
The Woman in Cabin 10
Ruth Ware
3.5/5 (2526)
Wolf Hall: A Novel
From Everand
Wolf Hall: A Novel
Hilary Mantel
4/5 (3973)
Fear: Trump in the White House
From Everand
Fear: Trump in the White House
Bob Woodward
3.5/5 (738)
John Adams
From Everand
John Adams
David McCullough
4.5/5 (2409)
On Fire: The (Burning) Case for a Green New Deal
From Everand
On Fire: The (Burning) Case for a Green New Deal
Naomi Klein
4/5 (74)
The Light Between Oceans: A Novel
From Everand
The Light Between Oceans: A Novel
M.L. Stedman
4.5/5 (789)
Manhattan Beach: A Novel
From Everand
Manhattan Beach: A Novel
Jennifer Egan
3.5/5 (880)
The Constant Gardener: A Novel
From Everand
The Constant Gardener: A Novel
John le Carré
3.5/5 (104)
The Unwinding: An Inner History of the New America
From Everand
The Unwinding: An Inner History of the New America
George Packer
4/5 (45)
Rise of ISIS: A Threat We Can't Ignore
From Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
3.5/5 (137)
Little Women
From Everand
Little Women
Louisa May Alcott
4/5 (105)
Application Guidelines
Document18 pages
Application Guidelines
Shubham J
No ratings yet
Nehal Jambhulkar: Probability, Statistics, Calculus & Linear Algebra
Document1 page
Nehal Jambhulkar: Probability, Statistics, Calculus & Linear Algebra
Shubham J
No ratings yet
Nehal Jambhulkar: Python Basics - II
Document1 page
Nehal Jambhulkar: Python Basics - II
Shubham J
No ratings yet
GOI Financial Budget Analysis 2021
Document10 pages
GOI Financial Budget Analysis 2021
Shubham J
No ratings yet
Roadmap
Document13 pages
Roadmap
Shubham J
No ratings yet
Python Coding Interview Questions For Freshers
Document6 pages
Python Coding Interview Questions For Freshers
Shubham J
No ratings yet
SQL Solved Questions (Imp.)
Document21 pages
SQL Solved Questions (Imp.)
Shubham J
No ratings yet
Data Prep and Cleaning For Machine Learning
Document22 pages
Data Prep and Cleaning For Machine Learning
Shubham J
No ratings yet

ML Implementation Diabetes Dataset

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

ML Implementation Diabetes Dataset

Uploaded by

Copyright:

Available Formats

You might also like