(Download PDF) Data Science and Analytics With Python 1St Edition Jesus Rogel Salazar Online Ebook All Chapter PDF

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 42

Data Science and Analytics with Python

1st Edition Jesus Rogel-Salazar


Visit to download the full and correct content document:
https://textbookfull.com/product/data-science-and-analytics-with-python-1st-edition-je
sus-rogel-salazar/
More products digital (pdf, epub, mobi) instant
download maybe you interests ...

Python Data Analytics: With Pandas, NumPy, and


Matplotlib Nelli

https://textbookfull.com/product/python-data-analytics-with-
pandas-numpy-and-matplotlib-nelli/

Data Science with Python 1st Edition Coll.

https://textbookfull.com/product/data-science-with-python-1st-
edition-coll/

Python Data Analytics with Pandas, NumPy and


Matplotlib, 2nd Edition Fabio Nelli

https://textbookfull.com/product/python-data-analytics-with-
pandas-numpy-and-matplotlib-2nd-edition-fabio-nelli/

Python Data Science Handbook Essential Tools for


Working with Data 1st Edition Jake Vanderplas

https://textbookfull.com/product/python-data-science-handbook-
essential-tools-for-working-with-data-1st-edition-jake-
vanderplas/
Python Data Science Handbook Essential Tools for
Working with Data 1st Edition Jake Vanderplas

https://textbookfull.com/product/python-data-science-handbook-
essential-tools-for-working-with-data-1st-edition-jake-
vanderplas-2/

Advanced Data Analytics Using Python: With Machine


Learning, Deep Learning and NLP Examples Mukhopadhyay

https://textbookfull.com/product/advanced-data-analytics-using-
python-with-machine-learning-deep-learning-and-nlp-examples-
mukhopadhyay/

Data science and complex networks : real cases studies


with Python First Edition Caldarelli

https://textbookfull.com/product/data-science-and-complex-
networks-real-cases-studies-with-python-first-edition-caldarelli/

Data Science from Scratch First Principles with Python


2nd Edition Joel Grus

https://textbookfull.com/product/data-science-from-scratch-first-
principles-with-python-2nd-edition-joel-grus/

Data Science from Scratch First Principles with Python


2nd Edition Grus Joel

https://textbookfull.com/product/data-science-from-scratch-first-
principles-with-python-2nd-edition-grus-joel/
DATA SCIENCE
AND ANALYTICS
WITH PYTHON
Chapman & Hall/CRC
Data Mining and Knowledge Discovery Series
SERIES EDITOR
Vipin Kumar
University of Minnesota
Department of Computer Science and Engineering
Minneapolis, Minnesota, U.S.A.

AIMS AND SCOPE


This series aims to capture new developments and applications in data mining and knowledge
discovery, while summarizing the computational tools and techniques useful in data analysis.
This series encourages the integration of mathematical, statistical, and computational meth-
ods and techniques through the publication of a broad range of textbooks, reference works,
and handbooks. The inclusion of concrete examples and applications is highly encouraged. The
scope of the series includes, but is not limited to, titles in the areas of data mining and knowledge
discovery methods and applications, modeling, algorithms, theory and foundations, data and
knowledge visualization, data mining systems and tools, and privacy and security issues.

PUBLISHED TITLES
ACCELERATING DISCOVERY: MINING UNSTRUCTURED INFORMATION FOR
HYPOTHESIS GENERATION
Scott Spangler
ADVANCES IN MACHINE LEARNING AND DATA MINING FOR ASTRONOMY
Michael J. Way, Jeffrey D. Scargle, Kamal M. Ali, and Ashok N. Srivastava
BIOLOGICAL DATA MINING
Jake Y. Chen and Stefano Lonardi
COMPUTATIONAL BUSINESS ANALYTICS
Subrata Das
COMPUTATIONAL INTELLIGENT DATA ANALYSIS FOR SUSTAINABLE
DEVELOPMENT
Ting Yu, Nitesh V. Chawla, and Simeon Simoff
COMPUTATIONAL METHODS OF FEATURE SELECTION
Huan Liu and Hiroshi Motoda
CONSTRAINED CLUSTERING: ADVANCES IN ALGORITHMS, THEORY,
AND APPLICATIONS
Sugato Basu, Ian Davidson, and Kiri L. Wagstaff
CONTRAST DATA MINING: CONCEPTS, ALGORITHMS, AND APPLICATIONS
Guozhu Dong and James Bailey
DATA CLASSIFICATION: ALGORITHMS AND APPLICATIONS
Charu C. Aggarawal
DATA CLUSTERING: ALGORITHMS AND APPLICATIONS
Charu C. Aggarawal and Chandan K. Reddy
DATA CLUSTERING IN C++: AN OBJECT-ORIENTED APPROACH
Guojun Gan
DATA MINING: A TUTORIAL-BASED PRIMER, SECOND EDITION
Richard J. Roiger
DATA MINING FOR DESIGN AND MARKETING
Yukio Ohsawa and Katsutoshi Yada
DATA MINING WITH R: LEARNING WITH CASE STUDIES, SECOND EDITION
Luís Torgo
DATA SCIENCE AND ANALYTICS WITH PYTHON
Jesus Rogel-Salazar
EVENT MINING: ALGORITHMS AND APPLICATIONS
Tao Li
FOUNDATIONS OF PREDICTIVE ANALYTICS
James Wu and Stephen Coggeshall
GEOGRAPHIC DATA MINING AND KNOWLEDGE DISCOVERY,
SECOND EDITION
Harvey J. Miller and Jiawei Han
GRAPH-BASED SOCIAL MEDIA ANALYSIS
Ioannis Pitas
HANDBOOK OF EDUCATIONAL DATA MINING
Cristóbal Romero, Sebastian Ventura, Mykola Pechenizkiy, and Ryan S.J.d. Baker
HEALTHCARE DATA ANALYTICS
Chandan K. Reddy and Charu C. Aggarwal
INFORMATION DISCOVERY ON ELECTRONIC HEALTH RECORDS
Vagelis Hristidis
INTELLIGENT TECHNOLOGIES FOR WEB APPLICATIONS
Priti Srinivas Sajja and Rajendra Akerkar
INTRODUCTION TO PRIVACY-PRESERVING DATA PUBLISHING: CONCEPTS AND
TECHNIQUES
Benjamin C. M. Fung, Ke Wang, Ada Wai-Chee Fu, and Philip S. Yu
KNOWLEDGE DISCOVERY FOR COUNTERTERRORISM AND
LAW ENFORCEMENT
David Skillicorn
KNOWLEDGE DISCOVERY FROM DATA STREAMS
João Gama
LARGE-SCALE MACHINE LEARNING IN THE EARTH SCIENCES
Ashok N. Srivastava, Ramakrishna Nemani, and Karsten Steinhaeuser
MACHINE LEARNING AND KNOWLEDGE DISCOVERY FOR
ENGINEERING SYSTEMS HEALTH MANAGEMENT
Ashok N. Srivastava and Jiawei Han
MINING SOFTWARE SPECIFICATIONS: METHODOLOGIES AND APPLICATIONS
David Lo, Siau-Cheng Khoo, Jiawei Han, and Chao Liu
MULTIMEDIA DATA MINING: A SYSTEMATIC INTRODUCTION TO
CONCEPTS AND THEORY
Zhongfei Zhang and Ruofei Zhang
MUSIC DATA MINING
Tao Li, Mitsunori Ogihara, and George Tzanetakis
NEXT GENERATION OF DATA MINING
Hillol Kargupta, Jiawei Han, Philip S. Yu, Rajeev Motwani, and Vipin Kumar
RAPIDMINER: DATA MINING USE CASES AND BUSINESS ANALYTICS APPLICATIONS
Markus Hofmann and Ralf Klinkenberg
RELATIONAL DATA CLUSTERING: MODELS, ALGORITHMS,
AND APPLICATIONS
Bo Long, Zhongfei Zhang, and Philip S. Yu
SERVICE-ORIENTED DISTRIBUTED KNOWLEDGE DISCOVERY
Domenico Talia and Paolo Trunfio
SPECTRAL FEATURE SELECTION FOR DATA MINING
Zheng Alan Zhao and Huan Liu
STATISTICAL DATA MINING USING SAS APPLICATIONS, SECOND EDITION
George Fernandez
SUPPORT VECTOR MACHINES: OPTIMIZATION BASED THEORY, ALGORITHMS,
AND EXTENSIONS
Naiyang Deng, Yingjie Tian, and Chunhua Zhang
TEMPORAL DATA MINING
Theophano Mitsa
TEXT MINING: CLASSIFICATION, CLUSTERING, AND APPLICATIONS
Ashok N. Srivastava and Mehran Sahami
TEXT MINING AND VISUALIZATION: CASE STUDIES USING OPEN-SOURCE TOOLS
Markus Hofmann and Andrew Chisholm
THE TOP TEN ALGORITHMS IN DATA MINING
Xindong Wu and Vipin Kumar
UNDERSTANDING COMPLEX DATASETS: DATA MINING WITH MATRIX
DECOMPOSITIONS
David Skillicorn
DATA SCIENCE
AND ANALYTICS
WITH PYTHON

Jesús Rogel-Salazar

Boca Raton London New York

CRC Press is an imprint of the


Taylor & Francis Group, an informa business
A CHAPMAN & HALL BOOK
CRC Press
Taylor & Francis Group
6000 Broken Sound Parkway NW, Suite 300
Boca Raton, FL 33487-2742

© 2017 by Taylor & Francis Group, LLC


CRC Press is an imprint of Taylor & Francis Group, an Informa business

No claim to original U.S. Government works

Printed on acid-free paper


Version Date: 20170517

International Standard Book Number-13: 978-1-498-74209-2 (Hardback)

This book contains information obtained from authentic and highly regarded sources. Reasonable efforts have been made to
publish reliable data and information, but the author and publisher cannot assume responsibility for the validity of all materials
or the consequences of their use. The authors and publishers have attempted to trace the copyright holders of all material
reproduced in this publication and apologize to copyright holders if permission to publish in this form has not been obtained. If
any copyright material has not been acknowledged please write and let us know so we may rectify in any future reprint.

Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmitted, or utilized in any
form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying, microfilming,
and recording, or in any information storage or retrieval system, without written permission from the publishers.

For permission to photocopy or use material electronically from this work, please access www.copyright.com (http://www.
copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400.
CCC is a not-for-profit organization that provides licenses and registration for a variety of users. For organizations that have been
granted a photocopy license by the CCC, a separate system of payment has been arranged.

Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identifi-
cation and explanation without intent to infringe.
Visit the Taylor & Francis Web site at
http://www.taylorandfrancis.com
and the CRC Press Web site at
http://www.crcpress.com
To A. J. Johnson and Prof. Bowman

Thanks to Alan M Turing for

opening up my mind
ix

Contents

1 Trials and Tribulations of a Data Scientist 1


1.1 Data? Science? Data Science! 2
1.1.1 So, What Is Data Science? 3

1.2 The Data Scientist: A Modern Jackalope 7


1.2.1 Characteristics of a Data Scientist and a Data Science Team 12

1.3 Data Science Tools 17


1.3.1 Open Source Tools 20

1.4 From Data to Insight: the Data Science Workflow 22


1.4.1 Identify the Question 24

1.4.2 Acquire Data 25

1.4.3 Data Munging 25

1.4.4 Modelling and Evaluation 26

1.4.5 Representation and Interaction 26

1.4.6 Data Science: an Iterative Process 27

1.5 Summary 28
x j. rogel-salazar

2 Python: For Something Completely Different 31


2.1 Why Python? Why not?! 33
2.1.1 To Shell or not To Shell 36

2.1.2 iPython/Jupyter Notebook 39

2.2 Firsts Slithers with Python 40


2.2.1 Basic Types 40

2.2.2 Numbers 41

2.2.3 Strings 41

2.2.4 Complex Numbers 43

2.2.5 Lists 44

2.2.6 Tuples 49

2.2.7 Dictionaries 52

2.3 Control Flow 54


2.3.1 if... elif... else 55

2.3.2 while 56

2.3.3 for 57

2.3.4 try... except 58

2.3.5 Functions 61

2.3.6 Scripts and Modules 65

2.4 Computation and Data Manipulation 68


2.4.1 Matrix Manipulations and Linear Algebra 69

2.4.2 NumPy Arrays and Matrices 71

2.4.3 Indexing and Slicing 74


data science and analytics with python xi

2.5 Pandas to the Rescue 76

2.6 Plotting and Visualising: Matplotlib 81

2.7 Summary 83

3 The Machine that Goes “Ping”: Machine Learning and Pattern


Recognition 87
3.1 Recognising Patterns 87

3.2 Artificial Intelligence and Machine Learning 90

3.3 Data is Good, but other Things are also Needed 92

3.4 Learning, Predicting and Classifying 94

3.5 Machine Learning and Data Science 98

3.6 Feature Selection 100

3.7 Bias, Variance and Regularisation: A Balancing Act 102

3.8 Some Useful Measures: Distance and Similarity 105

3.9 Beware the Curse of Dimensionality 110

3.10 Scikit-Learn is our Friend 116

3.11 Training and Testing 119

3.12 Cross-Validation 124

3.12.1 k-fold Cross-Validation 125

3.13 Summary 128


xii j. rogel-salazar

4 The Relationship Conundrum: Regression 131


4.1 Relationships between Variables: Regression 131
4.2 Multivariate Linear Regression 136
4.3 Ordinary Least Squares 138
4.3.1 The Maths Way 139

4.4 Brain and Body: Regression with One Variable 144


4.4.1 Regression with Scikit-learn 153

4.5 Logarithmic Transformation 155


4.6 Making the Task Easier: Standardisation and Scaling 160
4.6.1 Normalisation or Unit Scaling 161

4.6.2 z-Score Scaling 162

4.7 Polynomial Regression 164


4.7.1 Multivariate Regression 169

4.8 Variance-Bias Trade-Off 170


4.9 Shrinkage: LASSO and Ridge 172
4.10 Summary 179

5 Jackalopes and Hares: Clustering 181


5.1 Clustering 182
5.2 Clustering with k-means 183
5.2.1 Cluster Validation 186

5.2.2 k-means in Action 189


data science and analytics with python xiii

5.3 Summary 193

6 Unicorns and Horses: Classification 195


6.1 Classification 196
6.1.1 Confusion Matrices 198

6.1.2 ROC and AUC 202

6.2 Classification with KNN 205


6.2.1 KNN in Action 206

6.3 Classification with Logistic Regression 211


6.3.1 Logistic Regression Interpretation 216

6.3.2 Logistic Regression in Action 218

6.4 Classification with Naïve Bayes 226


6.4.1 Naïve Bayes Classifier 232

6.4.2 Naïve Bayes in Action 233

6.5 Summary 238

7 Decisions, Decisions: Hierarchical Clustering, Decision Trees and


Ensemble Techniques 241
7.1 Hierarchical Clustering 242
7.1.1 Hierarchical Clustering in Action 245

7.2 Decision Trees 249


7.2.1 Decision Trees in Action 256
xiv j. rogel-salazar

7.3 Ensemble Techniques 265

7.3.1 Bagging 271

7.3.2 Boosting 272

7.3.3 Random Forests 274

7.3.4 Stacking and Blending 276

7.4 Ensemble Techniques in Action 277


7.5 Summary 282

8 Less is More: Dimensionality Reduction 285


8.1 Dimensionality Reduction 286
8.2 Principal Component Analysis 291

8.2.1 PCA in Action 295

8.2.2 PCA in the Iris Dataset 300

8.3 Singular Value Decomposition 304

8.3.1 SVD in Action 306

8.4 Recommendation Systems 310

8.4.1 Content-Based Filtering in Action 312

8.4.2 Collaborative Filtering in Action 316

8.5 Summary 323

9 Kernel Tricks up the Sleeve: Support Vector Machines 327


9.1 Support Vector Machines and Kernel Methods 328
data science and analytics with python xv

9.1.1 Support Vector Machines 331

9.1.2 The Kernel Trick 340

9.1.3 SVM in Action: Regression 343

9.1.4 SVM in Action: Classification 347

9.2 Summary 353

Pipelines in Scikit-Learn 355

Bibliography 361

Index 369
xxviii j. rogel-salazar

The book shows the use of computer code by enclosing it in


a box as follows:

> 1 + 1 # Example of computer code

We have made use of a diple (>) to denote the command


line terminal prompt shown in the Python shell. Please
note that the same commands can be used in the iPython
interactive shell or iPython/Jupyter notebook, although
the look and feel may be quite different. As you may have
already noticed, the book uses margin notes, such as the
one that appears to the right of this paragraph, to highlight This is an example of the margin
notes used throughout this book.
certain areas or commands, as well as to provide some
useful comments.

The book is organised in a way that individual chapters are


sufficiently independent from each other so that the reader
is comfortable using the contents as a reference rather than
a textbook. Inevitably, there will be occasions where certain
topics make reference to other parts of the book and I will
point out when that may be the case. I would also like to
take this opportunity to mention that the implementations
presented are by no means the only or best way to do things.
Programming is pretty similar to the creative process of Programming is a creative process,
and as such there is more than one
writing: The fact that you have a set of words does not
way to do things.
imply that we all write reports in a poetic manner. I would
be delighted to hear from you all about the implementations
and changes you make to the code presented here. Do get in
touch!
Another random document with
no related content on Scribd:
1. GOOD King Wenceslas looked out
On the Feast of Stephen,
When the snow lay round about,
Deep and crisp and even.
Brightly shone the moon that night,
Though the frost was cruel,
When a poor man came in sight,
Gathering winter fuel.

2. “Hither, page, and stand by me


If thou know’st it, telling,
Yonder peasant, who is he?
Where and what his dwelling?”
“Sire, he lives a good league hence
Underneath the mountain;
Right against the forest fence,
By St. Agnes’ fountain.”

3. “Bring me flesh, and bring me wine,


Bring me pine-logs hither;
Thou and I will see him dine,
When we bear them thither.”
Page and monarch forth they went,
Forth they went together
Through the rude wind’s wild lament
And the bitter weather.

4. “Sire! the night is darker now,


And the wind blows stronger;
Fails my heart, I know not how,
I can go no longer.”
“Mark my footsteps, good my page;
Tread thou in them boldly;
Thou shalt find the winter’s rage
Freeze thy blood less coldly.”

5. In his master’s steps he trod


Where the snow lay dinted;
Heat was in the very sod
Which the saint had printed.
Therefore Christian men, be sure,
Wealth or rank possessing,
Ye who now will bless the poor
Shall yourselves find blessing.
[Listen] [MusicXML]
As Joseph was a-walking
[Listen] [MusicXML]

1. As Joseph was a-walking


He heard an angel sing,
“This night shall be the birthtime
Of Christ, the Heavenly King.

2. “He neither shall be born


In housen nor in hall.
Nor in the place of Paradise,
But in an ox’s stall.

3. “He neither shall be clothed


In purple nor in pall,
But in the fair white linen
That usen babies all.

4. “He neither shall be rocked


In silver nor in gold,
But in a wooden manger
That resteth on the mould.”

5. As Joseph was a-walking,


There did an angel sing;
And Mary’s child at midnight
Was born to be our King.

6. Then be ye glad, good people,


This night of all the year,
And light ye up your candles,
For His star it shineth clear.
Christmas Day in the Morning
[Listen] [MusicXML]

1. I saw three ships come sailing in,


On Christmas Day, on Christmas Day;
I saw three ships come sailing in,
On Christmas Day in the morning.

2. And what was in those ships all three


On Christmas Day, on Christmas Day;
And what was in those ships all three
On Christmas Day in the morning?

3. Our Saviour Christ and His lady


On Christmas Day, on Christmas Day;
Our Saviour Christ and His lady
On Christmas Day in the morning.

4. Pray whither sailed those ships all three


On Christmas Day, on Christmas Day;
Pray whither sailed those ships all three
On Christmas Day in the morning?

5. O they sailed into Bethlehem


On Christmas Day, on Christmas Day;
O they sailed into Bethlehem
On Christmas Day in the morning.

6. And all the bells on earth shall ring


On Christmas Day, on Christmas Day;
And all the bells on earth shall ring
On Christmas Day in the morning.

7. And all the angels in Heaven shall sing


On Christmas Day, on Christmas Day;
And all the angels in Heaven shall sing
On Christmas Day in the morning.

8. And all the souls on earth shall sing


On Christmas Day, on Christmas Day;
And all the souls on earth shall sing
On Christmas Day in the morning.

9. Then let us all rejoice amain


On Christmas Day, on Christmas Day;
Then let us all rejoice amain
On Christmas Day in the morning.
God rest you merry, Gentlemen!
1. GOD rest you merry, Gentlemen!
Let nothing you dismay;
Remember Christ our Saviour
Was born upon this day,
To save us all from Satan’s power
When we were gone astray.
O tidings of comfort and joy,
O tidings of comfort and joy.

2. In Bethlehem in Jury
This blessèd Babe was born,
And laid within a manger
Upon this blessèd morn:
The which His Mother Mary
Nothing did take in scorn.
O tidings of comfort and joy,
O tidings of comfort and joy.

3. From God, our Heavenly Father,


A blessèd angel came,
And unto certain shepherds
Brought tidings of the same,
How that in Bethlehem was born
The Son of God by name.
O tidings of comfort and joy,
O tidings of comfort and joy.

4. “Fear not,” then said the angel,


“Let nothing you affright:
This day is born a Saviour
Of virtue, power, and might;
So frequently to vanquish all
The friends of Satan quite.”
O tidings of comfort and joy,
O tidings of comfort and joy.

5. The shepherds at those tidings


Rejoiced much in mind,
And left their flocks a-feeding
In tempest, storm, and wind,
And went to Bethlehem straightway
This blessèd Babe to find.
O tidings of comfort and joy,
O tidings of comfort and joy.

6. But when to Bethlehem they came,


Where this dear Infant lay,
They found Him in a manger
Where oxen feed on hay;
His mother Mary, kneeling,
Unto the Lord did pray.
O tidings of comfort and joy,
O tidings of comfort and joy.

7. Now to the Lord sing praises,


All you within this place,
And with true love and brotherhood
Each other now embrace;
This holy-tide of Christmas
All others doth efface.
O tidings of comfort and joy,
O tidings of comfort and joy.
[Listen] [MusicXML]
The Holy Well
1. A S it fell out one May morning,
On a bright holiday,
Sweet Jesus ask’d His mother dear,
If He might go to play.
“To play, to play, sweet Jesus go.
And to play now get you gone,
And let me hear of no complaints,
At night when you come home.”

2. Sweet Jesus went down to yonder town,


As far as the Holy Well,
And there did see as fine children
As any tongue can tell.
He said, “God bless you ev’ry one,
May Christ your portion be:
Little children, shall I play with you?
And you shall play with Me.”

3. But they made answer to Him, “No,”


They were lords’ and ladies’ sons;
And He the meanest of them all,
Was born in an ox’s stall.
Sweet Jesus turned Him around,
And He neither laugh’d nor smil’d,
But the tears came trickling from His eyes
Like water from the skies.

4. Sweet Jesus turned Him about,


To His mother’s dear home went He,
And said “I’ve been in yonder town,
As after you may see.
I’ve been in yonder town,
As far as the Holy Well;
There did I meet as fine children
As any tongue can tell.

5. I bid God bless them ev’ry one,

You might also like