Download as pdf or txt
Download as pdf or txt
You are on page 1of 44

Data Visualization Using Python

Dr. Muhammad Hanif


Department of Computer Science, Electrical and Space Engineering
Lulea University of Technology, Sweden
Data Scientist in 2D….

https://www.slideshare.net/joshwills/production-machine-learninginfrastructure
Data Scientist in 3D….

http://drewconway.com/zia/2013/3/26/the-data-science-venn-diagram
Data Scientist in 5D….

https://speakerdeck.com/chdoig/the-state-of-python-for-data-science-pyss
Data Scientist in 5D….

https://speakerdeck.com/chdoig/the-state-of-python-for-data-science-pyss
Data Scientists Responsibilities

http://berkeleysciencereview.com/article/first-rule-data-science/
Why Python?
q General purpose

q IPython

q Popular and mature (both API wise and community support wise)

q Glue language (high level APIs, low level C/Fortran bindings)

q Science ecosystem (growing!)


Python’s Popularity: Widespread Knowledge and Many Tools

https://becominghuman.ai/top-20-most-popular-programming-languages-for-2021-and-beyond-735ee8370c61
Python’s Popularity: Widespread Knowledge and Many Tools

https://yalantis.com/blog/top-10-programming-languages/
Avoid Two Language Problem
Python’s Usage: Spread Over Whole Data Science Workflow

https://speakerdeck.com/chdoig/the-state-of-python-for-data-science-pyss-2015
One day at FB’s Data Science: A member could…

Author a multistage a)processing pipeline in Python,


design a hypothesis test, perform a b)Regression analysis
over data sample with R, design and implement an
c)algorithm for some data-intensive service in Hadoop,
or d)communicate the results of our analysis.

Jeff Hammerbacher

http://berkeleysciencereview.com/scienti%EF%AC%81c-collaborations-uc-berkeley-data-driven-cover/
Python Fits All!
Python: Tools
q Interactivity / Collaboration
o Ipython
o Jupyter

q Data Wrangling / Analysis


o Numpy
o Pandas

q Data Visualization
o Matplotlib
o Seaborn etc.
Why
Visualize
?
Visualize to Analyze
Visualize to Analyze
q Patterns q Correlation

q Trends
Make Decision based on a massive dataset

IN ONE
LOOK
Visualize
to
Discover
Interactive Visualization:
Let You Discover Information

https://ocean.sagepub.com/blog/tools-and-tech/turning-covid-19-into-a-data-visualization-exercise-for-your-students
Visualize
to
Support
a Story
Visualize
to tell a
Story
By itself
Distribution of Global Wealth

http://news.bbc.co.uk/2/shared/spl/hi/guides/457000/457022/html/nn5page1.stm
Visualize
to
Teach
Our brain processes
visuals 60,000 times
faster than text

https://twitter.com/omnivex/status/1126879918804094976
Python
Libraries
For Data
Visualization
Data Science is Getting Important for Python Community

6 out of 25 most popular libraries are for Data Science


https://www.python.org/dev/peps/pep-0465/#but-isn-t-matrix-multiplication-a-pretty-niche-requirement
Science Stack is Getting Better Each Day

https://speakerdeck.com/chdoig/the-state-of-python-for-data-science-pyss-2015
Matplotlib

q Python 2D plotting library which produces


publication quality figures in a variety of
hardcopy formats and interactive environments
across platforms.

q Python forerunner library for data visualization.

q “is extremely powerful but with that power


comes complexity.”
Matplotlib

https://realpython.com/python-matplotlib-guide/
Seaborn

q harnesses the power of matplotlib to create


beautiful charts in a few lines of code.

q The key difference is Seaborn’s default styles


and color palettes are designed to be more
aesthetically pleasing and modern.
Seaborn

https://blog.insightdatascience.com/data-visualization-in-python-advanced-functionality-in-seaborn-20d217f1a9a6
ggplot

q plotting system for Python based on R's


ggplot2 and the Grammar of Graphics.

q layer components to create a complete plot.


ggplot
Bokeh

q is also based on The Grammar of Graphics,


but unlike ggplot, it’s native to Python, not
ported over from R.

q supports streaming and real-time data.


Bokeh

https://www.kaggle.com/pavansanagapati/pandas-bokeh-visualization-tutorial
pygal

q offers interactive plots that can be embedded in


the web browser.
o Its prime differentiator is the ability to output
charts as SVGs.

q Each chart type is packaged into a method and


the built-in styles are pretty,
o it’s easy to create a nice-looking chart in a few
lines of code.
pygal

https://www.pluralsight.com/guides/charts-in-pygal
plotly

q making interactive plots, but it offers some


charts you won’t find in most libraries, like
contour plots, dendrograms, and 3D
charts.
plotly

https://pypi.org/project/plotly/
geoplotlib geoplotlib
q toolbox for creating maps and plotting
geographical data.

q You can use it to create a variety of map-types,


like choropleths, heatmaps, and dot density
maps.
geoplotlib geoplotlib

https://www.pluralsight.com/guides/building-geoplots-with-geoplotlib

You might also like