Professional Documents
Culture Documents
Matplotlib (By Devert Alexandre)
Matplotlib (By Devert Alexandre)
Matplotlib
A tutorial
Devert Alexandre
School of Software Engineering of USTC
Table of Contents
1
First steps
Curve plots
Scatter plots
Boxplots
Histograms
Usage example
Curve plot
Lets plot a curve
i m p o r t math
import m a t p l o t l i b . pyplot as p l t
# Generate a s i n u s o i d
n b S a m p l e s = 256
xRange = (math . p i , math . p i )
x , y = [] , []
f o r n i n xrange ( nbSamples ) :
k = ( n + 0 . 5 ) / nbSamples
x . append ( xRange [ 0 ] + ( xRange [ 1 ] xRange [ 0 ] ) k )
y . append ( math . s i n ( x [ 1 ] ) )
# Plot the s i n u s o i d
plt . plot (x , y)
p l t . show ( )
Curve plot
This will show you something like this
1.0
0.5
0.0
0.5
1.0 4
numpy
matplotlib can work with numpy arrays
i m p o r t math
i m p o r t numpy
import m a t p l o t l i b . pyplot as p l t
# Generate a s i n u s o i d
n b S a m p l e s = 256
xRange = (math . p i , math . p i )
x , y = numpy . z e r o s ( n b S a m p l e s ) , numpy . z e r o s ( n b S a m p l e s )
f o r n i n xrange ( nbSamples ) :
k = ( n + 0 . 5 ) / nbSamples
x [ n ] = xRange [ 0 ] + ( xRange [ 1 ] xRange [ 0 ] ) k
y [ n ] = math . s i n ( x [ n ] )
# Plot the s i n u s o i d
plt . plot (x , y)
p l t . show ( )
numpy
zeros build arrays filled of 0
linspace build arrays filled with an arithmetic
sequence
i m p o r t math
i m p o r t numpy
import m a t p l o t l i b . pyplot as p l t
# Generate a s i n u s o i d
x = numpy . l i n s p a c e (math . p i , math . p i , num=256)
y = numpy . z e r o s ( n b S a m p l e s )
f o r n i n xrange ( nbSamples ) :
y [ n ] = math . s i n ( x [ n ] )
# Plot the s i n u s o i d
plt . plot (x , y)
p l t . show ( )
numpy
numpy functions can work on entire arrays
i m p o r t math
i m p o r t numpy
import m a t p l o t l i b . pyplot as p l t
# Generate a s i n u s o i d
x = numpy . l i n s p a c e (math . p i , math . p i , num=256)
y = numpy . s i n ( x )
# Plot the s i n u s o i d
plt . plot (x , y)
p l t . show ( )
PDF output
Exporting to a PDF file is just one change
i m p o r t math
i m p o r t numpy
import m a t p l o t l i b . pyplot as p l t
# Generate a s i n u s o i d
x = numpy . l i n s p a c e (math . p i , math . p i , num=256)
y = numpy . s i n ( x )
# Plot the s i n u s o i d
plt . plot (x , y)
p l t . s a v e f i g ( s i n p l o t . p d f , t r a n s p a r e n t=True )
Table of Contents
1
First steps
Curve plots
Scatter plots
Boxplots
Histograms
Usage example
Multiple curves
Its often convenient to show several curves in one figure
i m p o r t math
i m p o r t numpy
import m a t p l o t l i b . pyplot as p l t
#
x
y
z
G e n e r a t e a s i n u s o i d and a c o s i n o i d
= numpy . l i n s p a c e (math . p i , math . p i , num=256)
= numpy . s i n ( x )
= numpy . c o s ( x )
Multiple curves
Its often convenient to show several curves in one figure
1.0
0.5
0.0
0.5
1.0 4
Custom colors
Changing colors can help to make nice documents
i m p o r t math
i m p o r t numpy
import m a t p l o t l i b . pyplot as p l t
#
x
y
z
G e n e r a t e a s i n u s o i d and a c o s i n o i d
= numpy . l i n s p a c e (math . p i , math . p i , num=256)
= numpy . s i n ( x )
= numpy . c o s ( x )
Custom colors
Changing colors can help to make nice documents
1.0
0.5
0.0
0.5
1.0 4
Line thickness
Line thickness can be changed as well
i m p o r t math
i m p o r t numpy
import m a t p l o t l i b . pyplot as p l t
#
x
y
z
G e n e r a t e a s i n u s o i d and a c o s i n o i d
= numpy . l i n s p a c e (math . p i , math . p i , num=256)
= numpy . s i n ( x )
= numpy . c o s ( x )
Line thickness
Line thickness can be changed as well
1.0
0.5
0.0
0.5
1.0 4
Line patterns
For printed document, colors can be replaced by line
patterns
i m p o r t math
i m p o r t numpy
import m a t p l o t l i b . pyplot as p l t
# L i n e s t y l e s can be , , . ,
#
x
y
z
G e n e r a t e a s i n u s o i d and a c o s i n o i d
= numpy . l i n s p a c e (math . p i , math . p i , num=256)
= numpy . s i n ( x )
= numpy . c o s ( x )
Line patterns
For printed document, colors can be replaced by line
patterns
1.0
0.5
0.0
0.5
1.0 4
Markers
It sometime relevant to show the data points
i m p o r t math
i m p o r t numpy
import m a t p l o t l i b . pyplot as p l t
# M a r k e r s can be
#
x
y
z
. ,
,,
o ,
1 and more
G e n e r a t e a s i n u s o i d and a c o s i n o i d
= numpy . l i n s p a c e (math . p i , math . p i , num=64)
= numpy . s i n ( x )
= numpy . c o s ( x )
Markers
It sometime relevant to show the data points
1.0
0.5
0.0
0.5
1.0 4
Legend
A legend can help to make selfexplanatory figures
i m p o r t math
i m p o r t numpy
import m a t p l o t l i b . pyplot as p l t
# l e g e n d l o c a t i o n can be b e s t ,
#
x
y
z
center ,
left ,
right , etc .
G e n e r a t e a s i n u s o i d and a c o s i n o i d
= numpy . l i n s p a c e (math . p i , math . p i , num=256)
= numpy . s i n ( x )
= numpy . c o s ( x )
Legend
A legend can help to make selfexplanatory figures
1.0
sin(x)
cos(x)
0.5
0.0
0.5
1.0 4
center ,
left ,
right , etc .
G e n e r a t e a s i n u s o i d and a c o s i n o i d
= numpy . l i n s p a c e (math . p i , math . p i , num=256)
= numpy . s i n ( x )
= numpy . c o s ( x )
# Axis setup
f i g = plt . figure ()
a x i s = f i g . add subplot (111)
a x i s . s e t y l i m ( 0.5 math . p i , 0 . 5 math . p i )
# Plot the curves
p l t . p l o t ( x , y , c= #FF4500 , l a b e l= s i n ( x ) )
p l t . p l o t ( x , z , c= #4682B4 , l a b e l= c o s ( x ) )
p l t . l e g e n d ( l o c= b e s t )
p l t . show ( )
sin(x)
cos(x)
1.0
0.5
0.0
0.5
1.0
1.5
Grid
Same goes for a grid, can be helpful
i m p o r t math
i m p o r t numpy
import m a t p l o t l i b . pyplot as p l t
# l e g e n d l o c a t i o n can be b e s t ,
#
x
y
z
center ,
left ,
right , etc .
G e n e r a t e a s i n u s o i d and a c o s i n o i d
= numpy . l i n s p a c e (math . p i , math . p i , num=256)
= numpy . s i n ( x )
= numpy . c o s ( x )
# Axis setup
f i g = plt . figure ()
a x i s = f i g . add subplot (111)
a x i s . s e t y l i m ( 0.5 math . p i , 0 . 5 math . p i )
a x i s . g r i d ( True )
# Plot the curves
p l t . p l o t ( x , y , c= #FF4500 , l a b e l= s i n ( x ) )
p l t . p l o t ( x , z , c= #4682B4 , l a b e l= c o s ( x ) )
p l t . l e g e n d ( l o c= b e s t )
p l t . show ( )
Grid
Same goes for a grid, can be helpful
1.5
sin(x)
cos(x)
1.0
0.5
0.0
0.5
1.0
1.5
Error bars
Your data might come with a known measure error
import
import
import
import
math
numpy
numpy . random
m a t p l o t l i b . pyplot as p l t
# Generate a noisy s i n u s o i d
x = numpy . l i n s p a c e (math . p i , math . p i , num=48)
y = numpy . s i n ( x + 0 . 0 5 numpy . random . s t a n d a r d n o r m a l ( l e n ( x ) ) )
y e r r o r = 0 . 1 numpy . random . s t a n d a r d n o r m a l ( l e n ( x ) )
# Axis setup
f i g = plt . figure ()
a x i s = f i g . add subplot (111)
a x i s . s e t y l i m ( 0.5 math . p i , 0 . 5 math . p i )
# Plot the curves
p l t . p l o t ( x , y , c= #FF4500 )
p l t . e r r o r b a r ( x , y , y e r r=y e r r o r )
p l t . show ( )
Error bars
Your data might come with a known measure error
1.5
1.0
0.5
0.0
0.5
1.0
1.5
Table of Contents
1
First steps
Curve plots
Scatter plots
Boxplots
Histograms
Usage example
Scatter plot
A scatter plot just shows one point for each dataset entry
i m p o r t numpy
i m p o r t numpy . random
import m a t p l o t l i b . pyplot as p l t
# G e n e r a t e a 2d n o r m a l d i s t r i b u t i o n
n b P o i n t s = 100
x = numpy . random . s t a n d a r d n o r m a l ( n b P o i n t s )
y = numpy . random . s t a n d a r d n o r m a l ( n b P o i n t s )
# Plot the points
plt . scatter (x , y)
p l t . show ( )
Scatter plot
A scatter plot just shows one point for each dataset entry
5
4
3
2
1
0
1
2
3
43
Aspect ratio
If can be very important to have the same scale on both
axis
i m p o r t numpy
i m p o r t numpy . random
import m a t p l o t l i b . pyplot as p l t
# G e n e r a t e a 2d n o r m a l d i s t r i b u t i o n
n b P o i n t s = 100
x = numpy . random . s t a n d a r d n o r m a l ( n b P o i n t s )
y = 0 . 1 numpy . random . s t a n d a r d n o r m a l ( n b P o i n t s )
# Axis setup
f i g = plt . figure ()
a x i s = f i g . a d d s u b p l o t ( 1 1 1 , a s p e c t= e q u a l )
# Plot the points
p l t . s c a t t e r ( x , y , c= #FF4500 )
p l t . show ( )
Aspect ratio
If can be very important to have the same scale on both
axis
0.3
0.2
0.1
0.0
0.1
0.2
0.3
3
Aspect ratio
Alternative way to keep the same scale on both axis
i m p o r t numpy
i m p o r t numpy . random
import m a t p l o t l i b . pyplot as p l t
# G e n e r a t e a 2d n o r m a l d i s t r i b u t i o n
n b P o i n t s = 100
x = numpy . random . s t a n d a r d n o r m a l ( n b P o i n t s )
y = 0 . 1 numpy . random . s t a n d a r d n o r m a l ( n b P o i n t s )
# Axis setup
f i g = plt . figure ()
a x i s = f i g . add subplot (111)
cmin , cmax = min ( min ( x ) , min ( y ) ) , max ( max ( x ) , max ( y ) )
cmin = 0 . 0 5 ( cmax cmin )
cmax += 0 . 0 5 ( cmax cmin )
a x i s . s e t x l i m ( cmin , cmax )
a x i s . s e t y l i m ( cmin , cmax )
# Plot the points
p l t . s c a t t e r ( x , y , c= #FF4500 )
p l t . show ( )
Aspect ratio
Alternative way to keep the same scale on both axis
2
1
0
1
2
2
[]
x += [ numpy . random . s t a n d a r d n o r m a l ( n b P o i n t s ) ]
y += [ 0 . 2 5 numpy . random . s t a n d a r d n o r m a l ( n b P o i n t s ) ]
x += [ 0 . 5 numpy . random . s t a n d a r d n o r m a l ( n b P o i n t s ) + 3 . 0 ]
y += [ 2 numpy . random . s t a n d a r d n o r m a l ( n b P o i n t s ) + 2 . 0 ]
# Axis setup
f i g = plt . figure ()
a x i s = f i g . a d d s u b p l o t ( 1 1 1 , a s p e c t= e q u a l )
# Plot the points
f o r i in xrange ( len ( x ) ) :
p l t . s c a t t e r ( x [ i ] , y [ i ] , c=c o l o r s [ i % l e n ( c o l o r s ) ] )
p l t . show ( )
1 0
Showing centers
It can help to see the centers or the median points
i m p o r t numpy
i m p o r t numpy . random
import m a t p l o t l i b . pyplot as p l t
c o l o r s = ( #FF4500 , #3CB371 , #4682B4 , #DB7093 , #FFD700 )
# G e n e r a t e a 2d n o r m a l d i s t r i b u t i o n
n b P o i n t s = 100
x , y = [] ,
[]
x += [ numpy . random . s t a n d a r d n o r m a l ( n b P o i n t s ) ]
y += [ 0 . 2 5 numpy . random . s t a n d a r d n o r m a l ( n b P o i n t s ) ]
x += [ 0 . 5 numpy . random . s t a n d a r d n o r m a l ( n b P o i n t s ) + 3 . 0 ]
y += [ 2 numpy . random . s t a n d a r d n o r m a l ( n b P o i n t s ) + 2 . 0 ]
# Axis setup
f i g = plt . figure ()
a x i s = f i g . a d d s u b p l o t ( 1 1 1 , a s p e c t= e q u a l )
# Plot the points
f o r i in xrange ( len ( x ) ) :
col = colors [ i % len ( colors )]
p l t . s c a t t e r ( x [ i ] , y [ i ] , c=c o l )
p l t . s c a t t e r ( [ numpy . median ( x [ i ] ) ] ,
p l t . show ( )
Devert Alexandre (School of Software Engineering of USTC) Matplotlib Slide 24/44
Showing centers
It can help to see the centers or the median points
8
6
4
2
0
2
43
1 0
Marker styles
You can use different markers styles
i m p o r t numpy
i m p o r t numpy . random
import m a t p l o t l i b . pyplot as p l t
m a r k e r s = ( + , , . )
# G e n e r a t e a 2d n o r m a l d i s t r i b u t i o n
n b P o i n t s = 100
x , y = [] ,
[]
x += [ numpy . random . s t a n d a r d n o r m a l ( n b P o i n t s ) ]
y += [ 0 . 2 5 numpy . random . s t a n d a r d n o r m a l ( n b P o i n t s ) ]
x += [ 0 . 5 numpy . random . s t a n d a r d n o r m a l ( n b P o i n t s ) + 3 . 0 ]
y += [ 2 numpy . random . s t a n d a r d n o r m a l ( n b P o i n t s ) + 2 . 0 ]
# Axis setup
f i g = plt . figure ()
a x i s = f i g . a d d s u b p l o t ( 1 1 1 , a s p e c t= e q u a l )
Marker styles
You can use different markers styles
10
8
6
4
2
0
2
44 3 2 1 0 1 2 3 4 5
Table of Contents
1
First steps
Curve plots
Scatter plots
Boxplots
Histograms
Usage example
Boxplots
Lets do a simple boxplot
i m p o r t numpy
i m p o r t numpy . random
import m a t p l o t l i b . pyplot as p l t
Boxplots
Lets do a simple boxplot
3
2
1
0
1
2
3
Boxplots
You might want to show the original data in the same
time
i m p o r t numpy
i m p o r t numpy . random
import m a t p l o t l i b . pyplot as p l t
Boxplots
You might want to show the original data in the same
time
4
3
2
1
0
1
2
3
4
Multiple boxplots
Boxplots are often used to show side by side various
distributions
i m p o r t numpy
i m p o r t numpy . random
import m a t p l o t l i b . pyplot as p l t
Multiple boxplots
Boxplots are often used to show side by side various
distributions
14
12
10
8
6
4
2
Orientation
Changing the orientation is easily done
i m p o r t numpy
i m p o r t numpy . random
import m a t p l o t l i b . pyplot as p l t
Orientation
Changing the orientation is easily done
5
4
3
2
1
6
10
12
Legend
Good graphics have a proper legend
i m p o r t numpy
i m p o r t numpy . random
import m a t p l o t l i b . pyplot as p l t
Legend
Good graphics have a proper legend
Alien nodules composition
112
concentration (ppm)
110
108
106
104
102
100
mercury
lead
lithium
tungstene
cadnium
Table of Contents
1
First steps
Curve plots
Scatter plots
Boxplots
Histograms
Usage example
Histograms
Histogram are convenient to sum-up results
i m p o r t numpy
i m p o r t numpy . random
import m a t p l o t l i b . pyplot as p l t
# Some d a t a
d a t a = numpy . a b s ( numpy . random . s t a n d a r d n o r m a l ( 3 0 ) )
# Show an h i s t o g r a m
p l t . b a r h ( r a n g e ( l e n ( d a t a ) ) , data , c o l o r= #4682B4 )
p l t . show ( )
Histograms
Histogram are convenient to sum-up results
30
25
20
15
10
5
00.0
0.5
1.0
1.5
2.0
2.5
Histograms
A variant to show 2 quantities per item on 1 figure
i m p o r t numpy
i m p o r t numpy . random
import m a t p l o t l i b . pyplot as p l t
# Some d a t a
data = [ [ ] , [ ] ]
f o r i i n xrange ( l e n ( data ) ) :
d a t a [ i ] = numpy . a b s ( numpy . random . s t a n d a r d n o r m a l ( 3 0 ) )
# Show an h i s t o g r a m
l a b e l s = range ( l e n ( data [ 0 ] ) )
p l t . b a r h ( l a b e l s , d a t a [ 0 ] , c o l o r= #4682B4 )
p l t . b a r h ( l a b e l s , 1 d a t a [ 1 ] , c o l o r= #FF4500 )
p l t . show ( )
Histograms
A variant to show 2 quantities per item on 1 figure
30
25
20
15
10
5
03
Labels
Very often, we need to name items on an histogram
i m p o r t numpy
i m p o r t numpy . random
import m a t p l o t l i b . pyplot as p l t
# Some d a t a
names = [ Wang Bu , Cheng Cao , Zhang Xue L i , Tang Wei , Sun Wu Kong ]
marks = 7 numpy . random . r a n d o m s a m p l e ( l e n ( names ) ) + 3
# Axis setup
f i g = plt . figure ()
a x i s = f i g . add subplot (111)
a x i s . s e t x l i m (0 , 10)
p l t . y t i c k s ( numpy . a r a n g e ( l e n ( marks ) ) + 0 . 5 , names )
a x i s . s e t t i t l e ( Datam i n i n g marks )
a x i s . s e t x l a b e l ( mark )
# Show an h i s t o g r a m
p l t . b a r h ( r a n g e ( l e n ( marks ) ) , marks , c o l o r= #4682B4 )
p l t . show ( )
Labels
Very often, we need to name items on an histogram
Data-mining marks
Sun Wu Kong
Tang Wei
Zhang Xue Li
Cheng Cao
Wang Bu
0
mark
10
Error bars
Error bars, to indicate the accuracy of values
i m p o r t numpy
i m p o r t numpy . random
import m a t p l o t l i b . pyplot as p l t
# Some d a t a
names = [ 6809 , 6502 , 8086 , Z80 , RCA1802 ]
s p e e d = 70 numpy . random . r a n d o m s a m p l e ( l e n ( names ) ) + 30
e r r o r = 9 numpy . random . r a n d o m s a m p l e ( l e n ( names ) ) + 1
# Axis setup
f i g = plt . figure ()
a x i s = f i g . add subplot (111)
p l t . y t i c k s ( numpy . a r a n g e ( l e n ( names ) ) + 0 . 5 , names )
a x i s . s e t t i t l e ( 8 b i t s CPU benchmark s t r i n g
axis . set xlabel ( score )
test )
# Show an h i s t o g r a m
p l t . b a r h ( r a n g e ( l e n ( names ) ) , s p e e d , x e r r=e r r o r , c o l o r= #4682B4 )
p l t . show ( )
Error bars
Error bars, to indicate the accuracy of values
8 bits CPU benchmark - string test
RCA1802
Z80
8086
6502
6809
0
20
40
score
60
80
100
Table of Contents
1
First steps
Curve plots
Scatter plots
Boxplots
Histograms
Usage example
Old Faithful
Old Faithful
This way works, but good example of half-done job
i m p o r t numpy
import m a t p l o t l i b . pyplot as p l t
# Read t h e d a t a
d a t a = numpy . l o a d t x t ( . / d a t a s e t s / g e y s e r . d a t )
# Axis setup
f i g = plt . figure ()
a x i s = f i g . a d d s u b p l o t ( 1 1 1 , a s p e c t= e q u a l )
# Plot the points
p l t . s c a t t e r ( d a t a [ : , 0 ] , d a t a [ : , 1 ] , c= #FF4500 )
p l t . show ( )
Old Faithful
This way works, but good example of half-done job
5.5
5.0
4.5
4.0
3.5
3.0
2.5
2.0
1.530
40
50
60
70
80
90
100
Old Faithful
Lets make a more readable figure
i m p o r t numpy
import m a t p l o t l i b . pyplot as p l t
# Read t h e d a t a
d a t a = numpy . l o a d t x t ( . / d a t a s e t s / g e y s e r . d a t )
# Axis setup
f i g = plt . figure ()
a x i s = f i g . add subplot (111)
a x i s . s e t t i t l e ( Old F a i t h f u l g e y s e r d a t a s e t )
a x i s . s e t x l a b e l ( w a i t i n g t i m e (mn . ) )
a x i s . s e t y l a b e l ( e r u p t i o n d u r a t i o n (mn . ) )
# Plot the points
p l t . s c a t t e r ( d a t a [ : , 0 ] , d a t a [ : , 1 ] , c= #FF4500 )
p l t . show ( )
Old Faithful
Lets make a more readable figure
Old Faithful geyser dataset
5.5
5.0
4.5
4.0
3.5
3.0
2.5
2.0
1.530
40
50
60
70
waiting time (mn.)
80
90
100
# Read t h e d a t a
d a t a = numpy . l o a d t x t ( . / d a t a s e t s / f i s h . d a t )
# Axis setup
f i g = plt . figure ()
a x i s = f i g . add subplot (111)
a x i s . s e t t i t l e ( North C a r o l i n a f i s h e s mercury c o n c e n t r a t i o n )
a x i s . s e t x l a b e l ( weight (g . ) )
a x i s . s e t y l a b e l ( m e r c u r y c o n c e n t r a t i o n ( ppm ) )
# Plot the points
p l t . s c a t t e r ( d a t a [ : , 3 ] , d a t a [ : , 4 ] , c= #FF4500 )
p l t . show ( )
3.5
3.0
2.5
2.0
1.5
1.0
0.5
0.0
0.51000
1000
2000
weight (g.)
3000
4000
5000
mercury concentration
mercury
(ppm)concentration (ppm)
weight (g.)
30
40
1000
30
length (cm.)
50
2000
weight (g.)
40
length (cm.)
60
3000
50
4000
60
70
5000
70
mercury concentration
mercury
(ppm)concentration (ppm)
weight (g.)
30
40
1000
30
length (cm.)
50
2000
weight (g.)
40
length (cm.)
60
3000
50
4000
60
70
5000
70