Matplotlib in Python

You might also like

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 43

Matplotlib in Python

Introduction to Data Visualization in Python


• Matplotlib is a powerful plotting library in Python used for creating static,
animated, and interactive visualizations.
• It was originally designed to emulate plotting abilities of Matlab but in
Python
• Matplotlib is popular due to its ease of use, extensive documentation,
and wide range of plotting capabilities.
• Many other packages use Matplotlib for data visualization, including
pandas, NumPy, and SciPy.
• Other libraries include seaborn, Altair, ggpy, Bokeh, plotly
• While some are built on top of Matplotlib, while others are independent
In Matplotlib, a figure is the top-level container
that holds all the elements of a plot.
It represents the entire window or page where
the plot is drawn.
The parts of a Matplotlib figure include:
• Figures (the canvas)
• Axes (The co-ordinate system)
• Axis (X-Y Axis)
• Marker
• Lines to Figures
• Matplotlib Title
• Axis labels
• Ticks and tick labels
• Legend
• Gridlines
• Spines (Borders of the plot area)
• The package is imported into the Python script by adding the
following statement:
from matplotlib import pyplot as plt
• Here pyplot() is the most important function in matplotlib library,
which is used to plot 2D data.
Pyplot in Matplotlib
• Pyplot is a Matplotlib module that provides a MATLAB-like interface.
• Each pyplot function makes some changes to a figure: e.g., creates a
figure, creates a plotting area in a figure, plots some lines in a plotting
area, decorates the plot with labels, etc.
• The various plots we can utilize using Pyplot are Line
Plot, Histogram, Scatter, 3D Plot, Image, Contour, and Polar
Basic Functions for Chart Creation
• Use plot() function of matplotlib.pyplot to plot the graph. This
function is used to draw the graph. It takes x value, y value,
format string(line style and color) as an argument.
• Use show() function of matplotlib.pyplot to show the graph
window. This function is used to display the graph. It does not
take any argument.
• Use title() function of matplotlib.pyplot to give title to graph. It
takes string to be displayed as title as argument.
• Use xlabel() function of matplotlib.pyplot to give label to x-axis.
It takes string to be displayed as label of x-axis as argument.
• Use ylabel() function of matplotlib.pyplot to give label to y-axis.
It takes string to be displayed as label of y-axis as argument.
• Use savefig() function of matplotlib.pyplot to save the result in a file.
• Use annotate() function of matplotlib.pyplot to highlight some
specific locations in the chart.
• Use legend() function of matplotlib.pyplot to apply legend in the
chart.
• The subplot() function allows you to plot different things in the
same figure. Its first argument specify height, second specify the
width and third argument specify the active subplot.
• Use bar() function to generate if we want to draw bar graph in
place of line graph. E.g. plt.bar(x, y, color = 'g', align = 'center')
• Use hist() function for graphical representation of the frequency
distribution of data. Rectangles of equal horizontal size
corresponding to class interval called bin and variable height
corresponding to frequency. It takes the input array and bins as
two parameters. The successive elements in bin array act as the
boundary of each bin.
Example
# importing matplotlib module
from matplotlib import pyplot as plt
Note: Remember to use plt.savefig()
# x-axis values before the plt.show() function
x = [5, 2, 9, 4, 7]
# Y-axis values
y = [10, 5, 8, 4, 2]
# Function to plot
plt.plot(x,y)
# function to show the plot
plt.savefig(“line_plot.png”)
plt.show()
import matplotlib.pyplot as plt
# Define X and Y data points
X = [12, 34, 23, 45, 67, 89]
Y = [1, 3, 67, 78, 7, 5]
# Plot the graph using matplotlib
plt.plot(X, Y,marker='o', markerfacecolor='r’)
plt.xlabel(“X-Axis”)
plt.ylable(“Y-Axis”)
# Add gridlines to the plot
plt.grid(color = 'green', linestyle = '--', linewidth = 0.5)
# `plt.grid()` also works
# displaying the title
plt.title(label='Number of Users of a particular Language’,
fontweight=10, pad='2.0’)
# Function to view the plot
plt.show()
Plotting Multiple Lines in a Line Plot
import matplotlib.pyplot as plt
import numpy as np
# create data
x = [1,2,3,4,5]
y = [3,3,3,3,3]
# plot lines
plt.plot(x, y, label = "line 1", linestyle="-")
plt.plot(y, x, label = "line 2", linestyle="--")
plt.plot(x, np.sin(x), label = "curve 1", linestyle="-.")
plt.plot(x, np.cos(x), label = "curve 2", linestyle=":")
plt.legend()
plt.show()
Bar Plot
from matplotlib import pyplot as plt
# x-axis values
x = [5, 2, 9, 4, 7]
# Y-axis values
y = [10, 5, 8, 4, 2]
# Function to plot the bar
plt.bar(x,y)
# function to show the plot
plt.show()
Horizontal Bar Chart
import matplotlib.pyplot as plt
y=['one', 'two', 'three', 'four', 'five']
# getting values against each value of y
x=[5,24,35,67,12]
plt.barh(y, x)
# setting label of y-axis
plt.ylabel("pen sold")
# setting label of x-axis
plt.xlabel("price")
plt.title("Horizontal bar graph")
plt.show()
Stacked Bar Chart
import matplotlib.pyplot as plt
import pandas as pd
data=[['A', 10, 20, 10, 26], ['B', 20, 25, 15, 21], ['C', 12, 15, 19, 6],['D', 10, 18, 11,
19]]
df = pd.DataFrame(data,columns=['Team', 'Round 1', 'Round 2', 'Round 3', 'Round
4’])
print(df)
# plot data in stack manner of bar type
df.plot(x='Team', kind='bar', stacked=True, title='Stacked Bar Graph by dataframe’)
plt.show()
2 Bar Plots in a graph
from matplotlib import pyplot as plt
from matplotlib import style
style.use('ggplot’)
plt.bar([0.25,1.25,2.25,3.25,4.25],[50,40,70,80,20], label="BMW", color='g’, width=.5) #1st
bar
plt.bar([.75,1.75,2.75,3.75,4.75],[80,20,20,50,60], label="Audi", color='r’, width=.5) #2nd bar
plt.legend() #legend
plt.xlabel('Days’) #x-axis label
plt.ylabel('Distance (kms)’) #y-axis label
plt.title('Information’) #chart title
plt.show()
Histogram
from matplotlib import pyplot as plt
# Y-axis values
y = [10, 5, 5,8, 4,10,10, 2]
# Function to plot histogram
plt.hist(y)
# Function to show the plot
plt.show()
Plotting 2 histograms in the same graph
import matplotlib.pyplot as plt
# giving two age groups data
age_g1 = [1, 3, 5, 10, 15, 17, 18, 16, 19, 21, 23, 28, 30, 31, 33, 38, 32, 40, 45, 43, 49, 55, 53, 63, 66, 85,
80, 57, 75, 93, 95]
age_g2 = [6, 4, 15, 17, 19, 21, 28, 23, 31, 36, 39, 32, 50, 56, 59, 74, 79, 34, 98, 97, 95, 67, 69, 92, 45,
55, 77,76, 85]
# plotting first histogram
plt.hist(age_g1, label='Age group1', bins=14, edgecolor='red')
# plotting second histogram
plt.hist(age_g2, label="Age group2", bins=14, edgecolor='yellow')
plt.legend()
# Showing the plot using plt.show()
plt.show()
Scatter Plot
from matplotlib import pyplot as plt
x = [5, 2, 9, 4, 7]
# Y-axis values
y = [10, 5, 8, 4, 2]
# Function to plot scatter
plt.scatter(x, y)
# function to show the plot
plt.show()
Another example for scatter plot
import matplotlib.pyplot as plt
from matplotlib import style
style.use('ggplot’) #importing style from ggplot
x = [1,1.5,2,2.5,3,3.5,3.6]
y=[7.5,8,8.5,9,9.5,10,10.5]
x1=[8,8.5,9,9.5,10,10.5,11]
y1=[3,3.5,3.7,4,4.5,5,5.2]
plt.scatter(x,y, label='high income low saving',color='r’) #1st scatter plot
plt.scatter(x1,y1,label='low income high savings',color=‘b’) # 2nd scatter plot
plt.xlabel('saving*100’) # x-axis label
plt.ylabel('income*1000’) #y-axis label
plt.title('Scatter Plot’) #chart title
plt.legend() #legend
plt.show() #plot display
Pie Plot in Python
import matplotlib.pyplot as plt
slices = [7,2,2,13] #slices in pie plot
activities = ['sleeping’, 'eating’, 'working’, 'playing’] #lables of pie plot
cols = ['c','m','r',’b’] #colors in pie plot
plt.pie(slices, labels=activities, colors=cols)
plt.title('Pie Plot') #Plot title
plt.show() #Displaying the plot
The seaborn library in Python
• Seaborn is a library mostly used for statistical plotting in Python.
• It is built on top of Matplotlib and provides beautiful default styles
and color palettes to make statistical plots more attractive.
Plotting using seaborn
• We will be plotting a simple line plot using the iris dataset.
• Iris dataset contains five columns such as Petal Length, Petal Width,
Sepal Length, Sepal Width and Species Type.
• It is a preloaded dataset in Python seaborn
Step 1-> pip install seaborn
Step 2-> import seaborn as sns
Step 3-> sns.load_dataset(“iris”)
The iris dataset, it is a
dataframe.
Creating a Basic Line Plot with seaborn in
Python
# importing packages
import seaborn as sns
# loading dataset
data = sns.load_dataset("iris")
# draw lineplot
sns.lineplot(x="sepal_length", y="sepal_width", data=data)
Using seaborn with Matplotlib
import seaborn as sns
import matplotlib.pyplot as plt
# loading dataset
data = sns.load_dataset("iris")
# draw lineplot
sns.lineplot(x="sepal_length", y="sepal_width", data=data)
# setting the x limit of the plot
plt.xlim(5)
plt.show()
Heatmap
• Heatmap is defined as a graphical representation of data using colors
to visualize the value of the matrix.
• In this, to represent more common values or higher activities brighter
colors basically reddish colors are used and to represent less common
or activity values, darker colors are preferred.
Basic Heatmap in Python
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
# generating 2-D 10x10 matrix of random numbers from 1 to 100
data = np.random.randint(low = 1, high = 100, size = (10, 10))
print("The data to be plotted:\n")
print(data)
# plotting the heatmap
hm = sns.heatmap(data = data, annot=True) #adding data values in the heatmap
# displaying the plotted heatmap
plt.show()
seaborn.heatmap() function
Syntax: seaborn.heatmap(data, *, vmin=None, vmax=None, cmap=None,
center=None, annot_kws=None, linewidths=0, linecolor=’white’, cbar=Tru
e, **kwargs)
Important Parameters:
• data: 2D dataset that can be coerced into an ndarray.
• vmin, vmax: Values to anchor the colormap, otherwise they are inferred
from the data and other keyword arguments.
• cmap: The mapping from data values to color space.
• center: The value at which to center the colormap when plotting
divergent data.
• annot: If True, write the data value in each cell.
• fmt: String formatting code to use when adding annotations.
• linewidths: Width of the lines that will divide each cell.
• linecolor: Color of the lines that will divide each cell.
• cbar: Whether to draw a colorbar.
All the parameters except data are optional.
Suggested Reads
• Neural Data Science in Python — Neural Data Science in Python
• Python Plotting With Matplotlib (Guide) – Real Python
• Getting Started with Python Matplotlib – An Overview –
GeeksforGeeks
• Python Seaborn Tutorial – GeeksforGeeks
• Subplots in Python (
Matplotlib Subplots - How to create multiple plots in same figure in Py
thon? - Machine Learning Plus
)

You might also like