Professional Documents
Culture Documents
Pandas - Series - Short - Notes
Pandas - Series - Short - Notes
Pandas is an open-source Python library that provides high-performance, easy-to-use data manipulation
and analysis tools. It is built on top of NumPy and is particularly suited for working with structured data,
such as tables or relational databases.
Pandas and NumPy:
Pandas NumPy
Designed for working with structured data, Primarily focused on numerical operations and
particularly tables or relational databases. manipulating homogeneous numerical arrays.
Provides a high-level interface for data Provides a multi-dimensional array object
manipulation and analysis with built-in data called ndarray, which is more suitable for
structures like DataFrame and Series. numerical computations.
Offers powerful data alignment and Does not have built-in support for handling
handling of missing data. missing data.
Supports heterogeneous data types within a Supports homogeneous data types, allowing for
single data structure. efficient storage and computation.
ii. List:
pd.Series([1, 2, 3, 4]) # creates a Series with elements from the list
without index (i.e , default index)
v. Dictionary:-
pd.Series( # Creates a series where key of dictionaries(i.e,
{“Name”:”NISHA”, Name,Title,Profession) will become index of
“Title: “JHA”, the Series. Do not give index in case of Series
“Profession”: “PGT IP” making with dictionary
}
)
Slicing of Series:
Slicing in a Series refers to extracting a portion of the Series based on its index. It allows you to select
specific elements or a range of elements from the Series.
Example:
series = pd.Series([1, 2, 3, 4, 5])
series[2] # Returns the element at index 2 (value: 3)
series[1:4] # Returns a new Series with elements from index 1 to index 3 (values: [2, 3, 4])
series[:3] # Returns a new Series with elements from the beginning up to index 2 (values: [1, 2, 3])
series[3:] # Returns a new Series with elements from index 3 to the end (values: [4, 5])
series[::-1] # Returns series in reverse order
series[:] # Returns all elements of the series
series[::] # Returns all elements of the Series
series[:3] # Returns first 3 elements
series[-3:] #Returns last three elements
series[1:2] # Returns elements from 1st index to last index with step value of 2
Example:
series = pd.Series([1, 2, 3, 4, 5])
series.iloc[2] # Returns the element at integer position 2 (value: 3)
series.iloc[1:4] # Returns a new Series with elements from integer position 1 to position 3
(values: [2, 3, 4])
loc is used for label-based indexing. It allows you to access elements using labels or index values.
Example:
series = pd.Series([1, 2, 3, 4, 5], index=['A', 'B', 'C', 'D', 'E'])
series.loc['C'] # Returns the element with the label 'C' (value: 3)
series.loc['B':'D'] # Returns a new Series with elements from label 'B' to label 'D' (values: [2, 3, 4])
Filtering of Series:
Filtering in a Series involves selecting specific elements based on certain conditions. It allows you to extract
a subset of data that satisfies a given criterion.
Example:
series = pd.Series([1, 2, 3, 4, 5])
filtered_series = series[series > 3] # Returns a new Series with elements greater than 3 (values: [4, 5])
Comparison of Two Series:
Comparing two Series involves checking the equality or inequality of corresponding elements in the Series.
Example:
series1 = pd.Series([1, 2, 3])
series2 = pd.Series([3, 2, 1])
series1 == series2 # Returns a new Series with Boolean values indicating whether the elements are
equal or not (values: [False, True, False])
Example:
series1 = pd.Series([1, 2, 3], index=['A', 'B', 'C'])
series2 = pd.Series([4, 5, 6], index=['B', 'C', 'D'])
addition
series1 + series2 # Returns a new Series with addition of elements, but NaN for mismatched
indices
subtraction
series1 - series2 # Returns a new Series with subtraction of elements, but NaN for
mismatched indices
multiplication
series1 * series2 # Returns a new Series with multiplication of elements, but NaN for
mismatched indices
division
series1 / series2 # Returns a new Series with division of elements, but NaN for mismatched
indices
Example:
series1 = pd.Series([1, 2, 3], index=['A', 'B', 'C'])
series2 = pd.Series([4, 5, 6], index=['B', 'C', 'D'])
addition
series1.add(series2, fill_value=0) # Returns a new Series with addition of elements,
replacing NaN with 0
subtraction
series1.sub(series2, fill_value=0) # Returns a new Series with subtraction of elements,
replacing NaN with 0
multiplication
series1.mul(series2, fill_value=0) # Returns a new Series with multiplication of elements,
replacing NaN with 0
division
series1.div(series2, fill_value=0) # Returns a new Series with division of elements,
replacing NaN with 0
Methods
Method Description Example
head(n) Returns the first n elements of the Series series.head(5)
tail(n) Returns the last n elements of the Series series.tail(3)
describe() Provides summary statistics of the Series series.describe()
unique() Returns an array of unique values series.unique()
nunique() Returns the number of unique values series.nunique()
sort_values() Sorts the Series by values series.sort_values()
sort_index() Sorts the Series by index series.sort_index()
max() Returns the maximum value in the Series series.max()
min() Returns the minimum value in the Series series.min()
mean() Returns the mean of the Series series.mean()
median() Returns the median of the Series series.median()
sum() Returns the sum of the Series series.sum()
std() Returns the standard deviation of the Series series.std()
isnull() Returns a Boolean Series indicating null values series.isnull()
notnull() Returns a Boolean Series indicating non-null values series.notnull()
dropna() Removes null values from the Series series.dropna()
fillna(value) Fills null values with the specified value series.fillna(0)
astype(dtype) Converts the data type of the Series series.astype('float')
value_counts() Returns a Series with value frequencies series.value_counts()
replace(old, new) Replaces specified values with new values series.replace(0, np.nan)
These are some commonly used attributes and methods of a Pandas Series. They can be used to retrieve
information about the Series, manipulate the data, and perform various operations. Mostly asked
attribute/methods are highlighted with yellow color
Here's an example of using the name attribute in a Pandas Series(Two way):
import pandas as pd Output:
series = pd.Series([10, 20, 30, 40, 50], name="NISHA")
print(series) 0 10
1 20
2 30
3 40
Note:- this is 1st way in with we write here 4 50
Name: NISHA, dtype: int64