STD & Var

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

Let's go through the concepts of variance and standard deviation and how to calculate them using a

Python DataFrame with the pandas library.

Concepts
Variance: Measures how far a set of numbers are spread out from their average value. The formula
for variance (𝞂2 ) is:

Where N is the number of data points, xi is each data point, and 𝝻 is the mean of the data
points.
● Standard Deviation: The square root of the variance, providing a measure of the spread of
data in the same units as the data itself. The formula is:

Using Python DataFrame


Let's create a DataFrame and calculate the variance and standard deviation for its columns.

Calculating Variance and Standard Deviation

Pandas provides built-in methods to calculate variance and standard deviation:

DataFrame.var(): Calculates the variance.

DataFrame.std(): Calculates the standard deviation.


import pandas as pd

# Sample data

data = {

'A': [1, 2, 3, 4, 5],

'B': [2, 4, 6, 8, 10],

'C': [5, 7, 9, 11, 13]

# Create DataFrame

df = pd.DataFrame(data)

print("DataFrame:")

print(df)

# Calculate variance

variance = df.var()

print("\nVariance:")

print(variance)

# Calculate standard deviation

std_dev = df.std()
print("\nStandard Deviation:")

print(std_dev)

Output

DataFrame:

A B C

0 1 2 5

1 2 4 7

2 3 6 9

3 4 8 11

4 5 10 13

Variance:

A 2.5

B 10.0

C 8.0

dtype: float64
Standard Deviation:

A 1.581139

B 3.162278

C 2.828427

dtype: float64

You might also like