This notebook will explain the following topics and concepts:


Rows and Columns - adding & removing

Adding new columns to a DataFrame
Inserting rows using values calculated from values in other rows
Filtering Data
Sorting Data
Sorting Indexes
Resetting Indexes
Multi Indexes and Cross Sections

Importing the libraries

Almost every piece of Data Analysis you carry out using python with begin with these 3 lines of code.

In [ ]: import pandas as pd

import numpy as np

import matplotlib.pyplot as plt

%matplotlib inline

# format for floats

pd.options.display.float_format = '{:,.2f}'.format

Load in some data from

Use the read_excel function to read the contents of a spreadsheet into a DataFrame.


The data we are using in this example came from a single excel file

The excel file has 3 work sheets in it containing price information for 3 companies: Google, IBM and
All 3 sheets contain the same type of information: daily trading and stock price information from 2010 to
Each set of data has a Date column, which we will use as our index (More on indexes later)
For this example we have already exported each sheet from excel to its own csv file: Google.csv, IBM.csv,
We will read in each csv file separately
Later in the course we will cover how to read directly from excel (.xlsx) files.

In [ ]: df_GOOGL = pd.read_excel('../Data/market_data.xlsx', sheet_name='GOOGL', index_col='Date', parse_dat

df_IBM = pd.read_excel('../Data/market_data.xlsx', sheet_name='IBM', index_col='Date', parse_dates=T
df_MSFT = pd.read_excel('../Data/market_data.xlsx', sheet_name='MSFT', index_col='Date', parse_dates

Selecting data

