Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

Midterm Report (at most 4 members/ Due: April 24 23:59):

Data Analysis and Reporting with Numpy and Pandas

Objective: Your task is to perform a comprehensive data analysis and write a


detailed report using Python's Numpy and Pandas libraries. This assignment
aims to enhance your skills in data manipulation, analysis, and reporting using
these powerful libraries.

Dataset: Download CSV files containing 1-year historical stock price data for
2 companies from Yahoo Finance. Each CSV file includes the following
columns:

 Date: The date of the trading day.

 Adj Close: The adjusted closing price of the stock.

 Volume: The trading volume of the stock for the day.

Tasks:

1. Data Loading and Cleaning:

 Read the CSV files into separate pandas DataFrames.

 Handle any missing values and duplicates in the datasets.

2. Exploratory Data Analysis (EDA):

 Calculate basic statistics (mean, median, min, max) for the


adjusted closing price and volume.

 Explore the trend of adjusted closing prices and volume using.

 Identify and visualize trends in adjusted closing prices and


volume over time for each company.

3. Data Manipulation and Aggregation:

 Merge the DataFrames on the 'Date' column to combine the


data for all companies.

 Create new columns to calculate total volume traded and


average adjusted closing price for each company.
 Aggregate the data to calculate monthly average adjusted
closing prices and total volume traded.

4. Advanced Analysis with Numpy and Pandas:

 Compute the correlation matrix between the adjusted closing


prices of the companies.

 Calculate rolling statistics such as the 5-day rolling mean and


standard deviation for the adjusted closing prices.

5. Reporting:

 Write a detailed report summarizing your analysis and findings.

 Include visualizations (e.g., plots, histograms) to support your


analysis.

 Provide insights into stock price trends, trading volume patterns,


and any correlations observed among the companies.

Submission Guidelines:

 Submit a Python script (.py) containing your code implementing the


above tasks and the output file (.txt).

 Include the 2 CSV files with historical stock price data for the selected
companies.

 Write a report in a separate document (e.g., word file or pdf file)


summarizing your analysis and findings.

 Ensure that your report is well-organized, clearly written, and includes


relevant visualizations to support your analysis.

Note: Feel free to explore additional functions beyond the ones mentioned
above. This assignment is designed to encourage experimentation and deep
learning of Numpy and Pandas functionalities. Please copy the following table
to the beginning of your report.

Student ID Name

You might also like