Professional Documents
Culture Documents
Final Project
Final Project
1 Final Project
3 Introduction
3.1 Question(s) of interests
The goal of this project is to analyze the performance of Disney movies based on their inflation-
adjusted gross values. I aim to explore various aspects such as the highest-grossing movies, role
various MPAA ratings play in bringing the audience to the theatres, and the impact of different
genres on the movie’s success. By examining the data, we can gain insights into Disney’s box office
success and identify patterns that contribute to their financial achievements.
1
# Importing files
total_gross = pd.read_csv("data/disney_movies_total_gross.csv")
total_gross inflation_adjusted_gross
0 $184,925,485 $5,228,953,251
1 $84,300,000 $2,188,229,052
2 $83,320,000 $2,187,090,808
3 $65,000,000 $1,078,510,579
4 $85,000,000 $920,608,730
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 579 entries, 0 to 578
Data columns (total 6 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 movie_title 579 non-null object
1 release_date 579 non-null object
2 genre 562 non-null object
3 MPAA_rating 523 non-null object
4 total_gross 579 non-null object
5 inflation_adjusted_gross 579 non-null object
dtypes: object(6)
memory usage: 27.3+ KB
Also, we need to clean our dataframe before we do any analysis. For this, I created
a function named “clean_my_data.py”. I created this function because if the dataset
at the URL was changed, the code in this report will still be able to run and give
meaningful insights. I have paid special attention to “movie_tile” because a null
value here means the data is unreliable in my opinion (the movie_title is the most
basic value here). Also, it drops the release_date column because we are only going
to use inflation_adjusted_gross, which makes the release_date reduntant (as time
factor is already considered under inflation adjustment).
2
Calling that function now
[19]: import clean_my_data as cd
/home/jupyter/prog-python-ds-students/release/final_project/clean_my_data.py:22:
FutureWarning: The default value of regex will change from True to False in a
future version. In addition, single character regular expressions will*not* be
treated as literal strings when regex=True.
data['total_gross'] = data['total_gross'].str.replace('$',
'').str.replace(',', '').astype(float)
/home/jupyter/prog-python-ds-students/release/final_project/clean_my_data.py:23:
FutureWarning: The default value of regex will change from True to False in a
future version. In addition, single character regular expressions will*not* be
treated as literal strings when regex=True.
data['inflation_adjusted_gross'] =
data['inflation_adjusted_gross'].str.replace('$', '').str.replace(',',
'').astype(float)
3
0 Snow White and the Seven Dwarfs 5.228953e+09
1 Pinocchio 2.188229e+09
2 Fantasia 2.187091e+09
8 101 Dalmatians 1.362871e+09
6 Lady and the Tramp 1.236036e+09
3 Song of the South 1.078511e+09
564 Star Wars Ep. VII: The Force Awakens 9.366622e+08
4 Cinderella 9.206087e+08
13 The Jungle Book 7.896123e+08
179 The Lion King 7.616409e+08
[25]: # Investigate the relationship between MPAA ratings and movie success
mpaa_rating_success = total_gross_cleaned.groupby('MPAA_rating')['total_gross'].
,→mean()
4
print(mpaa_rating_success)
[26]: total_gross_cleaned.head()
inflation_adjusted_gross
0 5.228953e+09
1 2.188229e+09
2 2.187091e+09
3 1.078511e+09
4 9.206087e+08
bar_chart_total_gross
[27]: alt.Chart(…)
5
title='Top 10 Highest-Grossing Movies (Inflation-Adjusted Gross)'
)
bar_chart_inflation_adjusted_gross
[28]: alt.Chart(…)
As the value of money changes over time, it is important to use the ‘inflation-adjusted
gross’ column for our analysis. Also, we need to use average inflation-adjusted gross
and not total inflation-adjusted gross because the count of movies under each category
is different.
[29]: import altair as alt
tooltip=['genre', 'inflation_adjusted_gross']
).properties(
title='Average Inflation Adjusted Gross per Genre of Disney Movies'
)
bar_genre
[29]: alt.Chart(…)
6
y=alt.Y('inflation_adjusted_gross:Q', title='Average Inflation Adjusted␣
Gross'),
,→
tooltip=['MPAA_rating', 'inflation_adjusted_gross']
).properties(
title='Average Inflation Adjusted Gross per MPAA Rating of Disney Movies'
)
bar_mpaa
[30]: alt.Chart(…)
5 Discussions
The analysis of Disney movies’ average inflation-adjusted gross revealed interesting insights. The
graph depicting the average inflation-adjusted gross per genre shows that the ‘musical’ genre has
significantly higher average gross compared to other genres. This suggests that Disney movies
belonging to the musical genre tend to perform exceptionally well in terms of box office revenue,
even when adjusting for inflation. The popularity and enduring appeal of Disney musicals, with
their captivating songs and enchanting storytelling, have likely contributed to their financial success
over the years.
Additionally, the graph illustrating the average inflation-adjusted gross per MPAA rating high-
lights the ‘G’ rating as having the highest average gross among the different MPAA ratings. This
indicates that Disney movies rated ‘G’ have historically generated substantial box office revenue
when considering inflation. The ‘G’ rating signifies that the movies are suitable for all audiences,
which aligns with Disney’s family-friendly brand and suggests that their movies appeal to a wide
range of viewers, including children and adults alike.
The ‘musical’ genre and ‘G’ rating seem to be associated with higher average inflation-adjusted
gross, indicating a strong market demand and audience reception for these types of movies. This
information can be valuable for Disney in understanding audience preferences and making strategic
decisions when developing and marketing future movie projects.
It’s important to note that while the average inflation-adjusted gross provides valuable insights,
individual movie performance can vary within each genre and MPAA rating category. Factors such
as production budgets, marketing strategies, release timing, and critical reception also influence
a movie’s financial success. Therefore, further analysis considering these factors would provide a
more comprehensive understanding of Disney’s movie performance.
[ ]: