Professional Documents
Culture Documents
Keshav - Bollywood Box Office
Keshav - Bollywood Box Office
• There was no accurate measure present to collect the information about the
hype of the movie
Causes of Error
• Many film makers don’t release the exact budget and that’s why it’s
estimation might be wrong
• Hype of the Movie is not incorporated
• Effects of Presence of Super Star Actors is not incorporated in the analysis
• Presence of Hollywood and other regional movies doing good in Indian
Theaters is not seen
• Effect of release on Festivals, Holidays and Extended Weekends is not
incorporated
• Audience Genre Preference is also not incorporated
• Effect of Presence of Big and Famous Production Houses is also not
incorporated
Data
No of No of Movie
simultaneous Releases in
Name Date of Release Budget(in Cr.) Gross Indian Collection (in Cr.) IMDB( out of 10) ROI Release that month
1921 12-Jan 15 19.69 4.2 1.312667 3 5
Kaalakaandi 12-Jan 18 6.15 6.2 0.341667 3 5
Mukkabaaz 12-Jan 12 13.67 8.1 1.139167 3 5
Vodka Diaries 19-Jan 3 1.26 5.6 0.42 1 5
Padmaavat 25-Jan 215 360.89 7 1.678558 1 5
Pad Man 09-Feb 76 100.53 8 1.322763 1 3
Aiyaary 16-Feb 57 22.33 5.2 0.391754 1 3
Sonu Ke titu ki sweety 23-Feb 40 128.85 7.1 3.22125 1 3
Pari 02-Mar 21 35.15 6.6 1.67381 2 8
Veerey ki Wedding 02-Mar 16 3.65 2.8 0.228125 2 8
Hate Story 4 09-Mar 17 25.64 3.3 1.508235 3 8
Dil Junglee 09-Mar 13 1.47 4 0.113077 3 8
3 Storeys 09-Mar 11 2.84 7.1 0.258182 3 8
Raid 16-Mar 72 125.66 7.4 1.745278 1 8
Hichki 23-Mar 20 59.13 7.5 2.9565 1 8
Baaghi 2 30-Mar 59 205.44 5 3.482034 1 8
Blackmail 06-Apr 18 28.81 7 1.600556 1 5
October 13-Apr 33 50 7.5 1.515152 1 5
Beyond the Clouds 20-Apr 7 2.1 6.9 0.3 2 5
Nanu Ki janu 20-Apr 15 4.12 5 0.274667 2 5
450
400
Box Office Collection(in Cr.)
350
300
250
200
150
100
50
0
0 50 100 150 200 250 300 350
Budget(in Cr.)
Scatter Plot of log(Box office Collection) vs log(Budget)
12
y = 1.422x - 3.6693
10 R² = 0.6492
Log(Box Office Collection)
6
Y-Values
Linear (Y-Values)
0
0 1 2 3 4 5 6 7 8 9 10
Log(Budget)
Scatter Plot of Return of Investment vs IMDB Rating
7
5
Return Of Investment
0
0 1 2 3 4 5 6 7 8 9
IMDB Rating
MATRIX PLOT
s
f
Linear Regression Analysis
• I will be assuming 3 models and will compare those models to get the best
applicable model
• Firstly I will incorporate all the variables and will regress it to analyze it
• Then I will not incorporate the factor for which p value would be greater
than 0.05
• Then Finally I will also remove the variable for which p value is greater
Model -1- Using All
variables
Model-
Y = 𝛽1 + 𝛽2X2 + 𝛽3X3 + 𝛽4X4
+ 𝛽5X5
Model-
Y = 𝛽1 + 𝛽2X2 + 𝛽3X3 + 𝛽4X4
Model-
Y = 𝛽1 + 𝛽2X2 + 𝛽3X3
OLS Condition Satisfaction
• Model-2:
Y = 𝛽1 + 𝛽2X2 + 𝛽3X3 + 𝛽4X4
Adjusted R2 = 0.5313
Multiple R2 = 0.552
Correlation of residuals with different variables
• All values of the correlation are very small except for box office collection
Multicollinearity
VIF for model 1
• Also we see that this model is not very accurate because of the various
factors not incorporated in the model
• Having a good public review also has a very significant effect in the
collection of the movie and good rating will surely increase the chances of
having return of investment greater than 1 , i.e. box office collection being
greater than the budget of the film
THANK YOU