Predictive Analysis 1 Assignment

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

9/20/23, 10:55 PM Untitled19.

ipynb - Colaboratory

import pandas as pd

boll = pd.read_csv('/content/bollywood.csv')

boll.head(2)

Release
SlNo MovieName ReleaseTime Genre Budget BoxOfficeCollection YoutubeViews YoutubeLi
Date

18-Apr-
0 1 2 States LW Romance 36 104.0 8576361 26
14

4-Jan- Table No.


1 2 N Thriller 10 12.0 1087320 1
13 21

boll.Genre.value_counts()

Comedy 36
Drama 35
Thriller 26
Romance 25
Action 21
Thriller 3
Action 3
Name: Genre, dtype: int64

boll[['Genre','ReleaseTime']].value_counts()

Genre ReleaseTime
Drama N 24
Comedy N 23
Thriller N 20
Romance N 15
Action N 12
Drama HS 6
Comedy LW 5
HS 5
Thriller FS 4
Romance LW 4
Drama FS 4
Comedy FS 3
Action N 3
Romance FS 3
HS 3
Action LW 3
HS 3
FS 3
Thriller N 2
Thriller HS 1
LW 1
Drama LW 1
Thriller LW 1
dtype: int64

boll['month'] = pd.DatetimeIndex(boll['Release Date']).month

boll.head(1)

https://colab.research.google.com/drive/1yp6bmte8FJDIV36MybTtJcHeugjFQKF6#scrollTo=1hbjGmTcn40T&printMode=true 1/5
9/20/23, 10:55 PM Untitled19.ipynb - Colaboratory

Release
SlNo MovieName ReleaseTime Genre Budget BoxOfficeCollection YoutubeViews YoutubeLi
Date

18-Apr-
0 1 2 States LW Romance 36 104.0 8576361 26
14
boll['month'].value_counts()

1 20
3 19
5 18
7 16
2 16
4 11
9 10
6 10
11 10
10 9
8 8
12 2
Name: month, dtype: int64

boll[boll['Budget']>25][['MovieName','month']].value_counts()

MovieName month
2 States 4 1
Raja Natwarlal 8 1
Kill Dil 11 1
Kochadaiiyaan 5 1
Krrish 3 11 1
..
Highway 2 1
Himmatwala 3 1
Holiday 6 1
Humshakals 6 1
Zilla Ghaziabad 2 1
Length: 62, dtype: int64

boll['ROI']=(boll['BoxOfficeCollection']-boll['Budget'])/boll['Budget']

boll.nlargest(10,['ROI'])

https://colab.research.google.com/drive/1yp6bmte8FJDIV36MybTtJcHeugjFQKF6#scrollTo=1hbjGmTcn40T&printMode=true 2/5
9/20/23, 10:55 PM Untitled19.ipynb - Colaboratory

Release
SlNo MovieName ReleaseTime Genre Budget BoxOfficeCollection YoutubeViews Youtube
Date

26-Apr-
64 65 Aashiqui 2 N Romance 12 110.0 2926673
13

19-Dec-
89 90 PK HS Drama 85 735.0 13270623
14
boll.groupby('ReleaseTime')['ROI'].mean()
13-Sep- Grand
132 133 LW Comedy 35 298.0 1795640
13 Masti
ReleaseTime 20-Sep- The
135 0.973853
FS 136 N Drama 10 85.0 1064854
13 Lunchbox
HS 0.850867
LW 1.127205
14-Jun-
87 88 Fukrey N Comedy 5 36.2 227912
N 0.657722 13
Name: ROI, dtype: float64
5-Sep-
58 59 Mary Kom N Drama 15 104.0 6086811
14
import matplotlib.pyplot as plt
import128
seaborn 18-Oct-
129 as sn Shahid FS Drama 6 40.0 1148516
13
%matplotlib inline
import warnings Humpty
11-Jul-
warnings.filterwarnings('ignore')
37 38 Sharma Ki N Romance 20 130.0 6604595
14
plt.hist(boll['Budget']) Dulhania

Bhaag
12-Jul-
101 102 Milkha4., 4., 2.,
(array([64., 40.,1319., 11., N 2.,Drama 30
1., 2.]), 164.0 2635390
array([ 2. , 16.8, 31.6,Bhaag 46.4, 61.2, 76. , 90.8, 105.6, 120.4,
135.2, 150. ]),
9-Aug- Chennai
<BarContainer
115 116 object of 10 artists>) FS Comedy 75 395.0 1882346
13 Express

sn.distplot(boll['Budget'])

https://colab.research.google.com/drive/1yp6bmte8FJDIV36MybTtJcHeugjFQKF6#scrollTo=1hbjGmTcn40T&printMode=true 3/5
9/20/23, 10:55 PM Untitled19.ipynb - Colaboratory

<Axes: xlabel='Budget', ylabel='Density'>

sn.distplot(boll[boll['Genre']=='Comedy']['ROI'],color='g',label='comedy')
sn.distplot(boll[boll['Genre']=='Drama']['ROI'],color='r',label='drama')
plt.legend()

<matplotlib.legend.Legend at 0x7bea32b63880>

feature=['BoxOfficeCollection','YoutubeLikes']
boll[feature].corr()

BoxOfficeCollection YoutubeLikes

BoxOfficeCollection 1.000000 0.682517

YoutubeLikes 0.682517 1.000000

heatfeature=['Budget', 'BoxOfficeCollection','YoutubeViews','YoutubeLikes','YoutubeDislikes']
sn.heatmap(boll[heatfeature].corr(),annot= True)

https://colab.research.google.com/drive/1yp6bmte8FJDIV36MybTtJcHeugjFQKF6#scrollTo=1hbjGmTcn40T&printMode=true 4/5
9/20/23, 10:55 PM Untitled19.ipynb - Colaboratory

<Axes: >

https://colab.research.google.com/drive/1yp6bmte8FJDIV36MybTtJcHeugjFQKF6#scrollTo=1hbjGmTcn40T&printMode=true 5/5

You might also like