Download as pdf or txt
Download as pdf or txt
You are on page 1of 37

INFORMATIC PRACTICES PROJECT

ON
IPL ANALYSIS

Submitted by
Aethen Paul Mathew
Class XII C
INDEX•
•DECLARATION................................….........................................5

• ACKNOWLEDGMENT.................................................................6

• HEADER FILES USED..............................………..........................7

• INTRODUCTION ABOUT PYTHON.......................................8-9

• INTRODUCTION ABOUT MYSQL...................….......................10

• SOFTWARE AND HARDWARE REQUIREMENTS..............11

• WORKING DESCRIPTION..................................................12-13

• DATA COLLECTION...................................................................14

• DATA VISUALIZATION........................................………..............15

• SOURCE CODE...........................................................................16

• OUTPUT........................................................................................23

• CONCLUSION............................................................................32

• BIBLIOGRAPHY………………………………………………………………………….34
DECLARATION

I declare that the project work entitled

"IPL ANALYSIS", submitted to

department of INFORMATICS

PRACTICES, ST.PHILOMENA'S PUBLIC

SCHOOL, ELANJI is prepared by me.All

the coding are result of my personal

efforts.

Submitted by
Aethen Paul Mathew
Class XII C
ACKNOWLEDGMENT
Primarily I would thank God for being able to complete this

project with success. Then I would like to thank my

Informatics Practices teacher Mrs. Jasmine Jacob whose

valuable guidance has been the one that helped me to patch

the project and make it full proof success. Her suggestions and

instructions has served as the major contributor towards the

completion of this project.

I also express my gratitude to our senior principal Rev.Dr.John

Erniakulathil and principal Joju Joseph for their

encouragement and all the facilities provided for the

completion of this project.

Then I would like to thank my parents and friends who have

helped me with their valuable suggestions and guidance has

been helpful in various phases of the completion of this project


HEADER FILES USED

• CSV Connectivity
• INTRODUCTION ABOUT PYTHON

Python is a high level general purpose

open source programming language. It

is both object oriented and procedural.

Python is an extremely powerful

language.

FEATURES OF PYTHON
• Python is a high level, open source, general purpose

programming language.

• It is object oriented, procedural and functional.

• It has library to support GUI.

• It is extremely powerful and easy to learn.

• It is open source, so free to available for everyone.

• It supports on Windows, Linux and Mac OS.


• Python enables us to write clear, logical applications for small

and large tasks.

• It has high level built in datatypes:string,lists,dictionaries etc.

• It encourages us to write clear and well structured code.

APPLICATIONS OF PYTHON

• Machine Learning

• Data Analysis

• Web Development

• Console based authentication

• 3D CAD Applications
INTRODUCTION ABOUT MYSQL

MySQL is an open source and freely available

Relational Database Management System that uses

structured Query Language. It provides excellent

features for creating, storing, maintaining and

accessing data stored in the form of databases and

their respective tables.

Mysql database system works on client server

architecture. It constitutes a Mysql server which runs

on a machine containing the databases and Mysql

databases (clients) which are connected to these server

machines over a network.

ADVANTAGES OF MYSQL

• Reliability and performance

• Modifiable

• Multi platform support

• Powerful Processing Capabilities

• Integrity

• Authorization
SOFTWARE AND HARDWARE

REQUIREMENTS

SOFTWARE REQUIREMENTS:

• Python 3.6 x or higher version

• Pandas Library preinstalled

• Matplotlib library preinstalled

HARDWARE REQUREMENTS :

• A computer or a laptop with operating

system- windows 7 or above.

• x86 64-bit CPU(Intel/AMD architecture)

• 4GB RAM

• 5 GB free disk space


WORKING DESCRIPTION

INTRODUCTION ABOUT PROJECT

Cricket is one of the popular game in India.

After the Start of IPL, Indian cricket standards

reached an ultimate level and many talented

players got a chance to prove themselves in a

platform like IPL where many international

cricketers play together. IPL is the one of the

leading cricket tournament in the world.

The Indian Premiere League (IPL) is a

professional league for Twenty20 cricket

championship in India. It was initiated by the

Board of Control for Cricket in India head

quartered in Mumbai and is supervised by BCCI

Vice president Rajeev shukhla who serves as the

league's chairman and commissioner. The IPL


works on a franchise system based on American

style of hiring players and transfers.

And cricket, as you can imagine, is ripe

with data points. It's a battle between bat and

ball played across different formats and different

levels. The ball-by-ball analysis of matches can

produce some surprising hidden insights, such as

batting partnerships and who the best batting

partner is.

THE MAIN OBJECTIVES OF THIS PROJECT IS:

• To find the team that had won by maximum runs.

• To find the team that had won by maximum wickets.

• To find the team that had won by minimum runs.

• To find the team that had won by minimum wickets.

• To find the season that had most number of matches.

• To find the Most Successful IPL Team.

• To find Players who got max times Man of Match.


DATA COLLECTION

Data has been collected from

www.iplt20.com,www.cricsheet.org. Data

consists of the ball by ball details for a total

of 696 matches from 2008-2018. Ball by

ball data provides in depth detail of all the

balls thrown in that particular over. The

ball could be either wide, dead, no ball or a

player got singles, doubles, triples, six or

four on that ball. There are two csv files of

datasets. Matches.csv.gives the details of

match venue, location, Season, contesting

team, about toss winner and toss decision,

match result, win got by runs or wickets,

player of the match, details of all the three


umpires and match Winner etc.

Deliveries.csv is the ball by ball data and the

combination of all the deliveries from

2008-18.

It consists of different attributes Match_id,

bowling team, batting team, batsmen,

bowler, Nonstriker, no ball runs, penalty

runs, Extra runs, over, total runs etc.

Innings tell if the first team was going on

field or second one. Over describes the

current over number. Ball describes the

current ball number of the current over.


DATA VISUALIZATION

The most important and significant part of

data visualization and predictive analysis is

to represent the data in form of charts and

graphs to get a visual presentation of data.

The collected data is visualized to get a

better and clear understanding about all

the parameters of the Season, the team,

All- rounders, batsmen and bowlers so that

it will be helpful for the team selectors

Captains and managers for the next auction.

Different packages are used to get the

proper analysis and visualization for players

and teams.
SOURCE CODE

import numpy as np # numerical computing

import pandas as pd # data processing, CSV file 1/0

(e.g. pd.read_csv)

import matplotlib.pyplot as plt #visualization

import seaborn as sns #modern visualization

plt.rcParams['figure.figsize'] = (14, 8)

sns.set_style("darkgrid")

df = pd.read_csv("E:¥ipl1.csv")

Print('--------------------------------------------')
print(' --------------------------------------------')

print(df.info())

print()

print(' --------------------------------------------')

print('--------------------------------------------')

print('Total Matches are::::',df['id'].max())

print()

print('-------------------------------------------- ')

print('--------------------------------------------')

print('How many seasons data we've got in the dataset?')


print(df['season'].unique())

print()

print('--------------------------------------------')

print('--------------------------------------------')

print('Which Team had won by maximum runs?')

print(df.iloc[df['win_by_runs'].idxmax()])

print()

print('--------------------------------------------')

print('--------------------------------------------')

print('Which Team had won by maximum wickets?')

print(df.iloc[df['win_by_wickets'].idxmax()]

['winner])
print()

print('--------------------------------------------')

print('--------------------------------------------')

print('Which Team had won by (closest margin) minimum

runs?')

print(df.iloc[df[df['win_by_runs'].ge(1)].win_by_runs.id

xmin()]['winner'])

print()

print('--------------------------------------------')

print('--------------------------------------------')

print('Which Team had won by minimum wickets?')

print(df.iloc[df[df['win_by_wickets'].ge(1)].win_by wic

kets.idxmin()))
print()

print('--------------------------------------------')

print('--------------------------------------------')

print('Which season had most number of matches?')

sns.countplot(x='season', data=df)

plt.show()

print()

print('--------------------------------------------')

print('--------------------------------------------')
print('The Most Successful IPL Team is:::')

data = df.winner.value_counts()

sns.barplot(y = data.index, x = data, orient='h')

print()

print('--------------------------------------------')

print('--------------------------------------------')

print('Players who got max times Man of Match are:::')

top_players=df.player_of_match.value_counts()[:10]

#sns.barplot(x="day", y="total_bill", data=tips)

fig, ax = plt.subplots()

ax.set_ylim([0,20])
ax.set_ylabel("Count")

ax.set_title("Top player of the match Winners")

#top_players.plot.bar()

sns.barplot(x = top_players.index, y = top_players, orient='v');

#palette="Blues");

plt.show()
Output
<class 'pandas.core.frame. DataFrame'>

RangeIndex: 637 entries, 0 to 636

Data columns (total 17 columns):

Match_SK 637 non-null int64

match_id 637 non-null int64

Team1 637 non-null object

Team2 637 non-null object

match_date 637 non-null object

Season_Year 637 non-null int64

Venue_Name 636 non-null object

City_Name 637 non-null object

Country_Name 637 non-null object

Toss_Winner 636 non-null object

match_winner 634 non-null object

Toss_Name 636 non-null object

Win_Type 635 non-null object

Outcome_Type 637 non-null object

ManOfMach 633 non-null object

Win_Margin 628 non-null float64

Country_id 637 non-null int64

dtypes: float64(1), int64(4), object(12)


memory usage: 84.7+ KB
df.groupby('Season_Year') ('match_winner').value_counts()

Season_Year match_winner

2008 Rajasthan Royals 13

Kings XI Punjab 10

Chennai Super Kings 9

Delhi Daredevils 7

Mumbai Indians 7

Kolkata Knight Riders 6

Royal Challengers Bangalore 4

Deccan Chargers 2

2009 Delhi Daredevils 10

Deccan Chargers 9

Royal Challengers Bangalore 9

Chennai Super Kings 8

Kings XI Punjab 7

Rajasthan Royals 6

Mumbai Indians 5

Kolkata Knight Riders 3

2010 Mumbai Indians 11

Chennai Super Kings 9

Deccan Chargers 8

Royal Challengers Bangalore 8

Delhi Daredevils 7

Kolkata Knight Riders 7


Rajasthan Royals 6

df['Season_Year'].value_counts()

2013 76

2012 74

2011 73

2017 60

2016 60

2014 60

2010 60

2015 59

2008 58

2009 57

Name: Season_Year, dtype: int64

Which season had most number of matches?


-----------------------------------

-----------------------------------

Most successful IPL Teams is:::


CONCLUSION
In this paper, the performance of cricket players(batsmen)

and toss related analysis in IPL from season 2008-2018 has

been visualized. Finding out the hidden parameters, patterns

and attributes that lead to the outcome of a cricket match

helps the team owners and selectors to recognize better

players. A salary of IPL cricket players is decided through the

auction process. Thus, it is a part of franchise and matter of

decision making about which player to be bided for and at

what cost by the past performance of players in IPL. Every

Selector needs young and dynamic players who can handle the

pressure calmly, and go towards the winning line.

This paper highlights the player performance especially

batsmen and addresses the analysis that is done for Maximum

Man of the Matches, Maximum Centuries Scored by Batsmen,

Top Batsmen, Batsmen with Top Strike Rate, Top 10 Players

with Maximum Runs. Statistics of 696 matches have been used

in this experiment and even for toss related analysis such as

Count of Toss wins, Decision taken by each team after winning

the toss, Toss Decision Season Wise, Toss Decision Team Wise.

SK Raina considered as the finest batsmen who is second in


the top list of batsmen having maximum runs, maximum man

of the matches, maximum centuries scored, V Kohli at the first

position of maximum runs and even he is in the list for

maximum centuries. All other Indian Star batsmen MS Dhoni

(Best Captain, Maximum runs and Maximum man of the

matches), Rishabh Pant (second best strike rate and maximum

centuries), RG Sharma, S Dhawan, G Gambhir, YK Pathan and

M Vijay performed very well at the end of last five overs.

Selectors have the clear choice to give preference to Indian

Players at first as they performed very well in season from

2008-2018.

We also presented toss related analysis, in which MS Dhoni is

the best captain for CSK who won the toss maximum times

having count of 77 and elected to bat first. Their choice of bat

first mostly results in win. Most of the times filed first is

elected by the captains so that they can plan and perform well

by chasing. RCB, KKR, MI and KXIP elected field first most of

the times having count of 57 and 49. Selectors have the clear

choice to select batsmen from Mumbai Indians and Kings XI

Punjab as this two teams handled the pressure very well

during all the Seasons from 2008-2018. By considering all

this visualization and toss related analysis, Team Management


can select the right players and rights teams at the time of

auction. A good and strong cricket team can be formed within

a given budget, which will have the highest chance of winning.

BIBLIOGRAPHY

1. Informatics Practices with Python by

Preeti Arora

2. http//:en.wikipedia.org

3. http//:www.botskOOl.com

You might also like