Data Analytics Assignment 1

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

Data Analytics | Assignment 1 | Name – Anirban Saha | Roll - 20065

Duckworth-Lewis-Stern Method
Problem :
Using the first innings data alone in given data set, we need to find the best fit ’run production
functions’ in terms of wickets-in-hand w and overs-to-go u.
Model is given as : Z(u, w) = Z0(w)[1 − exp(−Lu/Z0(w)]
The loss function - sum of squared errors loss function, summed across overs and wickets. We
need to produce a plot of the ten functions, & report the parameters associated with the
production functions, and the total error.

Solution Approach :
1. Data is given on ODI matches from 1999 to 2011 in csv file format. First important columns
from all the columns are fetched from the file. In data processing procedure four important features
about all matches are being extracted. These are :innings number, remaining runs,
remaining overs, wickets remains.
2. Next all the data points of first innings are taken and the loss function is defined as sum of
squared errors loss function, summed across overs and wickets. We can write loss function
minimization as below:
Σ
N
minimize (yn − Z(un, wn, Z, L))2
Z0(1),Z0(2),...,Z0(10),L
n=1

Where N = Number of all the first inning data-points. yn is the actual output/run and
Z(un, wn, Z, L) is the predicted run.
3. Then scipy.optimize library is used to minimize the objective function defined above. Two
different functions in this library is been used: scipy.optimize.least squares and
scipy.optimize.minimize. In scipy.optimize.minimize different methods has been used like
: L-BFGS-B, BFGS, POWEL, COBYLA & CG. Comparison of total loss of all the
methods is given below:

1
Data Analytics | Assignment 1 | Name – Anirban Saha | Roll - 20065

Approach Used [scipy.optimize.minimize] Total Loss


L-BFGS-B 104818193.940977
BFGS 135130856.904964
POWEL 141407485.515783
COBYLA 105966517.922816
CG 105966517.922816
least squares 5671522878410885.0
Table 1: Comparison of total loss for different methods.

Results :
It is found from the above observation that scipy.optimize.minimize with L-BFGS-B method is
giving least loss. Following is the detailed result:
TOTAL LOSS: 104818193.940977
Z[1] Z[2] Z[3] Z[4] Z[5] Z[6] Z[7] Z[8] Z[9] Z[10] L
12.7 27.2 49.5 75.9 100.6 132.6 162.4 199.3 229.8 272.0 11.56
Table 2: Optimized values [Of All Z and LParameters.]

Plots :
Plots for 10 functions as given below:-

2
Data Analytics | Assignment 1 | Name – Anirban Saha | Roll - 20065

3
Data Analytics | Assignment 1 | Name – Anirban Saha | Roll - 20065

You might also like