Group 6

You might also like

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 34

Optimal Model to Predict

Defender’s Tackle Points


-Submitted to Dr. Achint Nigam by Group 6

Aigal Chetas Manjunath - 2023H1540802P


Kannaiahgari Sahith - 2023H1540832P
Sunil Kumar Behera - 2023H1540843P
Naveen Kumar - 2023H1540845P
Vudayagiri Sai Shiva Kumar - 2023H1540859P
M V S Sri Sathvika - 2023H1540864P
Bodempudi Siri - 2023H1540871P
1
Problem Statement
2
Objective
3
Data Set Snapshot
4
Methodology

5 6 7
Correlation Matrix
8
Multiple Linear
Libraries Used Linearity Check

OVERVIEW
Plot Regression

9 10
Model - 2
11
Model - 3
12
Comparison
Model - 1
between models

13
Conclusion
Problem:
In Professional Kabaddi League, defenders play a pivotal role in
preventing the opposite team’s raiders from scoring a point.
Analysis and estimation of defenders ability to accumulate tackle
Problem points can provide the team management insights due to which
then can make a strategic decision.
Statement Proposed Solution:
In order to solve the above problem a predictive model using Multi
Linear Regression is developed. It can forecast the tackle points that
a defender can likely score in kabaddi matches.
Predict the Total Tackle Points scored by a defender by taking following
parameters into consideration:
● Total Tackles
● Height
● Weight
● Age
● High 5s
Objective ● Super Tackles
● Matches Played
● Auction Price
● Position
● Average time on mat

Upon completion of this project, a predictive model is developed which can


estimate/ predict a defender’s tackle points in kabaddi matches accurately.
Data
Description
Data Set
Snapshot
● Linearity Check
● Correlation Matrix
● Model Building
Methodology ● Model Evaluation
● Model Comparison
● Picking the best model
● library(tidyverse)
● library(ggplot2)
Libraries Used ● library(coefplot)
● library(corrplot)
● library(caret)
Linearity
Check
● Multiple Linear Regression is statistical method used to model the
relationship between a dependent variable and two or more
dependent variables.

Multiple linear
Regression
● There are many ways to build a MLR model , we used the
following approaches to achieve this :-
1. Building a model based on Correlation
2. Stepwise Regression
3. Building a model based on p-values
A Correlation matrix is a statistical technique to evaluate the relationship between
two variables in a data set. For our project it is as plotted below

Correlation
Matrix Plot Threshold=0.7
● We are dropping matches, high 5s and avg time on mat because total tackles is
highly correlated with dependent variable.
● Building the model by dropping above 3 independent variables.
The results are as follows:

Model - 1
Correlation
Model
1. Coefficient Plot:

Graphical
Interpretations
of Models
Evaluation of
Model- 1
Model Evaluation

Evaluation of
Model- 1
Model Evaluation

Evaluation of
Model- 1
Stepwise regression with backward elimination using the Akaike
Information Criterion (AIC) is a method for automatic variable selection in a
regression model. The AIC is a measure of a model's goodness of fit,
penalized for the number of parameters, which helps prevent overfitting.
Initial Model:
Start with a full model that includes all potential predictor variables.
Model – 2 Backward Elimination:
● Iteratively remove the least significant variable based on the AIC
Stepwise criterion.
● The criterion aims to find a balance between model complexity (number
Regression of predictors) and goodness of fit.
Iteration:
● Continue removing variables until further removals do not substantially
improve the AIC.
● At each step, the algorithm assesses the AIC of the reduced model
without one variable.
Final Model:
● The process stops when further removals don't lead to a significant
reduction in AIC.
Steps
Steps
Summary

Model – 2
Stepwise
Regression
Evaluation of
Model- 2 t
Model Evaluation

Evaluation of
Model- 2
Model Evaluation

Evaluation of
Model- 2
● Initially we built a model by including all the independent variables and found
out the summary:

Model – 3
( P- Value
approach)
● In the next step we have considered the variables which are significant i.e p-
value<=0.05
● We considered the following variables to build the model:
1) Total Tackles
2) Super Tackles
3) High 5s

Model – 3
( P- Value
approach)
Evaluation of
Model- 3
Model Evaluation

Evaluation of
Model- 1
Model Evaluation

Evaluation of
Model- 1
Comparison
Between
Models
Comparison
Between
Models :
Using Anova
• As per ANOVA test, Stepwise regression is the best model
Comparison
Between
Models :
Using AIC
• As per AIC test, Backward Elimination is the best model
● All the 3 models were tested using ANOVA and AIC for the
comparison.
● In ANOVA comparison, model 2 has the least RSS indicating the
best model among 3.
Conclusion ● In AIC comparison, model 2 has the least AIC indicating the best
model out of 3.
● Hence, as per ANOVA and AIC comparison, we conclude that
stepwise regression model is the best.
Regression Equation:

Conclusion
Multiple R-squared: 0.9559
So, 95.59% of variance in Total tackle points is explained by the
independent variables in the model.
Thank You

You might also like