Sonal Fra Milestone1 v1

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 20

Project – FRA MILESTONE 1

NAME: Sonal Singh


PGP-DSBA Online July’ 21
Date: 17/10/2021
Project – FRA MILESTONE 1

Summary This project report is to support a potential investor to


decide which company to invest in so that the company performs
better in future and investor would manage to gain money on the
invest amount rather than losing it.

DATA Description: -

Ans:
 Number of Rows – 3586
 Number of columns – 67
Project – FRA MILESTONE 1

 DATA INFO:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3586 entries, 0 to 3585
Data columns (total 67 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Co_Code 3586 non-null int64
1 Co_Name 3586 non-null object
2 Networth Next Year 3586 non-null float64
3 Equity Paid Up 3586 non-null float64
4 Networth 3586 non-null float64
5 Capital Employed 3586 non-null float64
6 Total Debt 3586 non-null float64
7 Gross Block 3586 non-null float64
8 Net Working Capital 3586 non-null float64
9 Current Assets 3586 non-null float64
10 Current Liabilities and Provisions 3586 non-null float64
11 Total Assets/Liabilities 3586 non-null float64
12 Gross Sales 3586 non-null float64
13 Net Sales 3586 non-null float64
14 Other Income 3586 non-null float64
15 Value Of Output 3586 non-null float64
16 Cost of Production 3586 non-null float64
17 Selling Cost 3586 non-null float64
Project – FRA MILESTONE 1

18 PBIDT 3586 non-null float64


19 PBDT 3586 non-null float64
20 PBIT 3586 non-null float64
21 PBT 3586 non-null float64
22 PAT 3586 non-null float64
23 Adjusted PAT 3586 non-null float64
24 CP 3586 non-null float64
25 Revenue earnings in forex 3586 non-null float64
26 Revenue expenses in forex 3586 non-null float64
27 Capital expenses in forex 3586 non-null float64
28 Book Value (Unit Curr) 3586 non-null float64
29 Book Value (Adj.) (Unit Curr) 3582 non-null float64
30 Market Capitalisation 3586 non-null float64
31 CEPS (annualised) (Unit Curr) 3586 non-null float64
32 Cash Flow From Operating Activities 3586 non-null
float64
33 Cash Flow From Investing Activities 3586 non-null float64
34 Cash Flow From Financing Activities 3586 non-null float64
35 ROG-Net Worth (%) 3586 non-null float64
36 ROG-Capital Employed (%) 3586 non-null float64
37 ROG-Gross Block (%) 3586 non-null float64
38 ROG-Gross Sales (%) 3586 non-null float64
39 ROG-Net Sales (%) 3586 non-null float64
40 ROG-Cost of Production (%) 3586 non-null float64
41 ROG-Total Assets (%) 3586 non-null float64
42 ROG-PBIDT (%) 3586 non-null float64
43 ROG-PBDT (%) 3586 non-null float64
44 ROG-PBIT (%) 3586 non-null float64
Project – FRA MILESTONE 1

45 ROG-PBT (%) 3586 non-null float64


46 ROG-PAT (%) 3586 non-null float64
47 ROG-CP (%) 3586 non-null float64
48 ROG-Revenue earnings in forex (%) 3586 non-null
float64
49 ROG-Revenue expenses in forex (%) 3586 non-null
float64
50 ROG-Market Capitalisation (%) 3586 non-null float64
51 Current Ratio[Latest] 3585 non-null float64
52 Fixed Assets Ratio[Latest] 3585 non-null float64
53 Inventory Ratio[Latest] 3585 non-null float64
54 Debtors Ratio[Latest] 3585 non-null float64
55 Total Asset Turnover Ratio[Latest] 3585 non-null float64
56 Interest Cover Ratio[Latest] 3585 non-null float64
57 PBIDTM (%)[Latest] 3585 non-null float64
58 PBITM (%)[Latest] 3585 non-null float64
59 PBDTM (%)[Latest] 3585 non-null float64
60 CPM (%)[Latest] 3585 non-null float64
61 APATM (%)[Latest] 3585 non-null float64
62 Debtors Velocity (Days) 3586 non-null int64
63 Creditors Velocity (Days) 3586 non-null int64
64 Inventory Velocity (Days) 3483 non-null float64
65 Value of Output/Total Assets 3586 non-null float64
66 Value of Output/Gross Block 3586 non-null float64
dtypes: float64(63), int64(3), object(1)
memory usage: 1.8+ MB
Project – FRA MILESTONE 1

 Number of Null Values:

1.1 Outlier Treatment

The outliers in dataset:

Most of the numerical variables have outliers present.


Treating outlier by using Inter Quartile range for each of
numerical column. Values greater than
Project – FRA MILESTONE 1

Upper quartile range would be capped with 75% of quartile


value Values lesser than
Lower quartile range would be capped with 25% of quartile
value The outliers would be replaced with Upper Quartile
values or lower.
And post outlier treatment the numerical variables in boxplot:

1.2 Missing Value Treatment


Project – FRA MILESTONE 1

There are total 103 missing value in the whole data set present
in a single column.

We Imputed them with the median value and ensured there


are no missing values after treatment.
Project – FRA MILESTONE 1

1.3 Transform Target variable into 0 and 1

There is no target variable defined – but since the objective is


to build a model for investor to decode which company to
invest in – the variable Networth Next Year could be used to
transform into target variable.

If the company’s Networth Next Year is positive and greater


than 0 – then the company would continue to return good
investment for investor and thus could be transformed as 0 –
NON-DEFAULT

And If the company’s Networth Next Year is equal to 0 or less


than it – then the company would likely not return a good
investment to investor and transformed as 1 – DEFAULT.
Project – FRA MILESTONE 1

1.4 Univariate & Bivariate analysis with proper interpretation

Some of the important parameters which are likely to


contribute more to the computation of which company to
default and which not are:
Market Capitalisation, Total Debt, ROG-Net Worth (%), Book
Value (Unit Curr), Value of Output/Gross Block, ROG-Cost of
Production (%)

The boxplot of them is shown below:

1. HeatMap:
Project – FRA MILESTONE 1

Most of the variables are highly correlated to the other variables


in the dataset and can be seen with below heat map:

2. Total dept:
There is one company which is borrowed a highest sum from
market which is nearly 2000 units, and some companies have
taken debt of around 500-750. Most if the companies have
debt lower than 250.
Companies with high debt may deploy high risk of not making
profit next year.
Project – FRA MILESTONE 1

3. ROG-Net Worth (%):


Net worth rate of growth – if its more the company is likely to
make more profit next year. The give dataset shows nearly half
companies having positive growth rate of Networth and rest
negative.
Project – FRA MILESTONE 1

4. Book Value (Unit Curr):


Book Value for unit currency indicates Net asset value of
company – if its positive, the company would always have
assets which can be used to capitalize should losses need to be
covered. There are 4 companies which have very good book
values. And some having negative book values. If these
companies are having high debts – then there is no way, the
losses could be covered with asset selling.
Project – FRA MILESTONE 1

5. Value of Output/Gross Block:


This figure shows almost 25% companies having good ratios of
market value and gross block – which means these companies
are most likely not to default.
Project – FRA MILESTONE 1

6. ROG-Cost of Production (%):


Rate of growth of production depicts how much the company’s
growth rate is for production cost – it may mean the company
is more likely to have more operation cost or more market
share or both.
From the dataset this rate is evenly distributed and the ones
which is highest are mostly either emerging companies or
performing really well.
Project – FRA MILESTONE 1

7. Pairplot of these respective variables with default shows


below:
Project – FRA MILESTONE 1

The total debt of defaulters is high and market capitalization is


less – which is to say more of money to pay than the company
owns in market. Net worth percentage of the defaulter is less
than those of non-defaulters and so is the asset values. The
Project – FRA MILESTONE 1

output values are defaulters is also less – which is to say less


production for companies not likely to make profit next year.

1.5 Train Test Split

The Target variable is Default and the variables such as


Co_Code , Co_Name, Networth Next Year are not contributing
much to this model. Hence dropping them. The dataset thus
left is divided into train and test data into 67: 33percentages
with random state 42 and stratify on default to make sure both
train and test data have similar proportion of defaulters and
non-defaulters. This is done as the dataset is imbalanced having
more of non-defaulters.

1.6 Build Logistic Regression Model (using statsmodel library)


on most important variables on Train Dataset and choose the
optimum cutoff. Also showcase your model building approach

The logistic regression model equation is:


Project – FRA MILESTONE 1

 Building this equation and storing it in Statsmodel


equation: model = SM.logit
 (formula=’Dependent Variable ~ Σ𝐼𝑛𝑑𝑒𝑝𝑒𝑛𝑑𝑒𝑛
𝑉𝑎𝑟𝑖𝑎𝑏𝑙𝑒(k)’ data = ‘Data Frame containing the required
values’).fit()
 The first model is built using all 64 independent variables –
but that is not giving any results as its too much of data. So
using Variance Inflation factor – the factors significantly
contributing as in the variance factor is less that 5 are taken to
create stats model and its performance is listed below:
 Variables used:
Market Capitalisation + Selling Cost + ROG-Capital Employed
(%) + Total Debt + Other Income + Revenue expenses in forex +
Cash Flow From Operating Activities + Value of Output/Total
Assets +Equity Paid Up + ROG-Total Assets (%) + ROG-PBIT (%) +
ROG-CP (%) +Revenue earnings in forex + Adjusted PAT + Cash
Flow From Financing Activities +Net Working Capital + ROG-
Net Worth (%) + Cash Flow From Investing Activities +Book
Value (Unit Curr) + Value of Output/Gross Block + Debtors
Velocity Days + ROG-Net Sales (%)+ Creditors Velocity Days +
ROG-Cost of Production (%) + Total Asset Turnover RatioLatest
Project – FRA MILESTONE 1

+ROG-Market Capitalisation (%) + ROG-Gross Block (%) +


Inventory Velocity Days + CPM (%)[Latest] +PBIDTM (%)[Latest]
+ Fixed Assets RatioLatest + Interest Cover RatioLatest +
Inventory RatioLatest +Current RatioLatest + Debtors
RatioLatest + Book Value (Adj.) (Unit Curr)

1.7 Validate the Model on Test Dataset and state the


performance matrices. Also state interpretation from the
model

You might also like