Professional Documents
Culture Documents
321 Project Write Up
321 Project Write Up
Finance 321
Professor Juneja
5 May 2023
For this project, I decided to use the company Kohn Deere, simply because they have
been around so long and are very good at innovating their technology to give people the best
possible product. I also personally invest in this company because I believe in the agricultural
importance and longevity of this company. John Deere has allowed our nation and world to grow
to agricultural levels that would not be possible without them. Throughout this project, I will be
analyzing my firm’s returns and forecasted stock returns. I will use a set of numerous variables
and linear regressions to accomplish this goal. After this is concluded, I will have a good
1.
To start this project, it was rather a simpler task, but a very important one. I had to go to
this website and collect all the data for my company to complete my analysis. The website was
called the “Wharton Data Research Services (WRDS) database and has access to all the variables
needed.
My first step to collect this data was to go to the link and log into my account using the
information given by my professor. Once logged in, I clicked on “Get Data” then “Data
Dashboard” then “Third Party”. I have now reached step 1, I then clicked on “Fama French
Portfolios and Factors” and then “Factors-Monthly Frequency”. Now I am ready to set the range
of data that I want to use, so I set my data range to have all data from “1980-01” to “2017-12”.
Now in step 2, I had to pick my factors for a query where I picked all but momentum. All those
factors were the four variables that will be used throughout this project. Those four variables
were: Excess Market Return, high minus low, small minus big, and the risk-free rate. Now in
step 3, I had to select my query output. This was where I clicked “*.xlsx” for the format for the
information and “none” for the compression type. Now I am ready to click “Submit Query”. This
concludes the steps required to collect all the monthly data containing the Fama French Factors.
Now that we have the Fama French Data, it is time to collect our data for my selected
company, John Deere. I will be using the same website and logging in as before, the only
additional information I need is my company’s stock ticker symbol, DE. Once logged in, I
clicked “Get Data”, then “CRSP”, then “Stock/Security File”, then I go to “Monthly Stock File”.
Now that I am at step 1 of collecting the data for my company, I am going to click on the same
data range as the Fam French Data to make sure it aligns properly with the same dates. The data
set range I will select is first “1980-01” to “2017-12”. Step 2 is where I will input my stock ticker
and under “time series information”, I selected PRC (price variable). Now onto step 3, for
“Select Query Output”, I chose (*.xlsx) and “None” under compression type. The final steps
Although all I did was follow the steps to collect this data, it is a very crucial part to make
sure all my data is collected properly that way I have accurate outputs and comparisons. Now
that I have all the Fama French data and my data for John Deere, I had to combine all of it into
one Excel page. Once I put John Deere data on the same file, I had to make a new column for the
returns using John Deere stock prices. I did that by using the simple formula that was (New-
Old/Old). Doing that for every row down to the final row, I now have a return for every year
2A.
Now that I have all my data, it was time to determine the estimated regression line
analysis with John Deere stock returns as the dependent variable and the variables above as the
independent variables. First, to do this, I needed to download the “data analysis tool pack” into
my Excel software. Once having that, you were able to determine the estimated regression line
for these three variables: Excess market return, small minus big, and High minus low. The
process to perform these were all the same except the x-variable will change. The steps to
perform this was to click: Data>Data analysis>regression. After you click regression, you get this
pop-up.
For all three regressions, the Y range is always going to be your stock return. I also made
to include Line fit Plots and Residuals for each. The only thing that changes from each is the X-
range for each one you make. For example, for excess market return, I would input all the data
from the excess market return from my Excel data sheet. Then click “ok” and it now created a
new sheet which I will provide below for each. Once created, I made sure to add a “line fit plot”
to each graph, added the trendline, the Y formula, and the r^2. This can all be done by clicking
the sidebar of each graph and selecting them to display on the chart. The last thing I had to
change was to label each “X Variable” correctly under “intercept” to the according variable I was
creating an estimated regression line for. For Question 2A, there were 3 parts: 2AA (Excess
Market Return), 2AB (Small Minus Big), and 2AC (High Minus Low). Below I will display the
table and data for each variable with its corresponding part to the question. I will also but a brief
“Excess return is identified by subtracting the return of one investment from the total return
“Small Minus Big (SMB) is a size effect based on the market capitalization of a company. SMB
measures the historic excess of small-cap companies over big-cap companies” (Investopedia 1).
companies with a high book-to-market value ratio (value companies) and companies with a low
2B.
After those were performed it was time to do the overall model regression analysis. The
process was similar to what I did before by using the data analysis tool pack and following the
steps: Data>Data Analysis>Regression. Now at the same blank pop-up as before, I again input
the same Y-Range as before, and all John Deere stock returns from 1980-2017. Now for the X-
Range, I input all of the ones above containing Excess market return, High Minus Low, and
Small Minus Big. I did not need a table for this one, so I then clicked “ok”. Once it created the
new sheet, I had to change the name of what was below “intercepts” to its corresponding name in
With all of that now complete, it is time to use that data to determine which regression
coefficients are statistically significant. Before doing this, I had to know that for a variable’s
coefficient to be statically significant, the p-value for that variable has to be <0.05. Below I will
show each variable’s strength as well as each p-value, and significance F. Below that will be the
overall model coefficient data. For this table, I got data from the question above, but rearranged
because of the significance F. That is because it is so minuscule and close to equal to 0. For
individual models, the only one that I would say is moderately significant is HML due to how the
2D.
According to Investopedia, “Adjusted R-squared can provide a more precise view of that
correlation by also taking into account how many independent variables are added to a
particular model against which the stock index is measured” (Investopedia 1). When it comes to
R-Squared, it is expressed between 0-100% with 100% being a perfect correlation. They also
state that “The figure does not indicate how well a particular group of securities is performing.
It only measures how closely the returns align with those of the measured benchmark. It is also
backward-looking—it is not a predictor of future results” (Investopedia 1). Below I will show
similar tables to those in 2c but will add R-squared and adjusted R-squared instead.
In the individual for John Deere, HML and SMB Adjusted R-squared are both very small
and/or negative showing to have very little explanatory power. Something to note on the overall
closely correlated to the overall Adjusted R-Squared, there is proof of explanatory power in the
model’s meaning. This suggests that John Deere’s returns are more influenced by what is
3A.
Now, I had to make another overall model regression, using only the data from 1980
analysis>Regression. I then input the stock returns from those dates into the Y Range
(dependent) and then the EMR, SMB, HML, and RFR into the X Range (Independent). Click ok
With the information I got from 3A, I was now able to predict the stock price from 2001-
2017 as well as the root mean square error and model-predicted stock price return over the same
period. The first thing I did was make a new Column in the WRDS Excel sheet labeled “Pred
Ret (Yhat)” which was for my forecasted stock return. I then input this formula into that
1.9527*RFR.
The numbers I input into that formula were all the coefficients for each independent variable
that I got from doing the 1980-2000 overall model regression analysis. Once in there, I input
each variable’s value for that month into the equation to get that month’s value. I then clicked
WRDS sheet, I had to make two new rows right next to “Pred Ret (Yhat)”. Those rows were
labeled “Error (y-yhat)”, and SQ Error. The first row, “Error (y-yhat)” was simply just RTX
return minus “Pred Ret (yhat)” for that month. I then clicked Ctrl+D to fill the formula down
the whole column. I then had to find the SQ error for the next column and that formula was just
that month’s “Error (y-yhat)” squared. I then clicked Ctrl+D to fill the formula down the whole
column.
Lastly, I had to get MSE to ultimately get RMSE. One column over I made two new
columns named MSE and RMSE. In MSE, it was the average of all the SQ Errors from 2001 to
2017. Since it was an average, I did not need to apply this to every row because it would just be
the same number. In the final column was RMSE. To do that, all I had to do was do the square
root of MSE. I also only had to do this to this row because it would be the same number all the
way down.
4.
After reading over many class notes, listening to lectures, extensive research online, and
multiple regression models, I am now able to discuss and interpret the strength of the
relationship between stock price returns for a personal firm. After doing regression analysis on
multiple variables and comparing results, it is evident that determining statistical significance
through P-values is one of the more important tools when it comes to observing. In my data, I
believe excess market return showed the most significant not only individually but also in the
overall market regression model. Given that the others, SMB, HML, and RFR were rather
insignificant, it is safe to say that the best representation of the stock’s performance was excess
market return because of its similarity to both individual and overall regression analysis.
Now I will discuss Adjusted R-Squared and its significance. When you get the overall
adjusted R, that gives you a percentage of variance that you can attribute to the stock returns
using all the independent variables. For example, my adjusted r-squared in my overall
regression model was .25894. Putting that into a percentage means that the four variables used,
(EMR, SMB, HML, RFR) attributed to 25.9% of John Deeres stock returns. With those beings
said, given that the overall significance F value is very small, that indicates that the significance
is not very strong for the individual independent variables. As I stated a few steps prior, I
believe that the adjusted r-squared similarity between EMR individually and in the overall
regression model can be attributed to being much more significant when it comes to the reason
To conclude, through all the research and models I completed analyzing this company
and its past, I believe the biggest factor when it comes to the stock’s performance is the excess
market return. Although HML also had a “significant” p-value, I don’t think it had enough
significance to consider it as a main driver for the stock returns. EMR was the only one to have
a truly significant p-value individually and overall, in my regression models. Lastly, having a
high significance in excess market return is something all investors want and look for. It shows
that they constantly beat benchmarks and are not only outdoing themselves but also showing
and-adjusted-rsquared.asp#toc-r-squared-vs-adjusted-r-squared-an-overview
https://corporatefinanceinstitute.com/resources/valuation/fama-french-three-factor-model/