Professional Documents
Culture Documents
Syllabus PDF
Syllabus PDF
Winter 2020
Office Hours: You can contact me on Slack any time. For video-conferencing, we can use Zoom
(I will send the link) or Skype (my id pnaik007). For in-person meeting, please use Slack or
email me (panaik007@gmail.com) to arrange a mutually suitable date and time.
Course Description
This course introduces statistical methods to solve business problems. It covers both cross-
sectional and time-series regression models. Cross-sectional regression models are presented in
the first five classes and the time series regression models in the subsequent classes.
We begin gently with the linear regression model that you have studied in the Fall quarter. In
class 1, we refresh our understanding of linear regression and then apply it to design new
products. Specifically, how should managers know consumers’ willingness to pay for some
feature of a new product? What would be its expected market share with and without the new
features? What should be the “right” price?
In linear regression, we require more observations than the number of variables. But what if the
number of variables exceeds the sample size? Classes 2 and 3 tackle such “big data” estimation.
The term “big” refers to either a large sample size (Big-N) or many variables (Big-p). The
former involves no new statistical issues (only computational ones). Hence, we focus on the Big-
p problem and learn Principal Components Regression (PCR). We apply PCR to create
perceptual maps, which help us visualize competing brands and differentiate the focal brand.
In PCR we combine all the original variables (i.e., eliminate none) to create a few new variables.
In contrast, in class 3, we learn how to eliminate many of the original variables and retain just a
few important ones. To this end, we apply “shrinkage” via Ridge regression, Lasso regression,
and Elastic Net regression. Thus, we shall learn two ways of tackling Big-p data: dimension
reduction (PCR) and shrinkage methods (Ridge, Lasso, EN).
The above regression models assume a linear relation between the response and predictor
variables. But what if this relation is not linear? Nonlinearity need not be known a priori. It can
be more complex than that can be captured via variable transformations, polynomial terms, or
interaction terms. To analyze such data, in class 4, we learn how to apply tree-based methods
1
BAX 442
Winter 2020
such as Bagging, Random Forest, and Boosting. They help us discover important variables
without assuming specific nonlinear functions.
The above 8 types of regression (LM, PCR, Ridge, Lasso, EN, Bagging, Random Forest,
Boosting) belong to the set of cross-sectional methods and constitutes the content for the
Midterm Exams in class 5.
In class 6, we turn our attention to time-series analysis and forecasting. In time-series data, the
past influences the present and the present impacts the future. Such inter-temporal dependencies
are the main difference between cross-sectional versus time-series analysis. Given any single
time series variable (e.g., sales, consumer price index, GDP, temperature), how should we
separate the signal from noise in it, and then how to further decompose this signal into trend and
seasonal components? To this end, we shall use the methods of moving average, loess
(nonparametric regression), and Holt-Winters Filter.
Classes 8 and 9 demystify the celebrated Kalman filter (KF). For example, NASA used KF to
launch man on moon; the Department of Defense used it in anti-aircraft gunfire control problem;
Aerospace industry use it for navigational guidance to spacecraft, aircrafts, ships, cars, and now
to people via GPS on smartphones; KF also found applications in statistics, economics, and
business. I pioneered its application to advertising (Naik et. al 1998). We shall apply KF to
quantify and infer the time-varying effectiveness of marketing actions.
The final class reserves time to catch up, review, and discuss the advancing frontiers of statistics
and business.
Grading
• Participation (10%)
• Homework (40%) – one least scoring HW out of 5 HWs will be dropped.
• Final Exam (50%)
2
BAX 442
Winter 2020
Textbooks
3
BAX 442
Winter 2020
Tentative Plan
Book 2:
Ch4, Ch5
4
BAX 442
Winter 2020
Breakout Session
• Practice them on various SKU time series
7 Feb 18 ARIMAX Models Book 2:
• AR(p) Ch8
• MA(q)
• Integrated (d) Book 3:
• ACF, PACF Ch2.6
• ARIMA(p,d,q)
• ARIMA with X variables (ARIMAX)
Marketing Mix Modeling Reading 3 HW4
Using variables
identified in HW3,
estimate their impact
on online and offline
sales
8 Feb 25 State Space Models
• Holt-Winters 𝛼𝛽𝛾 filter as a special case
• ARIMAX as a special case
• Regression with time-varying parameters
9 Mar 3 KF Estimation
• Coding KF + Maximum Likelihood
Remarks
• Class 1 is mandatory. Please ensure you attend; if you miss, you will lose 10 points from
your final score.
• No make-up for the Final exam. Definitely ensure that you plan accordingly.
5
BAX 442
Winter 2020
• For travel and other reasons (to be approved case-by-case), you can miss any one lecture
other than the first one. If you miss additional lectures, 10 points (out of 100) per 3-hour
lecture will be deducted from your final score.
• The use of smartphones and texting in the class are not allowed because it distracts my
teaching. Laptops can be used in class for note taking, analysis, and coding purposes, but not
for surfing the Internet or checking/responding to emails.
• Academic Code of Conduct: You are required to uphold the University’s Regulation 537 on
Exams, Plagiarism, Unauthorized collaboration, Lying, Disruption, and other issues. Read
the academic code of conduct at this link: http://sja.ucdavis.edu/files/cac.pdf