Professional Documents
Culture Documents
PAMS-22Fall-Smart Marketing with RRM-5-Optimization
PAMS-22Fall-Smart Marketing with RRM-5-Optimization
Chan- Esti-
ging mate
Model
Identification
Identification concerns about how we can solve our parameters from the information of data.
• Our data will not match our model perfectly, because we can only observe limited information.
• We separate the observable and unobservable (+unknown) parts of our parameters, into predictions
and errors.
• Identification (accuracy) therefore depends on whether we can constrain the error, aka find a pure
relationship between parameter and observations, through test marketing or extraneous shocks.
Likelihood Function
When we have the relevant data, the mission becomes adjusting the model by comparing
data and predictions, which include three steps.
Firstly, write down the prediction.
• We model the predictions (usually using linear modeling), and treat the errors as randomly distributed
(usually normally distributed).
𝑢 = 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑖𝑜𝑛 + 𝑒𝑟𝑟𝑜𝑟
⇒ 𝑢 = 𝛽 ⋅ 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑏𝑙𝑒𝑠 + 𝑁𝑜𝑟𝑚𝑎𝑙 𝛼, 𝜎
⇒ 𝑢 = 𝛽1 ⋅ 𝑝𝑟𝑖𝑐𝑒 + 𝛽2 ⋅ 𝑒𝑥𝑝𝑒𝑐𝑡𝑎𝑡𝑖𝑜𝑛 + ⋯ + 𝑁𝑜𝑟𝑚𝑎𝑙(𝛼, 𝜎)
Link Function
A sidenote is that, after we have gotten the parameter estimates, the resulting task is also a
maximization problem.
𝑝𝑟𝑜𝑓𝑖𝑡 = 𝑝𝑟𝑖𝑐𝑒 − 𝑐𝑜𝑠𝑡 ⋅ 𝑠𝑎𝑙𝑒𝑠
𝑠𝑎𝑙𝑒𝑠 = 𝑑𝑒𝑚𝑎𝑛𝑑 = Σ𝑖 𝑑𝑖
𝑑𝑖 = 𝐿−1 (𝑢𝑖 ; 𝛽1 , 𝛽2 … )
The remaining question, therefore, is how we can perform maximization.
Optimization
1. Optimization
Algorithms
2. Application
The Problem(s)
Both demand estimation and decision making needs to solve a maximization problem.
• In demand estimation, we choose distribution properties of parameters (the actual estimating
parameters) to maximize likelihood function.
• In decision making ,we choose decision parameter to maximize objective function.
𝑓 𝑥 = −𝑥 2
> 0 𝑖𝑓 𝑥 < 0
𝑓 ′ 𝑥 = −2𝑥 = ቐ= 0 𝑖𝑓 𝑥 = 0
< 0 𝑖𝑓 𝑥 > 0
The Limitation of Mathematic Deduction
The calculation burden of deduction increases exponentially with more functions involved,
especially with complicated structure.
• Sometimes there is no closed form of derivative function.
• It’s also practically difficult in business decision environment. Practitioners may not have the
mathematical sophistication, and model structure may be subject to frequent changes.
2. Solving the Problem Numerically
When we cannot solve the maximization problem analytically, a numerical solution can be
satisfactory enough.
By the term numerical, we mean that we only get the solution in the form of numbers.
𝑓 𝑥 = −𝑥 2 + 1
This is an example of analytical form of our objective function.
The computer restores such a function numerically, which means that it can output the value
of 𝑓(𝑥0 ), when the input is 𝑥0 .
• 𝑓 𝑥 = 1 𝑤ℎ𝑒𝑛 𝑥 = 0
• 𝑓 𝑥 = 2 𝑤ℎ𝑒𝑛 𝑥 = 1 …
Brute-Force
The intuitive way to get a numerical maximum solution is to exhaust every possibility.
• In terms of maximization, it means that we search the whole domain of definition, to find the
maximum value.
• It is equivalent for us to plot our function.
Accuracy
A common way to find local maximums (or minimums) is to find roots of the first derivative.
3. Root-Finding Algorithm
Beyond brute-forcing, we can add some basic rationales to improve our root-finding process.
Think about the following treasure hunting example.
Root-Finding Algorithm Rationales
The three rationales of treasure hunting can translate into root finding.
• Always run towards the direction – when 𝑓(𝑥) is larger than 0, we should guess a smaller 𝑥 next time if
𝑓(𝑥) is increasing.
• When the destination is close, check more – when 𝑓(𝑥) is already close to 0, we should our next guess
should not be far away.
• When the path switches direction a lot, check more – when 𝑓’(𝑥) is large in absolution, which means
𝑓(𝑥) is changing fast, our next guess should be conservatively close.
𝑓 𝑥 𝑛
𝑥 𝑛+1 =𝑥 𝑛 −𝛿⋅
𝑓′ 𝑥 𝑛
Stopping Point
For a root-finding algorithm, the ideal stopping point is of course finding the root.
However, numerical is with error, so it’s common to miss the root by a small margin.
• In practice, we need to set an error tolerance, so that we can stop when we are close to a root withing
the limit of the tolerance.
Another consideration is that, sometimes our algorithm cannot approach the root, which is
called divergence (the converse is called convergence).
• In practice, we also need to set a maximal iteration count.
Convergence
Newton’s Method saves a lot of time than brute-force method, but it does not guarantee
finding a root.
• Such trade-off between computation efficiency and convergence is common on numerical
algorithms. For example, a smaller 𝛿 (step length) ensures convergence more, but slower the
algorithm.
𝑓 𝑥 𝑛
• 𝑥 𝑛 + 1 = 𝑥 𝑛 − 𝜹 ⋅ 𝑓′ 𝑥 𝑛
Root Bracketing
Find 𝑥3 ∈ (𝑥1 , 𝑥2 ), if 𝑓 𝑥3 > 0, then there must be at least a root between 𝑥2 and 𝑥3
Software Practices
The problem of root bracketing is its low speed, it’s just an upgrade from brute-force method
by improving the searching efficiency.
A prominent problem we need to face with MLE is that it’s often times multivariate.
An intuitive way to solve the question, is to imitate Newton-Raphson method and move in the
direction of maximum (or minimum).
𝑥1 , 𝑥2 , … 𝑛+1 = 𝑥1 , 𝑥2 , … 𝑛 + 𝛿 ⋅ ∇𝑓 𝒙
𝜕𝑓 𝜕𝑓
= 𝑥1 , 𝑥2 , … 𝑛 + 𝛿 ⋅ ( , ,…)
𝜕𝑥1 𝜕𝑥2
Logit Link Function
In previous session, we have used one way to specify the likelihood function – by calculating
the cumulative normal distribution (probit model).
• In practice, calculating cumulative probability distribution can be a bit resource demanding, so an
approximate form is developed.
𝑝𝑖
𝑢𝑖 = ln (𝑤ℎ𝑒𝑟𝑒 𝑝𝑖 = 𝑝𝑟𝑜𝑏(𝑑𝑖 = 1))
1 − 𝑝𝑖
This is called the logit model, and the related link function is logit function. The related error
distribution is Gumbel distribution.
• In practice, probit and logit model tend to yield similar results.
2. Integration
Trapezoidal’s Rule set two initial points and draw secant. We therefore calculate the
trapezoids to approximate the result of integration.
• Beyond this intuition, we can also use the information of derivative – a steeper function should use
smaller sections to approximate more precisely
3. Constraints
A simple form of optimization problem, where the objective function is linear with linear
constraints, is called linear programming.
General Considerations
The direct implication of constraints is that now we need to consider corner solutions (the
function value at the edge of constraint).
• Besides local maximum, we need to also consider the values of corner solutions, which is provided by
the constraints.
Sometimes we may have multiple objectives, and some of them may not be compatible.
• Extracting most profit and expanding market size
• Foster customer loyalty and cost control
Combining Goals
The most intuitive and commonly used way to solve goal inconsistency is to merge them into
one objective.
• We can aggregating different goals together
– by applying weights or specifying a higher-dimension objective function
• We can provide satisficing criterion for them.
- by adding such criterion as constraints
• We can also use a punishment function for deviation from certain points
- by setting a reference point for some goals, and then conduct single-objective optimization with
punishments on the deviation from references of the other goals
Numerical Concerns
In this class, we talk about some basics about solving our optimization problems (in both
parameter estimation and decision analysis) numerically.
• Numerical solution is a approximation when the analytical form is hard to get.
• When dimensions are high and function form complicated, we cannot rely on brute-force method.
• Improvement algorithms need to face the tradeoff between efficiency and convergence.
• In applications, we also need to consider and deal with some special problem properties: Multivariate
and Integration add to the computation difficulty and prompt us to find more efficient method.
Constraint and Inconsistency requires special treatment in the optimization problem.
The Overall SMART Decision Framework
Objective 1
Con-
Action
troll-
Not Space
able
Chan-
ging Pre-
Con- Deci- Out-
Reality speci-
trolled sion come
fied 5
Chan- Esti-
ging mate
2&3&4
Model
Final Essay
The topic of our final essay, is to develop the decision analysis parametric model for your
selected case.
• Detailed guidelines will be provided in subsequent email(s).
The due date is Oct 23, with Oct 9 being the due date for early submission.
• Early submissions will be feedbacked within one week of submission.
The End