Professional Documents
Culture Documents
Dynamic Density Estimation of Market Microstructure Variables Via Auxiliary Particle Filtering
Dynamic Density Estimation of Market Microstructure Variables Via Auxiliary Particle Filtering
I
DANIEL NEHREN n the past decade, we have witnessed an the market microstructure variable p(yt ;θ),
is the global head of the explosion in the amount of research on density estimation is essentially parameter
Equity Quantitative
financial market microstructure. This estimation. However, density (parameter)
Solutions groups at
J.P. Morgan in is partially because this subject is inter- estimation usually uses cross-sectional data
New York, NY. esting, but also because it has become more and (or i.i.d. data) as opposed to time series data.
more important in capital market in terms of Even if with time series data, ergodicity is
DAVID FELLAH price formation and regulation. In this article, often assumed to make data easier for analysis.
runs the Americas we establish a framework in which the distri- In our case, we do not have the ergodicity as
Quantitative Solutions’
Algorithmic Trading
bution of market microstructure variables can market microstructure is inf luenced by many
group at J.P. Morgan be modeled sequentially. Inferential tasks can events happening across the day. Meanwhile,
in New York, NY. be carried out easily under our framework. we only have one single path of the time series
The distribution of market microstructure data to estimate a whole series of distributions.
JESUS RUIZ-M ATA variables can be used in so many ways, for Traditional methods become inadequate in the
runs the Quantitative
example, to help develop better trading strat- face of these difficulties.
Solutions’ Portfolio
Analytics group at J.P. egies, to enhance more robust risk manage- Our proposed method is designed to
Morgan in New York, ment tools, or to provide more informative deal with this situation. Instead of directly
NY. market signals. However, modeling distribu- estimating parameters, we give them a prior
tions is unquestionably harder than modeling distribution pt (θ). At each time point, the
YICHEN QIN the market microstructure itself since we only parameter estimation becomes updating the
is a Ph.D. student at the
Department of Applied
have one observation at each time point. The posterior distribution pt (θ|yt ). Moreover,
Math and Statistics, Johns method presented in this article is our attempt the posterior distribution is considered as
Hopkins University in in this direction. Positive results have been the prior distribution for the next time point
Baltimore, Maryland. found, but we believe more exciting results (i.e., pt+1(θ)=pt (θ|yt )). When iterating this
will be forthcoming. Bayesian updating, we have a series of poste-
rior distributions of the parameter. With the
GOAL AND BACKGROUND parametric distribution of the market micro-
structure variable conditional on parameters
Our ultimate goal is to estimate the dis- p(⋅|θ), we obtain the marginal distribution
tribution of any market microstructure variable of the market microstructure variable by
yt at any time t based on the past information integrating over the parameter space, that is,
y1:t so that we can evaluate the status of the p( ) ∫ p( | θ) pt (θ | yt ) dθ .
market and take appropriate actions. When Another benefit coming from the prior
assuming a parametric distribution family for distribution is the f lexibility of the market
a normal distribution with θ, then the marginal distribu- inference on state variables at each time point.
tion of y is 0.9ϕ(0,1)+0.1ϕ(5,10), which is a mixture of In our analysis, we choose the bid–ask spread as a
normal distributions. This distribution can be used for particular example of market microstructure variables.
The Journal of Trading 2012.7.4:55-64. Downloaded from www.iijournals.com by Riadh Zaatour on 03/04/14.
modeling outliers. When assuming a continuous prior We assume that, at each time point, the market micro-
distribution, the marginal distribution can take arbitrary structure variable yt (e.g., the bid–ask spread) follows a
shape, which grants us a great deal of f lexibility. gamma distribution with parameters αt and βt , which
In this article, we use the bid–ask spread as an also follow a vector autoregressive model (VAR) of the
example of market microstructure variables, but please order 1. We can only observe yt , but not αt and βt (i.e.,
keep in mind that our methodology is generally appli- xt ={αt , βt}). Hence, yt is the observable variable, and αt
cable to any market microstructure measure. Given that and βt are the state variables. Since the measurement
the bid–ask spread can only be positive, we assume it function is a gamma distribution, this is a nonlinear
follows a gamma distribution. We choose the gamma state space model.
distribution because of its f lexibility. The gamma dis- The VAR process takes values in the entire real
tribution can adopt skewness and kurtosis of data, and line, but the parameters of the gamma distribution can
its parameter estimation is relatively easy. only be positive. To make the VAR suitable for our
The spread follows a process with a particular analysis, we build the model using log αt and log βt
pattern across the day. Right after the opening of the instead since log function maps a positive number to
market, the spread tends to stay high and have large the entire real line.
volatility, which ref lects the chaos of supply and demand The nonlinear state space model is summarized
and information accumulated since the closing price on as follows:
the previous day. During the middle of the day, the
spread usually remains low because the price formation is yt ∼ gamma (α t βt ) α t βt > 0 (3)
stable and practitioners have better information about the
market. Toward the end of the day, the spread goes up ⎡
⎢ log α t ⎤⎥ ⎢ log α t −1 ⎥
⎡ ⎤ ⎛ ⎡0 ⎤ ⎡⎢ Σ11 Σt ⎤⎥ ⎞
12
again but with less magnitude than in the morning. ⎢ ⎥ = ρ⎢ ⎥+ε h εt N ⎜ ⎢ ⎥ , ⎢⎢ t21 ⎥
⎟
⎢
⎢⎣ log βt ⎥⎥⎦ ⎢ log β
⎢⎣
⎥
t −1 ⎥⎦
t
⎝ ⎣0 ⎦ ⎢⎣ Σ t Σt22 ⎥⎥⎦ ⎠
Let us brief ly talk about the state space model as Aiming at the simplicity, we take ρ to be the
the basis of our framework. The state space model is a identity matrix, which means the VAR is a random
mathematical model of a dynamic system that is driven walk. We also let Σ12
t
Σ t21 = 0 , which means there is
by an unobserved underlying process and has outputs no interaction between αt and βt. The state space model
that are generated by the underlying process at every becomes
time point. It is defined as follows:
yt ∼ gamma ( t
βt ) (5)
xt f ( xt −1 , t ) (1) log llogg α t −1 + ε1t ε1t ∼ N ( Σ11 ) (6)
t t
yt g( x t , t ) (2) log βt l g βt −1 + εt2
log εt2 ∼ N ( Σ t22 ) (7)
56 DYNAMIC DENSITY ESTIMATION OF M ARKET M ICROSTRUCTURE VARIABLES VIA AUXILIARY PARTICLE FILTERING FALL 2012
Note that there is a t on the subscripts of Σ11
t
and Σ t22, Usually, the original distribution p(x) is very hard
which means these hyper parameters are changing across to draw samples from but can be evaluated point-wisely.
time. The time varying assumption is realistic because To approximate p(x), we generate a sample { i }iN=1 fol-
there will be structural changes of the underlying pro- lowing another distribution q(x), called the importance
cesses across the day as news and other important infor- function. The approximation of p(x) is given by
mation become available, and there is also a significant
difference in trading activities between the beginning N
p( x i ) N
p( x ) = ∑ w i δ xi ( x ) where
r wi ∝ and ∑w i
=1
of the day and the rest of the day. However, estimation i =1 q( x i ) i =1
of these parameters can be very difficult. We propose an (8)
It is illegal to make unauthorized copies of this article, forward to an unauthorized user or to post electronically without Publisher permission.
PARTICLE FILTERING
Filtering and Bayesian Updating
There is a whole spectrum of ways to model the
state space model. Having prior distributions on param- Filtering is about how to sequentially obtain
eters, we think the sequential Monte Carlo method is p(xt|y1:t ), the posterior distribution of the parameters.
an appropriate choice, namely the particle filter. The All filtering methods consist of two steps: prediction
particle filter method sequentially generates samples (i.e., and updating (filtering). By applying Bayes rule, we
particles) to approximate distributions that are dynami- can have a basic relationship that connects prediction
cally changing. It has an obvious advantage over other and updating. In our particular example of the bid–ask
methods—the distribution f lexibility. Using samples to spread, the particle xt contains αt and βt , which are the
approximate distributions is much less constrained than parameters of the gamma distribution. The relationship
the parametric form in terms of a distribution’s proper- can be rewritten in terms of αt and βt , as follows.
ties, such as multi-modal and fat tails. Given the fact that
computational power becomes less expensive, the particle
r g at t
filterin
likelihood
pred
r iction
(9)
filter also becomes more accessible. Due to the temporal p( t βt |y1:t ) |α t βt ) p((α t βt |y1:t 1 )
p( yt |α
association that we are after, we choose the auxiliary
particle filter (APF) for this analysis. In this section, we
pred
r iction
p(( t
βt 1 | y1:: )d
)dα t −1dβt −1 (11) where uti , vti are expectations of αt and βt is conditional
N on α it −1 and βit −1. p( yt | uti vti ) can be considered as a proxy
= p yt | t
, βt ) ∑ p( t , βt |α it −1 , βit −1 )w ti−1 of p(yt|αt ,βt ). We can also write
i =1
(12)
im
mporta
r nce function ( mixture
r model
d l)
( t
βt i | y1:t:t ) q( t
βt | i, y :t )q(i | y1:t ) (17)
t rget function
ta
p( t
βt i | y1:t ) p(yyt | t
βt ) pp(α
(α t βt , i | y1:t −1 ) (13)
The p(yt|y1:t−1) stands for the likelihood of observing
= p yt | t
, βt ) p( t , βt | i
t 1
, βit−
t 1
) p(i | y1::t −1 ) (14) the current observation yt given the hyper parameters
Σ11 and Σ t22. Maximizing it is essentially the maximum
= p yt | t
, βt ) p( t , βt | i
t 1
, βit−
t 1
)w ti−1 (15) t
likelihood estimation.
This estimation is a two-dimensional optimization
We use the importance sampling to obtain new problem that is computationally expensive and has many
particles. The importance function is defined local maximums. We regularize the parameter space and
transform the estimation to a one-dimensional optimi-
58 DYNAMIC DENSITY ESTIMATION OF M ARKET M ICROSTRUCTURE VARIABLES VIA AUXILIARY PARTICLE FILTERING FALL 2012
EXHIBIT 1 To solve the optimization problem, note that we
The General Algorithm of the Auxiliary have a good approximation of p(α t βt 1 |y1:t 1 ) using
Particle Filter particles at time t−1. p( yt |α t βt ) is the gamma distri-
bution function by assumptions. pΣ ∗t ( t , βt |α t−1 t−
t
, βt−
t 1
),
the underlying process, is the only part that contains the
hyperparameter. To evaluate pt (yt|y1:t−1) at each point of
pΣ ∗t , we first take draws from p(α t βt 1 |y1:t 1 ) , and
then draw new particles according to the underlying
process pΣ ∗t ( t , βt |α t−1
t−
t
, βt−
t 1
) , finally, we take draws
It is illegal to make unauthorized copies of this article, forward to an unauthorized user or to post electronically without Publisher permission.
Simulation Results
60 DYNAMIC DENSITY ESTIMATION OF M ARKET M ICROSTRUCTURE VARIABLES VIA AUXILIARY PARTICLE FILTERING FALL 2012
EXHIBIT 3
Tracking the Simulated Data
It is illegal to make unauthorized copies of this article, forward to an unauthorized user or to post electronically without Publisher permission.
The Journal of Trading 2012.7.4:55-64. Downloaded from www.iijournals.com by Riadh Zaatour on 03/04/14.
Note: For a color version of this exhibit, please visit The Journal of Trading website at www.iijournals.com/jot.
( Σ ∗t = 0.001). The dotted line is the large constant hyper two parts, we can see that most of the jumps and spikes
parameter (Σ ∗t = 1). We can see from the exhibit that the are captured by this change point detection test.
APF generally tracks the spread very well. The model We are ultimately interested in the distribution,
with the small hyper parameter responds to jumps slowly. so we plotted the estimated quantiles of the spread for
The model with the large hyper parameter responds to different methods in Exhibit 6. In the exhibit, Part A
jumps swiftly, but it is very volatile even when there is is using small hyper parameter ( Σ ∗t = 0.001), Part B is
nothing happening to the data. Our time varying hyper using large hyper parameter (Σ ∗t = 1), and Part C is using
parameter combines the merits of these two algorithms. time varying hyper parameter with threshold of 0.01,
It can both move smoothly when the spread is stable and Part D is the method of moments (MOM) estimate
and track jumps very well when the data is volatile. The of the distribution based on a exponentially weighted
bottom part of Exhibit 5 indicates when the algorithm is sample window (sample size of 20). In these figures,
actively estimating the hyper parameter. Combining these the thin f lat lines indicate 90%, 50%, and 10% per-
62
EXHIBIT 5
EXHIBIT 4
DYNAMIC DENSITY ESTIMATION OF M ARKET M ICROSTRUCTURE VARIABLES VIA AUXILIARY PARTICLE FILTERING
FALL 2012
The Journal of Trading 2012.7.4:55-64. Downloaded from www.iijournals.com by Riadh Zaatour on 03/04/14.
It is illegal to make unauthorized copies of this article, forward to an unauthorized user or to post electronically without Publisher permission.
FALL 2012
EXHIBIT 6
The Comparison of Different Methods
becomes stable, the MOM estimated distribution con- Adams, R.P., and D.J.C. MacKay. “Bayesian Online Change-
verges to a point, which is useless for us. Moreover, point Detection.” University of Cambridge Technical Report,
when there is a structural change to the spread, MOM (2007).
The Journal of Trading 2012.7.4:55-64. Downloaded from www.iijournals.com by Riadh Zaatour on 03/04/14.
In this article, we present a novel approach to Johannes, M., and N. Polson. Handbook of Financial Statistics,
sequential density estimation for market microstruc- Graduate School of Business, Columbia University (2009),
ture variables using the auxiliary particle filter. Results pp. 1015-1028.
from simulated and real data show an excellent perfor-
mance. The approach we introduce is f lexible and can ——. “Particle Filtering and Parameter Learning.” Working
be adopted to many other places. Paper Graduate School of Business, Columbia University
Although significant results are found using our (2006).
proposed model, there are many openings for future
research. The first one is the regularization of the hyper Liu, J., and M. West. “Combined Parameter and State Estima-
tion in Simulation-Based Filtering.” In Sequential Monte Carlo
parameter space. More elegant ways of regularization
Methods in Practice, (2001), pp. 197-217.
are needed. We can also incorporate more compli-
cated structures to ρ and Σ ijt . For example, we can use Pitt, M.K., and N. Shephard. “Filtering via Simulation: Aux-
a non-identity matrix for ρ to introduce a trend in the iliary Particle Filters.” Journal of the American Statistical Associa-
underlying process. We can also introduce a negative tion, Vol. 94, No. 446 (1999), pp. 590-599.
correlation between αt and βt so that the gamma dis-
tribution is more robust. Last but not the least, we can, Ross, S.M. Simulation, 4th ed. Elsevier Academic Press.
instead of modeling a single market microstructure vari- (2006).
able, build multiple variables state space models (e.g.,
trading volume, spread, and so on).
Another separate direction is that we can build To order reprints of this article, please contact Dewey Palmieri
multiple auxiliary particle filters at different frequencies. at dpalmieri@ iijournals.com or 212-224-3675.
64 DYNAMIC DENSITY ESTIMATION OF M ARKET M ICROSTRUCTURE VARIABLES VIA AUXILIARY PARTICLE FILTERING FALL 2012