Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

Circular Arbitrage Detection Using Graphs

Zhenyu Cui1 and Stephen Taylor2


1
School of Business, Stevens Institute of Technology, 1 Castle Point on Hudson, Hoboken, NJ 07030, USA
2
New Jersey Institute of Technology, Martin Tuchman School of Management, 3000 Central Avenue
Building (CAB), Newark, New Jersey 07102, USA

October 15, 2018

Abstract

We propose a novel graph-theoretic method for the detection of circular arbitrage in foreign
exchange (FX) markets and discuss and demonstrate runtime improvements of this algorithm
over the brute force approach. An application on empirical currency bid/ask price data val-
idates this technique as well as provides an example of increased computational efficiency,
especially in the case where a large number of currencies are considered. Using minute-level
market data for all G10 currency pairs, we demonstrate the efficiency of the algorithm as
well as potential returns of higher order circular arbitrage trades. Finally, several potential
extensions are discussed.

1 Introduction

Quote from the Wall Street Journal1 :

There is still “free lunch” in foreign exchange, but algorithmic traders are increasingly
eating it faster than the rest of the market.

Foreign exchange (FX) markets in aggregate comprise one of the largest trading venues in
existence with an average daily turnover of approximately $5.1 trillion according to the Bank of
International Settlements [3]. Although one may trade a variety of forward, future and derivative
securities on OTC and exchanged-based FX platforms, we will focus on spot markets. These
markets have existed for centuries for major currency pairs and are quite mature; however, there
are dozens if not hundreds of market makers actively quoting FX spot prices, some on micro-second
time scales as discussed in Aarheim [1]. In contrast to equity markets which are mainly clustered
regionally (e.g. the S&P 500, FTSE 100, Hangseng Index, etc), the FX market is global in nature.
It is also becoming increasingly automated. As high frequency traders enter into these markets,
methods for accurately and efficiently detecting arbitrage trading opportunities are required. The
main institutions that participate in the Forex market are investment, commercial, central banks,
inter dealer brokers, institutional investors, and buyside currency speculators. Within such a
complicated system with varying participants and vast amounts of price data, inefficiencies and
mispricings will inevitably arise.

There are two main types of arbitrage opportunities arising in the foreign exchange market:
violations of covered interest parity and circular arbitrage, c.f. Akram [2] and Fenn [5]. In this
1 see https://blogs.wsj.com/marketbeat/2012/11/19/high-frequency-traders-getting-a-free-lunch-in-fx/

Electronic copy available at: https://ssrn.com/abstract=3267020


paper, we will focus on developing an efficient computational method to identify the latter type of
arbitrage. In particular, we will represent a FX market as a graph whose nodes denote individual
currencies and directed edges represent exchange rates between the currencies. We demonstrate
that iterating the max-plus product operation on the adjacency matrix of this graph will allow one
to quickly identify potential circular arbitrage opportunities of an arbitrary length in an efficient
manner.

2 Specification and Arbitrage Detection

We first describe a foreign exchange market as a bi-directional weighted graph whose nodes are
individual currencies and edges are the logarithm of corresponding FX rates. Given a set of
currencies ci for i = 1, . . . , n, let Aij denote the log of the best ask price for the currency pair
ci /cj . If one wishes to execute a market order immediately to exchange currency cj into currency
ci , Aij is the log ask price for which this market order would be filled. For example, if c1 = USD
and c2 = EUR then A12 would be the log of the best ask price for the USD/EUR exchange rate
and A21 would be the log best ask price for the EUR/USD reciprocal rate. This FX rate matrix
satisfies several properties, which include Aii = 0, all rates are positive exp(Aij ) > 0, and also
Aij ≈ −Aji , where the variations in this approximation are typically within bid/ask spreads for
major currency pairs.

If an exchange rate is missing or if we would like to exclude it from a study, then we assign
this rate a zero value. Since our main goal below will be to determine which cycle with a fixed
number of edges has the largest sum of log FX rates, this zero rate condition is equivalent to
precluding any cycle that contains a missing rate from contributing to the largest sum since the
log of the exchange rate will be −∞. In other words, in the graph that we construct whose nodes
are individual currencies and edges are the log of the spot rates, the sum of the edges of any cycle
that contains a missing rate will be −∞. For example, if EURUSD were missing, then we would
set this rate to zero. One would then only be able to convert any amount of USD to zero Euros
and any circular currency trade involving a USD to EUR conversion would have a 0% return since
there would be a zero factor in the product of various exchange rates to compute the return on an
initial non-zero investment. This has the effect of eliminating all cycles that contain the missing
rate from being considered as the best arbitrage opportunity. We finally note that we would set the
reciprocal rate USDEUR to zero only if this rate were missing as well. If both rates were missing
then the reciprocal approximation Aij ∼ −Aji would not hold for this particular currency pair.

Our aim is to detect potential circular arbitrage opportunities from the current FX rate values
that define A. For example, such an opportunity is realized by a circular trade that starts with
holding currency ci1 , then converting to currency ci2 at a log rate Ai1 i2 , and then converting again
into currency ci3 at a log rate of Ai2 i3 and so on. This process is continued until moving back to
the original currency at a log rate Ain i1 . Since n currencies are involved in this trade, we say that
this is an n-th order circular arbitrage if
 
n−1
X
(1) 1 < α ≡ exp Ain i1 + Aji ji+1  .
j=1

The return on such a trade starting with one unit of currency ci1 is thus α − 1 > 0. If multiple
n-th order arbitrage opportunities exist for a given currency set, we are interested in determining
the return of the maximum arbitrage trade. Specifically, if n = 3, this becomes a search for the
largest triangular arbitrage opportunity.

This problem may be rephrased in the language of graph theory as identifying the maximal
cycle of a fixed length in a bi-directional weighted graph. The nodes of the graph represent the
n currencies being considered and the edges represent the logarithm of the values of the exchange
rates that are the entries of the adjacency matrix A. Here the log of the exchange rates is taken

Electronic copy available at: https://ssrn.com/abstract=3267020


because we will be seeking cycles with the largest sum, which is equivalent to finding the largest
product of exchange rates in a circular arbitrage trade. The task of finding the maximum circular
arbitrage opportunity is equivalent to finding the longest k-cycle in this graph

The brute force method to determine the largest k-order arbitrage requires checking nk pos-


sible cycles, each of which requires kPmultiplications. In order to identify the maximum cycle of
n
arbitrary length, this would require k=1 k nk = 2n−1 n operations.

To determine an arbitrage
opportunity of a fixed length, this has a runtime of k nk ∼ O(nk /(k − 1)!). The problem of


detecting the minimal length cycle in a weighted directed graph is a well-studied problem. The
Floyd-Warshall [6, 8] algorithm can determine the minimum length cycle in O(n3 ) time. There
have been a number of refinements summarized in Orlin [7], which the authors improve upon to
construct an O(nm) algorithm, where m is the number of directed edges in the graph. This method
results in considerable performance improvement in the case when one considers a sparse graph;
however, in our applications we will typically consider complete or near complete graphs. Since
there are n(n + 1)/2 ∼ O(n2 ) edges in a complete graph, this method becomes an O(n3 ) algorithm.

Next we shall describe a technique first proposed in Zwick [10] for determining the shortest
cycle of a fixed length. This method relies on computing the max-plus product of the graph
adjacency matrix. The technique to detect such cycles is to iterate the max-plus product operation
on the graph adjacency matrix. Specifically, for two matrices A and B, the max-plus product is
defined component-wise by:

(2) (A ◦ B)ij := max(Aik + Bkj ).


k

Then the i, j-th component of the k-th power of A with respect to the max-plus product gives
the length of the longest cycle from i to j that passes through exactly k nodes. In other words,
the maximum value of the diagonal of the matrix, which is formed by k compositions of the max-
plus product of the adjacency matrix, gives the length of the longest k-cycle, or equivalently the
maximum k-order arbitrage.

To detect the length of the maximum cycle, we are interested in the case when i = j. In this
situation, one can view A ◦ A as computing the sum of the weights from vertex i to every other
vertex in the graph and then return the maximum of all such values; hence it determines the length
of the largest 2-cycle. If one iterates again, considering A ◦ A ◦ A, then the first iteration identifies
the longest path between vertex i and all other vertices that pass through a single node. The
second iteration finds the longest path from all path lengths considered in the first iteration, which
contains one additional edge connecting these paths back to the initial vertex. This corresponds to
a search for the longest three-cycle. Similarly, the diagonal elements of A◦· · ·◦A with k−1 max-plus
product operations will find the longest cycles of length k starting at the vertex associated with
3
their respective diagonal value. A simple implementation of the max-plus product has an √ O(n )
3 Ω(ln n)
runtime Zwick [10]; and recent advances in Williams [9] have improved upon this to O(n /2 )
where for two function f, g, we have that f ∈ Ω(g) indicates there exists a f such that f > g in
the large n limit. As a result, since n > k in practice, this method outperforms the brute force
approach in the k ≥ 4 case.

Lastly, we give an example of how to apply this technique to a simple FX market consisting
of four currencies where all exchange rates are available. These example rates are given in Table
1 and can be viewed as the rate one would receive if an immediate FX spot market order were
executed. From the entries in this table, we construct the log FX rate matrix A, then iterate the
max-plus operator on A itself three and four times and display the resulting diagonal elements

(3) exp(diag(A ◦ A ◦ A)) = 1 + [9.97e−6, 1.07e−5, 1.07e−5, 1.07e−5]

(4) exp(diag(A ◦ A ◦ A ◦ A)) = 1 + [1.15e−5, 1.15e−5, 1.44e−5, 1.44e−5]

Electronic copy available at: https://ssrn.com/abstract=3267020


Rate USD GBP CAD EUR
USD 1 0.76103 1.29853 0.86327
GBP 1.31401 1 1.70628 1.13434
CAD 0.77010 0.58607 1 0.66481
EUR 1.15839 0.88157 1.50420 1

Table 1: Example FX Rate best ask price matrix for four currencies: USD, GBP, CAD, EUR.

Taking the maximum value over both these arrays, we find the largest potential triangular
arbitrage return is 0.00107% and the largest fourth order circular arbitrage return is 0.00144%.
These marginal return values are typically of FX markets comprised of major currencies and after
one accounts for transaction costs a profitable triangular arbitrage trade does not exist.

Next, we change a single value in the FX rate matrix, the GBP/EUR rate, from 0.88157 to
0.89 to simulate a stale dealer quote and recompute the return of the maximum arbitrage trade to
understand how the arbitrage detection methods change. Denoting the resulting adjusted log rate
matrix by Ã, the analogous diagonal elements to the above are

(5) exp(diag(Ã ◦ Ã ◦ Ã)) = 1 + [9.57e−3, 9.57e−3, 9.57e−3, 9.57e−3]

(6) exp(diag(Ã ◦ Ã ◦ Ã ◦ Ã)) = 1 + [9.57e−3, 1.92e−2, 9.57e−3, 1.92e−2]

Now the largest triangular arbitrage return is 0.957% and the largest fourth order arbitrage
return is 1.92%; both of which are now significant. In reality, individual trades will typically
yield significantly reduced profits; however, executing on a large number of small but virtually
guaranteed arbitrage trades has the potential yield a very strong trading strategy.

3 Applications

We now provide applications of this graph-based arbitrage detection technique. We first describe
a dataset of minute-level FX spot rates associated with all G10 currency pairs that will be used
below. We then compare the computation times between this technique and the brute force
detection method for several combinations of cycle length and total number of currencies in the
FX market being considered. Finally, we compute the magnitude of the most profitable triangular
arbitrage trade as increasingly larger sets of currencies are considered for each minute over a single
day.

3.1 Data Description

We construct a dataset of historical bid and ask prices for spot exchange rates between all pairs of
the G10 member countries. These FX rates represent the most liquid currency pairs in the FX spot
market. Data was downloaded from Bloomberg’s Generic Composite Pricing (BGN) source which
aggregates quotes from major FX dealers. We restricted our pricing sources to dealers in the New
York FX spot market. The Python packages pandas, tia, and blpapi were used in a script that
automates this download process and organizes the price data in a manner amenable to further
analysis.

Bloomberg provides access to best bid and ask prices at minute frequencies for all major
foreign exchange pairs for the six months prior to the present date. These feeds are constructed

Electronic copy available at: https://ssrn.com/abstract=3267020


by combining live spot rate quotes from dozens of currency dealers, filtering unrealistic quotes
from those who have not had prior sufficient volume in currency pairs for which they are providing
prices. Finally, the best bid and ask prices are selected from the remaining quotes and made
available for download. One cannot directly identify the dealer with the best bid or ask price;
however, prices from this dataset were in principle tradable at the time they were quoted. We
also note that in Butz [4], the authors note the large FX dealers can take up to ten minutes to
internalize customer trades. This fact motivates examining minute level time scales for potential
arbitrage opportunities.

We use data for all G10 currency pairs on 7/12/2017 between 4:00 and 13:00 GMT, which
corresponds to the opening and closing times of the New York FX market. All combinations of
the three symbol identifiers for each currency were taken, and associated Bloomberg identifiers
were formed, e.g. “USDEUR BGN Curncy” for the USD/EUR rate. We then download data
corresponding to Bloomberg’s “BEST BID” and “BEST ASK” fields. In addition, a “numEvents”
field for each FX rate provides the number of trades that occur for each minute level bucket. We
compute the average value of this field for all currency pairs and use the result to filter non-standard
quoting conventions which have low average trade counts. For example, “USDEUR” is the preferred
convention for the dollar/euro exchange rate and “EURUSD” is a non-standard quoting convention.
The resulting 90 best bid and ask time series that had the greatest number of average numEvents
counts are used in subsequent applications. From these time series, we construct bid and ask
series for missing reciprocal currency pairs using the identities of the form, Bid(USD/EUR) =
1/Ask(EUR/USD) and Ask(USD/EUR) = 1/Bid(EUR/USD). We finally forward fill any missing
data and note that one may directly incorporate both a currency pair and its reciprocal pair into the
FX rate matrix, assuming there is sufficient liquidity in both to provide quality price information.

3.2 Computing Time Tests

We first determine the relative computing time ratios between brute force arbitrage detection and
the graph-based method for detecting the maximum circular arbitrage opportunity. We consider
cases of different numbers of currencies in the FX market, in particular n = 3, 4, 5, 6, 8 and 10. For
each of these sets, we detect maximum profit circular arbitrage of order 3, 4, 5, and 6. For each
combination, we then run both the brute force and graph-based arbitrage detection methods 1,000
times, and average their associated run times. The ratios of these resulting average computing
times are displayed in Table 2.

Arb. Deg./No. Cur. 3 4 5 6 8 10


3 0.68 1.06 1.34 1.5 2.7 4.5
4 2.53 5.67 11.0 17.0 39.4 78.5
5 6.31 19.1 46.0 91.7 273.9 661.0
6 21.6 89.0 258.1 571.2 2180.2 6626.5

Table 2: Relative computing time ratios between the brute force arbitrage detection and graph-
based algorithms’ runtimes. For example, the (5, 6) entry displays that the maximum fifth order
circular arbitrage magnitude in an FX market where six currencies are considered can be detected
91.7 times faster with the graph-based methods than the brute force approach.

We note that the graph-based method is considerably faster, typically by several orders of
magnitude, than the alternative brute force technique. One exception lies in the case of detecting
triangular arbitrage within a three currency market. In addition, the performance of the graph
method improves exponentially as one either fixes the number of currencies and increases the
arbitrage degree or fixes the arbitrage degree and increases the number of currencies being consid-
ered. The biggest performance improvements are realized in situations where a large numbers of
currencies are considered for the detection of the circular arbitrage in the market.

Electronic copy available at: https://ssrn.com/abstract=3267020


3.3 Arbitrage Detection Example

We next consider an example of triangular arbitrage detection over a single trading day in the
New York spot FX market on 7/12/2017. We consider: four currency sets consisting of the G4
currencies, six currencies which include the G4 as well as CAD and CHF, eight currencies that
add AUD and NZD to the six currency set, and finally the full G10. Our aim is to consider how
the magnitude of the maximum profit triangular arbitrage trade varies for each of these currency
sets over the full day.

In Figure 1, we plot time series of the arbitrage magnitude for each of these currency sets
over a trading day. Here, we assume that we start with one unit of the base currency and plot the
value of this portfolio after the maximum triangular arbitrage trade has been executed assuming
there are no transaction costs. This graph has several notable features. First, when only the G4

Triangular Arbitrage on 2017­07­12 in the NYC Spot Market
G4
6 Curs.
8 Curs.
1.0020 G10
Arbitrage Magnitude

1.0015

1.0010

1.0005

04:00 06:00 08:00 10:00 12:00


Time (GMT)

Figure 1: Triangular arbitrage magnitude for each minute during New York FX spot market trading
hours for each currency set on 2017-07-12.

currencies are considered, all arbitrage opportunities are below three basis points which indicates
that a profitable trade most likely does not exist when transaction costs are taken into account.
The arbitrage magnitudes are roughly uniform throughout the day with the exception of the G10
currencies. During the first two hours of trading, triangular arbitrage opportunities exist that are
typically between 15 and 20 basis points, with a short time window when slightly higher profitable
trades exist. Finally, we note that there is a prominent spike around 8:30 and a smaller one
near 13:00. Swift detection of such transient arbitrage opportunities is a natural strength of this
algorithm over the brute force method as one needs to execute trades to realize this arbitrage as
quickly as possible.

Although it is hard to identify the precise cause of these spikes, we note that the largest price
changes during the trading day occurred in a time interval around these spikes. For example, near
13:00 there was nearly a 60 basis point increase in the USD/EUR rate. As such price changes
propagate through to other currency pairs we suspect some dealers lag their peers in updating
price quotes, which results in potential circular arbitrage trading opportunities.

Finally, in Table 3, we display the currency triples that most frequently resulted in the maxi-
mum arbitrage trade over this single trading day. For example, within the G4 currencies, the GBP
to JPY to USD back to GBP trade resulted in the maximum triangular arbitrage opportunity,
when compared with all other such three currency trades, 11.5% of the time.

Electronic copy available at: https://ssrn.com/abstract=3267020


Rank G4 6 Curs. 8 Curs. G10
1 GBP/JPY/USD:11.5% CAD/CHF/GBP:47.4% CAD/CHF/NZD:57.2% AUD/NOK/SEK:14.6%
2 EUR/GBP/JPY:10.0% CAD/CHF/JPY:6.8% CAD/GBP/NZD:6.8% NOK/NZD/SEK:13.2%
3 EUR/GBP/USD:4.2% CAD/CHF/USD:2.5% AUD/CHF/NZD:5.9% JPY/NOK/SEK:8.7%
4 EUR/JPY/USD:2.0% CHF/GBP/JPY:2.4% CHF/GBP/NZD:5.1% CAD/NOK/NZD:3.1%
5 * CHF/GBP/USD:0.5% CHF/GBP/USD:1.8% GBP/NOK/SEK:2.3%

Table 3: Percentage of Minutes during which the Maximum Triangular Arbitrage Opportunity
Occurred for Specific Currency Triples for each Currency Set

Note that as we increase the number of currencies, the new currencies that extend previous
sets tend to enter into the top ranking triples. The new currencies are less liquid than those in the
previous currency set. These results provide evidence that the most profitable triangular arbitrage
trades typically involve less liquid currency pairs.

3.4 Higher Order Arbitrage

Finally, we compare gains on third, fourth, and fifth order arbitrage trades for each day over a
three month period while considering all G10 currency pairs in our dataset. For each day, we
compute the maximum arbitrage over each trading minute and then order the trades and estimate
the 99%-ile most profitable trade over the day. These values are then compiled over roughly a three
month time period and plotted in Figure 2. We note returns on the 99%-ile triangular arbitrage
trade are around 2.5 basis points on average, 7 basis points for the forth order arbitrage, and
11 basis points for the fifth order arbitrage trades; hence pursing higher order arbitrage trades
may lead increasingly profitable trading strategies although one should take additional transaction
costs and the complexity of simultaneously executing the trades into account for a more thorough
evaluation.

Triangular, Fourth, and Fifth Order Arbitrage ninety­nine percentile Trade Profit
1.0014 3
4
5
1.0012

1.0010
Arbitrage Magnitude

1.0008

1.0006

1.0004

1.0002

May Jun Jul


2017

Figure 2: Triangular, fourth and fifth order arbitrage ninety-nine percentile trades over a three
month period.

Electronic copy available at: https://ssrn.com/abstract=3267020


4 Conclusion

In summary, we have linked the problem of circular arbitrage detection to the identification of
the maximum length cycle in a graph constructed from the best ask price exchange rate matrix
being considered. Next, an algorithm that uses the max-plus product to find the length of the
longest cycle of a given length is discussed and its runtime is compared to an analogous brute
force approach. We finally demonstrate this runtime performance in several examples as well as
expected returns for triangular, fourth, and fifth order arbitrage trades.

There are several additional questions that would be worthwhile to investigate. First, does a
related technique exist that determines both the length and the path of the longest cycle in a FX
rate graph? Secondly, it would be of interest to incorporate transaction costs into the study of
higher order arbitrage, as well as to study the feasibility of executing simultaneously the required
number of trades to realize such arbitrage opportunities. Lastly, we have not considered matching
order sizes when executing circular arbitrage trades which would be necessary for a practical
application of this technique. This would require the arbitrager to either limit the potential trades
being considered or maintain excess cash in currencies with the smallest order sizes in the trade.
Both situations may negatively impact the trade profitability and it would be of interest to quantity
to what degree this will affect the practical viability of this procedure.

Acknowledgements: Both authors would like to thank Uri Zwick for bringing to their at-
tention the max-plus product approach to maximum length cycle identification as well as providing
comments that improved this draft.

References
[1] Aarheim, M. and Johnsen, N. (2013). High frequency arbitrage in foreign exchange markets.
Masters Thesis BI Norwegian Business School.
[2] Akram, Q.F., Rime, D. and Samo, L. (2005). Arbitrage in the foreign exchange market:
Turning on the microscope. Working Paper Norges Bank.

[3] BIS (2016). Triennial Central Bank Survey of Foreign Exchange and OTC Derivatives Mar-
kets in 2016. Bank for International Settlements.
[4] Butz, M. and Oomen, R. Internalisation by electronic FX spot dealers. Quantitative Finance
(2017).

[5] Fenn, D., Howison, S., McDonald, M., Williams, S., and Johnson, N. (2011). The Mirage
of Triangular Arbitrage In the Spot Foreign Exchange Market. International Journal of
Theoretical and Applied Finance 12 (08).
[6] Floyd, R.W. (1962), Algorithm 97: shortest path, Comm. ACM 5, 345.

[7] Orlin, J. (2017). An O(nm) time algorithm for finding the min length directed cycle in a
graph.
[8] Warshall, S. (1962). A theorem on Boolean matrices, Journal of the ACM 9, 11-12.
[9] Williams, R. (2014). Proceedings of the 46th Annual ACM symposium.

[10] Zwick, U. (2002). All pairs shortest paths using bridging sets and rectangular matrix multi-
plication. Journal of the ACM 49 (3), 289-317.

Electronic copy available at: https://ssrn.com/abstract=3267020

You might also like