Download as pdf or txt
Download as pdf or txt
You are on page 1of 26

Xerberus Whitepaper

Quis custodiet ipsos custodes?


Risk Analysis and Prediction Markets
- VERSION 1 -

November 2022

– Abstract –

Xerberus is a decentralized risk rating organization using advanced mathematics


in lieu of the more common machine learning or artificial intelligence methods.
Based on provable processes and integer-designed incentive structures, the
organization provides a public good in the form of risk assessment. Xerberus
stays true to the original values of crypto; Standing up for radical transparency
and accountability. Only by creating an objective understanding of the essence
of our industry can we fulfill the promise of financial emancipation and freedom.

1 Problem Statement
The decisive impulse for the rise of cryptocurrencies was the devastating effects of the 2008 financial
crisis. At the core of this crisis was a systematic failure of rating agencies. The purpose of rating
agencies is to create clarity and establish certainty for investors; understanding risk is essential to
guide decisions. Yet this institution was complicit in selling substanceless securities into the heart
of our economy. Centralized power allowed them to slide down the slope of questionable incentives,
leading us into crisis.

Since 2009 we have witnessed the birth of cryptocurrencies as a way of liberation from the traditional
financial systems and their excesses. Unfortunately, however, the new world of crypto has shown
to be a breeding ground for the same kind of malicious behavior.

We hear voices demanding to invite the same guardians into our system that let us down before.
It is thought that centralised authorities (E.g., governments and rating agencies) are needed to
determine what is secure and risky for us. Xerberus is the daring endeavor to prove them wrong
and create a verifiable and decentralized source to understand risk independently. Making crypto a
place of equal accountability for all participants will unleash its true potential.

1
Contents
1 Problem Statement 1

2 Introduction 4

3 Decentralized Risk Rating 5

4 Risk Assessment Logic 6


4.1 Data Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
4.2 Research Markets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
4.3 Research Markets Functionality . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
4.3.1 How Accurate Can Prediction Markets Be? . . . . . . . . . . . . . . . . . . 9
4.3.2 Number of Needed Participants? . . . . . . . . . . . . . . . . . . . . . . . 10
4.3.3 What About (Prediction) Market Manipulation? . . . . . . . . . . . . . . . 10
4.4 The Outcome Oracle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

5 Technical Implementation 10
5.1 Statistical Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
5.1.1 Wallet Identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
5.1.2 Velocity and Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
5.1.3 Governance Power . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
5.1.4 Trading analytics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
5.2 Topological Data Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
5.3 Persistence Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
5.4 Shape Risk Score . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
5.5 Risk Manifold . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
5.5.1 Local Region . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
5.5.2 Global Region . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
5.6 Trajectory Risk Score . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
5.7 Prediction Scoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
5.8 Curve-Fitting (Machine Learning) . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

6 Roadmap 16
6.1 Xerberus Labs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
6.2 Xerberus Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
6.3 Pre-Alpha Stage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
6.4 Alpha Stage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
6.5 Beta Stage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
6.6 Fully Functional Stage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
6.7 Decentralization Stage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

7 Tokenomics 19
7.1 Token Utility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
7.2 Utility Token Value Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
7.3 Utility Token Demand Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
7.4 Utility Token Pricing Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

2
7.4.1 Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
7.5 Token Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

8 Conclusion 21

3
2 Introduction
In an unregulated and new financial system like cryptocurrencies the chance that a project can be
a "rug-pull" or have malicious intent is much higher than standard financial markets. We seek to
classify projects based on their inherent chance for success, failure or mal-intent using provable
methods that can be freely verified by the public, in doing so, we approach the classification using
mathematical proofs and not artificial intelligence (AI) or machine learning (ML). Both of these
methods, although promising and having a lot of potential, do not adequately fulfill our desire for
verifiability.

In addition, both methods have drawbacks that are considered too large to be able to be used
at a computational level for our analysis:

• AI lends itself to the bias of its developers - having an unseen bias when considering risk can
"blind" us and allow for manipulation.

• ML requires an exhaustive training process that can easily be trained to detect, or ignore
potentially mission critical information - crypto financial markets are a new system and no
one has clearly defined parameters to allow for accurate and thorough training.

• Both methods are considered "Black Boxes" with no true definitive way to understand the
exact path that the computation took to reach its decision - when dealing with finances and
risk not being able to clearly define why you make your claims can be reckless.

We desire a method of classification more robust and verifiable to achieve our goal of a bias free and
open risk analysis framework. We believe that relying on logical proof mechanics of mathematics
we can achieve a more accurate, more easily verifiable risk model.

The reader should note that section [5.8] specifies our uses of machine learning. In short, we do
not use it as a computation layer but as an aid in an estimation of the ’risk manifold’ discussed in
section [5.5]. Using machine learning as an estimation layer to "guess" the correct manifold and is
not used for risk scoring itself.

Additionally, we believe in the power of crowd knowledge and its uses. We introduce a prediction
market to harness this wisdom and use it for analysis, by doing so, we are able to accurately
accumulate non-empirical and non-conventional data to feed our risk analysis. Providing an avenue
for rational actors to turn their proprietary knowledge into profits increases our approximation of
the efficient market hypotheses that states share prices reflect all known information. Xerberus’
approach to input of information will be a best-in-class, all-encompassing risk assessment model by
combining blockchain clarity, mathematical provability, and proprietary information aggregation.

4
3 Decentralized Risk Rating
The Xerberus organization will build upon the foundations created by traditional rating agencies
and combine them with decentralization and transparency. Decentralization of risk rating is
achieved by using the wisdom of the crowds instead of the opinions of a few selected analysts.
Non-transparent incentives can easily corrupt the latter, while the first includes all available
knowledge orchestrated by transparent incentives.

The Xerberus research markets collect the crowd’s wisdom. The markets allow researchers to com-
municate their opinions about future events and their importance by staking tokens on a research
poll. For example, a poll might ask if project X will achieve the next milestone within the published
timeline and if accomplishing said milestone matters. Thereby we create an incentive for knowl-
edgeable actors to monetize their insights and thus provide us with an event-based probability curve.

In theory, markets are efficient; however, crypto markets have displayed a weaker efficiency in
the past. By creating public sub-markets for particular questions we encourage both insiders and
experts to share their knowledge incentivised by profits. Therefore, our event-based probability
curve will contain more available knowledge than currently expressed in the market price, thus we
move closer to a stronger market efficiency.

The event-based probability curves are used as weighting of empirical data collected from the
Cardano blockchain as well as data points themselves. Thereby we create an all-inclusive risk model
that assigns a score based on all available information, that is, on-chain and off-chain information.
Transparency is created by storing the dynamic output of the risk model on-chain and updating
them continuously. Everyone with sufficient knowledge can understand the mathematical proof
underlying our risk assessment.

Marrying research markets with the infallible nature of mathematics to form a risk model cre-
ates an incorruptible and more complete risk assessment that liberates an investor from relying
merely on word of mouth or ’influencer’ opinions and forms a fair playing field of equal accountability.

5
4 Risk Assessment Logic
The inner workings of the Xerberus model is quite complex and intertwined. Below we have
provided a flow chart of methods used to reach our risk scores. In the upcoming section, we briefly
explore the systems and tools used in reaching our solutions. For the sake of easy reading, we
have omitted the mathematical equations from our explanation, Appendix [A] provides a more
rigorous definition. For an example of how to use the flow chart, calculating the "Shape Risk Score"
and understanding the tools needed is analogous to the path:
[IN-1] Ñ [SA] Ñ [TDA] Ñ [SRS].

4.1 Data Input


Generally, we source and divide our input data into two broad categories: Empirical and non-
empirical. Empirical data is data that is sourced from definite means, blockchain data, tax returns,
and confirmed sources. Non-empirical data is data received from biased sources; implied infor-
mation of research markets, and sentiment analysis. By dividing our data sources into these two
categories, we can construct two versions of our risk score, one that relies only on concentrated
facts and one that takes into account the larger macro factors that may influence the risk of an asset.

A project’s data is projected as a point cloud in Rn where the dimension n is number of data
points per sample. A Vietoris–Rips complex is constructed with the input point cloud data and
allows us to study the specific aspects that each project has via its particular shape. This process
converts the data from a point cloud into a set of filtration’s of simplicial complexes. Taking the
homology of each filtration gives a persistence module:
Hi pXr0 q Ñ Hi pXr1 q Ñ Hi pXr2 q Ñ ...
Projecting the data in such a way has the benefit that we are able to score and evaluate the
relations of data points for each sample set without having to actually infer the psychical magnitude
of their scalar values, giving us a very powerful tool that is invariant under magnitude.

6
4.2 Research Markets
The efficient market hypothesis states that all available information is already priced into an asset
price. Academia knows several versions of this theory ranging from weak to strong market efficiency.
Empirical data confirms that some markets are, in fact, less efficient than others. Tran, Leirvilk
(2019) [1] shows that, E.g., the bitcoin markets were significantly less efficient before 2017.

Some dispute market efficiency because of financial crises. However, a financial crisis does not
contradict market efficiency. The efficient market hypothesis only claims that known information
is priced in, and unknown information that causes crises are not yet priced in. This phenomenon
is well described as market uncertainty by Slovik[2] and we will refer to it as unknowable information.

One could argue that some market participants know the unknown before the other market partici-
pants consider it true; this point is strengthened by the fact that we can observe differences in
efficiency across markets. E.g., some market participants knew that the cryptocurrency "Luna"
contained a real risk of collapse while most market participants did not.

Therefore we propose a model of thinking that in three stages: Priced Information, Knowable
Information, and Unknowable Information. The difference between priced and knowable information
in crypto markets is bigger than in well-established financial markets. One reason might be that,
in general, smaller markets are less efficient than larger ones. Another reason for less efficiency
in crypto markets could be the high complexity of their products and retail-driven demand for
their tokens. Retail demand is often based on word of mouth and influencers rather than in-depth
understanding.

Xerberus’ proposed research markets and analysis methods provide insight into the divergence
between priced and knowable information. This insight is generated by incentivizing experts
to communicate their knowledge in curated sub-markets around specific questions. Different
corporate prediction markets described by Cowgill, Zitzewitz (2014) [3] inspired the proposed
research markets. These corporate prediction markets displayed a higher accuracy than expert

7
predictions, despite having insufficient incentives. Our markets are open to experts, enabling them
to monetize their knowledge, thus providing an incentive to participate. The openness might also
invite experts with limited insights; however, the nature of the market will reward knowledgeable
traders and penalize gamblers. Prediction markets inspire the research markets; however, we will
not use a "betting mechanism" - instead, we use interconnected polls. These polls come in the
form of questionnaires we call "forecasts." Additionally, Xerberus intends to implement a system
that is statically impossible to win via random bets through mathematical methods described in
the following versions of this paper.

4.3 Research Markets Functionality

The research markets allow researchers to stake Xerberus tokens on forecasts. These forecasts are
essentially a varying set of predictions a researcher can choose. A researcher can inspect all offered
forecasts and choose those believed to be accurate by staking on them. The researcher must also
state if they believe their forecast is remarkable or if they assume most other researchers will have
come to similar conclusions. Additionally, the researcher is asked about their confidence in their

8
prediction and to add some optional qualitative context. Rewards payout after an event occurs, and
prediction accuracy becomes apparent. Tokens can only be won but never lost within the research
market. The reward function accumulates with the repeated success of a researcher but punishes
researchers who seem to stake at random forecasts. A "gambling" researcher will receive no rewards
in our system since no information has been added. Veteran researchers who consistently add
information with statistical significance to the model can propose forecasts themselves as long as
they commit collateral to a question proposal. The collateral is burned if the question does not
adhere to the formulation rules. The formulation rules aim to prevent biased questions from being
posted on the research markets.

Given the research cited earlier, almost all tokens within our ecosystem will show significant market
inefficiency, and the proposed system will be able to address it. In addition to introducing higher
efficiency into the market, there is reason to believe Xerberus might be able, given enough data, to
quantify and predict unknowable information. Anticipating unknowable information could go as
far as being able to predict crashes, failed projects and "rug-pulls" well before they become apparent.

4.3.1 How Accurate Can Prediction Markets Be?


In order to ensure an accurate prediction market we require two main factors for accuracy.
Firstly, the forecasts should have unbiased wording and timely positioned, that is to say, the
question posed should be fair in its wording and trading on the question should be done in a time
scale relative to the question; asking about completion probability of a road-map of a project
should not be posed on day one of the project as no one can accurately know if the project is likely
to fulfill its intentions on time until some historical evidence of the speed of development is present
and no concise time frame is conceivable.
Secondly, the standard error of any given forecast should be calculable and defined. In contrast
to more traditional forecasting methods, measures of forecast standard error are not apparent for
prediction markets duely because prices (predictions) are not averages of a random walk samples
but something more guided in its intentions. Market prices/predictions are also not typical time
series of fundamental variables. Instead, they are a sequence of forecasts of a single future outcome.
Berg, Nelson, and Rietz (2003)[4] lay a foundation of an accurate calculation of standard errors
using various methods.
Additionally, the ability of the prediction market to aggregate information and make accurate
predictions is based on the efficient-market hypothesis [5.2]. blockchain technologies allow us a
greater amount of information about a project in regards to its finances, by combining prediction
markets with blockchain we are able to further increase our models accuracy by approximating
the efficient-market hypothesis with financial analysis, whale wallet watching and other notions
reviewed in other papers.

Another method for increasing accuracy was presented and developed by MIT in 2017 with an
algorithm intitled the "surprisingly popular" algorithm. In their algorithm they rely on the idea of
querying the traders to evaluate their own confidence when giving a bet. The method asks people
two things for each bet: What they think the right answer is, and what they think popular opinion
will be. The variation between the two aggregate responses indicates the correct answer.

9
4.3.2 Number of Needed Participants?
Surprisingly, our research suggests that the number of needed active traders for any given bet is
quite low. Christiansen (2007) [5] reported that prediction markets with more than 16 traders
were well-calibrated. Research by McHugh and Jackson (2012) [6] found that changing the amount
of traders had a minimal impact in regards to accuracy if the markets had more than 20 traders.
4.3.3 What About (Prediction) Market Manipulation?
It is likely that groups will be motivated to manipulate the market for personal gain. However, in
practice, such attempts at manipulation have always proven to be very short lived. In their paper,
Hanson, Oprea and Porter (2005) [7], show how attempts at market manipulation can in fact end
up increasing the accuracy of the market because they provide a profit incentive to bet against the
manipulator.
4.4 The Outcome Oracle
Determining who won a bet requires an impartial way of verification. Achieving this verification in
a decentralized way is critical to keep the system integer. We intend to deploy the Cardano Open
Oracle Protocol (COOP) to address this challenge developed by Orcfax.[10] However, in the early
stage of development, the Xerberus organization will determine the outcome of bets in good faith
with the help of community contributors.

5 Technical Implementation
5.1 Statistical Analysis
When we receive empirical data we need a method of converting it from its simple informative
nature to one that can be calculated unilaterally. To accomplish this, we employ methods rooted in
statistics to determine the inherent chance that any particular event is the highest probability. A
pillar of blockchain technology is the privacy through anonymity, because of this fact we can never
truly define with complete certainty any one given notion but can give a statistical representation
defining the chance that an event is true or not.
5.1.1 Wallet Identification
One method for analysis is reviewing the individual wallets of each project, a project with wallets
that primarily trade its market or sell large amounts of tokens brought at some token discount
event is more risky than a project that has primarily wallets who buy/use the token ever increased
amounts. We take a statistical approach and classify wallets based on:

1. Origin point of tokens in a wallet (E.g., From a DEX or other means)

2. Buying and selling tendencies

3. A wallets activity in relation to other assets held

4. Identification and tracking of whale wallets and insider wallets

By reviewing these parameters we can classify probabilities based on the chance to be a particular
wallet subtype: insider, institution, founder, team, whale, hodler, and many more.

10
5.1.2 Velocity and Distribution
A tokens velocity varies from volume in that the first is quantified by how much that particular
token is moved around and the later references trading activity. A token that is considered a utility
token will have a very high token velocity whereas a governance token should have low velocity
yet both can share similar trading volumes. By quantifying the difference in velocity vs trading
volume adds an extra layer for identification of projects/wallets acting on chain and can then use
both for further quantification.
5.1.3 Governance Power
Tokens that have an aspect of governance utility also need evaluation: not every token of a project
is used for governance proposals, as wallets defined as treasuries and the like should not be allowed
to vote. We seek to validate which wallets can/do vote and track the decentralization of a project.
Additionally, not every governance token is created equally, some provide more "voting power" per
token than others, we develop a scoring system to sort the potential influence holding a particular
token provides.
5.1.4 Trading analytics
Lastly, we can observe the trading metrics of a token: Where and in what quantities is the token
being traded? By which set of wallets? On what time scales? Answers to these questions and
the use of commonly known trading theory methods can infer ideas of the state of a project and
further our statistical outlook via the differences in these metrics.
5.2 Topological Data Analysis
Topological data analysis (TDA)[9] can broadly be described as a collection of data analysis
methods that find structure in data. These methods include clustering, manifold estimation,
nonlinear dimension reduction, mode estimation, ridge estimation and persistent homology.[8]
Using topological methods we are able to gain insights on datasets that are very high-dimensional,
incomplete and noisy. In addition, this method also includes an important mathematical notion
"Functoriality" - A method of mapping between individual sets/categories. The main theory driving
TDA is that the "shape" or structure of data contains relevant information. For example, a linear
relationship between two variables can be defined as a line, a cyclic relationship (think of the
changing of seasons, or bear vs bull markets) can be classified as a circle or relevant sinusoidal
function and so forth.
5.3 Persistence Diagram
Each project has a unique set of data values and therefore a unique shape can be generated for it
defined commonly by the Vietoris-Rips complex. Slight changes in the underlying data correlate
to a morphism of the shape generated. All shapes (lines, planes, volumes etc) that make up the
larger super-shape of a project are then cataloged by the amount of time that particular shape
persists in the data through some parameterization techniques defined by TDA. The resulting
information shows the creation and destruction of each shape type and is then graphed to a
Persistence diagram - a collection of points on ∆p:“ tpu, vq P R2 |u, v ě 0, u ď vuq where the
point at px, yq indicates a topological feature born at scale x and persists until scale y. The
persistence of these topological features is crucial in determining the behavior of the underlying
signal and the dynamic nature of a project generating it.

11
More persistent features are detected over a wide range of spatial scales and are deemed more
likely to represent true features of the underlying space rather than artifacts of sampling, noise, or
particular choice of parameters. When comparing the persistence diagrams generated from two
separate projects X, Y , use of the Wasserstein distance allows us to scale the differing shapes of
the similarities or dissimulators between projects.

A persistence barcode is a way to structure the data in very efficient ways that still preserve all
information about a data set in the forms of Betti numbers and persistence diagrams. These
parameters can be efficiently placed on-chain for verification and transparency of how
we determine the amount of risk per project relatively.

12
5.4 Shape Risk Score
For the first risk score we rely exclusively on empirical data for shape generation and therefore
classification. Consider a set of all projects and their correlating shapes, some projects (take for
example the entire set of DEXs) will have very similar shapes because of the similarity of scope.

We place all shapes generated by all projects and order them in commonality or similarity.

This risk score essentially takes all the defined shapes and grades them from most common to least
common. If the reader desires to understand the potential risk of a project, they can reference
the "shape scale" and see if the project in question is functioning in a similar manner to other
projects of its type. Additionally, we can subcatorgorise projects based on their shape: say that all
rug-pulls possess a certain isomorphism that shows some level of consistency throughout all data
sets than any project also containing that isomorphism inherits a level of risk from it. The risk
score is generated as follows:

Risk of a project is defined as the variance of its shape from all other
projects and the similarities in shapes give an indication of analogous
intent.

5.5 Risk Manifold


The risk score defined above gives a brief description of the classification of projects based on
their data, however this method can be insulated and fail to understand macro events and that
some projects vary in shape naturally from others. Realfi projects vary greatly in shape from "pfp
NFT" projects for example. To account for this, we project the projects shapes into a "space-time"
referred to as the risk manifold. This space is parameterized by both statistical and research
market data points.
We define a manifold constructed from non-empirical and often macro parameters that can be
used as a background of sorts where we use the shape of a project to determine the location in risk
manifold uniformly. The manifold can be used as a boundary to observe the change in shape for
each project over time.

Theory 1: The risk manifold is an n dimensional manifold constructed of all


possible shapes on a local level, where the "curvature" of the space is determined
by macro-events and is continuous.

13
One can think of the risk manifold as the complete set of all possible shapes. The observation of
the change of shape over time suggests the idea of movement, or some notion of velocity, where an
inevitable change in the data of a project "moves" it in risk manifold. The lifespan of a project
can then be abstracted into a simple path along risk manifold.

Theory 2: All projects lie on the surface of the risk manifold and there is an
infinite number of paths that a project can take to translate from point A to
point B

That is to say, every project is some amorphous projection of an idea, and by moving along
risk manifold every project can theoretically become any other. Any morphism of a project is
another point on risk manifold. We use these theorem to generate a way to create a space where
classification of projects is done globally and in context.
5.5.1 Local Region
A local region of the risk manifold can
be thought of as a space on the risk
manifold that encompasses all projects
of a particular sub-type. Take for
example any project that classifies as
a "DEX", when observing this part
of the manifold through time we can
see projects "move" here and there as
described in the section above. We
can then define areas of shapes that
could be classified as risky. When a
project morphs into, or out of these ar-
eas it is bound by theorems stated above.

The figure on the right gives an arbitra-


tion of the risk space with two areas, one
high risk and one low risk. Through time
we can observe the trajectory of a project
and define a more dynamic score: It may
be better to be a project who is in a
higher risk area but over time moving Figure 1: Snapshot of a local area in the risk manifold
towards low risk area than it is to start
in a low risk area and be pointed towards higher risk. Local areas give an indication of the
success of a project in relation to other similar subjected projects.

14
5.5.2 Global Region
The global region is defined as the super-space that is dictated by macro-events and general
sentiment and is generated from the outcomes of the prediction markets. If a project is free to
move about the surface on a local level
than the global region determines the
set of available paths that can be taken.

The figure on the left is an example of


the global structure of the risk manifold.
The motivation for this construction is
as follows: A group of projects carry
some spectrum of risk that is a manifes-
tation of risk for their particular niche,
by mapping all projects globally we can
understand the differences in risk per
grouping and better weigh the effects of
each groups risk on another.
Consider each color as an inflection of
risk but based on different parameters.
A project in the blue global section may
have the same risk score as a project
in the yellow section, yet both projects
received the same score for different rea-
Figure 2: Global representation of the risk manifold sons.
5.6 Trajectory Risk Score
With this framework in mind, we are able to define our second risk score:
Risk of a project is defined as the speed of translation of its shape
in the risk manifold and its path about the manifold determines
the trajectory of a project. The local area of a project defines the
parameters upon which we classify risk.
By molding a shape that captures a global picture of all possible shapes a project can actuate via
research markets and global factors we are able to embed our risk score into its context. By doing
so, we allow for greater understanding and unique determination of risk for each sub-type of project.
However, the introduction of research markets also turns our risk score from a mathematically
provable model into a mathematical model with potentially fallible inputs. In a an effort to remain
pure to our quest for provability and transparency we found it necessary to provide the initial
determination of risk as well as this secondary and ultimately more contextual version.

5.7 Prediction Scoring


The installation of the risk manifold gives us a unique opportunity: We are able to give accurate
risk prediction for the futures of projects as well. By observing the speed and direction of a project
and keeping true to the two theorem stated in section [5.5] we are able to estimate what potential
problems or risk factors a project may face in the future by projection analysis of the projects path.
If proven to be accurate enough we can also advise said projects with precise information needed

15
to "course correct". This provides another use case for Xerberus in the form of advising projects
to success based on their past and present risk.
5.8 Curve-Fitting (Machine Learning)
Although it has been stated that we
will not be using machine learning
for our calculations, we will use it for
risk predictions and futures. The key
takeaway is that although we use ML
for a speculative aspect of our scores, it
is not used as a core in any calculation
of the risk scores themselves.

In reality, when we construct the risk


manifold using the above described
methods it will not, at least naturally,
fulfill theorem 1 of the risk manifold in
regards to continuity. We employ Ma-
chine Learning curve-fitting techniques
to find an approximation of the manifold
that is continuous and by doing so allows
us to plot not only the path of a project
historically but also give some idea of
where it is headed as well. Figure 3: A smoothing of the Global Region of the risk
manifold by machine learning
If we can define a space that holds under the theorem 1,2 presented for the risk manifold we can
create a risk score that becomes invariant under the prediction markets and therefore provable.
We seek to use ML as a temporary tool to refine the structure of the risk manifold quicker than
canonical observation can.

6 Roadmap
6.1 Xerberus Labs
“Xerberus Labs” will be Incorporated as a limited liability company (Gmbh) in Zug, Switzerland,
with the intention to become a recognized active member of the ’crypto valley’. Switzerland
provides the optimal legal framework for Xerberus products. The company is essential to build up
the Xerberus organisation and provide a legal counter-party for operations. Furthermore, some
of the intellectual property will remain within Xerberus Labs. Additionally, Switzerland offers
a legislation package allowing for a tokenized equity model enforcing on-chain governance of the
underlying company. We believe this regulated and enforceable governance to be the future and
feel confident that Switzerland is and will remain an important location for innovation in our
industry. Note that a tokenized equity would be a fully regulated security token; such a token will
be issued differently and does not connect to the Xerberus utility token used to access, benefit
from, and participation in governance of the protocol.

16
6.2 Xerberus Protocol
The protocol combines user governance with transparency to create a system for everyone to view
and understand token risk1 . To achieve this, the Xerberus Protocol provides three solutions:
• It allows open cooperation of experts to find a consensus about risks and incentivizes their
cooperation.
• It finds a fair market price for information.
• It provides transparency as to why a risk score is given and archives previous scores on chain
for the public to see and reference.

6.3 Pre-Alpha Stage


• Building infrastructure to read information from the blockchain and perform statistical
analysis and other various research.
• Funding: bootstrap.
• Output: Basic dashboard with fundamental analysis on tokens, nothing world changing but
a good set of information that provides a foundation for more advanced research.
• Finished by: End of Q4 2022
1
see appendix B

17
6.4 Alpha Stage
• Use of the available data combined with TDA [5.2]. We look at the shape of data and see
which projects data looks similar to each other. Based on the similarity, we provide a risk
score. The idea is that good projects should look alike each other as do rug pulls.

• Funding: Attempt to acquire our pre-seed funding via ISPO and/or Project Catalyst.

• Output: Expanding the dashboard with risk scores based on blockchain data that provide
you unique and in-depth insights into the quality of a project. Verifiable and Proveable.

• Finished by: End of Q1 2023

6.5 Beta Stage


• After understanding the shape of successful and failed projects we need to add a prediction
element into the risk analysis. Therefore we create a public research market and allow
broad knowledge to flow into those predictions while also avoiding closed door decisions.
By combining the research market output with the TDA data we built a “risk universe”: a
multidimensional space in which projects exist like planets. By creating this space we can
now predict, like astronomers, where the objects in our space are headed.

• Funding: We seek to receive our seed funding via strategic investors.

• Output: A research market for researchers whose knowledge will improve risk scores signif-
icantly by providing a more rich data set. Everyone can now learn about the quality of a
project today, and how on what trajectory it is heading in the future.

• Finished by: End of Q3 2023

6.6 Fully Functional Stage


• During the last stage of building the core we add a machine learning module that is trained
to recognize and sort the data shapes of projects. Thereby we make the model run by itself,
without the inference of humans beyond feeding information into the model via the research
market.

• Funding: Revenues of the prediction market and B2B API.

• Output: A fully functional risk rating organization that provides in-depth knowledge about
the quality of a project, its future trajectory and how likely it is to continue on this trajectory.
The rating is transparent, provable and void of corruptible human interference.

• Finished by: End of Q4 2023 / Early Q1 2024

6.7 Decentralization Stage


• After the model is fully established we focus on moving all parts of the organization over to a
fully decentralized infrastructure including but not limited to decentralized hosting, securing
of rankings on-chain and allowing people to reference our risk output for their on-chain
transactions: E.g., the triggering of a liquidation if a score falls below a threshold.

18
• Output: The first fully decentralized and transparent risk rating organization in the history
of finance.

• Finished by: Q2-Q4 of 2024

7 Tokenomics
The Xerberus tokens purpose is to enable users to engage with the Xerberus research markets in a
decentralized way that aligns incentives to create a virtuous cycle of value creation on the platform.
The token is the essential vehicle for value creation and distribution. The token creates value
by equipping the researcher’s opinion with weight and balancing the weight toward researchers
with the highest accuracy. The token distributes value by providing prioritized access to critical
information to those parties with the highest stake in a specific data category. The token will also
contain a decisions based platform governance rights.

A Xerberus token grants you the right to use the Xerberus platform either as a researcher or a
consumer of data. You may benefit from the platform through your active research contributions
as a researcher. The Xerberus token is the only way a user can contribute and access the output of
Xerberus research markets. Therefore, the token is a utility token.

7.1 Token Utility


The aforementioned utility makes the Xerberus token system critical as it is central for value
creation within Xerberus’ risk assessment:

• Stake tokens on any given forecast to earn more tokens and access the research markets.

• Stake tokens on signal streams to receive priority access to information.

• Vote on platform governance decisions such as listing new assets and defining parameters of
the risk manifold.

7.2 Utility Token Value Thesis


The first distribution of Xerberus tokens to users will be via different stages of product pre-sales.
After this, the token will be available on various trading venues. The ability to exchange Xerberus
tokens for other digital assets is essential for the incentive mechanism that drives Xerberus. No
expert will share their knowledge if the tokens they receive are stuck on the platform or have no value.

Therefore, the question of why Xerberus tokens have value is critical to the underlying incentive
mechanism. The Xerberus token is a vehicle to coordinate different actors to produce an output;
risk assessment to guide investors in their decisions. These risk scores provide a non-obvious insight
into which tokens are valuable. Since these insights aren’t yet known, they are not priced into the
market. By expanding this logic, publishing or changing a Xerberus risk score for an asset should
change its market price. This price change is an opportunity for a trader and thus provides a direct
value to the public.

If the probability of a forecast changes the Xerberus risk score of asset λ and the market price
adjusts the market capitalization of asset λ from 100 million ADA up to 104 million ADA, the

19
Time —– Event —–

t=0 The risk score for asset λ puts it among the blue-chip assets.
t=1 Forecast β on a critical event for asset λ is published.
t=2 The Forecast β probability falls into the negative, indicating the event is unlikely to occur.
t=3 Asset λ risk score is adjusted removing it from the blue-chip ranking.
t=4 Traders opening new positions in expectation of an imminent price change of asset λ.
t=5 The price of asset λ adjusts.
Table 1: Logic; Risk Score Impact on Price

Xerberus impact on the market price is four percent. This price change demonstrates that the
information implied in the forecast has a value of four Million ADA. The Xerberus utility token is
a tool to capture the value the experts put into the platform (aka their effort) and distribute this
value back onto the market while optimizing both the quality of input and the output price.
7.3 Utility Token Demand Thesis
Token demand from the data supply side: For any forecast on an event critical for the price
development of an asset, there is a number of people who know critical information. Every person
will have a different amount of liquidity available to monetize their knowledge. The available
liquidity depends on the means and the actors’ certainty about their insight. E.g., a rational
actor who knows an event will not occur with 100 percent certainty will monetize on this risk-free
opportunity to maximize profits. This actor would be willing to invest any amount into Xerberus
tokens as the pay-off is guaranteed higher than the original investment. While a rational actor
who knows an event will not occur with little certainty will be more cautious when staking on a
forecast.

Token demand from the information consumption side: For any prediction on an event
critical for the price development of an asset, there is a number of traders capable of capitalizing
on price volatility, given the right information. These traders will receive prioritized access to
such information as long as they staked Xerberus tokens on a signal stream for a specific token.
The protocol will distribute the signals following the staking leader board. Therefore, traders will
constantly buy more tokens to have the highest stake, compared to other traders, if the profit they
can make by capitalizing on price volatility is higher than the cost of an additional stake.

7.4 Utility Token Pricing Discussion


We have therefore established that there is rational demand for the Xerberus utility token, we
still have to develop a simple metric to understand if the Xerberus token is cheap or expensive:
One way of looking at the fair price is to compare the price of a token with its practical impact
on the market prices of other assets. The price impact should equal the value of information
communicated through the tokens weighted by market participants’ usage of the resulting risk
score. This value then also accurately reflects the total value traders could extract if the volatility
created is traded optimally.

Aggregated Price Impact = Pricing Relevant Information ¨ Risk Score Distribution

20
7.4.1 Lemma

If the aggregated price impact of all Xerberus is larger than the market capitalization, the token is
undervalued. As a result, the price volatility value is higher than the value of all tokens. However,
professional traders can turn the unpriced value into profit, and thus they will buy more tokens
until the price impact and market cap are equal again.

Ultimately, the price of a token depends on the supply and demand in an open marketplace;
consequently, the above described metric is only one perspective on how one could evaluate the
Xerberus token based on the relative utility it provides.

7.5 Token Distribution


The exact token distribution, as well as the parsing, token reward schemes, etc., will be published
separately and attached to the preceding versions of this whitepaper.

8 Conclusion
Xerberus believes in pushing the boundaries of what we collectively deem possible. In this paper,
we have laid the groundwork for our methods in providing a risk assessment model with open
verifiable proofs via mathematics and a way to aggregate additional crowd-sourced knowledge via
research markets. In addition, we have presented our token utility thesis and shown that the token
is the central value creation mechanism in a token based economy.

Furthermore, we elaborated on our perspective on market efficiency and demonstrated how our
research market could reduce market inefficiencies and uncertainty.

Xerberus takes the data gathered from the blockchain and the prediction market and projects it
into a higher dimension where we can calculate the relations and equivalence of each data point to
all other data points. Once we have this higher dimensional point cloud, we create a topological
map parameterized by the underlying point cloud. When looking at this resulting topology of the
data set, we can understand its unique connections that would otherwise be omitted. Mapping
data in this way gives a functorial method to reach conclusions. Allowing us not to look at risk as a
result but can also grade and quantify every aspect that contributes to the risk in an ever-increasing
field of relevance: we can calculate more than just a risk score but show what specific unique
aspects contribute to risk per project.

Risk for a Decentralized Exchange differs from risk for a RealFi project, and our method lays a
foundation to quantify those differences. Understanding the substances and the interconnectedness
of risk enables us to provide profound insights to investors. This insight can protect and guide the
further development of our ecosystem. We are confident that with the expansion of our ecosystem,
more assets will benefit from our risk understanding. Thereby Xerberus will help to create a more

21
accountable world through transparency.

22
Appendix A
Topological Data Analysis
An important part of TDA and the core of our risk model is the notion of persistent homology.
persistent homology is a method of mapping can calculating topological spaces and shapes, the
longer a feature persists the more likely to be true features of the underlying data. To calculate the
Persistent homology we first must take the data and bulid simplicial complex from the underlying
data where the Rn data points are considered vertices, the creation of higher dimensional complexes
must satisfy:
1. Every face of a simplex from κ is also in κ

2. The non-empty interection of any two simplices σ1 , σ2 P κ is a face of both σ1 and σ2


A pure or homogeneous simplicial k-complex κ is a simplicial complex where every simplex of
dimension less than k is a face of some simplex σ P κ of dimension exactly k. Informally, a pure
1-complex "looks" like it’s made of a bunch of lines, a 2-complex "looks" like it’s made of a bunch
of triangles, etc. An example of a non-homogeneous complex is a triangle with a line segment
attached to one of its vertices. Pure simplicial complexes can be thought of as triangulations and
provide a definition of polytopes. [13] More simply put: A distance function on the underlying
space corresponds to a filtration of the simplicial complex, that is a nested sequence of increasing
subsets.

Simplicial homology gives a toolbox of invariant shape discription and characterization:


Given a simplicial complex κ it is possible to define the Chain Complex associated with κ, denoted
as C˚ pκq :“ pCn pκq, Bn qnPZ where Cn pκq is the free Abelian group generated by the k-simplices
of κ and Bn : Cn pκq Ñ Cpn´1q pκq is a homomorphism titled boundary-map which encodes the
boundry relations between k-simplices and the (n-1)-simplices of κ such that B 2 “ 0. we denote
as Zn pκq :“ kerpBn q of the group of the k-cycles of κ and as Bn pκq :“ impBn`1 q of the group of
k-boundries of κ. Then, we define the k th homology group of κ as:

Zn pκq
Hn pκq :“ Hn pC˚ pκqq “
Bn pκq
For intuitions sake, Homology groups show if/ the number of "holes" in a shape. For each
degree n, the nth Betti number βn is defined as the rank of Hn pκq and it counts the number of
independent k-cycles which do not represent the boundary of any collection of simplices of κ. In
dimension 0, β0 coincides with the number of connected components of the complex, in dimension
1, its tunnels and its holes, in dimension 2, the shells surrounding voids or cavities, and so on.
Persistent Homology allows us to calulate the homology of a data set on a multi-scale
approach.
Shape Risk Score
Largely, When measuring the variance in shapes, the Wasserstein Distance is used: [11] The pth
Wasserstein distance between two probability measures µ and ν in Pp pM q where Pp pM q denotes
the collection of all probability measures µ on Metric space M , that is, there exists some x0 in M
such that: ż
dpx, x0 qp dµpxq ă 8
M

23
The Wasserstein distance is then:
˜ ż ¸1{p
Wp pµ, νq :“ infφPΓpµ,νq dpx, yqp dφpx, yq
M ˆM

Where Γpµ, νq is the collection of all measures on M ˆ M (Couplings) In essence, the wasserstein
distance measures the "work" Required to transform from one shape to the next.

Say we take the Wasserstein distance between all Metric sets defined by project data, we could
define a distribution set B with sample i such that:
ÿ
Bi “ W pj, kq
j,k

then the mean, median and mode can be calculated in respect to what shapes (projects) are
most common and therefore more predominate and less risky by definition
Persistence Diagrams / Barcodes
To be completed in a preceding version of this paper
Risk Manifold
An n-dimensional topological manifold RM is a topological Hausdroff space with a countable
base which is locally homeomorphic to Rn , that is, for every point p in RM there is an open
neighbourhood U of p and a homeomorphism φ : U which maps the set U onto an open set V Ă Rn
[12]
Trajectory Risk Score
To be completed in a preceding version of this paper

Appendix B
Risk is inherently subjective, what one person defines as financially risky another may reason the
opposite. Xerberus seeks to provide a consistent set of metrics that any interested investor can
take and adjust to their preference to define Risk to their taste. In this paper we define a risk score
in the sense of differance from the mean, in reality risk may present itself in many other ways. To
enable this, we do not use Machine learning or AI on a computational level,

24
References
[1] Tran, and Leirvilk., "Efficiency in the Markets of Crypto-Currencies." Finance Research
Letters - November 2019
https://www.researchgate.net/publication/337624087_Efficiency_in_the_Markets_of_
Crypto-Currencies

[2] Slovik,. "Market uncertainty and market instability."


https://www.bis.org/ifc/publ/ifcb34ad.pdf

[3] Cowgill, and Zitzewitz., "Corporate Prediction Markets: Evidence from Google, Ford, and
Firm X*." Review of Economic Studies - April 2015
http://www.houdekpetr.cz/!data/papers/Cowgill%20et%20al%202015.pdf

[4] Berg, Nelson, and Rietz., "Accuracy and Forecast Standard Error of Prediction Markets."
Departments of Accounting, Economics and Finance. University of Iowa - July 2003
https://iemweb.biz.uiowa.edu/archive/forecasting.pdf

[5] Christiansen, J. D. (2007). "Prediction markets: practical experiments in small markets and
behaviours observed." Journal of Prediction Markets, 1(1), 17–41.
http://refhub.elsevier.com/S0169-2070(18)30087-6/sb14

[6] McHugh, P., & Jackson, A. (2012). "Prediction market accuracy: the impact of size, incentives,
context and interpretation." Journal of Prediction Markets, 6(2), 22–46.
http://www.ubplj.org/index.php/jpm/article/view/500

[7] Hanson, Oprea and Porter., (2005) "Information aggregation and manipulation in an
experimental market." Journal of Economic Behavior & Organization, Volume 60, Issue 4,
August 2006, Pages 449-459
https://www.sciencedirect.com/science/article/abs/pii/S0167268105001575

[8] Wasserman, (2017) "Annual Review of Statistics and Its Application: Topological Data
Analysis." Department of Statistics, Carnegie Mellon University

[9] Carlsson, (2009) "TOPOLOGY AND DATA." BULLETIN (New Series) OF THE AMERI-
CAN MATHEMATICAL SOCIETY, Volume 46, Number 2, April 2009, Pages 255–308
https://www.ams.org/journals/bull/2009-46-02/S0273-0979-09-01249-X/
S0273-0979-09-01249-X.pdf

[10] Orcfax https://www.orcfax.link/orcfax-july-technical-update/

25
[11] Jung Hun Oh, Maryam Pouryahya, Aditi Iyer, Aditya P. Apte, Allen Tannenbaum, Joseph O.
Deasy., "Kernel Wasserstein Distance." 1Department of Medical Physics, Memorial Sloan
Kettering Cancer Center, USA, & Departments of Applied Mathematics and Computer
Science, Stony Brook University, USA
https://arxiv.org/pdf/1905.09314.pdf

[12] Keng, "Manifolds: A Gentle Introduction." Bounded Rationality


https://bjlkeng.github.io/posts/manifolds/

[13] https://en.wikipedia.org/wiki/Simplicialc omplex

26

You might also like