Professional Documents
Culture Documents
Final Report
Final Report
Project Report
Author: Group 05
Tran Ngoc Khanh - 20200326
Nguyen The Minh Duc - 20204904
Nguyen Ngoc Toan - 20200544
Pham Thanh Nam - 20204921
Nguyen Hoang Tien - 20204927
Advisor: Assoc Prof. Nguyen Linh Giang
Academic year: 2022
Abstract
The year 2008 marked the birth of a completely new concept in the financial world - cryptocur-
rency. Cryptocurrency is a new type of digital currency so that all the transactions are verified and
maintained by a decentralized system, rather than by a centralized authority like the current financial
system. This development is done in conjunction with the blockchain concept and has had huge
potential growth in the future financial system. For this reason, the cryptocurrency market was born,
where blockchain projects can find investment, as well as users can put their trust in a decentralized
digital currency to perform transactions and investors can put their money to make a profit. Price
prediction systems that have been studied for a long time ago can be applied by investors and in-
vestment funds in this new market. Although this will be more difficult when the market is very
volatile and very erratic, cryptocurrency price prediction is still a very hot and interesting topic for
researchers. Our target is to implement some Statistical models, Machine Learning models and Deep
Learning models in the price movement prediction problem. Our study involves comparing all these
models and selecting the best model to make the final training strategy and evaluation. And the
result is: the best model is LightGBM and with this model, we obtained the average accuracy on the
test set of 55.262%, and the average AUC score on the test set of 0.56747. The corresponding values
of the validation sets are 54.658% and 0.55127.
Acknowledgment
We would like to express our sincere gratitude to Assoc. Prof. Nguyen Linh Giang for giving us
an opportunity to work on this wonderful project on Bitcoin price prediction. Throughout some
quick encounters with him, we wanted to say thank you for your valuable words and advice that
motivated us. We are also extremely grateful to Assoc. Prof. Than Quang Khoat who taught us
about Machine Learning this semester 20212. The knowledge that he taught us is indispensable to
completing this project. Preparing this project in collaboration with our teachers was a refreshing
experience.
1
Contents
1 Introduction 4
2 Methodology 5
3 Data Preparation 5
3.1 Data collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3.2 Exploratory data analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3.2.1 Label analysis (Close price) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3.2.2 Feature analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.3 Feature engineering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.4 Feature scaling and normalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.4.1 Window normalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.4.2 Logarithmic scale . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
4 Models 9
4.1 Theoretical background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
4.1.1 Statistical models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
4.1.2 Machine Learning models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
4.1.3 Ensemble methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
4.1.4 Deep learning models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.2 Practical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.3 Evaluation and Error Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.4 Final Pipeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
5 Conclusion 23
A RAW DATA 26
2
Report Group 05
List of Figures
1 Global crypto market cap [7] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2 Top 10 Crypto Fund Manager Managers by Managed AUM (Swfinstitute) [8] . . . . . 4
3 Project pipeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
4 Bitcoin Close Price . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
5 Bitcoin close price 1st difference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
6 Close price trend seasonal component . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
7 Close price residual component . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
8 First 100 close price seasonal component datapoints . . . . . . . . . . . . . . . . . . . . 7
9 Residual histogram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
10 Residual divided by trend histogram - 1 hour . . . . . . . . . . . . . . . . . . . . . . . 7
11 Residual divided by trend histogram - 1 day . . . . . . . . . . . . . . . . . . . . . . . . 8
12 Features correlation matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
13 Close price before difference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
14 The close price after first difference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
15 ACF and PACF plot of each difference of Close Price . . . . . . . . . . . . . . . . . . . 10
16 Returned series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
17 Summary of ARIMA model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
18 The random 300 data points of trainning set results . . . . . . . . . . . . . . . . . . . . 10
19 GARCH model summary of volatility . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
20 Confident interval on whole test set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
21 300 random continuous data points with confident interval . . . . . . . . . . . . . . . . 11
22 Odds and logit function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
23 The distance between the expectations and the sum of the variances affect the discrim-
inant degree of the data [13] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
24 Error with training rounds [17] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
25 Amount of say . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
26 Error with training rounds [19] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
27 The repeating module in a standard RNN contains a single layer [24] . . . . . . . . . . 17
28 The repeating module in an LSTM contains four interacting layers [24] . . . . . . . . . 17
29 LSTM notations [24] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
30 LSTM cell state line [24] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
31 LSTM gate [24] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
32 LSTM forget gate [24] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
33 LSTM input gate and new cell state [24] . . . . . . . . . . . . . . . . . . . . . . . . . . 18
34 Update cell state [24] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
35 LSTM output gate [24] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
36 GAN architecture [26] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
37 Generator architecture [26] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
38 Features importance barplot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
39 AUC-Iterations plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
40 Train set score distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
41 Validation set score distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
42 Test set score distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
43 Validation set calibration curve . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
44 Test set calibration curve . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
45 Test score histogram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
46 Test set calibration curve . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
47 Validation score histogram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
48 Validation set calibration curve . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
49 Raw data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
50 Features and indicators - 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
51 Features and indicators - 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3
Report Group 05
4
Report Group 05
5
Report Group 05
causing its price to halve and fluctuate between 3.2.1.2. Decomposition of Time Series
$3,000 and $4,000 until April 2019. From April close price data
4 to the end of June this year, the price of BTC *Theoretical background
increased sharply 3 times and then continued to
decline to the old price 9 months later. BTC First of all, we need to understand the 3 pat-
then moved sideways through October 2020 in terns that make up a time series data, which are
the $8000 to $12,000 price range. The first 10 “trend”, “seasonal” and “residual”.
months of this year are also the time when big Trend (Tt )
investors and sharks gather to prepare for a big A trend pattern shows the tendency of the data
BTC wave in 2021. 1 year later, BTC peaked at to increase or decrease during a period of time.
nearly $70000 and it has corrected to a third A time series data can go up, go down instantly
of its peak price with a current value of just but the trend must be upward, downward or sta-
$21,000. ble in a period of time.
Seasonal (St )
A seasonal pattern represents the seasonal factor
that affect the time series data. The seasonality
is fixed and has a known frequency.
Residual (Rt )
A residual pattern is the result when we ex-
tract trend and seasonal components out of the
original time series data. This residual may be
named remainder by other researchers, but in
this report, we use the residual to name it along
with the library convention.
Additive decomposition
yt = Tt + St + Rt
Multiplicative decomposition:
yt = Tt ∗ St ∗ Rt
*Analysis
We decompose the close price data into 3 com-
Figure 5: Bitcoin close price 1st difference ponents by using statsmodel library [12]:
6
Report Group 05
7
Report Group 05
8
Report Group 05
9
Report Group 05
10
Report Group 05
11
Report Group 05
Odds is a function that varies between 0 and ∞. In order to find A0 and A, we will minimize a loss
We know that the good situation occurs when function and the popular loss function for the
odds > 1, it means that a bad odds happens classification problem is binary cross-entropy.
when odds function varies between 0 and 1 while
a good odds function happens when its value is 4.1.2.2. Linear Discriminant Analysis
between 1 and ∞, and this is not fair. This (LDA)
asymmetry makes it difficult to compare these 2 The approach of LDA is different from logistic
situations, that’s why we need logarithmic scale. regression, instead of assuming about the linear
logits (which means that logit(P (y = k|x)) can
be formed by a linear relation A0 + AT x), we
assume that p = P (y = k|x) follows a normal
distribution.
The logarithmic scale of the original odds func- Figure 23: The distance between the expecta-
tion is logit function. This function makes the tions and the sum of the variances affect the
odds assessment symmetrical and fair. So we discriminant degree of the data [13]
have:
The problem of LDA is to find a projection on an
logit(p) = logit(P (y = k|x)) arbitrary axis through which we can determine
= log(odds(p)) the separation of two classes. This method can
be considered as a dimension reduction method
P (y = k|x)
= log where we reduce the original dimension into a
1 − P (y = k|x)
lower dimension where we can easily determine
where p = P (y = k|x) is the beneficial probabil- the separation mentioned above.
ity (when y = k) given x.
Assume that there are N data points
4.1.2.1. Logistic regression x1 , x2 , . . . , xN which are divided into 2 classes:
Suppose that this logit function can be formed C1 and C2 . The data projection onto a straight
by a linear relation with our data x. So: line can be described by a coefficient vector w
and the corresponding value of each new data
point is given by: yi = wT xi , 1 ≤ i ≤ N .
P (y = k|x)
logit(p) = log = A0 + AT x
1 − P (y = k|x) The expected vector of each class is:
The decision boundary is a hyperplane which is 1 X
the set of points x for which the log-odds are mk = xn , k = 1, 2
Nk
n∈Ck
zero and defined by A0 + AT x = 0. There are
2 popular methods that result in linear logits: According to Figure 23, the best solution is when
linear logistic regression - which is discussed in m1 , m2 are farthest away and s1 , s2 are the
this section and linear discriminant analysis - same. That is the reason LDA tries to solve:
which is discussed in the next section.
(m1 − m2 )2
e A0 +AT x maxJ(w) =
Now we have: P (y = k|x) = , s1 2 + s22
1 + eA0 +AT x where
which can be seen as a sigmoid function of x.
12
Report Group 05
Decision trees are a popular method for various or by taking the majority vote in the case of
machine learning tasks. Tree learning "come[s] classification trees.
closest to meeting the requirements for serving
as an off-the-shelf procedure for data mining", This bootstrapping procedure leads to better
say Hastie et al., "because it is invariant un- model performance because it decreases the vari-
der scaling and various other transformations of ance of the model, without increasing the bias.
feature values, is robust to inclusion of irrele- This means that while the predictions of a single
vant features, and produces inspectable models. tree are highly sensitive to noise in its training
However, they are seldom accurate". set, the average of many trees is not, as long
as the trees are not correlated. Simply train-
In particular, trees that are grown very deep ing many trees on a single training set would
tend to learn highly irregular patterns: they give strongly correlated trees (or even the same
overfit their training sets. Random forests are tree many times, if the training algorithm is de-
a way of averaging multiple deep decision trees, terministic); bootstrap sampling is a way of de-
trained on different parts of the same training correlating the trees by showing them different
set, with the goal of reducing the variance. This training sets.
comes at the expense of a small increase in the
bias and some loss of interpretability, but gener- Additionally, an estimate of the uncertainty of
ally greatly boosts the performance in the final the prediction can be made as the standard de-
model. viation of the predictions from all the individual
regression trees on x′ :
Forests are like the pulling together of decision s
PB ˆ2
tree algorithm efforts. Taking the teamwork of ′
b=1 (fb (x ) − f )
many trees thus improving the performance of σ=
B−1
a single random tree. Though not quite similar,
forests give the effects of a k-fold cross valida-
The number of samples/trees, B, is a free param-
tion.
eter. Typically, a few hundred to several thou-
Bagging sand trees are used, depending on the size and
The training algorithm for random forests ap- nature of the training set. An optimal number
plies the general technique of bootstrap aggre- of trees B can be found using cross-validation,
gating, or bagging, to tree learners. Given a or by observing the out-of-bag error: the mean
training set X = x1 , x2 , ..., xn with responses prediction error on each training sample xi , us-
Y = y1 , y2 , ..., yn , bagging repeatedly (B times) ing only the trees that did not have xi in their
13
Report Group 05
bootstrap sample. The training and test error low bias and low variance, or model with good
tend to level off after some number of trees have predictive results.
been fit.
The idea of aggregating decision trees of the
From bagging to random forests Random Forest algorithm is similar to the idea
The above procedure describes the original bag- of The Wisdom of Crowds proposed by James
ging algorithm for trees. Random forests also in- Surowiecki in 2004. The Wisdom of Crowds says
clude another type of bagging scheme: they use that usually aggregating information from one
a modified tree learning algorithm that selects, group is good than from an individual. In the
at each candidate split in the learning process, Random Forest algorithm, it also synthesizes in-
a random subset of the features. This process is formation from a group of decision trees and the
sometimes called "feature bagging". The reason results are better than the Decision Tree algo-
for doing this is the correlation of the trees in an rithm alone.
ordinary bootstrap sample: if one or a few fea-
tures are very strong predictors for the response
variable (target output), these features will be
selected in many of the B trees, causing them
to become correlated. An analysis of how bag-
ging and random subspace projection contribute
to accuracy gains under different conditions is
given by Ho.
14
Report Group 05
4.1.3.3. LightGBM
Figure 25: Amount of say First, we need to understand how the Gradient
Boosting Decision Tree works.
Note that the total error will always be between
0 and 1 and 0 indicates perfect stump and 1 For a given data set with n examples and m
indicates horrible stump. From Figure 10, it can features
be seen that if the error rate is 0.5 (the classifier
predicts half right and half wrong), the “amount
D = (xi , yi )(|D| = n, xi ∈ Rm ), yi ∈ R)
of say” will be 0. If the error rate is small, alpha
will be positive. If the error rate is large, alpha Starting with the predicted value obtained by
will be negative. the tree using K additive functions, we have:
K
New sample weight = Old weight × e−α
X
yˆi = ϕ(xi ) = fk (xi ), fk ∈ F
k=1
The new sample weights will be normalized, and
Where F is the space of regression trees, each fk
a new training round will be performed.
corresponds to an independent tree structure q
In conclusion, the individual learners can be and leaf weights w, yˆi is the predicted value.
weak, but if the performance of each one is
To learn the set of functions used in the model,
slightly better than random guessing, the final
we minimize the following regularized objective:
model can be proven to converge to a strong
learner. X X
L(ϕ) = l(yˆi , yi ) + Ω(fk )
i k
Adaboost often not overfit on practice
When using Adaboost, we observe it behaving 1
where Ω(f ) = γT + λ||w||2
like this: 2
The first term is the training loss and the second
term, which is the regularized term, represents
the complexity of all the trees. Here, T is the
number of leaves in the tree and w is the leaf
weight vector of each tree. γ and λ are the 2
regularization parameters that are pre-defined.
Our objective is to minimize the loss function
above.
15
Report Group 05
T 2
P
(t) 1 X ( i∈Ij gi ) That is the basics of Gradient Boosting Deci-
L̃ (q) = − + γT
sion Tree. Now, we will deep dive into Light-
P
2 i∈Ij hi + λ
j=1
GBM. Indeed, LightGBM is a gradient boosting
The formula above is used to measure the quality framework based on decision trees to increase
of the tree structure q. And for evaluating the the efficiency of the model and reduce memory
split candidates, we have the following formula:
usage.
( i∈IL gi )2 ( i∈IR gi )2
" P P #
( i∈I gi )2
P
1
Lsplit = P +P −P −γ It uses two novel techniques: Gradient-based
2 i∈IL hi + λ i∈IR hi + λ i∈I hi + λ
One Side Sampling (GOSS) and Exclusive Fea-
where IL , IR , I are the instance set of left leaves, ture Bundling (EFB) which fulfills the limita-
right leaves and leaves before splitting, respec- tions of histogram-based algorithm that is pri-
tively. marily used in all Gradient Boosting Decision
16
Report Group 05
17
Report Group 05
remember the information, it can remove or add tion we’re going to store in the cell state. This
information to this cell state at each unit, care- has two parts. First, a sigmoid layer called the
fully regulated by structures called gates. “input gate layer” decides which values we’ll up-
date. Next, a tanh layer creates a vector of new
candidate values, C̃t , that could be added to the
state. LSTM will then combine these two to cre-
ate an update to the state.
18
Report Group 05
19
Report Group 05
As we can see, the performance of the statistical and time-consuming. Therefore, we will choose
models is the lowest. This is relatively easy to LightGBM as a representative model for analy-
understand because in this method we only use sis in the following sections.
the close price for forecasting, the model uses
much less information than others, so low per- 4.3. Evaluation and Error Analysis
formance is inevitable. In section 4.2, we can see that the performance
on the LightGBM model gives the best results.
Next to the linear methods, the method that we Therefore, in this section, all the performance
see the best performance is LDA. We don’t have evaluation and error analysis will be based on
time to analyze specifically what the reason is the output of the LightGBM model. We will first
here, but we will make an assumption that be- look at the important features of this model:
cause our window normalization causes the fea-
ture distributions to move towards normal dis-
tributions. Most of the dimensions do not carry
many classification capabilities, so they satisfy
more applicable conditions than Logistic Regres-
sion or Support Vector Machine.
20
Report Group 05
as log_volume_resid_divide_prev. Finally, it
can be seen that all the features can be grouped
into the following: Price, Volume, Exchange,
Coinbase Exchange.
21
Report Group 05
strategy for validation set, the curve on test set 4.4. Final Pipeline
and validation set are bad). After doing feature engineering and testing on
various models, we choose LightGBM with the
highest performance as our best model. Af-
ter that, we do the features selection process.
Considering the trained LightGBM model, we
can see the features with their importance gain.
For faster training with cumulative assessment,
we only choose the topmost important features
that contribute to 99.7% of the total importance
gain of all features. The number 99.7% comes
from the probability that a normal random vari-
able will fall within 3 standard deviations of the
mean. By this way of feature selection, we ob-
tained only 260 features, which means we can
Figure 43: Validation set calibration curve
reduce about 660 features.
22
Report Group 05
5. Conclusion
Figure 46: Test set calibration curve Bitcoin prediction is an extremely difficult topic
for any method, as mentioned in the Feature En-
gineering section, the problem depends on too
many factors and if only based on common ana-
lytical data types, it is difficult to give a highly
accurate result. In the above report, we have
applied all the methods we have learned in 2
subjects Applied Statistics and Machine Learn-
ing to analyze data and come up with a Bitcoin
price forecasting model. Although we used many
different analytical methods as well as many dif-
ferent models to find a better result, we only
saw performance inching little by little. Due to
the difficulty of the problem, we do not expect
a high result, so we always try to come up with
the most complete process possible. We see that
Figure 47: Validation score histogram this is an open topic, and there is potential for
further exploration in the future.
23
Report Group 05
In terms of data, we can add many other on- [9] W. Ballmann and T. Seng. Quantitative
chain indicators as well as calculate other tech- Investing. Available at https : / / www .
nical indicators to improve the model. In addi- eurekahedge.com/Research/News/1102/
tion, we can add data related to news and in- Quantitative-Investing. Oct. 2005.
vestor sentiment - these indicators can also be [10] Binance. Binance API Documentation.
accurate predictors of the next direction of the Available at https : / / binance - docs .
price. github.io/apidocs/spot/en. Sept. 2019.
In terms of model, we can consider adding CNN, [11] CryptoQuant. Bitcoin: Summary, on-
Transformers and Attention, Deep Forest, . . . chain data analytics. Available at https:
These robust models can be combined to im- / / cryptoquant . com / asset / btc /
prove the prediction performance. summary.
[12] statsmodels.tsa.seasonal.seasonald ecompose.
References Available at https://www.statsmodels.
[1] Satoshi Nakamoto. “Bitcoin: A peer-to- org / stable / generated / statsmodels .
peer electronic cash system”. In: Decentral- tsa . seasonal . seasonal _ decompose .
ized Business Review (2008), p. 21260. html.
[2] How Digital Currency Is Ushering in a [13] Linear Discriminant Analysis. How a bit-
New Era of Money Technology. Avail- coin court case in Japan may create
able at https : / / www . firstcitizens . crypto millionaires. Available at https :
com/commercial/insights/technology/ //machinelearningcoban.com/2017/06/
digital-currency. May 2021. 30/lda. June 2017.
[3] Guardian Nigeria. The idea and a brief [14] Necati Demir. Ensemble Methods: Elegant
history of cryptocurrencies. Available at Techniques to Produce Improved Machine
https : / / guardian . ng / technology / Learning Results. Available at https : / /
tech/the-idea-and-a-brief-history- www . toptal . com / machine - learning /
of-cryptocurrencies. May 2022. ensemble-methods-machine-learning.
[4] Cryptopedia Staff. The Early Days of [15] Wikipedia. Random Forest. Available at
Crypto Exchanges. Available at https:// https : / / en . wikipedia . org / wiki /
www.gemini.com/cryptopedia/crypto- Random_forest.
exchanges - early - mt - gox - hack. Mar. [16] Tuan Nguyen. Random Forest al-
2022. gorithm. Available at https : / /
[5] Wikipedia. Mt. Gox. Available at https: machinelearningcoban . com / tabml _
//en.wikipedia.org/wiki/Mt._Gox. book/ch_model/random_forest.html.
[6] Brian McGleenon. How a bitcoin court [17] Chetan Prabhu. What Do Random Forests
case in Japan may create crypto mil- And The Wisdom of Crowds Have In
lionaires. Available at https : / / uk . Common? Available at https : / / www .
finance . yahoo . com / news / britcoin - linkedin . com / pulse / random - forest -
millionaires - mt - gox - case - japan - wisdom - crowds - chetan - prabhu/. May
153624083-230116218.html. Oct. 2021. 2017.
[7] Trading View. CRYPTOCURRENCY [18] Neelam Tyagi. Understanding the Gini In-
MARKET. Available at https : / / dex and Information Gain in Decision
www . tradingview . com / markets / Trees. Available at https://medium.com/
cryptocurrencies / global - charts/. analytics- steps/understanding- the-
2022. gini - index - and - information - gain -
in - decision - trees - ab4720518ba8.
[8] Swfinstitute. Rankings by Total Managed
Mar. 2020.
AUM. Available at https : / / www .
swfinstitute . org / fund - manager -
rankings/crypto-fund-manager. 2022.
24
Report Group 05
25
Report Group 05
A. RAW DATA
26
Report Group 05
27
Report Group 05
28