Professional Documents
Culture Documents
R For Deep Learning (I) - Build Fully Connected Neural Network From Scratch - ParallelR
R For Deep Learning (I) - Build Fully Connected Neural Network From Scratch - ParallelR
R For Deep Learning (I) - Build Fully Connected Neural Network From Scratch - ParallelR
BLOG
R for Deep Learning (I): Build Fully Connected Neural Network from Scratch
(h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-
scratch/)
FEBRUARY 1 3 , 2 0 1 6 BY PENG ZHAO (HTTPS://W EB.ARCHI VE.O RG/W EB/2 0 2 0 1 0 2 2 2 1 2 8 2 4 /HTTP://W W W.PARALLELR.CO M/AUTHO R/PATRI C/)
by (h ps://web.archive.org/web/20201022212824/h p://synved.com/wordpress-social-media-feather/)
Share This:
Backgrounds
Deep Neural Network (DNN) (h ps://web.archive.org/web/20201022212824/h ps://en.wikipedia.org/wiki/Deep_learning) has made a great
progress in recent years in image recogni on, natural language processing and automa c driving fields, such as Picture.1 shown from 2012 to
2015 DNN improved IMAGNET (h ps://web.archive.org/web/20201022212824/h p://image-net.org/challenges/LSVRC/2015/)’s accuracy from
~80% to ~95%, which really beats tradi onal computer vision (CV) methods.
(h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/wp-
content/uploads/2016/02/ces2016.png)
Picture.1 – From NVIDIA CEO Jensen’s talk in CES16
http://www.parallelr.com/r-deep-neural-network-from-scratch/ (h ps://web.archive.org/web/20201022212824/h ps://www.youtube.com/playlist?
Go AUG OCT NOV
👤 ⍰❎
120list=PLZHnYvH1qtObAdk3waYMalhA8bdbhyTm2)
captures 22 f🐦
23 Feb 2016 - 8 Nov 2020 2019 2020 2021 ▾ About this capture
In this post, we will focus on fully connected neural networks which are commonly called DNN in data science. The biggest advantage of DNN is
to extract and learn features automa cally by deep layers architecture, especially for these complex and high-dimensional data that feature
engineers can’t capture easily, examples in Kaggle
(h ps://web.archive.org/web/20201022212824/h p://blog.kaggle.com/2014/08/01/learning-from-the-best/). Therefore, DNN is also very
a rac ve to data scien sts and there are lots of successful cases as well in classifica on, me series, and recommenda on system, such as
Nick’s post (h ps://web.archive.org/web/20201022212824/h p://blog.dominodatalab.com/using-r-h2o-and-domino-for-a-kaggle-
compe on/) and credit scoring (h ps://web.archive.org/web/20201022212824/h p://www.r-bloggers.com/using-neural-networks-for-credit-
scoring-a-simple-example/) by DNN. In CRAN and R’s community, there are several popular and mature DNN packages including nnet
(h ps://web.archive.org/web/20201022212824/h ps://cran.r-project.org/web/packages/nnet/index.html), nerualnet
(h ps://web.archive.org/web/20201022212824/h ps://cran.r-project.org/web/packages/neuralnet/), H2O
(h ps://web.archive.org/web/20201022212824/h ps://cran.r-project.org/web/packages/h2o/index.html), DARCH
(h ps://web.archive.org/web/20201022212824/h ps://cran.r-project.org/web/packages/darch/index.html), deepnet
(h ps://web.archive.org/web/20201022212824/h ps://cran.r-project.org/web/packages/deepnet/index.html)and mxnet
(h ps://web.archive.org/web/20201022212824/h ps://github.com/dmlc/mxnet), and I strong recommend H2O DNN algorithm and R interface
(h ps://web.archive.org/web/20201022212824/h p://www.h2o.ai/ver cals/algos/deep-learning/).
(h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/wp-content/uploads/2016/02/matrureDNNPackages-2.png)
Using exis ng DNN package, you only need one line R code for your DNN model in most of the me and there is an example
(h ps://web.archive.org/web/20201022212824/h p://www.r-bloggers.com/fi ng-a-neural-network-in-r-neuralnet-package/) by neuralnet. For
the inexperienced user, however, the processing and results may be difficult to understand. Therefore, it will be a valuable prac ce to implement
your own network in order to understand more details from mechanism and computa on views.
– Build specified network with your new ideas
DNN is one of rapidly developing area. Lots of novel works and research results are published in the top journals and Internet every week, and
the users also have their specified neural network configura on to meet their problems such as different ac va on func ons, loss func ons,
regulariza on, and connected graph. On the other hand, the exis ng packages are definitely behind the latest researches, and almost all exis ng
packages are wri en in C/C++, Java so it’s not flexible to apply latest changes and your ideas into the packages.
– Debug and visualize network and data
As we men oned, the exis ng DNN package is highly assembled and wri en by low-level languages so that it’s a nightmare to debug the network
layer by layer or node by node. Even it’s not easy to visualize the results in each layer, monitor the data or weights changes during training, and
show the discovered pa erns in the network.
120about
captures network’s architecture, it refers to the configura on of DNN, such as how many layers in the network, how many 22 neurons in each layer, f🐦
23 Feb 2016 - 8 Nov 2020 2019 2020 2021 ▾ About this capture
what kind of ac va on, loss func on, and regulariza on are used.
(h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/wp-content/uploads/2016/02/dnn_architecture.png)
Now, we will go through the basic components of DNN and show you how it is implemented in R.
and weights are ini alized by random number from rnorm. Bias is just a one dimension matrix with the same size of neurons and set to zero.
Other ini aliza on approaches, such as calibra ng the variances with 1/sqrt(n) and sparse ini aliza on, are introduced in weight ini aliza on
(h ps://web.archive.org/web/20201022212824/h p://cs231n.github.io/neural-networks-2/#init) part of Stanford CS231n.
Pseudo R code:
Another common implementa on approach combines weights and bias together so that the dimension of input is N+1 which indicates N input
features with 1 bias, as below code:
Neuron
A neuron is a basic unit in the DNN which is biologically inspired model of the human neuron. A single neuron performs weight and input
mul plica on and addi on (FMA), which is as same as the linear regression in data science, and then FMA’s result is passed to the ac va on
func on. The commonly used ac va on func ons include sigmoid
(h ps://web.archive.org/web/20201022212824/h ps://en.wikipedia.org/wiki/Sigmoid_func on), ReLu
http://www.parallelr.com/r-deep-neural-network-from-scratch/ Go AUG OCT NOV
👤 ⍰❎
120(hcaptures
ps://web.archive.org/web/20201022212824/h ps://en.wikipedia.org/wiki/Rec fier_(neural_networks)), Tanh 22 f🐦
23 Feb 2016 - 8 Nov 2020 2019 2020 2021 ▾ About this capture
(h ps://web.archive.org/web/20201022212824/h ps://reference.wolfram.com/language/ref/Tanh.html) and Maxout. In this post, I will take
the rec fied linear unit (ReLU) as ac va on func on, f(x) = max(0, x) . For other types of ac va on func on, you can refer here
(h ps://web.archive.org/web/20201022212824/h p://cs231n.github.io/neural-networks-1/#ac un).
(h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/wp-content/uploads/2016/02/neuron.png)
In R, we can implement neuron by various methods, such as sum(xi*wi) . But, more efficient representa on is by matrix mul plica on.
R code:
Implementa on Tips
In prac ce, we always update all neurons in a layer with a batch of examples for performance considera on. Thus, the above code will not work
correctly.
1) Matrix Mul plica on and Addi on
As below code shown, input %*% weights and bias with different dimensions and it can’t be added directly. Two solu ons are provided.
The first one repeats bias ncol mes, however, it will waste lots of memory in big data input. Therefore, the second approach is be er.
# dimension: 2X2
input <- matrix (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/matrix)(1:4, nrow (h ps://web.archive.org/web/20201022212824/h p://inside-r.o
# dimension: 2x3
weights <- matrix (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/matrix)(1:6, nrow (h ps://web.archive.org/web/20201022212824/h p://inside
# dimension: 1*3
bias <- matrix (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/matrix)(1:3, nrow (h ps://web.archive.org/web/20201022212824/h p://inside-r.or
# doesn't work since unmatched dimension
input %*% weights + bias
Error input %*% weights + bias : non-conformable arrays
# correct
# put matrix in the first, the scalar will be recycled to match matrix structure
> pmax (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/pmax)(s1, 0)
[,1] [,2] [,3]
[1,] 8 17 26
[2,] 11 24 37
Layer
– Input Layer
the input layer is rela vely fixed with only 1 layer and the unit number is equivalent to the number of features in the input data.
-Hidden layers
Hidden layers are very various and it’s the core component in DNN. But in general, more hidden layers are needed to capture desired pa erns in
case the problem is more complex (non-linear).
-Output Layer
The unit in output layer most commonly does not have an ac va on because it is usually taken to represent the class scores in classifica on and
arbitrary real-valued numbers in regression. For classifica on, the number of output units matches the number of categories of predic on while
there is only one output node for regression.
IRIS is well-known built-in dataset in stock R for machine learning. So you can take a look at this dataset by the summary at the console directly
as below.
R code:
From the summary, there are four features and three categories of Species. So we can design a DNN architecture as below.
http://www.parallelr.com/r-deep-neural-network-from-scratch/ Go AUG OCT NOV
👤 ⍰❎
120 captures 22 f🐦
23 Feb 2016 - 8 Nov 2020 2019 2020 2021 ▾ About this capture
(h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/iris_network/)
And then we will keep our DNN model in a list, which can be used for retrain or predic on, as below. Actually, we can keep more interes ng
parameters in the model with great flexibility.
R code:
List of 7
$ D : int 4
$ H : num 6
$ K : int 3
$ W1: num [1:4, 1:6] 1.34994 1.11369 -0.57346 -1.12123 -0.00107 ...
$ b1: num [1, 1:6] 1.336621 -0.509689 -0.000277 -0.473194 0 ...
$ W2: num [1:6, 1:3] 1.31464 -0.92211 -0.00574 -0.82909 0.00312 ...
$ b2: num [1, 1:3] 0.581 0.506 -1.088
Predic on
Predic on, also called classifica on or inference in machine learning field, is concise compared with training, which walks through the network
layer by layer from input to output by matrix mul plica on. In output layer, the ac va on func on doesn’t need. And for classifica on, the
probabili es will be calculated by so max (h ps://web.archive.org/web/20201022212824/h ps://en.wikipedia.org/wiki/So max_func on)
while for regression the output represents the real value of predicted. This process is called feed forward or feed propaga on.
R code:
# Prediction
predict.dnn <- func on (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/func on)(model, data = X.test) {
# new data, transfer to matrix
new.data <- data.matrix (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/data.matrix)(data)
# Feed Forwad
hidden.layer <- sweep (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/sweep)(new.data %*% model$W1 ,2, model$b1, '+')
# neurons : Rectified Linear
hidden.layer <- pmax (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/pmax)(hidden.layer, 0)
score <- sweep (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/sweep)(hidden.layer %*% model$W2, 2, model$b2, '+')
Training
Training is to search the op miza on parameters (weights and bias) under the given network architecture and minimize the classifica on error
or residuals. This process includes two parts: feed forward and back propaga on. Feed forward is going through the network with input data (as
predic on parts) and then compute data loss in the output layer by loss func on
(h ps://web.archive.org/web/20201022212824/h ps://en.wikipedia.org/wiki/Loss_func on) (cost func on). Go
http://www.parallelr.com/r-deep-neural-network-from-scratch/ “DataAUG
lossOCT
measures
NOV the 👤 ⍰ ❎
120compa 22we selected cross-f 🐦
capturesbility between a predic on (e.g. the class scores in classifica on) and the ground truth label.” In our example code,
23 Feb 2016 - 8 Nov 2020 2019 2020 2021 ▾ About this capture
entropy func on to evaluate data loss, see detail in here (h ps://web.archive.org/web/20201022212824/h p://cs231n.github.io/neural-
networks-2/#losses).
A er ge ng data loss, we need to minimize the data loss by changing the weights and bias. The very popular method is to back-propagate the
loss into every layers and neuron by gradient descent
(h ps://web.archive.org/web/20201022212824/h ps://en.wikipedia.org/wiki/Gradient_descent) or stochas c gradient descent
(h ps://web.archive.org/web/20201022212824/h ps://en.wikipedia.org/wiki/Stochas c_gradient_descent) which requires deriva ves of data
loss for each parameter (W1, W2, b1, b2). And back propaga on
(h ps://web.archive.org/web/20201022212824/h p://colah.github.io/posts/2015-08-Backprop/) will be different for different ac va on
func ons and see here (h ps://web.archive.org/web/20201022212824/h p://mochajl.readthedocs.org/en/latest/user-guide/neuron.html) and
here (h ps://web.archive.org/web/20201022212824/h p://like.silk.to/studymemo/ChainRuleNeuralNetwork.pdf) for their deriva ves formula
and method, and Stanford CS231n (h ps://web.archive.org/web/20201022212824/h p://cs231n.github.io/neural-networks-3/) for more
training ps.
In our example, the point-wise deriva ve for ReLu is:
(h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/class-relu/)
R code:
http://www.parallelr.com/r-deep-neural-network-from-scratch/
# Train: build and train a 2-layers neural network
Go AUG OCT NOV
👤 ⍰ ❎
train.dnn <- func on (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/func on)(x, y, traindata=data, testdata=NULL,
120 captures 22 f🐦
23 Feb 2016 - 8 Nov 2020 2019 2020 2021 ▾ About this capture
# set hidden layers and neurons
# currently, only support 1 hidden layer
hidden=c (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/c)(6),
# max iteration steps
maxit=2000,
# delta loss
abstol=1e-2,
# learning rate
lr = 1e-2,
# regularization rate
reg = 1e-3,
# show results every 'display' step
display = 100,
random.seed = 1)
{
# to make the case reproducible.
set.seed (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/set.seed)(random.seed)
# use all train data to update weights since it's a small dataset
batchsize <- N
# updated: March 17. 2016
# init loss to a very big value
loss <- 100000
# iteration index
i <- i +1
# forward ....
# 1 indicate row, 2 indicate col
hidden.layer <- sweep (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/sweep)(X %*% W1 ,2, b1, '+')
# neurons : ReLU
hidden.layer <- pmax (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/pmax)(hidden.layer, 0)
score <- sweep (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/sweep)(hidden.layer %*% W2, 2, b2, '+')
# softmax
score.exp <- exp (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/exp)(score)
probs <-sweep (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/sweep)(score.exp, 1, rowSums (h ps://web.archive.org/web/20201022212
# backward ....
dscores <- probs
dscores[Y.index] <- dscores[Y.index] -1
dscores <- dscores / batchsize
# update ....
dW2 <- dW2 + reg*W2
dW1 <- dW1 + reg*W1
W1 <- W1 - lr * dW1
b1 <- b1 - lr * db1
W2 <- W2 - lr * dW2
b2 <- b2 - lr * db2
# final results
# creat list to store learned parameters
# you can add more parameters for debug and visualization
# such as residuals, fitted.values ...
model <- list (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/list)( D = D,
H = H,
K = K,
# weights and bias
W1= W1,
b1= b1,
W2= W2,
b2= b2)
120and then use the training set to train model while tes ng set to measure the generaliza on ability of our model.
captures 22 f🐦
23 Feb 2016 - 8 Nov 2020 2019 2020 2021 ▾ About this capture
R code
########################################################################
# testing
#######################################################################
set.seed (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/set.seed)(1)
# 0. EDA
summary (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/summary)(iris (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/dataset
plot (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/graphics/plot)(iris (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/datasets/iris))
# 2. train model
ir.model <- train.dnn(x=1:4, y=5, traindata=iris (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/datasets/iris)[samp,], testdata=iris (h
# 3. prediction
labels.dnn <- predict.dnn(ir.model, iris (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/datasets/iris)[-samp, -5])
#accuracy
mean (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/mean)(as.integer (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/as
# 0.98
The data loss in train set and the accuracy in test as below:
(h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/iris_loss_accuracy-2/)
Then we compare our DNN model with ‘nnet’ package as below codes.
http://www.parallelr.com/r-deep-neural-network-from-scratch/ Go AUG OCT NOV 👤 ⍰
library (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/library)(nnet (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/nnet/nnet))❎
22
ird <- data.frame (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/data.frame)(rbind (h ps://web.archive.org/web/20201022212824/h p://inside-r.o
120 captures f🐦
23 Feb 2016 - 8 Nov 2020 2019 2020 2021 ▾ About this capture
species = factor (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/factor)(c (h ps://web.archive.org/web/20201022212824/h
ir.nn2 <- nnet (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/nnet/nnet)(species ~ ., data (h ps://web.archive.org/web/20201022212824/h p://i
decay = 1e-2, maxit = 2000)
# accuracy
mean (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/mean)(ird$species[-samp] == labels.nnet)
# 0.96
Update: 4/28/2016:
Google’s tensorflow released a very cool website to visualize neural network in here
(h ps://web.archive.org/web/20201022212824/h p://playground.tensorflow.org/). And we has taught almost all technologies as google’s
website in this blog so you can build up with R as well 🙂
Summary
In this post, we have shown how to implement R neural network from scratch. But the code is only implemented the core concepts of DNN, and
the reader can do further prac ces by:
Solving other classifica on problem, such as a toy case in here (h ps://web.archive.org/web/20201022212824/h p://www.wildml.com/2015/09/implemen ng-a-
neural-network-from-scratch/)
Selec ng various hidden layer size, ac va on func on, loss func on
Extending single hidden layer network to mul -hidden layers
Adjus ng the network to resolve regression problems
Visualizing the network architecture, weights, and bias by R, an example in here
http://www.parallelr.com/r-deep-neural-network-from-scratch/ Go AUG OCT NOV 👤 ⍰❎
(h ps://web.archive.org/web/20201022212824/h ps://beckmw.wordpress.com/tag/nnet/).
120 captures 22 f🐦
23 Feb 2016 - 8 Nov 2020 2019 2020 2021 ▾ About this capture
In the next post (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-dnn-parallel-accelera on/), I will introduce how to
accelerate this code by mul cores CPU and NVIDIA GPU.
Notes:
1. The en re source code of this post in here
(h ps://web.archive.org/web/20201022212824/h ps://github.com/PatricZhao/ParallelR/blob/master/ParDNN/iris_dnn.R)
2. The PDF version of this post in here
(h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/materials/2_DNN/ParallelR_R_DNN_1_BUILD_FROM_SCRATCH.pdf)
3. Pre y R syntax in this blog is Created by inside-R .org
(h ps://web.archive.org/web/20201022212824/h p://blog.revolu onanaly cs.com/2016/08/farewell-inside-rorg.html) (acquired by MS and
dead)
55 COMMENTS
gary says:
Javid says:
Charles says:
VladMinkov says:
In this blog, I intend to make sure the code is as simple as possible to understand, so only 1 hidden layer is used and we ini nized the
weights/bias w/ random number. The construc on of DNN model mainly refers the online course of CS231n.
h p://cs231n.github.io/neural-networks-3/ (h ps://web.archive.org/web/20201022212824/h p://cs231n.github.io/neural-networks-3/)
As you men oned the ini aliza on method of Stacked Autoencoder and Stacked RBM (or other parts of DNN) which can be implemented in
example R code is rela ve flexible compared with mature packages. And that is one of major reasons why we need to build DNN from scratch
rather than using CRAN packages.
The example code of this blog is in GitHub and I will be very happy if you can improve the code by what real DNN has as you men oned so that
more readers can learn it clearly from codes.
h ps://github.com/PatricZhao/ParallelR/tree/master/ParDNN
(h ps://web.archive.org/web/20201022212824/h ps://github.com/PatricZhao/ParallelR/tree/master/ParDNN)
Thanks again for your informa ve notes.
Reply (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/?
replytocom=54#respond)
tony says:
VladMinkov says:
BR,
–Peng
Reply (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/?
replytocom=56#respond)
bastcilkdoptb says:
furtdsolinopv says:
hendrik says:
hendrik says:
Regarding with which kind of NN structure is be er, it really depends on the problem and input data, and I think this the real state-
of-art in DNN and very me-consuming 🙂
Go back the code, the forward and back propaga on processing of mul ple layer network are very similar to 1 layer. The only thing
for you is to add a loop. Take 2 hidden layers for example in below code. Actually, you need to generalize this piece of code by
numbers of input parameters, such as hidden=c(3,3,3) stands for 3 hidden layers and each with 3 neurons:
WilliamCavy says:
eebest8 says:
Harshith says:
kenny192 says:
More info in :
h p://stackoverflow.com/ques ons/9390337/purpose-of-decay-parameter-in-nnet-func on-in-r
(h ps://web.archive.org/web/20201022212824/h p://stackoverflow.com/ques ons/9390337/purpose-of-decay-parameter-in-nnet-func on-
in-r)
h p://stats.stackexchange.com/ques ons/29130/difference-between-neural-net-weight-decay-and-learning-rate
http://www.parallelr.com/r-deep-neural-network-from-scratch/ Go AUG OCT NOV 👤 ⍰❎
(h ps://web.archive.org/web/20201022212824/h
120 captures 22
p://stats.stackexchange.com/ques ons/29130/difference-between-neural-net-weight-
f🐦
23 Feb 2016 - 8 Nov 2020 2019 2020 2021 ▾ About this capture
decay-and-learning-rate)
kenny192 says:
Actually, LR is not needed in nnet and I am trying to understand the weights upda ng of nnet. I will reply to you when I figure out
more details.
Good ques ons for further learning 🙂
kenny192 says:
kenny192 says:
kenny192 says:
tope says:
tope says:
2. traindata is located in func on, you can't execute it line by line. But you can assign real data first as below:
> samp <- c(sample(1:50,25), sample(51:100,25), sample(101:150,25)) > traindata=iris[samp,]
> nrow(traindata)
[1] 75
BTW, the whole R code is on Github (h ps://web.archive.org/web/20201022212824/h ps://github.com/PatricZhao/ParallelR), you can try to
run it and then debug one by one.
Reply (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/?
replytocom=1456#respond)
Michael says:
Brian says:
jdxbx says:
Patricio says:
To switch from current classifica on network to regression network, major changes include:
1. Ac va on func on from ReLu to tanh or sigmoid. A en on that it is usually not necessary to set ac va on func on on the output layer;
2. Loss func on from so max to mean squared error or absolute error;
3. Back propaga on processing.
BTW, the regression NN code can be found:
h ps://github.com/PatricZhao/ParallelR/tree/master/ParDNN
(h ps://web.archive.org/web/20201022212824/h ps://github.com/PatricZhao/ParallelR/tree/master/ParDNN)
you can try to run it and then debug one by one.
Thanks for your comments again.
Reply (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/?
replytocom=4376#respond)
Flo-Bow says:
2 TRACKBACKS
R for Deep Learning (II): Achieve High-Performance DNN with Parallel Accelera on – ParallelR
(h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-dnn-parallel-accelera on/)
R for deep learning (III): CUDA and Mul GPUs Accelera on – ParallelR (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-dnn-cuda-mul gpu/)
LEAVE A REPLY
Your email address will not be published. Required fields are marked *
Comment
Name *
http://www.parallelr.com/r-deep-neural-network-from-scratch/ Go AUG OCT NOV 👤 ⍰❎
120 captures 22 f🐦
23 Feb 2016 - 8 Nov 2020 2019 2020 2021 ▾ About this capture
Email *
Website
POST COMMENT
NEWSLETTER
Email address:
Subscribe
SEARCH
Search
CATEGORIES
RECENT POSTS
http://www.parallelr.com/r-deep-neural-network-from-scratch/
R benchmark for High-Performance Analy cs and Compu ng (II): GPU Packages Go AUG OCT NOV 👤 ⍰❎
(hcaptures
120 ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-hpac-benchmark-analysis-gpu/)
22 f🐦
23 Feb 2016 - 8 Nov 2020 2019 2020 2021 ▾ About this capture
R and openMP: boos ng compiled code on mul -core cpu-s (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-and-
openmp-boos ng-compiled-code-on-mul -core-cpu-s/)
R for Deep Learning (III): CUDA and Mul GPUs Accelera on (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-dnn-
cuda-mul gpu/)
BLOG ROLL
DISCLAIMER
1) This is a personal weblog.
2) The opinions expressed here represent my own and not those of my employer.
3) The opinions expressed by the ParallelR Bloggers and those providing comments are theirs alone and do not reflect the opinions of ParallelR.
HOME (HTTPS://WEB.ARCHIVE.ORG/WEB/20201022212824/HTTP://WWW.PARALLELR.COM/)
BLOG (HTTPS://WEB.ARCHIVE.ORG/WEB/20201022212824/HTTP://WWW.PARALLELR.COM/BLOG/)
PRESENTATIONS (HTTPS://WEB.ARCHIVE.ORG/WEB/20201022212824/HTTP://WWW.PARALLELR.COM/PRESENTATION/)
OPEN SOURCES (HTTPS://WEB.ARCHIVE.ORG/WEB/20201022212824/HTTP://WWW.PARALLELR.COM/OPENSOURCE/)
DONATION (HTTPS://WEB.ARCHIVE.ORG/WEB/20201022212824/HTTP://WWW.PARALLELR.COM/DONATION/)
CONTACT (HTTPS://WEB.ARCHIVE.ORG/WEB/20201022212824/HTTP://WWW.PARALLELR.COM/CONTACT/)
LOGIN (HTTPS://WEB.ARCHIVE.ORG/WEB/20201022212824/HTTP://WWW.PARALLELR.COM/WP-ADMIN)
(h ps://web.archive.org/web/20201022212824/h
(h ps://web.archive.org/web/20201022212824
(h ps://web.archive.org/web/202010222p