R For Deep Learning (I) - Build Fully Connected Neural Network From Scratch - ParallelR

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 25

http://www.parallelr.

com/r-deep-neural-network-from-scratch/ Go AUG OCT NOV 👤 ⍰❎


120 captures 22 f🐦
23 Feb 2016 - 8 Nov 2020 2019 2020 2021 ▾ About this capture

BLOG

R for Deep Learning (I): Build Fully Connected Neural Network from Scratch
(h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-
scratch/)
FEBRUARY 1 3 , 2 0 1 6 BY PENG ZHAO (HTTPS://W EB.ARCHI VE.O RG/W EB/2 0 2 0 1 0 2 2 2 1 2 8 2 4 /HTTP://W W W.PARALLELR.CO M/AUTHO R/PATRI C/)

by (h ps://web.archive.org/web/20201022212824/h p://synved.com/wordpress-social-media-feather/)
Share This:

Backgrounds
Deep Neural Network (DNN) (h ps://web.archive.org/web/20201022212824/h ps://en.wikipedia.org/wiki/Deep_learning) has made a great
progress in recent years in image recogni on, natural language processing and automa c driving fields, such as Picture.1 shown from 2012 to
2015 DNN improved IMAGNET (h ps://web.archive.org/web/20201022212824/h p://image-net.org/challenges/LSVRC/2015/)’s accuracy from
~80% to ~95%, which really beats tradi onal computer vision (CV) methods.

(h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/wp-

content/uploads/2016/02/ces2016.png)
Picture.1 – From NVIDIA CEO Jensen’s talk in CES16
http://www.parallelr.com/r-deep-neural-network-from-scratch/ (h ps://web.archive.org/web/20201022212824/h ps://www.youtube.com/playlist?
Go AUG OCT NOV
👤 ⍰❎
120list=PLZHnYvH1qtObAdk3waYMalhA8bdbhyTm2)
captures 22 f🐦
23 Feb 2016 - 8 Nov 2020 2019 2020 2021 ▾ About this capture
In this post, we will focus on fully connected neural networks which are commonly called DNN in data science. The biggest advantage of DNN is
to extract and learn features automa cally by deep layers architecture, especially for these complex and high-dimensional data that feature
engineers can’t capture easily, examples in Kaggle
(h ps://web.archive.org/web/20201022212824/h p://blog.kaggle.com/2014/08/01/learning-from-the-best/). Therefore, DNN is also very
a rac ve to data scien sts and there are lots of successful cases as well in classifica on, me series, and recommenda on system, such as
Nick’s post (h ps://web.archive.org/web/20201022212824/h p://blog.dominodatalab.com/using-r-h2o-and-domino-for-a-kaggle-
compe on/) and credit scoring (h ps://web.archive.org/web/20201022212824/h p://www.r-bloggers.com/using-neural-networks-for-credit-
scoring-a-simple-example/) by DNN. In CRAN and R’s community, there are several popular and mature DNN packages including nnet
(h ps://web.archive.org/web/20201022212824/h ps://cran.r-project.org/web/packages/nnet/index.html), nerualnet
(h ps://web.archive.org/web/20201022212824/h ps://cran.r-project.org/web/packages/neuralnet/), H2O
(h ps://web.archive.org/web/20201022212824/h ps://cran.r-project.org/web/packages/h2o/index.html), DARCH
(h ps://web.archive.org/web/20201022212824/h ps://cran.r-project.org/web/packages/darch/index.html), deepnet
(h ps://web.archive.org/web/20201022212824/h ps://cran.r-project.org/web/packages/deepnet/index.html)and mxnet
(h ps://web.archive.org/web/20201022212824/h ps://github.com/dmlc/mxnet), and I strong recommend H2O DNN algorithm and R interface
(h ps://web.archive.org/web/20201022212824/h p://www.h2o.ai/ver cals/algos/deep-learning/).

(h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/wp-content/uploads/2016/02/matrureDNNPackages-2.png)

So, why we need to build DNN from scratch at all?


– Understand how neural network works

Using exis ng DNN package, you only need one line R code for your DNN model in most of the me and there is an example
(h ps://web.archive.org/web/20201022212824/h p://www.r-bloggers.com/fi ng-a-neural-network-in-r-neuralnet-package/) by neuralnet. For
the inexperienced user, however, the processing and results may be difficult to understand. Therefore, it will be a valuable prac ce to implement
your own network in order to understand more details from mechanism and computa on views.
– Build specified network with your new ideas

DNN is one of rapidly developing area. Lots of novel works and research results are published in the top journals and Internet every week, and
the users also have their specified neural network configura on to meet their problems such as different ac va on func ons, loss func ons,
regulariza on, and connected graph. On the other hand, the exis ng packages are definitely behind the latest researches, and almost all exis ng
packages are wri en in C/C++, Java so it’s not flexible to apply latest changes and your ideas into the packages.
– Debug and visualize network and data

As we men oned, the exis ng DNN package is highly assembled and wri en by low-level languages so that it’s a nightmare to debug the network
layer by layer or node by node. Even it’s not easy to visualize the results in each layer, monitor the data or weights changes during training, and
show the discovered pa erns in the network.

Fundamental Concepts and Components


Fully connected neural network
(h ps://web.archive.org/web/20201022212824/h p://lo.epfl.ch/files/content/sites/lo/files/shared/1990/IEEE_78_1637_Oct1990.pdf), called
DNN in data science, is that adjacent network layers are fully connected to each other. Every neuron in the network is connected to every
neuron in adjacent layers.
A very simple and typical neural network is shown below
http://www.parallelr.com/r-deep-neural-network-from-scratch/ with 1 input layer, 2 hidden layers, and 1 output layer. Go
Mostly,
AUGwhen
OCT researchers
NOV talk⍰ ❎
👤

120about
captures network’s architecture, it refers to the configura on of DNN, such as how many layers in the network, how many 22 neurons in each layer, f🐦
23 Feb 2016 - 8 Nov 2020 2019 2020 2021 ▾ About this capture
what kind of ac va on, loss func on, and regulariza on are used.

(h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/wp-content/uploads/2016/02/dnn_architecture.png)
Now, we will go through the basic components of DNN and show you how it is implemented in R.

Weights and Bias


Take above DNN architecture, for example, there are 3 groups of weights from the input layer to first hidden layer, first to second hidden layer
and second hidden layer to output layer. Bias unit links to every hidden node and which affects the output scores, but without interac ng with
the actual data. In our R implementa on, we represent weights and bias by the matrix. Weight size is defined by,

(number of neurons layer M) X (number of neurons in layer M+1)

and weights are ini alized by random number from rnorm. Bias is just a one dimension matrix with the same size of neurons and set to zero.
Other ini aliza on approaches, such as calibra ng the variances with 1/sqrt(n) and sparse ini aliza on, are introduced in weight ini aliza on
(h ps://web.archive.org/web/20201022212824/h p://cs231n.github.io/neural-networks-2/#init) part of Stanford CS231n.
Pseudo R code:

weight.i <- 0.01*matrix (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/matrix)(rnorm (h ps://web.archive.org/web/20201022212824/h p://insi


nrow (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/nrow)=layer size of (i),
ncol (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/ncol)=layer size of (i+1))
bias.i <- matrix (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/matrix)(0, nrow (h ps://web.archive.org/web/20201022212824/h p://inside

Another common implementa on approach combines weights and bias together so that the dimension of input is N+1 which indicates N input
features with 1 bias, as below code:

weight <- 0.01*matrix (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/matrix)(rnorm (h ps://web.archive.org/web/20201022212824/h p://insi


nrow (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/nrow)=layer size of (i) +1,
ncol (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/ncol)=layer size of (i+1))

Neuron

A neuron is a basic unit in the DNN which is biologically inspired model of the human neuron. A single neuron performs weight and input
mul plica on and addi on (FMA), which is as same as the linear regression in data science, and then FMA’s result is passed to the ac va on
func on. The commonly used ac va on func ons include sigmoid
(h ps://web.archive.org/web/20201022212824/h ps://en.wikipedia.org/wiki/Sigmoid_func on), ReLu
http://www.parallelr.com/r-deep-neural-network-from-scratch/ Go AUG OCT NOV
👤 ⍰❎

120(hcaptures
ps://web.archive.org/web/20201022212824/h ps://en.wikipedia.org/wiki/Rec fier_(neural_networks)), Tanh 22 f🐦
23 Feb 2016 - 8 Nov 2020 2019 2020 2021 ▾ About this capture
(h ps://web.archive.org/web/20201022212824/h ps://reference.wolfram.com/language/ref/Tanh.html) and Maxout. In this post, I will take
the rec fied linear unit (ReLU) as ac va on func on, f(x) = max(0, x) . For other types of ac va on func on, you can refer here
(h ps://web.archive.org/web/20201022212824/h p://cs231n.github.io/neural-networks-1/#ac un).

(h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/wp-content/uploads/2016/02/neuron.png)

In R, we can implement neuron by various methods, such as sum(xi*wi) . But, more efficient representa on is by matrix mul plica on.
R code:

neuron.ij <- max (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/max)(0, input %*% weight + bias)

Implementa on Tips
In prac ce, we always update all neurons in a layer with a batch of examples for performance considera on. Thus, the above code will not work
correctly.
1) Matrix Mul plica on and Addi on
As below code shown, input %*% weights and bias with different dimensions and it can’t be added directly. Two solu ons are provided.
The first one repeats bias ncol mes, however, it will waste lots of memory in big data input. Therefore, the second approach is be er.

# dimension: 2X2
input <- matrix (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/matrix)(1:4, nrow (h ps://web.archive.org/web/20201022212824/h p://inside-r.o
# dimension: 2x3
weights <- matrix (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/matrix)(1:6, nrow (h ps://web.archive.org/web/20201022212824/h p://inside
# dimension: 1*3
bias <- matrix (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/matrix)(1:3, nrow (h ps://web.archive.org/web/20201022212824/h p://inside-r.or
# doesn't work since unmatched dimension
input %*% weights + bias
Error input %*% weights + bias : non-conformable arrays

# solution 1: repeat bias aligned to 2X3


s1 <- input %*% weights + matrix (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/matrix)(rep (h ps://web.archive.org/web/20201022212824/

# solution 2: sweep addition


s2 <- sweep (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/sweep)(input %*% weights ,2, bias, '+')

all.equal (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/all.equal)(s1, s2)


[1] TRUE

2) Element-wise max value for a matrix


Another trick in here is to replace max by pmax to get element-wise maximum value instead of a global one, and be careful of the order in
pmax 🙂
http://www.parallelr.com/r-deep-neural-network-from-scratch/
# the original matrix
Go AUG OCT NOV
👤 ⍰❎
> s1
120 captures 22 f🐦
23 Feb 2016 - 8 Nov 2020 2019 2020 2021 ▾ About this capture
[,1] [,2] [,3]
[1,] 8 17 26
[2,] 11 24 37

# max returns global maximum


> max (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/max)(0, s1)
[1] 37

# s1 is aligned with a scalar, so the matrix structure is lost


> pmax (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/pmax)(0, s1)
[1] 8 11 17 24 26 37

# correct
# put matrix in the first, the scalar will be recycled to match matrix structure
> pmax (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/pmax)(s1, 0)
[,1] [,2] [,3]
[1,] 8 17 26
[2,] 11 24 37

Layer
– Input Layer

the input layer is rela vely fixed with only 1 layer and the unit number is equivalent to the number of features in the input data.
-Hidden layers

Hidden layers are very various and it’s the core component in DNN. But in general, more hidden layers are needed to capture desired pa erns in
case the problem is more complex (non-linear).
-Output Layer

The unit in output layer most commonly does not have an ac va on because it is usually taken to represent the class scores in classifica on and
arbitrary real-valued numbers in regression. For classifica on, the number of output units matches the number of categories of predic on while
there is only one output node for regression.

Build Neural Network: Architecture, Predic on, and Training


Till now, we have covered the basic concepts of deep neural network and we are going to build a neural network now, which includes
determining the network architecture, training network and then predict new data with the learned network. To make things simple, we use a
small data set, Edgar Anderson’s Iris Data (iris
(h ps://web.archive.org/web/20201022212824/h ps://en.wikipedia.org/wiki/Iris_flower_data_set)) to do classifica on by DNN.
Network Architecture

IRIS is well-known built-in dataset in stock R for machine learning. So you can take a look at this dataset by the summary at the console directly
as below.
R code:

summary (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/summary)(iris (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/dataset


Sepal.Length Sepal.Width Petal.Length Petal.Width Species
Min. :4.300 Min. :2.000 Min. :1.000 Min. :0.100 setosa :50
1st Qu.:5.100 1st Qu.:2.800 1st Qu.:1.600 1st Qu.:0.300 versicolor:50
Median :5.800 Median :3.000 Median :4.350 Median :1.300 virginica :50
Mean :5.843 Mean :3.057 Mean :3.758 Mean :1.199
3rd Qu.:6.400 3rd Qu.:3.300 3rd Qu.:5.100 3rd Qu.:1.800
Max. :7.900 Max. :4.400 Max. :6.900 Max. :2.500

From the summary, there are four features and three categories of Species. So we can design a DNN architecture as below.
http://www.parallelr.com/r-deep-neural-network-from-scratch/ Go AUG OCT NOV
👤 ⍰❎
120 captures 22 f🐦
23 Feb 2016 - 8 Nov 2020 2019 2020 2021 ▾ About this capture

(h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/iris_network/)

And then we will keep our DNN model in a list, which can be used for retrain or predic on, as below. Actually, we can keep more interes ng
parameters in the model with great flexibility.

R code:

List of 7
$ D : int 4
$ H : num 6
$ K : int 3
$ W1: num [1:4, 1:6] 1.34994 1.11369 -0.57346 -1.12123 -0.00107 ...
$ b1: num [1, 1:6] 1.336621 -0.509689 -0.000277 -0.473194 0 ...
$ W2: num [1:6, 1:3] 1.31464 -0.92211 -0.00574 -0.82909 0.00312 ...
$ b2: num [1, 1:3] 0.581 0.506 -1.088

Predic on
Predic on, also called classifica on or inference in machine learning field, is concise compared with training, which walks through the network
layer by layer from input to output by matrix mul plica on. In output layer, the ac va on func on doesn’t need. And for classifica on, the
probabili es will be calculated by so max (h ps://web.archive.org/web/20201022212824/h ps://en.wikipedia.org/wiki/So max_func on)
while for regression the output represents the real value of predicted. This process is called feed forward or feed propaga on.

R code:

# Prediction
predict.dnn <- func on (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/func on)(model, data = X.test) {
# new data, transfer to matrix
new.data <- data.matrix (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/data.matrix)(data)

# Feed Forwad
hidden.layer <- sweep (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/sweep)(new.data %*% model$W1 ,2, model$b1, '+')
# neurons : Rectified Linear
hidden.layer <- pmax (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/pmax)(hidden.layer, 0)
score <- sweep (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/sweep)(hidden.layer %*% model$W2, 2, model$b2, '+')

# Loss Function: softmax


score.exp <- exp (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/exp)(score)
probs <-sweep (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/sweep)(score.exp, 1, rowSums (h ps://web.archive.org/web/2020102221282

# select max possiblity


labels.predicted <- max.col (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/max.col)(probs)
return (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/return)(labels.predicted)
}

Training
Training is to search the op miza on parameters (weights and bias) under the given network architecture and minimize the classifica on error
or residuals. This process includes two parts: feed forward and back propaga on. Feed forward is going through the network with input data (as
predic on parts) and then compute data loss in the output layer by loss func on
(h ps://web.archive.org/web/20201022212824/h ps://en.wikipedia.org/wiki/Loss_func on) (cost func on). Go
http://www.parallelr.com/r-deep-neural-network-from-scratch/ “DataAUG
lossOCT
measures
NOV the 👤 ⍰ ❎
120compa 22we selected cross-f 🐦
capturesbility between a predic on (e.g. the class scores in classifica on) and the ground truth label.” In our example code,
23 Feb 2016 - 8 Nov 2020 2019 2020 2021 ▾ About this capture
entropy func on to evaluate data loss, see detail in here (h ps://web.archive.org/web/20201022212824/h p://cs231n.github.io/neural-
networks-2/#losses).

A er ge ng data loss, we need to minimize the data loss by changing the weights and bias. The very popular method is to back-propagate the
loss into every layers and neuron by gradient descent
(h ps://web.archive.org/web/20201022212824/h ps://en.wikipedia.org/wiki/Gradient_descent) or stochas c gradient descent
(h ps://web.archive.org/web/20201022212824/h ps://en.wikipedia.org/wiki/Stochas c_gradient_descent) which requires deriva ves of data
loss for each parameter (W1, W2, b1, b2). And back propaga on
(h ps://web.archive.org/web/20201022212824/h p://colah.github.io/posts/2015-08-Backprop/) will be different for different ac va on
func ons and see here (h ps://web.archive.org/web/20201022212824/h p://mochajl.readthedocs.org/en/latest/user-guide/neuron.html) and
here (h ps://web.archive.org/web/20201022212824/h p://like.silk.to/studymemo/ChainRuleNeuralNetwork.pdf) for their deriva ves formula
and method, and Stanford CS231n (h ps://web.archive.org/web/20201022212824/h p://cs231n.github.io/neural-networks-3/) for more
training ps.
In our example, the point-wise deriva ve for ReLu is:

(h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/class-relu/)

R code:
http://www.parallelr.com/r-deep-neural-network-from-scratch/
# Train: build and train a 2-layers neural network
Go AUG OCT NOV
👤 ⍰ ❎
train.dnn <- func on (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/func on)(x, y, traindata=data, testdata=NULL,
120 captures 22 f🐦
23 Feb 2016 - 8 Nov 2020 2019 2020 2021 ▾ About this capture
# set hidden layers and neurons
# currently, only support 1 hidden layer
hidden=c (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/c)(6),
# max iteration steps
maxit=2000,
# delta loss
abstol=1e-2,
# learning rate
lr = 1e-2,
# regularization rate
reg = 1e-3,
# show results every 'display' step
display = 100,
random.seed = 1)
{
# to make the case reproducible.
set.seed (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/set.seed)(random.seed)

# total number of training set


N <- nrow (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/nrow)(traindata)

# extract the data and label


# don't need atribute
X <- unname (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/unname)(data.matrix (h ps://web.archive.org/web/20201022212824/h p://inside-r.o
Y <- traindata[,y]
if(is.factor (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/is.factor)(Y)) { Y <- as.integer (h ps://web.archive.org/web/20201022212824/h p
# updated: 10.March.2016: create index for both row and col
Y.len <- length (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/length)(unique (h ps://web.archive.org/web/20201022212824/h p://inside-r.o
Y.set <- sort (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/sort)(unique (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-
Y.index <- cbind (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/cbind)(1:N, match (h ps://web.archive.org/web/20201022212824/h p://insid

# number of input features


D <- ncol (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/ncol)(X)
# number of categories for classification
K <- length (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/length)(unique (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc
H <- hidden

# create and init weights and bias


W1 <- 0.01*matrix (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/matrix)(rnorm (h ps://web.archive.org/web/20201022212824/h p://inside-r.o
b1 <- matrix (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/matrix)(0, nrow (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r

W2 <- 0.01*matrix (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/matrix)(rnorm (h ps://web.archive.org/web/20201022212824/h p://inside-r.o


b2 <- matrix (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/matrix)(0, nrow (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r

# use all train data to update weights since it's a small dataset
batchsize <- N
# updated: March 17. 2016
# init loss to a very big value
loss <- 100000

# Training the network


i <- 0
while(i < maxit && loss > abstol ) {

# iteration index
i <- i +1

# forward ....
# 1 indicate row, 2 indicate col
hidden.layer <- sweep (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/sweep)(X %*% W1 ,2, b1, '+')
# neurons : ReLU
hidden.layer <- pmax (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/pmax)(hidden.layer, 0)
score <- sweep (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/sweep)(hidden.layer %*% W2, 2, b2, '+')

# softmax
score.exp <- exp (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/exp)(score)
probs <-sweep (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/sweep)(score.exp, 1, rowSums (h ps://web.archive.org/web/20201022212

# compute the loss


corect.logprobs <- -log (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/log)(probs[Y.index])
data.loss <- sum (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/sum)(corect.logprobs)/batchsize
http://www.parallelr.com/r-deep-neural-network-from-scratch/ Go AUG OCT NOV 👤 ⍰❎
120 captures
reg.loss 22
<- 0.5*reg* (sum (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/sum)(W1*W1) + sum (h ps://web.archive.org/web/202010
f🐦
23 Feb 2016 -loss <- data.loss + reg.loss
8 Nov 2020 2019 2020 2021 ▾ About this capture

# display results and update model


if( i %% display == 0) {
if(!is.null (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/is.null)(testdata)) {
model <- list (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/list)( D = D,
H = H,
K = K,
# weights and bias
W1 = W1,
b1 = b1,
W2 = W2,
b2 = b2)
labs <- predict.dnn(model, testdata[,-y])
# updated: 10.March.2016
accuracy <- mean (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/mean)(as.integer (h ps://web.archive.org/web/2020102221282
cat (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/cat)(i, loss, accuracy, "\n")
} else {
cat (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/cat)(i, loss, "\n")
}
}

# backward ....
dscores <- probs
dscores[Y.index] <- dscores[Y.index] -1
dscores <- dscores / batchsize

dW2 <- t (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/t)(hidden.layer) %*% dscores


db2 <- colSums (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/colSums)(dscores)

dhidden <- dscores %*% t (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/t)(W2)


dhidden[hidden.layer <= 0] <- 0

dW1 <- t (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/t)(X) %*% dhidden


db1 <- colSums (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/colSums)(dhidden)

# update ....
dW2 <- dW2 + reg*W2
dW1 <- dW1 + reg*W1

W1 <- W1 - lr * dW1
b1 <- b1 - lr * db1

W2 <- W2 - lr * dW2
b2 <- b2 - lr * db2

# final results
# creat list to store learned parameters
# you can add more parameters for debug and visualization
# such as residuals, fitted.values ...
model <- list (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/list)( D = D,
H = H,
K = K,
# weights and bias
W1= W1,
b1= b1,
W2= W2,
b2= b2)

return (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/return)(model)


}

Tes ng and Visualiza on


We have built the simple 2-layers DNN model and now we can test our model. First, the dataset is split into twoGoparts
http://www.parallelr.com/r-deep-neural-network-from-scratch/ AUGforOCT
training 👤 ⍰❎
NOV and tes ng,

120and then use the training set to train model while tes ng set to measure the generaliza on ability of our model.
captures 22 f🐦
23 Feb 2016 - 8 Nov 2020 2019 2020 2021 ▾ About this capture
R code

########################################################################
# testing
#######################################################################
set.seed (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/set.seed)(1)

# 0. EDA
summary (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/summary)(iris (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/dataset
plot (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/graphics/plot)(iris (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/datasets/iris))

# 1. split data into test/train


samp <- c (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/c)(sample (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/sam

# 2. train model
ir.model <- train.dnn(x=1:4, y=5, traindata=iris (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/datasets/iris)[samp,], testdata=iris (h

# 3. prediction
labels.dnn <- predict.dnn(ir.model, iris (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/datasets/iris)[-samp, -5])

# 4. verify the results


table (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/table)(iris (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/datasets/iris)[-
# labels.dnn
# 1 2 3
#setosa 25 0 0
#versicolor 0 24 1
#virginica 0 0 25

#accuracy
mean (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/mean)(as.integer (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/as
# 0.98

The data loss in train set and the accuracy in test as below:

(h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/iris_loss_accuracy-2/)
Then we compare our DNN model with ‘nnet’ package as below codes.
http://www.parallelr.com/r-deep-neural-network-from-scratch/ Go AUG OCT NOV 👤 ⍰
library (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/library)(nnet (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/nnet/nnet))❎
22
ird <- data.frame (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/data.frame)(rbind (h ps://web.archive.org/web/20201022212824/h p://inside-r.o
120 captures f🐦
23 Feb 2016 - 8 Nov 2020 2019 2020 2021 ▾ About this capture
species = factor (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/factor)(c (h ps://web.archive.org/web/20201022212824/h
ir.nn2 <- nnet (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/nnet/nnet)(species ~ ., data (h ps://web.archive.org/web/20201022212824/h p://i
decay = 1e-2, maxit = 2000)

labels.nnet <- predict (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/stats/predict)(ir.nn2, ird[-samp,], type="class")


table (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/table)(ird$species[-samp], labels.nnet)
# labels.nnet
# c s v
#c 22 0 3
#s 0 25 0
#v 3 0 22

# accuracy
mean (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/mean)(ird$species[-samp] == labels.nnet)
# 0.96

Update: 4/28/2016:
Google’s tensorflow released a very cool website to visualize neural network in here
(h ps://web.archive.org/web/20201022212824/h p://playground.tensorflow.org/). And we has taught almost all technologies as google’s
website in this blog so you can build up with R as well 🙂

(h ps://web.archive.org/web/20201022212824/h p://playground.tensorflow.org/#ac va on=tanh&batchSize=10&dataset=circle&regDataset=


plane&learningRate=0.03&regulariza onRate=0&noise=0&networkShape=4,2&seed=0.86675&showTestData=false&discre ze=false&percTrainD

Summary
In this post, we have shown how to implement R neural network from scratch. But the code is only implemented the core concepts of DNN, and
the reader can do further prac ces by:
Solving other classifica on problem, such as a toy case in here (h ps://web.archive.org/web/20201022212824/h p://www.wildml.com/2015/09/implemen ng-a-
neural-network-from-scratch/)
Selec ng various hidden layer size, ac va on func on, loss func on
Extending single hidden layer network to mul -hidden layers
Adjus ng the network to resolve regression problems
Visualizing the network architecture, weights, and bias by R, an example in here
http://www.parallelr.com/r-deep-neural-network-from-scratch/ Go AUG OCT NOV 👤 ⍰❎
(h ps://web.archive.org/web/20201022212824/h ps://beckmw.wordpress.com/tag/nnet/).
120 captures 22 f🐦
23 Feb 2016 - 8 Nov 2020 2019 2020 2021 ▾ About this capture
In the next post (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-dnn-parallel-accelera on/), I will introduce how to
accelerate this code by mul cores CPU and NVIDIA GPU.

Notes:
1. The en re source code of this post in here
(h ps://web.archive.org/web/20201022212824/h ps://github.com/PatricZhao/ParallelR/blob/master/ParDNN/iris_dnn.R)
2. The PDF version of this post in here
(h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/materials/2_DNN/ParallelR_R_DNN_1_BUILD_FROM_SCRATCH.pdf)
3. Pre y R syntax in this blog is Created by inside-R .org
(h ps://web.archive.org/web/20201022212824/h p://blog.revolu onanaly cs.com/2016/08/farewell-inside-rorg.html) (acquired by MS and
dead)

← The R Parallel Programming Blog (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/the-r-parallel-programming-blog/)


R for Deep Learning (II): Achieve High-Performance DNN with Parallel Accelera on → (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-dnn-parallel-
accelera on/)

55 COMMENTS

gary says:

February 19, 2016 at 5:15 am (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/#comment-3)


Nice, very helpful.
Looking forward to more blogs.
up up up!

Reply (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/?replytocom=3#respond)

Javid says:

July 13, 2017 at 12:49 pm (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/#comment-4155)


Dear Peng
Great work to learn…Appreciated… I just want to ask that Does it work with panel data? if so then how data should be like..
Best regards’
Reply (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/?
replytocom=4155#respond)

Richard Paris says:

February 22, 2016 at 9:07 pm (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/#comment-4)


Great work. also looking forward to more blogs in this super interes ng field!

Reply (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/?replytocom=4#respond)

Charles says:

March 9, 2016 at 3:05 pm (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/#comment-8)


Great post!
Small error/typo in the code for the predict func on. The func on name should be ‘predict.dnn’ not just ‘predict’.
Thanks again.
Reply (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/?replytocom=8#respond)
http://www.parallelr.com/r-deep-neural-network-from-scratch/ Go AUG OCT NOV 👤 ⍰❎
120 captures Peng Zhao (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/) says: 22 f🐦
23 Feb 2016 - 8 Nov 2020 2019 2020 2021 ▾ About this capture

March 10, 2016 at 2:36 am (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/#comment-10)


Hi Charles,
Really thanks for your comments.
Good catch. Fixed the ‘predict’ name with ‘predict.dnn’.
Thanks again for reading 🙂
–Peng
Reply (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/?
replytocom=10#respond)

VladMinkov says:

May 6, 2016 at 8:52 am (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/#comment-38)


Neural network in the ar cle is nothing to do with the deep neural network has not. This is a common MLP. Only 2 package (“darch”, “deepnet”)
actually create deep neural network ini alized by Stacked Autoencoder and Stacked RBM. All other create a simple neural network with deep
regulariza on and the original ini aliza on of weights of neurons.
Obviously you do not quite understand the term “deep”.

Reply (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/?replytocom=38#respond)

Peng (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/) says:

May 10, 2016 at 3:42 am (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/#comment-54)


Hi VladMinkov,
Thanks for your comments. Great point about the difference between MLP and DNN, and several nice answers in below:
h ps://www.quora.com/How-is-deep-learning-different-from-mul layer-perceptron
(h ps://web.archive.org/web/20201022212824/h ps://www.quora.com/How-is-deep-learning-different-from-mul layer-perceptron)

In this blog, I intend to make sure the code is as simple as possible to understand, so only 1 hidden layer is used and we ini nized the
weights/bias w/ random number. The construc on of DNN model mainly refers the online course of CS231n.
h p://cs231n.github.io/neural-networks-3/ (h ps://web.archive.org/web/20201022212824/h p://cs231n.github.io/neural-networks-3/)
As you men oned the ini aliza on method of Stacked Autoencoder and Stacked RBM (or other parts of DNN) which can be implemented in
example R code is rela ve flexible compared with mature packages. And that is one of major reasons why we need to build DNN from scratch
rather than using CRAN packages.

The example code of this blog is in GitHub and I will be very happy if you can improve the code by what real DNN has as you men oned so that
more readers can learn it clearly from codes.
h ps://github.com/PatricZhao/ParallelR/tree/master/ParDNN
(h ps://web.archive.org/web/20201022212824/h ps://github.com/PatricZhao/ParallelR/tree/master/ParDNN)
Thanks again for your informa ve notes.
Reply (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/?
replytocom=54#respond)

wedding t shirts says:

May 9, 2016 at 3:10 am (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/#comment-47)


I love it when people get together and share ideas. Great
site, s ck with it!
Reply (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/?replytocom=47#respond)

tony says:

May 9, 2016 at 5:24 am (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/#comment-48)


Ihttp://www.parallelr.com/r-deep-neural-network-from-scratch/
really like what you guys tend to be up too. This kind of Go AUG OCT NOV 👤 ⍰❎
clever
120 work and exposure! Keep up the good works guys I’ve included you guys to our blogroll.
captures 22 f🐦
23 Feb 2016 - 8 Nov 2020 2019 2020 2021 ▾ About this capture
Reply (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/?replytocom=48#respond)

Peng (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/) says:

May 9, 2016 at 5:30 am (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/#comment-49)


HI Kitchen Sink,
Thank you very much for your encouragement and blog rolling.
We’ll keep on going to make our work be er.
BR,
–Peng
Reply (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/?
replytocom=49#respond)

VladMinkov says:

May 10, 2016 at 9:04 am (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/#comment-55)


Hi Peng,
I’m not a programmer, I prac ce. I It is important to know the basic principles of the algorithm without unnecessary gory details.
Look at a few of my ar cles here. h ps://www.mql5.com/en/ar cles/1103
(h ps://web.archive.org/web/20201022212824/h ps://www.mql5.com/en/ar cles/1103) and h ps://www.mql5.com/en/ar cles/2029
(h ps://web.archive.org/web/20201022212824/h ps://www.mql5.com/en/ar cles/2029) and
h ps://www.mql5.com/en/ar cles/1628 (h ps://web.archive.org/web/20201022212824/h ps://www.mql5.com/en/ar cles/1628).
Best regards.
Reply (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/?replytocom=55#respond)

Peng (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/) says:

May 10, 2016 at 9:17 am (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/#comment-56)


Hi VlaMinkov,
Very great ar cles and I think I need to refresh my knowledge of DNN.
Welcome for further valuable comments on our site.

BR,
–Peng
Reply (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/?
replytocom=56#respond)

precious basaldua says:

May 15, 2016 at 1:24 am (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/#comment-67)


Great post. I was checking constantly this blog and I’m
impressed! Very helpful informa on par cularly the closing part 🙂 I take care
of such info much. I was seeking this par cular info for a very lengthy me.
Thanks and good luck.
Reply (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/?replytocom=67#respond)

bastcilkdoptb says:

May 18, 2016 at 12:34 pm (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/#comment-102)


Whats up very cool web site!! Guy .. Beau ful .. Wonderful .. I’ll bookmark your website and take the feeds addi onally…I’m happy to find numerous
useful info here within the put up, we need develop extra strategies in this regard, thanks for sharing.
Reply
http://www.parallelr.com/r-deep-neural-network-from-scratch/ Go AUG OCT NOV 👤 ⍰❎
(h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/?replytocom=102#respond)
120 captures 22 f🐦
23 Feb 2016 - 8 Nov 2020 2019 2020 2021 ▾ About this capture

furtdsolinopv says:

May 19, 2016 at 7:53 am (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/#comment-103)


You have done an excellent job. I will certainly dig it and personally suggest to my friends. I’m sure they’ll be benefited from this site.

Reply (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/?replytocom=103#respond)

hendrik says:

June 1, 2016 at 1:13 pm (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/#comment-120)


How do we make it work for regression?
here some overview of my dataset
supposed my train data 4 column 288 rows
col 1 = data from previous 2 day, col 2 =data from previous day, col 3=data now, col 4= response variable, target
i have tried this code give
dtaipei.model #accuracy
> mean(as.integer(dtaipei[-samp, 4]) == preddtaipei.dnn)
[1] 0.01388889

what i should do?


Reply (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/?replytocom=120#respond)

Peng (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/) says:

June 2, 2016 at 5:20 am (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/#comment-124)


Hi Hendrik,
Thanks for reading my blog and prac cing by real data.
This is a very good ques on and it’s not very difficult to switch from current classifica on network to regression network.

Actually, major changes include:


1. Ac va on func on from ReLu to tanh or sigmoid;
2. Loss func on from so max to mean squared error or absolute error;
I think you can get the correct codes a er a li le warm-up, and welcome to check in your code in ParallelR GitHub.
On the other hand, I will release the regression code later in case you have issues to build regression NN.
Thanks for your comments again.
–Peng
Reply (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/?
replytocom=124#respond)

hendrik says:

June 8, 2016 at 7:29 am (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/#comment-132)


thanks for reply…
a er i tried for regression, i come up with some further ques on
– how do we modify the number of hidden layer from 2- 5 layers instead of fix hidden layer (2), does 2 hidden layers architecture
guarantee best structure of network? the idea is perform grid search, or random search hyper-paramater op miza on to look for best
architecture. for example
– hidden (try 2 to 5 layers deep, 10 to 2000 neurons per layer)
-rec fier/rec fierwithdropout
-adap ve learning rate, etc
-can we consider your network architecture is a deep learning? since DNN also in family of deep learning,
http://www.parallelr.com/r-deep-neural-network-from-scratch/ Go in term
AUG of NOV
OCT deep learning ⍰❎
👤 usually
120 captures at the first step use unsupervised pre-training (RBM, AE, SAE, CRBM) to learn feature then followed by supervised 22at the top(fine-tune, f🐦
23 Feb 2016 - 8 Nov 2020 2019 2020 2021 ▾ About this capture
backpropaga on, etc)

correct me if i was wrong…


i’m not an expert , s ll need to learn many things
thanks in advance…

Reply (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/?


replytocom=132#respond)

Peng (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/) says:

June 10, 2016 at 3:23 am (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/#comment-137)


That’s great to go further in DNN 🙂
1) Fixed 1 hidden layer (totally 3 layers) in the ar cle is only for simplified the code and make it shorter.
In prac ce, people prefer to use many hidden layers (range from the layers to hundred layers) to capture complicated pa erns in
the data; however, it’s prone to over-fi ng the data by much layers. Thus, it’s a tradeoff between shallow and deep network, and
you can refer this paper as well (h ps://arxiv.org/pdf/1312.6184.pdf
(h ps://web.archive.org/web/20201022212824/h ps://arxiv.org/pdf/1312.6184.pdf)). And dropout and adap ve LR are commonly
used in the modern network.

Regarding with which kind of NN structure is be er, it really depends on the problem and input data, and I think this the real state-
of-art in DNN and very me-consuming 🙂
Go back the code, the forward and back propaga on processing of mul ple layer network are very similar to 1 layer. The only thing
for you is to add a loop. Take 2 hidden layers for example in below code. Actually, you need to generalize this piece of code by
numbers of input parameters, such as hidden=c(3,3,3) stands for 3 hidden layers and each with 3 neurons:

############# Forward ######################


# input to layer1
hidden.layer1 = sweep (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/sweep)(X %*% W1 ,2, b1, '+')
hidden.layer1 = pmax (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/pmax)(hidden.layer1, 0)
#layer1 to layer2
hidden.layer2 = sweep (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/sweep)(hidden.layer1 %*% W2, 2, b2, '+')
hidden.layer2 = pmax (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/pmax)(hidden1.layer2, 0)
#layer2 to output layer
score = sweep (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/sweep)(hidden.layer2 %*% W3, 2, b3, '+')
############ Backward #####################
dscores = probs
dscores[Y.index] = dscores[Y.index] -1
dscores = dscores / batchsize
dW3 = t (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/t)(hidden1.layer) %*% dscores
db3 = colSums (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/colSums)(dscores)
dhidden1 = dscores %*% t (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/t)(W3)
dhidden1[hidden1.layer < 0] = 0
dW2 = t (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/t)(hidden.layer) %*% dhidden1
db2 = colSums (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/colSums)(dhidden1)
dhidden = dhidden1 %*% t (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/t)(W2)
dhidden[hidden.layer < 0] = 0
dW1 = t (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/t)(X) %*% dhidden
db1 = colSums (h ps://web.archive.org/web/20201022212824/h p://inside-r.org/r-doc/base/colSums)(dhidden)

2) Actually, I am not sta sts 🙁


My understand is major from online course: h p://cs231n.github.io/
(h ps://web.archive.org/web/20201022212824/h p://cs231n.github.io/)
You can refer the materials of VladMinkov from above comments for what’s really deep learning or below slide.
p:// p.dca.fee.unicamp.br/pub/docs/vonzuben/ia353_1s15/topico10_IA353_1s2015.pdf
(h ps://web.archive.org/web/20201022212824/ p:// p.dca.fee.unicamp.br/pub/docs/vonzuben/ia353_1s15/topico10_IA353_1s201
Thanks

Lola Junge says:


June 4, 2016 at 2:15 pm (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/#comment-128)
http://www.parallelr.com/r-deep-neural-network-from-scratch/ Go AUG OCT NOV 👤 ⍰❎
Sweet
120 blog!
captures I found it while surfing around on Yahoo News. Do you have any ps on how to get listed in Yahoo News? I’ve22
been trying for a while f but🐦
2019 2020 2021
I never seem to get there! Many thanks
23 Feb 2016 - 8 Nov 2020 ▾ About this capture

Reply (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/?replytocom=128#respond)

Peng (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/) says:

June 6, 2016 at 7:43 am (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/#comment-130)


Hi Lola,
Thanks, though this is a li le out of topic, I’m very happy to know the blog has been listed on Yahoo.
Actually, I don’t know the exact reasons but I think a well-baked ar cle with enriched technology instruc ons will be favored for readers.
Good Luck!
Reply (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/?
replytocom=130#respond)

Yaeko Kahawai says:

June 14, 2016 at 1:29 am (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/#comment-145)


Wow, wonderful weblog format! How lengthy have you ever been blogging for? you made running a blog look easy. The full look of your site is great,
let alone the content!
Reply (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/?replytocom=145#respond)

WilliamCavy says:

June 21, 2016 at 5:45 am (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/#comment-159)


I really liked your ar cle post.Much thanks again. Keep wri ng. Satre

Reply (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/?replytocom=159#respond)

eebest8 says:

July 6, 2016 at 4:05 am (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/#comment-205)


“Thank you ever so for you blog post.Really looking forward to read more.”
Reply (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/?replytocom=205#respond)

Harshith says:

August 11, 2016 at 10:12 am (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/#comment-326)


In which package, the func on layer.size is present? I’am ge ng an error “could not find the func on “layer.size”.
Reply (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/?replytocom=326#respond)

Peng (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/) says:

August 11, 2016 at 1:35 pm (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/#comment-328)


Hi Harshith,
Sorry for the confusions. Actually, ‘layer.size’ is a fake func on I used to illustrate how to ini alize weight and bias.
When we define the network architecture, we need to pass the hidden layer size to train.dnn func on with ‘hidden=c(3)’ and we can save it to
a vector such as ‘size.nn <- c(D, hidden, K)'. So, 'layer.size' ca be as simple as : layer.size <- function(x) { size.nn[x] }
Hope this helpful 🙂 And I will try to refine the post later to make it clear.

Thanks for your reading and very great feedback.


Reply (h ps://web.archive.org/web/20201022212824/h
http://www.parallelr.com/r-deep-neural-network-from-scratch/ p://www.parallelr.com/r-deep-neural-network-from-scratch/?
Go AUG OCT NOV 👤 ⍰❎
replytocom=328#respond)
120 captures 22 f🐦
23 Feb 2016 - 8 Nov 2020 2019 2020 2021 ▾ About this capture

George Chao says:

July 14, 2017 at 8:05 am (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/#comment-4172)


Hi Zhao Peng,
I think bias is
`bias.i <- matrix(0, nrow=1, ncol = layer.size(i))`

Reply (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/?


replytocom=4172#respond)

George Chao says:

July 14, 2017 at 8:05 am (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/#comment-4173)


Sorry `bias.i <- matrix(0, nrow=1, ncol = layer.size(i+1))`

Matrix Chen says:

July 15, 2017 at 12:38 am (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/#comment-4185)


Hi George,
Really thanks for your comments.
Good catch. Fixed the ‘layer.size(i)’ with ‘layer.size(i+1)’.
Thanks again for reading 🙂

kenny192 says:

September 11, 2016 at 1:27 pm (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/#comment-496)


Really thank you for good post!!
I have a few ques ons about ‘nnet’ package.
Why the ‘nnet’ Do not use the learning rate? Does learning rate are formed randomly?
Reply (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/?replytocom=496#respond)

Peng (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/) says:

September 12, 2016 at 2:38 am (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/#comment-502)


kenny192, thanks for reading 🙂

Update the answer.


Actually, nnet doesn’t use learning rate as Brian D. Ripley said:
h ps://stat.ethz.ch/pipermail/r-help/2005-May/071822.html (h ps://web.archive.org/web/20201022212824/h ps://stat.ethz.ch/pipermail/r-
help/2005-May/071822.html)
“I think you need to read the book nnet supports. It does not do on-line learning and does not have a learning rate.”
And NN in Chapter 8 of the books from Brain in below:
p:// p.cs.ntust.edu.tw/yang/R-
BOOK/Venables,%20WN%20and%20Ripley,%20BD(2002)%20Modern%20Applied%20Sta s cs%20with%20S,%204th%20Ed.pdf
(h ps://web.archive.org/web/20201022212824/ p:// p.cs.ntust.edu.tw/yang/R-
BOOK/Venables,%20WN%20and%20Ripley,%20BD(2002)%20Modern%20Applied%20Sta s cs%20with%20S,%204th%20Ed.pdf)

More info in :
h p://stackoverflow.com/ques ons/9390337/purpose-of-decay-parameter-in-nnet-func on-in-r
(h ps://web.archive.org/web/20201022212824/h p://stackoverflow.com/ques ons/9390337/purpose-of-decay-parameter-in-nnet-func on-
in-r)
h p://stats.stackexchange.com/ques ons/29130/difference-between-neural-net-weight-decay-and-learning-rate
http://www.parallelr.com/r-deep-neural-network-from-scratch/ Go AUG OCT NOV 👤 ⍰❎
(h ps://web.archive.org/web/20201022212824/h
120 captures 22
p://stats.stackexchange.com/ques ons/29130/difference-between-neural-net-weight-
f🐦
23 Feb 2016 - 8 Nov 2020 2019 2020 2021 ▾ About this capture
decay-and-learning-rate)

Reply (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/?


replytocom=502#respond)

kenny192 says:

September 12, 2016 at 4:29 am (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/#comment-505)


Thank you for reply, so you means that ‘nnet’ package have a only decay parameter? And decay parameter is how to used to set learning
rate? Kind regards,
Reply (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/?
replytocom=505#respond)

Peng Zhao (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/) says:

September 12, 2016 at 6:23 am (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/#comment-


506)
Sorry for inaccurate answer previously, I have modified it.

Actually, LR is not needed in nnet and I am trying to understand the weights upda ng of nnet. I will reply to you when I figure out
more details.
Good ques ons for further learning 🙂

kenny192 says:

September 12, 2016 at 6:44 am (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/#comment-


508)
Thank you for reply,
I think that it is already on weight inside ‘nnet package includes the learning rate. If you check the this book’s 6.8.11 in below :
h p://cns-classes.bu.edu/cn550/Readings/duda-etal-00.pdf (h ps://web.archive.org/web/20201022212824/h p://cns-
classes.bu.edu/cn550/Readings/duda-etal-00.pdf)

Peng Zhao (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/) says:

September 12, 2016 at 6:55 am (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/#comment-


509)
Thanks for great book and info. Very helpful!

“weight decay” method to update all weights by:


Wnew = Wold * (1 − decay)

And source code in nnet can be found:


h ps://github.com/cran/nnet/blob/master/src/nnet.c
(h ps://web.archive.org/web/20201022212824/h ps://github.com/cran/nnet/blob/master/src/nnet.c)

kenny192 says:

September 12, 2016 at 7:17 am (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/#comment-


510)
Thank you, I also very helpful.
Kind regards

kenny192 says:

November 11, 2016 at 12:38 pm (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/#comment-


938)
Hello, Peng 🙂 I have a ques on about overfi ng in neural networks model. In general, overfi Go
http://www.parallelr.com/r-deep-neural-network-from-scratch/ ng means when
AUG OCT NOVvalida on👤error ⍰❎
120 captures increases while the training error steadily decreases in result. right? but if valida on error will be converge22
(not increase or f 🐦
23 Feb 2016 - 8 Nov 2020 2019 2020 2021 ▾ About this capture
decrease) at par cular point while the training error steadily decreases, is this case also overfi ng?

Donovan Kessner says:

November 20, 2016 at 6:38 am (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/#comment-1060)


Superb web page you’ve go en there.|
Reply (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/?replytocom=1060#respond)

tope says:

December 28, 2016 at 2:28 pm (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/#comment-1437)


str(ir.model)
produces the error below in R terminal
Error in str(ir.model) : object ‘ir.model’ not found
Reply (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/?replytocom=1437#respond)

tope says:

December 28, 2016 at 11:52 pm (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/#comment-1440)


N<-nrow(traindata)
is giving me errors
Error in nrow(traindata) : object 'traindata' not found
Reply (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/?replytocom=1440#respond)

Peng (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/) says:

December 30, 2016 at 12:44 pm (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/#comment-1456)


Hi Tope,
Thanks for your interests for our blog and welcome for any ques ons.
Regarding your problems,
1. str(ir.model) , actually, you need to train model first, as blog men oned,
ir.model <- train.dnn(x=1:4, y=5, traindata=iris[samp,], testdata=iris[-samp,], hidden=6, maxit=2000, display=50)

2. traindata is located in func on, you can't execute it line by line. But you can assign real data first as below:
> samp <- c(sample(1:50,25), sample(51:100,25), sample(101:150,25)) > traindata=iris[samp,]
> nrow(traindata)
[1] 75

BTW, the whole R code is on Github (h ps://web.archive.org/web/20201022212824/h ps://github.com/PatricZhao/ParallelR), you can try to
run it and then debug one by one.
Reply (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/?
replytocom=1456#respond)

Michael says:

January 31, 2017 at 8:36 pm (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/#comment-1732)


nice blog i like it thank you so much…this will helpfull to my students educa on
Reply (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/?replytocom=1732#respond)

Brian says:

February 14, 2017 at 3:41 pm (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/#comment-1849)


Hi, I’m confused on how you would convert this into a 2+ hidden layer NN. Any chance you could help me out? Fantas
http://www.parallelr.com/r-deep-neural-network-from-scratch/ Go c arOCT
AUG cle,NOV 👤 ⍰❎
I learnt so much.
Thanks!
120 captures 22 f🐦
23 Feb 2016 - 8 Nov 2020 2019 2020 2021 ▾ About this capture
Reply (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/?replytocom=1849#respond)

Peng (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/) says:

February 15, 2017 at 11:46 am (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/#comment-1857)


Hi Brain,
Thanks for your interests for the blog. Please see my reply to hendrik in June 10, 2016 at 3:23 am 🙂
Thanks
Reply (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/?
replytocom=1857#respond)

Sharon Grace says:

May 9, 2017 at 1:43 pm (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/#comment-2856)


Hi, Am working on a project by the name house rent predic on using ar ficial neural networks please assist on the coding

Reply (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/?replytocom=2856#respond)

jdxbx says:

July 2, 2017 at 8:10 am (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/#comment-4054)


Very nice work!
sorry,i can not find the data1, where is the data1?
Thank you very much!
Reply (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/?replytocom=4054#respond)

Peng (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/blog) says:

July 3, 2017 at 12:17 am (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/#comment-4057)


We use the built-in dataset of ‘iris’ rather than ‘data1’.
Thanks
Reply (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/?
replytocom=4057#respond)

Patricio says:

July 20, 2017 at 8:58 pm (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/#comment-4249)


Hi!, really amazing blog!
I am trying to modify the code, for a single node in the output layer (with another data),
But I could not do it, could you help me?
Reply (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/?replytocom=4249#respond)

Matrix Chen says:

July 22, 2017 at 2:07 pm (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/#comment-4278)


Hi Patricio,
Really thanks for your comments.
I think you are trying to switch from current classifica on network to regression network.
Actually, major changes include:
http://www.parallelr.com/r-deep-neural-network-from-scratch/ Go AUG OCT NOV 👤 ⍰❎
1. Ac
120 captures va on func on from ReLu to tanh or sigmoid; 22 f🐦
23 Feb 2016 - 8 Nov 2020 2019 2020 2021 ▾ About this capture
2. Loss func on from so max to mean squared error or absolute error;
Welcome to share your codes with me. It is a great honor to help you.🙂
Thanks for your comments again.

Reply (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/?


replytocom=4278#respond)

Genwei Zhang says:

July 31, 2017 at 3:38 am (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/#comment-4363)


Dear Dr. Zhao,
This blog is excellent for me!
I tested the classifica on results and it worked very well as what you have showed.
Now, I am facing a task to fit a NN onto a regression dataset, you men oned above in one of the comments that switching from classifica on to
regression is not very hard. But, I tried to modify the code and ended up with not working well on my end. I am just wondering did you post
somewhere another blog talking about regression NN or can you kindly provide me with some details? I would really appreciated your help.
David Zhang,
Graduate Student,
University of Oklahoma
Reply (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/?replytocom=4363#respond)

Matrix Chen says:

July 31, 2017 at 3:50 pm (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/#comment-4376)


Hi David,
Really thanks for your comments.

To switch from current classifica on network to regression network, major changes include:
1. Ac va on func on from ReLu to tanh or sigmoid. A en on that it is usually not necessary to set ac va on func on on the output layer;
2. Loss func on from so max to mean squared error or absolute error;
3. Back propaga on processing.
BTW, the regression NN code can be found:
h ps://github.com/PatricZhao/ParallelR/tree/master/ParDNN
(h ps://web.archive.org/web/20201022212824/h ps://github.com/PatricZhao/ParallelR/tree/master/ParDNN)
you can try to run it and then debug one by one.
Thanks for your comments again.
Reply (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/?
replytocom=4376#respond)

Genwei Zhang says:

July 31, 2017 at 5:57 pm (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/#comment-4379)


Hi, thank you so much for your reply. I will try that out!

Reply (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/?


replytocom=4379#respond)

Flo-Bow says:

September 21, 2017 at 3:46 pm (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/#comment-5129)


Hi Dr. Zhao,
thank you for sharing your knowledge, but your R-Code does not work:
http://www.parallelr.com/r-deep-neural-network-from-scratch/ Go AUG OCT NOV
👤 ⍰❎
weight.i
120 captures <- 0.01*matrix(rnorm(layer.size(i)*layer.size(i+1)), 22 f🐦
23 Feb 2016 - 8 Nov 2020 2019 2020 2021 ▾ About this capture
nrow=layer.size(i),
ncol=layer.size(i+1))
What can I do?
Thanks in advance for replying.
Cheers Flo-Bow
Reply (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/?replytocom=5129#respond)

Matrix Chen says:

September 22, 2017 at 4:44 am (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/#comment-5134)


Hi Flo-Bow,

Really thanks for your comments.


This line just shows that the weight size is defined by the number of neurons layer M and the number of neurons in layer M+1. You can
complete the weight ini aliza on part according to this rule on your own. Also, you can refer to the en re source code of this post:
h ps://github.com/PatricZhao/ParallelR/blob/master/ParDNN/iris_dnn.R
(h ps://web.archive.org/web/20201022212824/h ps://github.com/PatricZhao/ParallelR/blob/master/ParDNN/iris_dnn.R)

you can try to run it and then debug one by one.


Thanks for your comments again.

Reply (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/?


replytocom=5134#respond)

Mauricio (h ps://web.archive.org/web/20201022212824/h p://www.divephotoguide.com/user/lypemania18) says:

April 10, 2020 at 9:53 am (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/#comment-13053)


As I site possessor I believe the content ma er here is ra ling wonderful ,
appreciate it for your hard work. You should keep it up forever!
Best of luck.
Reply (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-deep-neural-network-from-scratch/?
replytocom=13053#respond)

2 TRACKBACKS
R for Deep Learning (II): Achieve High-Performance DNN with Parallel Accelera on – ParallelR
(h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-dnn-parallel-accelera on/)
R for deep learning (III): CUDA and Mul GPUs Accelera on – ParallelR (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-dnn-cuda-mul gpu/)

LEAVE A REPLY
Your email address will not be published. Required fields are marked *

Comment
Name *
http://www.parallelr.com/r-deep-neural-network-from-scratch/ Go AUG OCT NOV 👤 ⍰❎
120 captures 22 f🐦
23 Feb 2016 - 8 Nov 2020 2019 2020 2021 ▾ About this capture

Email *

Website

No fy me of followup comments via e-mail. You can also subscribe


(h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/comment-
subscrip ons/?srp=216&srk=7c88bab4554719eb6eda8e5de4a43dff&sra=s&srsrc=f)
without commen ng.

POST COMMENT

NEWSLETTER

Email address:

Your email address

Subscribe

SEARCH

Search 

CATEGORIES

Accelerators (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/category/mic-fpga-asic/)

General (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/category/general-inforamion/)

GPGPU (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/category/mic-fpga-asic/general-purpose-gpu-compu ng/)

Intel Xeon Phi (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/category/mic-fpga-asic/intel-xeon-phi/)

MPI (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/category/openmpi-message-passing-interface/)

Mul Cores (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/category/openmp-tbb-pthread-mul threads/)

Performance Op mizaiton (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/category/cache-efficiency/)

Vectoriza on (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/category/vectoriza on/)

RECENT POSTS
http://www.parallelr.com/r-deep-neural-network-from-scratch/
R benchmark for High-Performance Analy cs and Compu ng (II): GPU Packages Go AUG OCT NOV 👤 ⍰❎
(hcaptures
120 ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-hpac-benchmark-analysis-gpu/)
22 f🐦
23 Feb 2016 - 8 Nov 2020 2019 2020 2021 ▾ About this capture

Parallel Computa on with R and XGBoost (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/parallel-computa on-with-


r-and-xgboost/)

R with Parallel Compu ng from User Perspec ves (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-with-parallel-


compu ng/)

R and openMP: boos ng compiled code on mul -core cpu-s (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-and-
openmp-boos ng-compiled-code-on-mul -core-cpu-s/)

R for Deep Learning (III): CUDA and Mul GPUs Accelera on (h ps://web.archive.org/web/20201022212824/h p://www.parallelr.com/r-dnn-
cuda-mul gpu/)

BLOG ROLL

R-bloggers (h ps://web.archive.org/web/20201022212824/h p://www.r-bloggers.com/)

Simple Sta s cs (h ps://web.archive.org/web/20201022212824/h p://simplysta s cs.org/)

Jun's Blog (h ps://web.archive.org/web/20201022212824/h p://junma5.weebly.com/data-blog/)

DISCLAIMER
1) This is a personal weblog.
2) The opinions expressed here represent my own and not those of my employer.
3) The opinions expressed by the ParallelR Bloggers and those providing comments are theirs alone and do not reflect the opinions of ParallelR.

HOME (HTTPS://WEB.ARCHIVE.ORG/WEB/20201022212824/HTTP://WWW.PARALLELR.COM/)
BLOG (HTTPS://WEB.ARCHIVE.ORG/WEB/20201022212824/HTTP://WWW.PARALLELR.COM/BLOG/)
PRESENTATIONS (HTTPS://WEB.ARCHIVE.ORG/WEB/20201022212824/HTTP://WWW.PARALLELR.COM/PRESENTATION/)
OPEN SOURCES (HTTPS://WEB.ARCHIVE.ORG/WEB/20201022212824/HTTP://WWW.PARALLELR.COM/OPENSOURCE/)
DONATION (HTTPS://WEB.ARCHIVE.ORG/WEB/20201022212824/HTTP://WWW.PARALLELR.COM/DONATION/)
CONTACT (HTTPS://WEB.ARCHIVE.ORG/WEB/20201022212824/HTTP://WWW.PARALLELR.COM/CONTACT/)
LOGIN (HTTPS://WEB.ARCHIVE.ORG/WEB/20201022212824/HTTP://WWW.PARALLELR.COM/WP-ADMIN)

Copyright © ParallelR 2020

  

(h ps://web.archive.org/web/20201022212824/h
(h ps://web.archive.org/web/20201022212824
(h ps://web.archive.org/web/202010222p

You might also like