Parameters, Hyperparameters, Machine Learning - Towards Data Science

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

10/26/21, 8:39 PM Parameters, Hyperparameters, Machine Learning | Towards Data Science

Open in app

Towards Data Science


591K Followers

Home About Editors' Picks Features Deep Dives Grow Contribute

Kizito Nyuytiymbiy · Dec 30, 2020 · 4 min read

Parameters and Hyperparameters in Machine


Learning and Deep Learning
What exactly are they and how do they interact?

When you begin learning anything new one of the things you grapple with is the lingo
of the field you’re getting into. Clearly understanding the terms (and in some cases the
symbols and acronyms) used in a field is the first and most fundamental step to
understanding the subject matter itself. When I started out in Machine Learning, the
concept of parameters and hyperparameters confused me a lot. If you are here I am
supposing you also find it confusing. So, I wrote this article to dispel whatever
confusion you might have and set you on a path of absolute clarity.

In ML/DL, a model is defined or represented by the model parameters. However,


the process of training a model involves choosing the optimal hyperparameters
that the learning algorithm will use to learn the optimal parameters that
correctly map the input features (independent variables) to the labels or targets
(dependent variable) such that you achieve some form of intelligence.

So what exactly are parameters and hyperparameters and how do they relate?

Hyperparameters
Hyperparameters are parameters whose values control the learning process and
determine the values of model parameters that a learning algorithm ends up learning.
https://towardsdatascience.com/parameters-and-hyperparameters-aa609601a9ac 1/4
10/26/21, 8:39 PM Parameters, Hyperparameters, Machine Learning | Towards Data Science
determine the values of model parameters that a learning algorithm ends up learning.
Open in app

The prefix ‘hyper_’ suggests that they are ‘top-level’ parameters that control the
learning process and the model parameters that result from it.

As a machine learning engineer designing a model, you choose and set


hyperparameter values that your learning algorithm will use before the training of the
model even begins. In this light, hyperparameters are said to be external to the model
because the model cannot change its values during learning/training.

Hyperparameters are used by the learning algorithm when it is learning but they are
not part of the resulting model. At the end of the learning process, we have the trained
model parameters which effectively is what we refer to as the model. The
hyperparameters that were used during training are not part of this model. We cannot
for instance know what hyperparameter values were used to train a model from the
model itself, we only know the model parameters that were learned.

Basically, anything in machine learning and deep learning that you decide their values or
choose their configuration before training begins and whose values or configuration will
remain the same when training ends is a hyperparameter.

Here are some common examples


Train-test split ratio

Learning rate in optimization algorithms (e.g. gradient descent)

Choice of optimization algorithm (e.g., gradient descent, stochastic gradient


descent, or Adam optimizer)

Choice of activation function in a neural network (nn) layer (e.g. Sigmoid, ReLU,
Tanh)

The choice of cost or loss function the model will use

Number of hidden layers in a nn

Number of activation units in each layer

The drop-out rate in nn (dropout probability)

Number of iterations (epochs) in training a nn

N b f l i l i k
https://towardsdatascience.com/parameters-and-hyperparameters-aa609601a9ac 2/4
10/26/21, 8:39 PM Parameters, Hyperparameters, Machine Learning | Towards Data Science

Number of clusters in a clustering task


Open in app

Kernel or filter size in convolutional layers

Pooling size

Batch size

Parameters
Parameters on the other hand are internal to the model. That is, they are learned or
estimated purely from the data during training as the algorithm used tries to learn the
mapping between the input features and the labels or targets.

Model training typically starts with parameters being initialized to some values
(random values or set to zeros). As training/learning progresses the initial values are
updated using an optimization algorithm (e.g. gradient descent). The learning
algorithm is continuously updating the parameter values as learning progress but
hyperparameter values set by the model designer remain unchanged.

At the end of the learning process, model parameters are what constitute the model
itself.

Examples of parameters
The coefficients (or weights) of linear and logistic regression models.

Weights and biases of a nn

The cluster centroids in clustering

Simply put, parameters in machine learning and deep learning are the values your
learning algorithm can change independently as it learns and these values are affected
by the choice of hyperparameters you provide. So you set the hyperparameters before
training begins and the learning algorithm uses them to learn the parameters. Behind
the training scene, parameters are continuously being updated and the final ones at
the end of the training constitute your model.

Therefore, setting the right hyperparameter values is very important because it directly
impacts the performance of the model that will result from them being used during
model training. The process of choosing the best hyperparameters for your model is
called hyperparameter tuning and in the next article, we will explore a systematic way
of doing hyperparameter tuning
https://towardsdatascience.com/parameters-and-hyperparameters-aa609601a9ac 3/4
10/26/21, 8:39 PM Parameters, Hyperparameters, Machine Learning | Towards Data Science
of doing hyperparameter tuning.
Open in app

Conclusion
I trust that you now have a clear understanding of what hyperparameters and
parameters exactly are and understand that hyperparameters have an impact on the
parameters your model learns. I will be following this up with a detailed practical
article on hyperparameter tuning.

This article is a product of knowledge from


The deep learning specialization on Coursera by Andrew Ng.

Machine Learning course on Coursera by Andrew Ng.

Sign up for The Variable


By Towards Data Science

Every Thursday, the Variable delivers the very best of Towards Data Science: from hands-on tutorials
and cutting-edge research to original features you don't want to miss. Take a look.

Emails will be sent to ravan.farman@gmail.com.


Get this newsletter Not you?

https://towardsdatascience.com/parameters-and-hyperparameters-aa609601a9ac 4/4

You might also like