Welcome to Scribd!

Bias-Variance Tradeoff Clustering: Clustering Phenomena in Dropout

Uploaded by

0% found this document useful (0 votes)

6 views1 page

Dropout regularization clusters neurons together during training such that there is a high similarity between consecutive weight matrices. This clustering can be used to predict trends in both the sample variance observed during test time sampling and the overall model variance. Specifically, the sizes of the hidden layers are found to have an almost exact power relation to the sampling variance. By using the centroids of the neuron clusters as compressed weights, the authors demonstrate strong model compression results while retaining the original model's predictive features.

Original Description:

Original Title

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Download as pdf or txt

0% found this document useful (0 votes)

6 views1 page

Bias-Variance Tradeoff Clustering: Clustering Phenomena in Dropout

Uploaded by

Sawaira Iqbal

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Download as pdf or txt

Jump to Page

You are on page 1of 1

Search inside document

Clustering Phenomena in Dropout

David Kewei Lin Jensen Jinhui Wang Phil Chen

linkewei@stanford.edu wangjh97@stanford.edu philhc@stanford.edu CS 229 (Spr 2019) Final Project
Mentor: Anand Avati

Introduction Clustering Bias-Variance Tradeoff

Idea 1 – Dropout ”clusters” neurons Findings Comparison of Model Variances
Dropout is a regularization technique introduced in (Srivastava et al., • Dropout decreases model variance
Result: High rand index between consecutive
2014). weight matrices • Sampling at test time further decreases model
variance
During training time, Why does sampling at test time decrease
nodes will be disabled at variance?
random with a fixed Electron cloud model
probability 1-p. Model family Electron cloud
(1 − p ≈ 0.2 − 0.5) Sampling Superposition of states
Idea 2 – Clustering can predict sampling variance
Inverting at test time Collapsing wavefunction
Result: almost exact power relation between size of hidden layer and sampling variance
On average, the electron cloud moves less
than any one ”component” of the cloud
During test time, the
node output is
multiplied by p so that
the expected outputs
match up. Legend

Framework / Setup Model Compression Summary/Future Work

• Interpretation as Bayesian Neural Idea: if dropout clusters neurons together, use ”centroids” of weight matrix to Summary
Net compress network • Dropout clusters weights of hidden layers
• Model drawn from Dropout • These clusters can predict trends for both sample variance and model
distribution Results variance
• Training equivalent to • Using the centroids of these clusters, we presented a model compression
variational inference for algorithm with strong results
Bayesian neural networks (Gal
and Gharamani, 2015) Future work
• Latent variables are the usual • Unified model to explain both sample and model variance through
weights Advantages clustering
• Retains original model features • Considering both sets of weights simultaneously for model compression
• Dropout test-time protocol is a • Easy computation • Theoretically-motivated hyperparameters for model compression
linearization assumption (scaling • Low variance
by Dropout factor)
• Alternative: Monte Carlo Disadvantages
References:
integration (sampling) • Slightly lower accuracy [1] Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. Dropout: a simple way to prevent neural networks from
overfitting. The Journal of Machine Learning Research, 15(1):1929–1958, 2014.
[2] Yarin Gal and Zoubin Ghahramani. Bayesian convolutional neural networks with bernoulli approximate variational inference. arXiv preprint arXiv:1506.02158,
2015.
• Testing setup [3] William M. Rand. Objective criteria for the evaluation of clustering methods. Journal of the American Statistical Association, 66(336):846–850, 1971. doi:
10.1080/01621459.1971.10482356
• MNIST dataset with basic 3- Both graphics used in the “Introduction to Dropout” section are from [1].
layer feedforward neural
network.

GNPS Classic Molecular Networking Overview: Enter Authors Here University of California - San Diego
Document9 pages
GNPS Classic Molecular Networking Overview: Enter Authors Here University of California - San Diego
skaract
No ratings yet
Chapter 1: Biology: Exploring Life
Document5 pages
Chapter 1: Biology: Exploring Life
Thalia Lau
100% (1)
English: Quarter 2 - Module 3
Document21 pages
English: Quarter 2 - Module 3
Emer Perez
100% (12)
Statistics in Theory and Practice
From Everand
Statistics in Theory and Practice
Robert Lupton
Rating: 3 out of 5 stars
3/5 (1)
Aurora Rising by Amie Kaufman and Jay Kristoff Excerpt
Document40 pages
Aurora Rising by Amie Kaufman and Jay Kristoff Excerpt
Allen & Unwin
50% (10)
Semisupervised Learning Image Classification2
Document13 pages
Semisupervised Learning Image Classification2
Sabbir Ahmed
No ratings yet
Constable Etal 1987 Geophysics
Document12 pages
Constable Etal 1987 Geophysics
cannabilab.cannapils
No ratings yet
Less Than One'-Shot Learning: Learning N Classes From M N Samples
Document8 pages
Less Than One'-Shot Learning: Learning N Classes From M N Samples
peter suma
No ratings yet
Rennie 2014
Document6 pages
Rennie 2014
maged
No ratings yet
Weakly Supervised Contrastive Learning
Document10 pages
Weakly Supervised Contrastive Learning
liuyunwu2008
No ratings yet
Dimensionality Reduction (Contd) : CS771: Introduction To Machine Learning Piyush Rai
Document17 pages
Dimensionality Reduction (Contd) : CS771: Introduction To Machine Learning Piyush Rai
Rajachandra Voodiga
No ratings yet
NN 04
Document18 pages
NN 04
youssef hussein
No ratings yet
Class-Balanced Loss Based On Effective Number of Samples
Document11 pages
Class-Balanced Loss Based On Effective Number of Samples
Param Uttarwar
No ratings yet
CS772 Lec21
Document18 pages
CS772 Lec21
Yuvraj Pardeshi
No ratings yet
Fulltext01 4
Document21 pages
Fulltext01 4
Abel Berhanmeskel
No ratings yet
Simple Time-Domain Model For A Shielded Twisted Pair of Wires
Document7 pages
Simple Time-Domain Model For A Shielded Twisted Pair of Wires
Denis Jaisson
No ratings yet
Robust, Accurate Stochastic Optimization For Variational Inference
Document16 pages
Robust, Accurate Stochastic Optimization For Variational Inference
th onorimi
No ratings yet
Ancova DS 10
Document2 pages
Ancova DS 10
Leonardo Oliveira
No ratings yet
Curricularface: Adaptive Curriculum Learning Loss For Deep Face Recognition
Document10 pages
Curricularface: Adaptive Curriculum Learning Loss For Deep Face Recognition
Cẩm Tú Cầu
No ratings yet
Paper 6 Tips (Physics IGCSE) Centre of Mass Experiment (With The Lamina
Document6 pages
Paper 6 Tips (Physics IGCSE) Centre of Mass Experiment (With The Lamina
vexin
No ratings yet
Neural Networks: Haibing Wu, Xiaodong Gu
Document10 pages
Neural Networks: Haibing Wu, Xiaodong Gu
Shah Zaib
No ratings yet
Lecture 8 Machine Learning PDF
Document97 pages
Lecture 8 Machine Learning PDF
Deepti Gupta
No ratings yet
Znet - Lung Nodule Detection Moira Berens, Robbert Van Der Gugten, Michael de Kaste, Jeroen Manders, and Guido Zuidhof
Document4 pages
Znet - Lung Nodule Detection Moira Berens, Robbert Van Der Gugten, Michael de Kaste, Jeroen Manders, and Guido Zuidhof
Prakash Singh
No ratings yet
Analysis of Ground Void Patterns For Differential Microstrip Impedance Matching On Surface Mount Pads
Document4 pages
Analysis of Ground Void Patterns For Differential Microstrip Impedance Matching On Surface Mount Pads
Abhijith Prabha
No ratings yet
CVAE-GAN Fine-Grained Image Generation Through Asymmetric Training
Document10 pages
CVAE-GAN Fine-Grained Image Generation Through Asymmetric Training
Wang fei
No ratings yet
Alexnet Tugce Kyunghee
Document35 pages
Alexnet Tugce Kyunghee
Jose
No ratings yet
Deep Clustering Discriminative Embeedings For Segmentation and Separation
Document5 pages
Deep Clustering Discriminative Embeedings For Segmentation and Separation
George Kasihiuw
No ratings yet
Parallel Multiscale Autoregressive Density Estimation
Document16 pages
Parallel Multiscale Autoregressive Density Estimation
Mario Galindo Queralt
No ratings yet
Implicit Quantile Networks For Distributional Reinforcement Learning, Will Dabney Et Al., 2018, v1
Document14 pages
Implicit Quantile Networks For Distributional Reinforcement Learning, Will Dabney Et Al., 2018, v1
Jean Savournon
No ratings yet
Ever,: P.I.Fergestad's Norwegian Technology1
Document10 pages
Ever,: P.I.Fergestad's Norwegian Technology1
Ilona Dr. Smuncz
No ratings yet
Efficient Second-Order TreeCRF For Neural Dependency Parsing
Document11 pages
Efficient Second-Order TreeCRF For Neural Dependency Parsing
spam8883
No ratings yet
Brio
Document14 pages
Brio
Dheeraj
No ratings yet
Modality Matches Modality: Pretraining Modality-Disentangled Item Representations For Recommendation
Document9 pages
Modality Matches Modality: Pretraining Modality-Disentangled Item Representations For Recommendation
QAQ OYSTER
No ratings yet
Image Transformer: Van Den Oord & Schrauwen 2014 Bellemare Et Al. 2016
Document10 pages
Image Transformer: Van Den Oord & Schrauwen 2014 Bellemare Et Al. 2016
Michael Luo
No ratings yet
Contrastive Self-Supervised Learning Leads To Higher Adversarial Susceptibility
Document9 pages
Contrastive Self-Supervised Learning Leads To Higher Adversarial Susceptibility
Rohit Gupta
No ratings yet
Step Perhitungan Leaching
Document19 pages
Step Perhitungan Leaching
nadia
No ratings yet
The Impact of Neural Network Overparameterization On Gradient Confusion and Stochastic Gradient Descent
Document46 pages
The Impact of Neural Network Overparameterization On Gradient Confusion and Stochastic Gradient Descent
No12n533
No ratings yet
Web Page
Document13 pages
Web Page
Swapan Kumar Saha
No ratings yet
Nakagami Fading Channel
Document12 pages
Nakagami Fading Channel
Osama Rehman
No ratings yet
Fault Tolerant
Document10 pages
Fault Tolerant
Kmp Sathishkumar
No ratings yet
1 s2.0 S1877050921010395 Main
Document8 pages
1 s2.0 S1877050921010395 Main
Radwa Magdy
No ratings yet
D8 ADVANCE Diffraction Solutions
$D8 ADVANCE Diffraction Solutions$
Document40 pages
D8 ADVANCE Diffraction Solutions
eabdellatif
No ratings yet
Small Steps
Document7 pages
Small Steps
张宇
No ratings yet
Marginalized Denoising Auto-Encoders For Nonlinear Representations
Document9 pages
Marginalized Denoising Auto-Encoders For Nonlinear Representations
Ali Hassan Mirza
No ratings yet
Unit 5 - Week 3 Lectures: Assignment 3
Document3 pages
Unit 5 - Week 3 Lectures: Assignment 3
rallabhandiSK
No ratings yet
Poster
Document1 page
Poster
Muhammad Waris Hargan
No ratings yet
Identifying and Compensating For Feature Deviation in Imbalanced Deep Learning
Document15 pages
Identifying and Compensating For Feature Deviation in Imbalanced Deep Learning
2000010605
No ratings yet
Batch Normalization
Document11 pages
Batch Normalization
Ali Hssan
No ratings yet
Tung Similarity-Preserving Knowledge Distillation ICCV 2019 Paper
Document10 pages
Tung Similarity-Preserving Knowledge Distillation ICCV 2019 Paper
raniahamizi006
No ratings yet
Ablation
Document15 pages
Ablation
chouirefkawtar
No ratings yet
DL6 - Convnets 4
Document57 pages
DL6 - Convnets 4
razifa0
No ratings yet
Implementation of Multi-Expansion Point Model Order Reduction For Coupled PEEC-Semiconductor Simulations
Document6 pages
Implementation of Multi-Expansion Point Model Order Reduction For Coupled PEEC-Semiconductor Simulations
ツギハギスタッカート
No ratings yet
Topic 1-Intorducton To Fluid Mechanics
Document45 pages
Topic 1-Intorducton To Fluid Mechanics
维尔
No ratings yet
Accurate and Reliable Facial Expression Recognition Using Advanced Softmax Loss With Fixed Weights
Document5 pages
Accurate and Reliable Facial Expression Recognition Using Advanced Softmax Loss With Fixed Weights
Mylavarapu Sripritham
No ratings yet
2022 06 Poster Lmps Najjar
Document1 page
2022 06 Poster Lmps Najjar
عمر نجار
No ratings yet
Paper 1
Document6 pages
Paper 1
Vivek Kumar Res. Scholar, Computer Sci. & Engg., IIT(BHU)
No ratings yet
One Pixel Attack For Fooling Deep Neural Networks
Document9 pages
One Pixel Attack For Fooling Deep Neural Networks
augustuswg
No ratings yet
The Limitations of Deep Learning in Adversarial Settings
Document16 pages
The Limitations of Deep Learning in Adversarial Settings
ZIX326
No ratings yet
Network Models II: CS109/Stat121/AC209/E-109 Data Science
Document19 pages
Network Models II: CS109/Stat121/AC209/E-109 Data Science
Matheus Silva
No ratings yet
Icecube/Deepcore Tests For Novel Explanations of The Miniboone Anomaly
Document7 pages
Icecube/Deepcore Tests For Novel Explanations of The Miniboone Anomaly
Hakim FOURAR LAIDI
No ratings yet
05 Spectral Power Iterations For The Random Eigenvalue Problem
Document13 pages
05 Spectral Power Iterations For The Random Eigenvalue Problem
JR
No ratings yet
Linkage Based Face Clustering Via Graph Convolution Network
Document9 pages
Linkage Based Face Clustering Via Graph Convolution Network
jasinechen
No ratings yet
Virtual Adversarial Training: A Regularization Method For Supervised and Semi-Supervised Learning
Document16 pages
Virtual Adversarial Training: A Regularization Method For Supervised and Semi-Supervised Learning
Thế Anh Nguyễn
No ratings yet
Margdarshan Scheme Guidelines New - 16.08.2023
Document34 pages
Margdarshan Scheme Guidelines New - 16.08.2023
Mahendera Ku. Chand
No ratings yet
II-Day 6
Document8 pages
II-Day 6
Sir Asher
No ratings yet
English HL Grade 11 Revision Term 1 - 2021
Document16 pages
English HL Grade 11 Revision Term 1 - 2021
Maria-Regina Ukatu
No ratings yet
Load at Yield Point (N) Yield Strength (Mpa) Maximum Load (N) Ultimater Strength (Mpa)
Document4 pages
Load at Yield Point (N) Yield Strength (Mpa) Maximum Load (N) Ultimater Strength (Mpa)
Anonymous SJE1kG5AkN
No ratings yet
Ficha Técnica de ATF Quaquer
Document4 pages
Ficha Técnica de ATF Quaquer
Rodrigo Aguirre
No ratings yet
View All Callouts: Function Isolation Tools
Document15 pages
View All Callouts: Function Isolation Tools
Airton Sena
No ratings yet
C6501 - User's Guide (Security)
Document32 pages
C6501 - User's Guide (Security)
Salih Anwar
No ratings yet
Pengembangan Lembar Kerja Siswa Berbasis Pendekatan Open-Ended Untuk Memfasilitasi Kemampuan Berpikir Kreatif Matematis Siswa Madrasah Tsanawiyah
Document11 pages
Pengembangan Lembar Kerja Siswa Berbasis Pendekatan Open-Ended Untuk Memfasilitasi Kemampuan Berpikir Kreatif Matematis Siswa Madrasah Tsanawiyah
erdawati nurdin
No ratings yet
Photostencil Emulsions 23
Document4 pages
Photostencil Emulsions 23
huroncursi
No ratings yet
Modul Unit 17 - Rasuah Musuh Negara - PDF
Document14 pages
Modul Unit 17 - Rasuah Musuh Negara - PDF
Elaine Tang
No ratings yet
Trace
Document2,936 pages
Trace
Marisa Tamba.2005
No ratings yet
Ps 1144BL-4WD 1
Document53 pages
Ps 1144BL-4WD 1
Gheorghe Holtea
No ratings yet
Report On Transducers
Document36 pages
Report On Transducers
punith666
No ratings yet
Color Atlas of Diagnostic Microbiology
Document223 pages
Color Atlas of Diagnostic Microbiology
Oana Herea
100% (2)
Access To Credit and Performance of Small Scale Farmers in Nigeria
Document126 pages
Access To Credit and Performance of Small Scale Farmers in Nigeria
Emmanuel Kings
No ratings yet
Hedland Scratch Band Tunebook (Dedicated To Colleen)
Document219 pages
Hedland Scratch Band Tunebook (Dedicated To Colleen)
api-3836032
100% (6)
Bangalore Institute of Technology
Document4 pages
Bangalore Institute of Technology
girija
No ratings yet
Tips On Upcat
Document5 pages
Tips On Upcat
Aira Mae Aloveros
No ratings yet
Persuasive Speech Best
Document6 pages
Persuasive Speech Best
api-658645362
No ratings yet
Whitman Resume Annotated
Document1 page
Whitman Resume Annotated
api-664114983
No ratings yet
PEAC Lesson Plan English 8
Document2 pages
PEAC Lesson Plan English 8
Galzote Jhosie May
No ratings yet
Phase Transformation
Document356 pages
Phase Transformation
SarbajitManna
No ratings yet
Solution Test
Document3 pages
Solution Test
mridul
No ratings yet
CEM130 Construction Safety Management - SYLLABUS Revised
Document5 pages
CEM130 Construction Safety Management - SYLLABUS Revised
del rosario
No ratings yet
Test 1 Grammar, Revised Ecpe Honors
Document3 pages
Test 1 Grammar, Revised Ecpe Honors
Anna Chronopoulou
100% (1)
Wiii Wadd
Document6 pages
Wiii Wadd
Muhammad Daffa A S
No ratings yet
Edc 300-Module 1
Document82 pages
Edc 300-Module 1
isaya boniphace
No ratings yet