Layer-sequential unit-variance (LSUV) initialization

This is a simple method for weight initialization for deep net learning. The method consists of the two steps:
- First, pre-initialize weights of each convolution or inner-product layer with
orthonormal matrices.
- Second, proceed from the first to the final layer, normalizing the variance of the output of each layer to be equal to one.
Experiment with different activation functions (maxout, ReLU-family, tanh) show
that the proposed initialization leads to learning of very deep nets.
Pseudo-code of LSUV

Note:
- In the most cases, batch normalization put after non-linearity performs better.
- LSUV-initialized network is as good as batch-normalized one.
- The paper are not claiming that batch normalization can always be replaced by proper initialization, especially in large datasets like ImageNet.
LSUV-keras: https://github.com/ducha-aiki/LSUV-keras

Tech It Yourself

Layer-sequential unit-variance (LSUV) initialization

Post a Comment

0 Comments

Latest Posts

Popular Posts

Collection of points in Computer Vision

Note 2: Linear regression - Python demos

new data augmentation methods

Principal Component Analysis PCA using Singular Value Decomposition SVD

Visualize the heatmap - GradCAM - Keras

Popular Posts

Collection of points in Computer Vision

Note 2: Linear regression - Python demos

new data augmentation methods

Principal Component Analysis PCA using Singular Value Decomposition SVD

Visualize the heatmap - GradCAM - Keras

What is a batch-norm in machine learning?

Robot Operating System - ROS tutorial

UML Class Diagram Relationships, Aggregation, Composition

Tags

Layer-sequential unit-variance (LSUV) initialization

Post a Comment

0 Comments

Follow us

Latest Posts

Popular Posts

Popular Posts

Tags