ML0033 All Zeros Init

Written by

How does initializing all weights and biases to zero affect a neural network’s training?

Answer

Initializing all weights and biases to zero forces neurons to behave identically, leading to uniform gradient updates that prevent the network from learning diverse representations.

(1) Symmetry Problem: Neurons receive identical gradients, causing them to learn the same features rather than developing distinct representations.
(2) Limited Representational Capacity: The network cannot capture complex, varied patterns because all neurons behave identically.
(3) Slow/No Convergence: The lack of Representational Capacity further makes it difficult for the model to update to the optimal weights.
(4) Zero Output (Potentially): For some activation functions (like ReLu), with zero weights and biases, the initial output of every neuron will be zero. This can lead to zero gradients in the subsequent layers, halting the learning process entirely.

Here is an example comparing initializing all weights and biases to zero vs random initialization for a binary classification problem.

Did you solve the problem?

Basics NN

ML0033 All Zeros Init

Comments

Leave a Reply Cancel reply

More posts

MSD0007 Demand Forecasting System for Retailer

MSD0006 Video Recommendation System

MSD0005 Surveillance Video Anomaly Detection

DL0052 Rotary Positional Embedding