Why are residual connections important in deep neural networks?
Answer
Residual connections, also known as skip connections, are vital in deep neural networks primarily because they tackle the infamous vanishing gradient problem and help with the related issue of network degradation as the network depth increases.
Residual connection is often expressed by the following equation:
Where: represents the residual mapping that the network learns (i.e., what needs to be added to the input
to achieve the desired output).
is the input to the residual block.

(1) Tackle vanishing gradient problem:
Residual connections create a direct shortcut for gradient flow by incorporating an identity mapping into the learned transformation. This ensures that even if the gradient through the learned component is small, a strong, direct gradient component persists, preventing vanishing gradients in deep networks. This improves gradient flow during backpropagation, reducing vanishing gradients and enabling the training of very deep networks.
(2) Address network degradation:
Residual connections mitigate the degradation problem often seen in deep networks. Without these connections, simply stacking more layers can result in higher training errors, as the network struggles to update its weights effectively. With residual connections, any layer that doesn’t contribute useful information can effectively learn to output zeros in the residual branch, letting the network default to an identity mapping.
Leave a Reply