Activation Functions
Activation functions introduce non-linearity into your network. Without them, a 47-layer neural network would have the expressive power of a single matrix multiply — and we'd all be out of a hobby.
All activation functions and their backward passes — from the classic ReLU to the modern GELU.
Activation functions introduce non-linearity into your network. Without them, a 47-layer neural network would have the expressive power of a single matrix multiply — and we'd all be out of a hobby.