Softmax function takes in a vector of dimension and normalizes it into a probability distribution with different outcomes:

Numerical Instability

When calculating softmax in floating point, when the values get big there are instabilities. Instead, observe:

Which means we can add any constant to prevent overflow/underflow and thus infs.