A trick in deep learning to speed up training computation. Linear Normialization § For each f(x)i=γ⋅σxi−E[x]+β where γ,β are learnable parameters Batch Normalization §