Intuition. In many cases, data is represented by a sum of commonly-existing components. For example, a number is simply a combination of pen strokes. We formalize these properties into the following:

  • Each component is a element of a common language, i.e. a dictionary
  • Data is constructed from a weighted sum of dictionary items
    • Often, only a few types of pen strokes are used for a single number. Thus we want to only choose (=weight) few components; i.e. enforce sparsity in latent representation

def. Sparse Coding & Dictionary Learning. Given a dictionary , let be a datapoint and be the corresponding latent representation obtained by:

i.e., the latent representation always minimizes the reconstruction loss. This is an odd way to define a latent representation, but it is useful in that given a dictionary always minmizes reconstruction loss—part of the training is built into the inference.

  • We also enforce that the columns of is of norm , because if we don’t the sparsity objective can ‘cheat’ and increase the numbers in to compensate for the sparsity.
  • This optimization is done by the Iterative Shrinkage and Thresholding Algorithm (ISTM) because the sparsity penalty makes it difficult to use standard tools. Objective. Now, how do we get ? Sometimes it’s obvious (e.g., sentences are composed of words) but in most cases not. We obtain optimal with the following objective:

Site Unreachable