Description

Latent Dirichlet Allocation (LDA) is a generative statistical model that allows sets of observations to be explained by unobserved groups that explain why some parts of the data are similar.

How it Works

  1. LDA represents documents as mixtures of topics.
  2. Each topic is modeled as a distribution over words.
  3. LDA assumes that the words of each document are generated by a mixture model corresponding to a random mixture of latent topics.
  4. The words in each document are assumed to be produced by a two-level generative process.

Benefits

Limitations

Features