Skip-Gram: The Skip-Gram model predicts the context words (surrounding words) given a target word (center word). The input to the Skip-Gram model is the target word, and the outputs are the words that appear in the context of the target word.

For example, consider the sentence “The cat sat on the mat.” If “sat” is the target word, the Skip-Gram model will use “sat” to predict the context words “The”, “cat”, “on”, “the”, “mat”.

Continuous Bag of Words (CBOW): The CBOW model, on the other hand, predicts the target word from its context. It takes the context words as input and tries to predict the target word.

Using the same sentence “The cat sat on the mat”, the CBOW model will take “The”, “cat”, “on”, “the”, “mat” as input and try to predict the target word “sat”.

Differences:

In summary, the choice between Skip-Gram and CBOW depends on your specific application and the amount of data you have. If you have a small dataset or care a lot about the representation of rare words, Skip-Gram might be your best bet. If you have a large dataset and speed is a concern, CBOW could be more suitable.