

The encoding of a given word is simply the vector in which the corresponding element is set to one, and all other elements are zero. In a simple 1-of-N (or ‘one-hot’) encoding every element in the vector is associated with a word in the vocabulary. What is a word vector?Īt one level, it’s simply a vector of weights.

The last two papers give a more detailed explanation of some of the very concisely expressed ideas in the Milokov papers.Ĭheck out the word2vec implementation on Google Code. The third paper (‘Linguistic Regularities…’) describes vector-oriented reasoning based on word vectors and introduces the famous “King – Man + Woman = Queen” example.

From the second paper we get more illustrations of the power of word vectors, some additional information on optimisations for the skip-gram model (hierarchical softmax and negative sampling), and a discussion of applying word vectors to phrases.
