Topic 1 Word Embeddings and Sentence Embeddings
cs224n-2019
- lecture 1: Introduction and Word Vectors
- lecture 2: Word Vectors 2 and Word Senses
slp - chapter 6: Vector Semantics
ruder.io/word-embeddings - chapter 14: The Representation of Sentence Meaning
語言是信息傳遞知識傳遞的載體,
能有效溝通的前提是,雙方的知識等同
文章目錄
How to represent the meaning of a word?
meaning: signifier(symbol) <=> signified(idea or thing)
common solution: WordNet, a thesaurus containing lists of synonym sets and hypernyms 同義詞和上位詞。
缺點:missing new meanings of words, can’t compute accurate word similarity.
solution: representing words as discrete symbols one-hot, but there is curse of dimensionality problem as well as on natural notion of similarity:
Representing words by their context
It should learn to encode similarity in the vectors themselves
詞向量的編碼目標是把詞相似性進行編碼,所有優化的目標和實際的使用都圍繞在similarity上。類比所有的編碼器,都應該清楚編碼的目標是什麼!
Distributional semantics: A word’s meaning is given by the words that frequently appear close-by.
You shall know a word by the company it keeps.
Word vectors/word embeddings: a dense vector for each word, chosen so that it is similar to vectors of words that appear in similar contexts.
Word2vec: Overview
Word2vec (Mikolov et al. 2013) is a framework for learning word vectors, main idea:
- We have a large corpus of text
- Every word in a fixed vocabulary is represented by a vector
- Go through each position in the text, which has a center word and context (“outside”) words
- Use the similarity of the word vectors for c and o to calculate the probability of given (or vice versa)
- Keep adjusting the word vectors to maximize this probability
Example windows and process for computing :