NLP Topic 1 Word Embeddings and Sentence Embeddings

原創

CrazyTensor

2019-04-16 06:45

Topic 1 Word Embeddings and Sentence Embeddings

cs224n-2019

lecture 1: Introduction and Word Vectors
lecture 2: Word Vectors 2 and Word Senses
slp
chapter 6: Vector Semantics
ruder.io/word-embeddings
chapter 14: The Representation of Sentence Meaning

語言是信息傳遞知識傳遞的載體，
能有效溝通的前提是，雙方的知識等同

文章目錄

Topic 1 Word Embeddings and Sentence Embeddings

How to represent the meaning of a word?

Representing words by their context

How to represent the meaning of a word?

meaning: signifier(symbol) <=> signified(idea or thing)
common solution: WordNet, a thesaurus containing lists of synonym sets and hypernyms 同義詞和上位詞。
缺點：missing new meanings of words, can’t compute accurate word similarity.
solution: representing words as discrete symbols one-hot, but there is curse of dimensionality problem as well as on natural notion of similarity:

Representing words by their context

It should learn to encode similarity in the vectors themselves
詞向量的編碼目標是把詞相似性進行編碼，所有優化的目標和實際的使用都圍繞在similarity上。類比所有的編碼器，都應該清楚編碼的目標是什麼！
Distributional semantics: A word’s meaning is given by the words that frequently appear close-by.
You shall know a word by the company it keeps.
Word vectors/word embeddings: a dense vector for each word, chosen so that it is similar to vectors of words that appear in similar contexts.

Word2vec: Overview

Word2vec (Mikolov et al. 2013) is a framework for learning word vectors, main idea:

We have a large corpus of text
Every word in a fixed vocabulary is represented by a vector
Go through each position $t$ in the text, which has a center word $c$ and context (“outside”) words $o$
Use the similarity of the word vectors for c and o to calculate the probability of $o$ given $c$ (or vice versa)
Keep adjusting the word vectors to maximize this probability

Example windows and process for computing $P(w_{t+j}|w_t)$ :

objective and prediction function

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

NLP Topic 1 Word Embeddings and Sentence Embeddings

Topic 1 Word Embeddings and Sentence Embeddings

文章目錄

How to represent the meaning of a word?

Representing words by their context

Word2vec: Overview

objective and prediction function

2020重新啓航

數學基礎 Probability Theory

Tensorflow 2.0 學習資料

NLP資料整理

Top 2 Language Models

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結