【論文筆記】Enhancing Pre-Trained Language Representations with Rich Knowledge for MRC

原創

changreal

2020-02-21 05:55

KT-NET——Knowledge and Text fusion NET

KBs ：WrodNet + NELL ; distrubuted representations of KBs(KB embeddings).

WordNet:記錄了lexical relations, 比如(organism, hypernym of, animal)

NELL:stores beliefs about entities；比如(Coca Cola, headquartered in, Atlanta)

Datasets：ReCoRD, SQuAD1.1

與其他利用extra knowledge model的區別（比如Kn-Reader區別）

首先學習了KB concepts的embeddins，對學習到的KB embeddings再做retrieved並整合進MRC系統裏（也就是structured kg和context是整合起來的）。這樣用到的relevant KB是globally的，這對MRC系統來說more useful.

之前的KB model都是先retrieve相關KB，然後再對相關KB encode和整合進MRC系統，其中的relevant KB是locally的。

評估指標：EM, F1，EM+F1 score

這篇論文的相關利用知識的模型和論文都值得看一看。

貢獻

1. pre-trained LMs + kn，未來研究的潛在方向，enhancing advanced LMs with kg from KBs.

2.設計了MRC的KT-NET

使用了kb的bert的效果

來源於ReCoRD（2018）: 引入來自WordNet和NELL的kn以後，提高了CST準確度。

Real-word entities, synsets, concepts

KT-NET模型

模型簡述

①首先學習2個KBs的embeddings；

②檢索相關的可能的KB embeddings；

③encodes，把選中的embeddings 和BERT的隱層狀態fuse起來；

④用context-, knowledge-aware predictions.

爲了encode kg，使用了knowledge graph embedding技術，從而學到KB concepts的向量表示。

給定P,Q,然後爲所有token w（w∈P∪Q的）檢索一系列相關的KB concepts C(w)，其中每個概念c∈C(w)，c是學到的vector embedding c. 從而得到預訓練的KB embeddings，再+ 4 major components裏。

然後，迭代地：

BERT Encoding layer，計算問題和passages的deep, context-aware representations;
Knowledge intergration layer, 不僅context-aware，並且knowledge-aware。利用attention機制從kb memory中選擇最相關的kb embeddings, 然後把他們和bert encode的representations整合起來；
Self-maching layer，fuse BERT and KB representations，進一步rich interactions.
Output layer，make knowledge-aware predictions.

具體

使用的2個KBs，知識的被存儲爲triples:(subject, relation, object),

Knowledge embedding

給定一個triple(s,r,o),學習vector embeddings of subject s, relation r, and object o.

然後使用BILINEAR model，f(s,r,o) = sTdiag(r)o.

這樣已經在KB裏的triples會有higher validity. 然後一個magin-base ranking loss來學習嵌入。從而得到兩個KBs的每個entity的vector representation。

Retrieval

Wordnet裏，返回word的synsets作爲候選;

NELL裏，首先識別P,Q的NE,通過string matching識別出的entities連接到NELL entities，然後蒐集相關NELL concepts作爲候選獲得一系列潛在相關概念。

如圖：passage/question的 token，給出kb中3個最相關度概念~ （用attention來選出）

4 component

實驗

預處理：使用BERT的BasicTokenizer，用NLTK找同義詞，還用FullTokenizer built in BERT to segment words into wordicecs.

考慮所有句子的單詞，(n. v. adj. adv)，然後每個詞si，獲取最後隱層詞表示，然後計算q和p的詞si、sj的餘弦相似度。

在MRC任務fine-tune後BERT對question的詞會學習到相似的表示。但是整合入知識以後，不同的q的單詞展示出對一篇文章的單詞不同的相似度，這些相似度很好地反映了它們在KBs裏encode的關係。

KT-NET可以學習更準確的representations, 從而取得更好的question-passage matching.

提到的技術

Knowledge graph embedding techniques (Yang et al., 2015)：用於encode knowledge, 學習到KB concept的向量表示；

Element-wise multiplcation；

Row-wise softmax；

BILINEAR model(yang 2015) 通過一個雙線性函數f(s,r,o)來測量validity，並且a margin-based ranking loss to learn the embeddings；

需要外部知識的數據集

ReCoRD ：extractive MRC datasets

ARC 、MCScript 、OpenBookQA 、CommonsenseQA ：multi-choice MRC datasets

structured knowledge from KBs :一系列論文（看論文）

部分提到的論文

(Bishan Yang and Tom Mitchell. 2017. ) Leveraging knowledge bases in lstms for improving machine reading；

(2018)Commonsense for generative multi-hop question answering tasks.；

【看過】(2018)Knowledgeable reader: Enhancing cloze-style reading comprehension with external commonsense knowledge.；

(2018, commonsense reasoning)Bridging the gap between human and machine commonsense reading comprehension

changreal

發佈了60 篇原創文章 · 獲贊 13 · 訪問量 3萬+

私信關注

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

【論文筆記】Enhancing Pre-Trained Language Representations with Rich Knowledge for MRC

KT-NET模型

具體

實驗

使用neovim打造go ide(支持代碼跳轉, 代碼補全, 實時語法檢查)

挑戰程序設計競賽 2.3章習題 poj 3046 Ant Counting

Shell/Python中的用戶名獲取

【論文筆記】Attention總結二：Attention本質思想 + Hard/Soft/Global/Local形式Attention

【讀書筆記】《深度學習入門——基於python的理論與實現》

【論文筆記】MRC綜述論文+神經閱讀理解與超越基礎部分總結

【兼容調試】AttributeError: 'NoneType' object has no attribute 'loader'

【論文筆記】ULMFiT——Universal Language Model Fine-tuning for Text Classification

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結