[截圖的公式好醜呀…有時間改下…]
# gcn簡介
一般分爲xxxxx

KGCN: Knowledge Graph Convolutional Networks for Recommender Systems

www2019

本文利用kg的結構（structure）信息和語義（semantic）信息來提高推薦的效果。受到gcn的啓發，提出KGCN（ Knowledge Graph Convolutional Networks），KGCN核心跟GCN一樣，都是通過將a鄰居結點的信息傳播到結點a上（想到了概率圖模型中的belief）。這樣設計有兩個好處1）通過聚合操作，每個實體能夠捕獲到（ local proximity structure）局部近似結構，

方法

文章將entity $u$ 和item $v$ 中對齊，因此後面可以當作同樣的理解。

Propagation

$\pi_{r}^{u}=g(\mathbf{u}, \mathbf{r})$

$\tilde{\pi}_{r_{v}, e}^{u}=\frac{\exp \left(\pi_{r_{v}, e}^{u}\right)}{\sum_{e \in \mathcal{N}(v)} \exp \left(\pi_{r_{v, e}}^{u}\right)}Å$

$\mathbf{v}_{\mathcal{N}(v)}^{u}=\sum_{e \in \mathcal{N}(v)} \tilde{\pi}_{r_{v}, e}^{u} \mathbf{e}$

$\pi_r^v$ ：relation $r$ 對 user $u$ 的重要性，作用等同於個性化過濾（personalized filters）
$N(v)$ ：直接和entity/item $v$ 關聯的entity集合；
$v_{S(v)}^u$ : 用戶 $v$ 鄰居的表示

考慮到對不同的 $e$ ， $N(e)$ 的數量變化很大，文章爲每個實體均勻採樣了固定大小的鄰居 $S(e)$ 而不是用它全部的鄰居，如下， $k$ 表示了一層感知域的範圍。

$\mathcal{S}(v) \triangleq\{e | e \sim \mathcal{N}(v)\} \text { and }|\mathcal{S}(v)|=K$

聚合(aggregate)過程

提出了三種聚合實體 $v$ 和鄰居 $S(v)$ 的方法：

Sum aggregator

$agg_{s u m}=\sigma\left(\mathbf{W} \cdot\left(\mathbf{v}+\mathbf{v}_{\mathcal{S}(v)}^{u}\right)+\mathbf{b}\right)$

Concat aggregator
$agg_{\text {concat }}=\sigma\left(\mathbf{W} \cdot \operatorname{concat}\left(\mathbf{v}, \mathbf{v}_{\mathcal{S}(v)}^{u}\right)+\mathbf{b}\right)$
Neighbor aggregator
$agg_{neighbor}=\sigma\left(\mathbf{W} \cdot \mathbf{v}_{\mathcal{S}(v)}^{u}+\mathbf{b}\right)$

預測目標

表示用戶u將會engage（可以理解爲喜好）商品v的程度。 $Y$ 是交互歷史。
$\hat{y}_{u v}=\mathcal{F}(u, v | \Theta, \mathrm{Y}, \mathcal{G})$

損失函數

這裏loss的計算考慮了負採樣的策略。 $J$ 表示交叉熵，每個<u,v>採樣的數量 $T^u$ 取決於原始的<u,v>的歷史交互次數 $T^u=|{v:y_{uv}=1}|$ ， $P$ 是採樣的分佈，文中服從均勻分佈。

僞碼

實驗

方法在 MovieLens-20M (movie), Book-Crossing (book), and Last.FM (music).三個數據集上進行測試。其中的items和KG的數據集Microsoft Satori中的entity進行對齊。對齊過程中，如果出現匹配到多個或者沒有匹配到的，就不做考慮。

鄰居結點採樣數量的影響

迭代次數的影響

embedding維度的影響

其他

困惑

是第一次利用kg+gcn的嗎？是的
這裏不太明白爲什麼能夠捕獲到這種結構信息？（諮詢了一個做圖模型的同學，他的回答是，因爲捕獲到了鄰居的信息，這種信息稱之爲結構信息）
如果基於kg，這樣的鄰接矩陣不會特別大嗎？（文中針對每個用戶抽取sub-KG，因此鄰接矩陣會很大）
文章中說到採樣得到固定的鄰居，怎麼採的？（在後續有說明）
更新過程中的計算權重部分，u表示怎麼得到的？？？

tips

文章提到了幾篇處理鄰居結點數量不定/變化的情況，在related work中可以找到。

KGAT: Knowledge Graph Attention Network for Recommendation

KDD2019，August 4–8, 2019: https://arxiv.org/pdf/1905.07854.pdf
github: https://github.com/xiangwang1223/knowledge_graph_attention_network

Tat-Seng Chua團隊的，資深做推薦。包括

Explainable Reasoning over Knowledge Graphs for Recommendation. In AAAI2019.
Unifying Knowledge Graph Learning and Recommendation: Towards a Better Understanding of User Preferences. In WWW 2019

黃色部分和灰色部分通過KGAT方式可以發現相關但是傳統方式捕獲不到。

之前利用CKG的論文可以分爲兩種：
1）Path-based方式，抽取一些路徑去訓練模型，相當於兩個階段，因此第一個階段路徑的抽取對最後的性能有很大的影響。另外抽取path是 labor-intensive。
2）Regularization-based 方式主要是在loss中加入了跟kg相關的部分去捕獲KG結構信息。這種方式encode kg的方式比較implicit，因此“neither the long-range connectivities are guaranteed to be captured, nor the results of high-order modeling are interpretable.”

因此提出Knowledge Graph Attention Network (KGAT)，“ a model that can exploit high-order information in KG in an efficient, explicit, and end-to-end manner.”

方法

User-Item Bipartite Graph: 將歷史交互信息構建bipartite graph $G1$
KG： $G2$
CKG： $G = G1+ G2$ , 通過match entity和item 將 $G1$ ， $G2$ 合併成 $G$
embedding：在CKG上用TransR訓練

GCN的整個過程分爲Information Propagation和Information Aggregation

Information Propagation
$h$ 能夠的鄰居結點 $N_h$ 中獲取到的信息：

Knowledge-aware Attention的計算過程
1）利用TransR計算embedding表示

2.）計算attention
Information Aggregation
三種聚合方式：
- GCN Aggregator （和上面的Sum aggregator的激活函數不同）
- GraphSage Aggregator （和上面的Concact aggregator的激活函數不同）
- Bi-Interaction Aggregator

將上述傳遞，擴展到多跳：（直接看公式就是加了層次的上標 $(l)$ ）

3. prediction
各層拼接作爲最終表示：

預測：

4. loss: BPR loss

其中， $O = \{(u,i,j)|(u,i) ∈ R^+,(u,j) ∈ R^−\}$ , $R^−$ 表示歷史記錄中跟user $u$ 沒有交互記錄的item $j$ 。

（負採樣的時候，被採樣的是跟用戶沒有交互的item，但是並不表示用戶dislike這些，應該是考慮到item量大…？反正交互稀疏？）
所以怎樣的訓練是合理的？原始的訓練方式

實驗設置&數據集

推薦數據集：Amazon-book，Last-FM，Yelp2018。
Amazon-book，Last-FM 中的item跟FB中的實體對齊。除對齊之外，還考慮了2-hop的鄰居進行擴展。
對於Yelp2018，從文本（ local business information network ）中抽取 (e.g., category, location, and attribute) 作爲KG。
爲了保證質量，過濾掉KG中entity出現次數小於閾值(10次)的數據
簡單交叉驗證：80%，10%，10%，隨機選取。

結果

在三個數據集的整體結果，KGAT的方式比其他的方式好
遞歸次數的影響
聚合方式的影響
attention的影響，第一行去掉KG emb用平均的傳遞方式，第二種是去掉KGE
可解釋

RGCN：Modeling Relational Data with Graph Convolutional Networks

https://arxiv.org/pdf/1703.06103.pdf， 2018

Propagation+ Aggregation

[待看] Heterogeneous Graph Attention Network

https://arxiv.org/pdf/1903.07293.pdf

[待看] GraphRec：Graph Neural Networks for Social Recommendation

https://arxiv.org/pdf/1902.07243.pdf

KGCN vs KGAT vs RGCN

KGCN:

場景：推薦
針對每個用戶，抽取不同圖結構。
利用GCN的方式，不同用戶對不同relation會有不同的計算不同weight。

	RGCN	KGCN	KGAT
場景	KGE	RS	RS
思路	給不同relation不同權重	不同用戶對不同relation會有不同的計算不同weight，爲每個用抽取sub-KG	將user-item的交互和KG信號放在一張圖中
attention	——
傳播

RS：推薦
KGE：kg embedding

[進行中...]KG&GCN/異構圖GCN

文章目錄

KGCN: Knowledge Graph Convolutional Networks for Recommender Systems

方法

Propagation

聚合(aggregate)過程

預測目標

損失函數

僞碼

實驗

其他

困惑

tips

KGAT: Knowledge Graph Attention Network for Recommendation

方法

實驗設置&數據集

結果

RGCN：Modeling Relational Data with Graph Convolutional Networks

Propagation+ Aggregation

[待看] Heterogeneous Graph Attention Network

[待看] GraphRec：Graph Neural Networks for Social Recommendation

KGCN vs KGAT vs RGCN

[進行中...]KG&GCN/異構圖GCN

[正在進行中...] KG & object detection

安裝虛擬機 failed to install the hcmon driver

scala Learning

shell編程 learning

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結