Metapath-guided Heterogeneous Graph Neural Network for Intent Recommendation

Fan S, Zhu J, Han X, et al. Metapath-guided Heterogeneous Graph Neural Network for Intent Recommendation[J]. 2019.
https://github.com/googlebaba/KDD2019-MEIRec

Abstract

與傳統的查詢推薦和項目推薦不同，意圖推薦是在用戶打開應用程序時，根據用戶的歷史行爲自動推薦用戶意圖，而無需任何輸入。

我們提出了Metapath-guided Embedding method for Intent Recommendation (called MEIRec)，一種 metapath-guided heterogeneous Graph Neural Network 來學習 the embeddings of objects in intent recommendation。爲了減少參數，我們提出了一種歸一化的嵌入機制。
離線實驗表明有性能一定的提升，在淘寶平臺的在線數據上 CTR 有 1.54% 的提升，也吸引了 2.66% 的新用戶進行 queries 搜索。
在意圖推薦系統中，歷史信息可以大致分爲兩類。第一種是屬性數據，例如用戶的 profiles 和 objcet 的屬性。另一種類型是交互數據（triple interaction among users, items, and queries），例如用戶單擊（item）日誌，用戶搜索（query）日誌和query guide（item）日誌。
本文中，我們定義 the intent recommendation 爲根據用戶歷史行爲自動推薦個性化的 intent。intent recommendation 和傳統的 query recommendation/suggestion 不同之處在於：

根據歷史行爲推薦而不是歷史查詢
不需要用戶輸入 partial query

現階段應用在 industry 的 intent recommendation 一般是人工提取特徵然後用分類器進行分類，嚴重依賴於領域知識和人工提取特徵。
Heterogeneous Information Network (HIN)包含三種類型的objects and links（user click item, user search query and query guide item）

PRELIMINARIES

DEFINITION 1. Intent Recommendation.

Given a set $<U, I, Q, A, B>$ , where $U = \{u_1, … ,u_p\}$ denotes the set of $p$ users, $I =\{i_1, … ,i_q\}$ denotes the set of $q$ items, $Q = \{q_1, … , q_r \}$ denotes the set of $r$ queries, $W = \{w_1, … ,w_n \}$ denotes the set of $n$ terms, $A$ denotes the attributes associated with objects, and $B$ denotes the interaction behaviors between different types of objects. In our application, a query $q\in Q$ or an item $i\in I$ , is constituted by several terms $w\in W$ . The purpose of intent recommendation is to recommend the most related intent (i.e., query) $q\in Q$ to a user $u\in U$ .

本文我們關心的是起點爲 users，終點爲 queries 的 metapaths。例如，“User−Item−Query (UIQ)” metapath 表明 user 點擊 items, 這些 items 被一些 queries 引導；“Query−User−Item (QUI)” 表明 a query 被 some users 搜索, 這些 user 最近點擊了 some items.

DEFINITION 2. Metapath-guided Neighbors.

給定一個對象 $o$ 和一個 metapath $ρ$ , the metapath-guided neighbors 被定義爲沿着 metapath 訪問的鄰居，i-th step neighbors of object $o$ 寫作 $N^i_ρ (o)$ 。 $N^0_ρ (o)$ 是 $o$ 本身。
以 Figure 2(a) 爲例，給定 the metapath “User−Item−Query (UIQ)” 和 $u_2$ ，我們可以得到 metapath-guided neighbors： $N^0_{UIQ}(u_2)={u_2}, N^1_{UIQ}(u_2)=\{i_1,i_2\},N^3_{UIQ}(u_2)=\{q_1, q_2, q_3\}$ 。 $u_2$ 所有的metapath-guided neighbors 爲 $N^{UIQ}(u_2) =\{N^0_{UIQ}(u_2), N^1_{UIQ}(u_2), N^3_{UIQ}(u_2)\}=\{u_2,i_1,i_2,q_1, q_2, q_3\}$ 。

THE MEIREC MODEL

Overview

MEIRec 的目的是設計一個 heterogeneous GNN for enriching the representations of users and queries。除此之外,用 Term 的embedding來減少需要學習的參數。

Uniform Term Embedding

將 queries 和 items 拆分成 term，然後用 operation function 來將 Term 進行 aggregate，本文中我們採用 the average function 對 Term 進行 aggregate。

Metapath-guided Heterogeneous Graph Neural Network

我們首先說明如何沿 metapath UIQ 彙總鄰居信息。我們使用統一的 term 嵌入來獲取查詢的初始嵌入。根據 Figure2（a）中的網絡結構，得到 $u_2$ 的第一步鄰居集，即 $N^1_{UIQ}(u2)=\{i_1, i_2\}$ 。對於其中的每個節點 $i_k$ ，我們提取第二步鄰居集合 $N^2_{UIQ}(u2)=\{q_1, q_2, q_3\}$ 。在獲得 $u_2$ 的第1步和第2步鄰居集後，彙總第2步鄰居的嵌入，以獲得第1步鄰居的嵌入。在此例中，彙總 $q_1$ 的嵌入以獲得項 $i_1$ 的嵌入，並彙總 $q_2$ 和 $q_3$ 的嵌入以獲得 $i_2$ 的嵌入。最後，彙總第一步鄰居 $\{i_1, i_2\}$ 的嵌入，以獲得用戶 $u_2$ 的嵌入 $U^{UIQ}_2$ 。同理，我們可以得到 $u_2$ 以不同的元路徑的嵌入，例如 $U^{UQI}_2$ 。然後彙總所有元路徑嵌入，以獲得 $u_2$ 的最終嵌入（即 $U_2$ ）。

User Modeling/Query Modeling

$I^{UIQ}_j= g(E_{q_1},E_{q_2},...)$

$U^{UIQ}_i=g(I^{UIQ}_{q_1}, I^{UIQ}_{q_2},...)$

$U_i=g(U^{\rho_1}, U^{\rho_2}_i,...)$

同理： $Q_i= g(Q^{\rho_1}_i , Q^{\rho_2}_i ···)$

Optimization Objective

在模型中，我們預測用戶 $u_i$ 搜索查詢 $q_j$ 的概率 $\hat{y}_{ij}$ 。通過聚合用戶和查詢的鄰居，爲用戶 $u_i$ 獲得融合嵌入 $U_i$ ，爲查詢 $q_j$ 獲得融合的查詢嵌入 $Q_j$ 。我們將傳統方法中使用的 static features 送到 Multi-Layer Perceptron 中獲取靜態特徵 $S_{ij}$ 。然後，我們將用戶，查詢和靜態 features 的嵌入進行合併。最後，我們將融合的嵌入送到MLP層中以獲得預測得分 $yˆ{ij}$ 。我們有：
$\hat{y}_{ij}= sigmoid(f(U_i \oplus Q_j \oplus S_{ij}))$

$f(\cdot)$ 表示一個只有一個輸出的MLP， $\oplus$ 表示 embedding concatenate operation。The loss function 是 a point-wise loss function：
$J=\sum_{Y\cup Y^{-}}(y_{ij}log(\hat{y}_{ij}))+(1-y_{ij})log(1-\hat{y}_{ij}))$

$Y$ 和 $Y^{-}$ 分別表示正負樣本。

Model Analysis

對 MEIREC 的參數空間進行分析。由於採用 Uniform Term Embedding，參數空間遠遠小於傳統方法。

4 OFFLINE EXPERIMENTS

Dataset

數據爲 Taobao mobile application from Android and IOS online。對於 user 提取了 42 個static features，包括性別、年齡、購買力等；對於 query 提取了 39 個 static features，包括長度、term size、CTR等。我們收集10天的交互數據來構建 HIN，其中包括 100 million queries、400 million users 和 400 million items。此外 HIN 還包括 4 billion search relations between users and queries, 20 billion click relations between users and items, and 4 billion guide relations between items and queries。

接下來介紹如何構建訓練和驗證樣本。收集的數據集中的每個原始交互記錄都包含<user, recommended query, timestamp, label>，表示在時間戳處顯示給用戶的推薦查詢；label指示用戶是否點擊了該推薦的查詢。爲了更好地理解我們提出的模型的性能，我們在不同數據規模上驗證了我們的模型。在我們的離線實驗中，我們使用不同時間段（從1到5天）的訓練數據來預測下一天。因此，我們有三個具有不同比例的數據集，分別標記爲1天，3天和5天。爲了獲得更可靠的結果，我們將每個訓練集的大小從40％到100％進行調整。數據的詳細統計信息 Table 1所示。此外，使用 AliWS 對 query 和 iterm 標題的上下文進行細分以獲得 term 詞典，然後選擇其中的 280,000個 term。

我們的數據集具有以下獨特的特點：

數據集足夠大，並且在訓練和驗證集中都包含數百萬個用戶和查詢；
數據集在驗證集中包含大約一半到四分之三的新用戶；
Table 1 中所示的密度（(#interactions of users and queries)/(#users∗#queries)）非常稀疏。

數據的這些特性，使得我們的模型設計的巨大挑戰。

Baselines and Evaluation Metrics

對比模型

LR : It is a linear model with static features.
DNN: With the same input setting as LR, we implement the deep neural network with 3 layers MLP.
GBDT: It is a scalable tree-based model for feature learning and classification task. We feed static features into GBDT.
LR/DNN/GBDT+DW: We feed the static features of users and queries, as well as the pre-training embeddings learned by DeepWalk (DW) from structural information, into LR/DNN/GBDT model.
LR/DNN/GBDT+MP: We feed the static features of users and queries, as well as the pre-training embeddings learned by MetaPath2vec (MP) from structural information, into LR/DNN/GBDT model.
NeuMF: It is the state-of-art neural network method for top-N recommendation. Here we feed it with the structural information (interactions between users and queries), since it cannot be fed the static features.
MEIRec: It’s our model with the input of the static features and structural information.

使用 AUC 來評估不同模型的性。

Detailed Implementation

基於Tensorflow實現了所提出的方法。對於MEIRec，將 term 嵌入的維數設置爲64。使用具有64個隱藏神經元的單層 LSTM 來對 user-query-sequence 和 user-item-sequence 進行建模，使用單層CNN彙總鄰居信息。對於GBDT，the tree number 設置爲 200。對於Deepwalk / MetaPath2vec，嵌入的維數設置爲32。對於所有方法，在訓練階段都將模型參數隨機初始化爲高斯分佈，使用 mini-batch Adam 對模型進行優化。將batch大小設置爲512，將學習率設置爲0.001。所有實驗均在Nvidia Tesla P100 Cluster中進行。

Performance Evaluation

從 Table 2 中我們可以看出，MEIRec 的表現明顯好於其他模型。
在 method 級別，GBDT> DNN> LR> NeuMF。NeuMF無法學習新用戶的嵌入，由於新查詢出現在驗證集中，因此新對象的嵌入將是隨機變量，這會使NeuMF的性能最差。GBDT不會出現這一問題，因此在實際系統中得到廣泛使用。在特徵級別，基於 static features + heterogeneous embeddings > static features + homogeneous embeddings > static features。這表明融合更多信息通常可以獲得更好的性能。使用異質網絡嵌入（即MetaPath2vec）可以獲得比同質網絡嵌入（即Deepwalk）更好的性能。這表明我們應該考慮HIN中對象的異構性以獲得更好的性能。

隨着數據規模的增加，MEIRec 比 baseline 的優勢更加明顯（從2.1％增加到4.3％）。結果進一步證實了 MEIRec 對於大規模數據集更具可擴展性。

Effect of Aggregation Methods

採用 Aggregation 的 MEIRec 的表現：

$MEIRec_{stats}$ : It only uses the static features
$MEIRec_{ave}$ : Both structural information and static features are used. We use the AVE function (i.e., average operation on aggregated embeddings) to aggregate the neighbors of both users and queries in this model.
$MEIRec_{lstm}$ : It uses the structural information and static features. We use LSTM to aggregate the neighbors of users and use AVE to aggregate the neighbors of queries in this model.
MEIRec: It is the proposed model MEIRec

For user side, the LSTM function capture time-sequence information for user behaviors, such as user click item sequence and user search query sequence.
And for query side, the unordered functions (i.e., CNN or AVG) are good enough to aggregate the neighbor information of query.

Effect of Different Metapaths

The four metapaths are UQI, QIQ, QUI, and UIQ, and they are added into the model by their order.

Effect of the Number of Neighbors

For query side, we set the number of neighbors as a fixed value 5, and for user side, we vary the number of neighbors from 3 to 10.

ONLINE EXPERIMENTS

UCTR=Unique Click/Unique Visitor

CONCLUSION

本文研究了意圖推薦問題，該問題在增加移動電子商務中的用戶活動和粘性方面起着重要作用。爲了解決意圖推薦中的挑戰，我們使用HIN對意圖推薦系統中的對象和交互進行建模，並提出了一種新的等距引導GNN方法進行意圖推薦，稱爲MEIRec。 MEIRec利用元路徑引導的鄰居來利用HIN中的豐富結構信息。而且，在MEIRec中設計了統一的 term 嵌入，不僅顯着減少了參數空間，而且使其適合於新生成的用戶和查詢。離線和在線實驗的大量結果證明了我們提出的模型的有效性。

代碼

輸入

0:81：wide_feat_list，42 static features of user + 39 static features query
81:276：user_item_seq_feat，用戶單擊日誌,195= 15*13，13=10(item_terms)+1(item_topcate X)+1(item_leafcate X)+1(time_delta X) — rnn —> user_item_term_lstm_output (user_word_embedding)
276:292：query_feat, 16=10(query length)+3(query topcate length X)+3(query leafcate length X)，— mean —>query_w2v_sum (query_embedding)
292:462：user_query_seq_feat，用戶搜索日誌，與user_item_seq_feat同理（user_query_seq_embedding）
462:562：query_item_query_feat，query 引導的 item 對應的 query 的 term avg – cnn/avg—> （query_item_query_embedding）
562:662：user_query_item_feat，n*10，10爲 query 相應的 item 的term id，reduce mean 之後再根據 query 的順序用rnn聚合（user_item_query_embedding）
662:812：user_item_query_feat，n*10，10爲 item 相應的 query 的term id，reduce mean 之後再根據 item 的順序用rnn聚合（user_query_item_embedding）
812:：query_user_item_feat，點擊 query 的 user 點擊過的 item 的 term avg — cnn/avg—> （query_user_item_embedding）

中間層

wide_feat_list — wide_full_connect —>wide_hidden_layer1 (64 維)
embedding — tf.concat — tf.nn.dropout (64*7 維)—單層全鏈接—> qu_term_concat（64 維）
[wide_feat_list, qu_term_concat] — tf.concat（64*2維） — 兩層全鏈接（128-64-1） — > global_res

loss 和優化

loss = 交叉熵+對全鏈接w的L2正則化

優化器 adam，帶 clip_by_global_norm（global 的梯度階段）、exponential_decay（梯度衰減）、ExponentialMovingAverage（平滑）