【閱讀筆記】Adversarially Regularized Autoencoders

原創

2019-02-27 19:20

Adversarially Regularized Autoencoders

Kim Y, Zhang K, Rush A M, et al. Adversarially regularized autoencoders[J]. arXiv preprint arXiv:1706.04223, 2017.
GitHub: https://github.com/jakezhaojb/ARAE
adversarially regularized autoencoder (ARAE)

Abstract

Deep latent variable models （也就是VAE、GAN這種由隨機變量作種子的模型）比較方便生成連續的樣本。當把他們運用在例如文本、離散圖片等離散結構上時，將會遇到很大挑戰。本文提出了一個靈活的方法來訓練 deep latent variable models of discrete structures。

Background and Notation

Discrete Autoencoder

就是把離散序列 encoder 之後再 decoder，通過 softmax 來進行離散
$L_{rec}(\phi,\psi)=-log~p_{\psi}(x|enc_{\phi}(x))$

$\hat{x}=argmax_{x}~p_{\psi}(x|enc_{\phi}(x))$

編碼器和解碼器是一個 problem-specific（特定的問題），一般可以選擇 RNN 作爲解碼器和編碼器。

Generative Adversarial Networks

WGAN：
$min_{\theta}max_{w\in W}E_{z\sim p_r}[f_w(z)]-E_{\tilde{z}\sim p_z}[f_w(\tilde{z})]$

weight-clipping $w=[-\epsilon,\epsilon]$

Adversarially Regularized Autoencoder

ARAE combines a discrete autoencoder with a GAN-regularized latent representation. 模型如下圖所示，學習離散空間 $P_{\psi}$ 。直覺上這種方法用一個更靈活的先驗分佈提供了一個更平滑的離散編碼空間。

模型包含 a discrete autoencoder regularized with a prior distribution,
$min_{\phi,\psi}L_{rec}(\phi,\psi)+\lambda^{(1)}W(P_Q,P_z)$

其中 $W$ 表示離散編碼空間 $P_Q$ （就是 $x$ 經過編碼後 $enc_{\phi}(x)$ 概率空間）和 $P_z$ 的 Wasserstein 距離。模型訓練相當於對下面幾個目標進行求解：

(1) $min_{\phi,\psi}L_{rec}(\phi,\psi)=E_{x\sim P_r}[-log~p_{\phi}(x|enc(x))]$
(2) $max_{w\in W}L_{cri}=E_{x\sim P_r}[f_w(enc_{\phi}(x))]-E_{\hat{z}\sim P_z}[f_w(\hat{z})]$
(3) $min_{\phi}L_{enc}(\phi)=E_{x\sim P_r}[f_w(enc_{\phi}(x))]-E_{\hat{z}\sim P_z}[f_w(\hat{z})]$

(1)爲最小化編碼解碼器的的重構誤差、(2)是優化判別器、(3)是優化生成器

經驗上我們發現，先驗分佈 $P_z$ 對結果有很強的影響，最簡單的選擇是固定的高斯分佈 $N(0,1)$ ，但是這種限制很強的條件很容易造成模型的崩潰。我們不固定 $P_z$ 而是通過一個生成器來學習一個從高斯分佈 $N(0,1)$ 到 $P_z$ 的映射。

Algorithm 1 ARAE Training
for each training iteration do

(1) Train the encoder/decoder for reconstruction $KaTeX parse error: Expected 'EOF', got '}' at position 11: (\phi,\psi}̲)$

Sample $\{x^{(i)}\}^m_{i=1}\sim P_r$ and compute $z^{(i)}=enc_{\phi}(x^{(i)})$

Backprop loss, $L_{rec}=−\frac{1}{m}\sum^m_{i=1}log~p_{\psi}(x^{(i)}|z^{(i)})$

(2) Train the critic $(w)$

Sample $\{x^{(i)}\}^m_{i=1}\sim P_r$ and ${s^{(i)}}m_{i=1}\sim N(0, I)

Compute $z^{{(i)}=enc_{\phi}(x}{(i)}) and $\hat{z}^{{(i)}=g_{\theta}(z}{(i)})

Backprop loss $-\frac{1}{m}\sum^m_{i=1}f_w(z(i))+frac{1}{m}\sum^{m_{i=1}f_w(\hat{z}}{(i)})

Clip critic $w$ to $KaTeX parse error: Unexpected character: '' at position 11: [−\epsilon̲, \epsilon]$

(3) Train the encoder/generator adversarially $(\phi, \theta)$

Sample $\{x^{(i)}\}^m_{i=1}\sim P_r$ and $\{s^{(i)}\}^m_{i=1}\sim N(0, I)$

Compute $z^{(i)}=enc_{\phi}(x^{(i)})$ and $\hat{z}^{(i)}=g_{\theta}(s^{(i)})$ .

Backprop loss $\frac{1}{m}\sum_m^{i=1} f_w(z^{(i)})− \frac{1}{m}\sum^m_{i=1} f_w(\hat{z}^{(i)})$
end for

Extension: Unaligned Transfer

考慮對齊問題，對解碼器增加一個條件變爲 $p_{\psi}(x|z,y)$ （沒看太明白，以後看代碼看看能看明白不），最優化時考慮分類誤差
$min_{\phi,\psi}L_{rec}(\phi,\psi)+\lambda^{(1)}W(P_Q,P_z)-\lambda^{(2)}L_{class}(\phi,u)$

本文中 $\lambda^{(2)}=1$ ，並且需要在訓練時增加兩個步驟：(2b) 訓練分類器、(3b)爲分類器訓練解碼器

Algorithm 2 ARAE Transfer Extension
Each loop additionally:

(2b) Train attribute classifier $(u)$

Sample $\{x^{(i)}\}^m_{i=1}\sim P_r$ , lookup $y^{(i)} , and compute $z^(i)=enc_{\phi}(x^{(i)})$

Backprop loss $−\frac{1}{m}\sum^m_{i=1}log~p_u(y^{(i)}|z^{(i)})$

(3b) Train the encoder adversarially $(\phi)$

Sample $\{x^{(i)}\}^m_{i=1}\sim P_r$ , lookup $y^{(i)} , and compute $z^(i)=enc_{\phi}(x^{(i)})$

Backprop loss $−\frac{1}{m}\sum^m_{i=1}log~p_u(1-y^{(i)}|z^{(i)})$

Theoretical Properties

在標準的 GAN 中，我們隱式的減小真實分佈和模型分佈。在本文的情況中，我的理解是隱式的最小化 embedding 空間的真實分佈和模型分佈，並且最小化模型分佈 $P_r$ 和隱變量分佈 $p_{\psi}=\int_zp_{\psi}(x|z)p(z)dz$ 。
略去一些很數學的證明

Experiments

其他

看 github 上作者把 WGAN 方法更新爲 WGAN-UP。

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

【閱讀筆記】Adversarially Regularized Autoencoders

Adversarially Regularized Autoencoders

Abstract

Background and Notation

Discrete Autoencoder

Generative Adversarial Networks

Adversarially Regularized Autoencoder

Extension: Unaligned Transfer

Theoretical Properties

Experiments

其他

釘釘打卡速度慢

Nginx R31 doc 官方文檔-01-nginx 如何安裝

Qt/C++音視頻開發74-合併標籤圖形/生成yolo運算結果圖形/文字和圖形合併成一個/水印濾鏡

挑戰程序設計競賽 2.2章習題 POJ - 3617 Best Cow Line 貪心

字節面試：MySQL什麼時候鎖表？如何防止鎖表？

.NET8連接SQL SERVER 2008 R2 報：證書鏈是由不受信任的頒發機構頒發的

golang開發環境搭建(win10)

python計算機視覺學習筆記——PIL庫的用法

Golang初學：獲取程序內存使用情況，std runtime

【論文閱讀】Solving Billion-Scale Knapsack Problems

【閱讀筆記】Cost-Effective and Stable Policy Optimization Algorithm for Uplift Modeling

【學術】重構具有時間延遲相互作用的動力學網絡

一元方程的求根公式

A holistic approach to semi-supervised learning

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結