SVM優化--對偶

SVM--基本思想中講到SVM的優化目標,這裏再貼出來,如下:

\min_{W,b,\xi}||W||^2/2 + C \sum_i{\xi _i} \\ s.t. \;\;\; && y_i (W^T X_i + b) \geq 1-\xi_i, \forall i \\ \xi _i \geq 0 , \forall i

這是一個二次規劃問題,雖然可以直接應用KKT條件進行求解,但是還是太過複雜,不易求解;

首先對該問題進行一下轉化,設

L(W,b,\xi,\alpha,\beta) = f(W,b,\xi) + \sum_i{\alpha_i g(X_i)} + \sum_i{\beta_i h(\xi_i)}

其中g(X_i) = 1-\xi_i - y_i (W^T X_i + b) \leq 0h(\xi_i) = -\xi_i \leq 0\alpha_i \geq 0, \; \beta_i \geq 0, \; \forall i

f(W,b,\xi) = \frac{{\left \| W \right \|}^2}{2} + C \sum_i{\xi_i}

\because \alpha_i g(X_i) \leq 0, \; \beta_i h(\xi_i) \leq 0

\therefore \max_{\alpha,\beta}L(W,b,\xi,\alpha,\beta) = f(W,b,\xi),也即有

\min_{W,b,\xi}{f(W,b,\xi)} = \min_{W,b,\xi}\max_{\alpha,\beta}L(W,b,\xi,\alpha,\beta),參照KKT的證明

易知,當\alpha_i g(X_i) = 0, \; \beta_i h(\xi_i) = 0, \; \forall iKKT)時,

\min_{W,b,\xi}\max_{\alpha,\beta}L(W,b,\xi,\alpha,\beta) = \max_{\alpha,\beta}\min_{W,b,\xi}L(W,b,\xi,\alpha,\beta),等式右邊是等式左邊的對偶

最終,SVM的優化目標等價於

\max_{\alpha,\beta}\min_{W,b,\xi}L(W,b,\xi,\alpha,\beta) \\ s.t. \;\; \forall i \left\{\begin{matrix} \alpha_i g(X_i) = 0, \; \beta_i h(\xi_i) = 0 \\ g(X_i) \leq 0 , \; h(\xi_i) \leq 0 \\ \alpha_i \geq 0, \; \beta_i \geq 0 \end{matrix}\right.

先求\min_{W,b,\xi}L(W,b,\xi,\alpha,\beta),即

\left\{\begin{matrix} \partial L/\partial W = W - \sum_{i}\alpha_i y_i X_i = 0 \; &\rightarrow& W = \sum_{i}\alpha_i y_i X_i \\ \partial L/\partial b = - \sum_{i}\alpha_i y_i = 0 \; &\rightarrow& \sum_{i}\alpha_i y_i = 0\\ \partial L/\partial \xi_i = C - \alpha_i - \beta_i = 0 \; &\rightarrow& \alpha_i = C-\beta_i \leq C \\ \end{matrix}\right.將這些代入上公式,即有

\min_{a}L = \min_{a}\left \| \sum_{i}a_i y_i X_i \right \|^2/2 - \sum_{i}a_i \\ s.t. \;\; \forall i \left\{\begin{matrix} \alpha_i g(X_i) = 0, \; \beta_i h(\xi_i) = 0 \\ g(X_i) \leq 0 , \; h(\xi_i) \leq 0 \\ \alpha_i \geq 0, \; \beta_i \geq 0 \\ 0 \leq a_i \leq C, \;\; \sum_{i}a_i y_i = 0 \end{matrix}\right.

\alpha_i = 0時,\beta_i = C - \alpha_i = C > 0 \; \rightarrow \xi_i = 0,有y_i(W^T X_i + b) \geq 1-\xi_i=1

0 < \alpha_i < C時,0 < \beta_i = C - \alpha_i < C \; \rightarrow \xi_i = 0,有y_i(W^T X_i + b) = 1-\xi_i=1

\alpha_i = C時,\beta_i = C - \alpha_i = 0 \; \rightarrow \xi_i \geq 0,有y_i(W^T X_i + b) = 1-\xi_i \leq 1

即有\left\{\begin{matrix} y_i(W^T X_i + b) \geq 1 &,& \alpha_i = 0 \\ y_i(W^T X_i + b) = 1 &,& 0 < \alpha_i < C \\ y_i(W^T X_i + b) \leq 1 &,& \alpha_i = C \\ \end{matrix}\right.

所以最終\min_{\alpha}L = \min_{\alpha}\left \| \sum_{i}\alpha_i y_i X_i \right \|^2/2 - \sum_{i}\alpha_i \\ s.t. \;\; 0 \leq \alpha_i \leq C, \; \sum_{i}\alpha_i y_i = 0 \\ KKT: \forall i \left\{\begin{matrix} y_i(W^T X_i + b) \geq 1 &,& \alpha_i = 0 \\ y_i(W^T X_i + b) = 1 &,& 0 < \alpha_i < C \\ y_i(W^T X_i + b) \leq 1 &,& \alpha_i = C \\ \end{matrix}\right.

在取得最優解後,即可根據\left\{\begin{matrix} W = \sum_{i}\alpha_i y_i X_i \\ b = y_i - W^T X_i, \; \forall i \in \{i: 0 < \alpha_i < C\} \end{matrix}\right.求得W、b

 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章