深度學習之Softmax&SVM loss&gradient公式圖及其python實現

Softmax與SVM都是用來對數據進行分類的。Softmax常用於神經網絡的輸出層,SVM常常直接與SGD配合實現物體分類。無論是Softmax還是SVM在工作時都需要計算出loss和gradient,學習使用中發現兩者有很多相似之處,特拿來對比學習。

公式

 

圖解

scores是計算得到的分類得分公式中的s,y是ground truth

 

python代碼實現

    """
    Structured softmax and SVM loss function.
    Inputs have dimension D, there are C classes, and we operate on minibatches
    of N examples.

    Inputs:
    - W: A numpy array of shape (D, C) containing weights.
    - X: A numpy array of shape (N, D) containing a minibatch of data.
    - y: A numpy array of shape (N,) containing training labels; y[i] = c means
      that X[i] has label c, where 0 <= c < C.
    
    Returns a tuple of:
    - loss as single float
    - gradient with respect to weights W; an array of same shape as W
    """
def softmax_loss_vectorized(W, X, y):


    loss = 0.0
    dW = np.zeros_like(W)

    num_train = X.shape[0]
    score = X.dot(W)
    shift_score = score - np.max(score, axis=1, keepdims=True)  # 對數據做了一個平移
    shift_score_exp = np.exp(shift_score)
    shift_score_exp_sum = np.sum(shift_score_exp, axis=1, keepdims=True)
    score_norm = shift_score_exp / shift_score_exp_sum

    loss = np.sum(-np.log(score_norm[range(score_norm.shape[0]), y])) / num_train
    
    # dW
    d_score = score_norm
    d_score[range(d_score.shape[0]), y] -= 1
    dW = X.T.dot(score_norm) / num_train 
    return loss, dW


def svm_loss_vectorized(W, X, y):

    delta = 1
    num_training = X.shape[0]
    scores = X.dot(W)
    scores_gt_cls = scores[range(num_training), y][..., np.newaxis]
    scores_dis = scores - scores_gt_cls + delta
    scores_dis[range(num_training), y] -= delta
    scores_norm = np.maximum(0, scores_dis)

    loss = np.sum(scores_norm) / num_training
 
    d_scores = scores_norm
    d_scores[d_scores > 0] = 1  # 出現錯誤得分的地方統統設爲1
    row_sum=np.sum(d_scores, axis=1)
    d_scores[range(num_training), y] -= row_sum
    dW = X.T.dot(d_scores)/num_training
    return loss, dW

 

 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章