模型訓練技巧——label smoothing

原創

CV-deeplearning

2020-05-08 04:56

1. 原理

通俗理解：將真實標籤減去一個很小的數，然後平均分配到其他類上，實現了標籤軟化。

2. 作用

(a圖)：當增加ε時，目標類別與其它類別之間的理論差距減小。(b圖)：最大預測與其它類別平均值之間差距的經驗分佈。很明顯，通過標籤平滑，分佈中心處於理論值並具有較少的極端值。

3. 代碼實現


import torch
import torch.nn as nn


class LSR(nn.Module):

    def __init__(self, e=0.1, reduction='mean'):
        super().__init__()

        self.log_softmax = nn.LogSoftmax(dim=1)
        self.e = e
        self.reduction = reduction
    
    def _one_hot(self, labels, classes, value=1):
        """
            Convert labels to one hot vectors
        
        Args:
            labels: torch tensor in format [label1, label2, label3, ...]
            classes: int, number of classes
            value: label value in one hot vector, default to 1
        
        Returns:
            return one hot format labels in shape [batchsize, classes]
        """

        one_hot = torch.zeros(labels.size(0), classes)

        #labels and value_added  size must match
        labels = labels.view(labels.size(0), -1)
        value_added = torch.Tensor(labels.size(0), 1).fill_(value)

        value_added = value_added.to(labels.device)
        one_hot = one_hot.to(labels.device)

        one_hot.scatter_add_(1, labels, value_added)

        return one_hot

    def _smooth_label(self, target, length, smooth_factor):
        """convert targets to one-hot format, and smooth
        them.

        Args:
            target: target in form with [label1, label2, label_batchsize]
            length: length of one-hot format(number of classes)
            smooth_factor: smooth factor for label smooth
        
        Returns:
            smoothed labels in one hot format
        """
        one_hot = self._one_hot(target, length, value=1 - smooth_factor)
        one_hot += smooth_factor / length

        return one_hot.to(target.device)

    def forward(self, x, target):

        if x.size(0) != target.size(0):
            raise ValueError('Expected input batchsize ({}) to match target batch_size({})'
                    .format(x.size(0), target.size(0)))

        if x.dim() < 2:
            raise ValueError('Expected input tensor to have least 2 dimensions(got {})'
                    .format(x.size(0)))

        if x.dim() != 2:
            raise ValueError('Only 2 dimension tensor are implemented, (got {})'
                    .format(x.size()))


        smoothed_target = self._smooth_label(target, x.size(1), self.e)
        x = self.log_softmax(x)
        loss = torch.sum(- x * smoothed_target, dim=1)

        if self.reduction == 'none':
            return loss
        
        elif self.reduction == 'sum':
            return torch.sum(loss)
        
        elif self.reduction == 'mean':
            return torch.mean(loss)
        
        else:
            raise ValueError('unrecognized option, expect reduction to be one of none, mean, sum')

4. 用法

from LabelSmoothing import LSR
# 標籤平滑，LSR可看做一種損失函數

#criterion = nn.CrossEntropyLoss()
criterion = LSR()
# LSR()與 nn.CrossEntropyLoss()在用法上完全相同

博主親試有效（略有提升），拿去用吧。

https://www.zhihu.com/question/65339831

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

模型訓練技巧——label smoothing

1. 原理

2. 作用

3. 代碼實現

4. 用法

微服務實踐k8s&dapr開發部署實驗（2）狀態管理

Win10 LTSC 2019 安裝後的一些步驟

Python 潮流週刊#52：Python 處理 Excel 的資源

模型訓練技巧——激活函數mish

Linux 高級命令彙總

Pascal VOC數據集轉化爲COCO數據集格式

mmdectionv1 SSD-mobilenetv2實戰

Cascade Mask R-CNN實戰——訓練自己的數據集

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結