深度學習常見激活函數介紹及代碼實現

作用

深度神經網絡引入非線性單元，使訓練問題不再是一個凸優化問題，雖然我們很難得到最優解，但是可以通過梯度下降去尋找局部最小值。

增強模型的擬合能力，理論上只要有足夠的神經元，一層隱藏層就可以表達任意函數。

性質

可微（多元函數）：函數可微保證使用梯度下降優化的可計算性。
單調性：保證梯度方向相對穩定。
輸出值範圍：當輸出有限，由於特徵表示受有限權值影響，基於梯度的優化方法會更加穩定；當輸出無限，特徵表示不受影響，但由於高梯度需要小學習率。
非飽和性：

當激活函數滿足如下要求，稱爲右飽和：

當激活函數滿足如下要求，稱爲左飽和：

激活函數飽和會造成梯度值接近0，導致梯度消失使模型無法收斂。

sigmoid

sigmoid函數，導函數圖像：

sigmoid激活函數具有“連續可微”，“單調性”，“輸出值有限”。通過查看導函數圖像，sigmoid激活函數最大的問題就是兩端飽和，造成梯度消失（解決辦法：使用relu激活函數，BN等），此外輸出不以0中心（以0中心的好處是可以加快模型收斂）。
目前sigmoid激活函數多使用在二分類問題（對於大於二分類問題，如果類別之間存在相互關係使用sigmoid，反之使用softmax），門控機制的判斷等。

import tensorflow as tf
tf.enable_eager_execution()

sigmoid_test=tf.nn.sigmoid([-3.,-2.,-1.,0.0,1.,2.,3.],name='sigmoid_op')

print(sigmoid_test)

輸出：

tf.Tensor(
[0.04742587 0.11920292 0.26894143 0.5        0.7310586  0.880797
 0.95257413], shape=(7,), dtype=float32)

tanh

tanh函數，導函數圖像：

tanh激活函數輸出區間[-1,1]，輸出值以0爲中心，與sigmoid激活函數相比具有更大的梯度值，再加上輸出值以0爲中心，模型收斂更快。不過它依然存在兩端飽和，梯度消失問題還是存在，tanh激活函數在RNN模型中應用較多。

import tensorflow as tf
tf.enable_eager_execution()

tanh_test=tf.nn.tanh([-3.,-2.,-1.,0.0,1.,2.,3.],name='tanh_op')

print(tanh_test)

輸出：

tf.Tensor(
[-0.9950547 -0.9640276 -0.7615942  0.         0.7615942  0.9640276
  0.9950547], shape=(7,), dtype=float32)

relu

relu函數，導函數圖像：

relu與線性單元的區別是在其一半的定義域上輸出爲0，這使得它易於優化，計算。通過圖像可得，relu激活函數的梯度不僅大，而且一致，更重要的是它沒有sigmoid，tanh激活函數的飽和性，有效緩解了梯度消失問題。目前，relu激活函數是神經網絡隱藏層的首選。
但是，它最大的問題是當輸入小於0時，輸出值爲0，此時神經元將無法學習。

import tensorflow as tf
tf.enable_eager_execution()

relu_test=tf.nn.relu([-3.,-2.,-1.,0.0,1.,2.,3.],name='relu_op')
tf.nn.relu
print(relu_test)

輸出：

tf.Tensor([0. 0. 0. 0. 1. 2. 3.], shape=(7,), dtype=float32)

leakyrelu

leakyrelu函數，導函數圖像：

leakyrelu激活函數是relu的衍變版本，主要就是爲了解決relu輸出爲0的問題。如圖所示，在輸入小於0時，雖然輸出值很小但是值不爲0。
leakyrelu激活函數一個缺點就是它有些近似線性，導致在複雜分類中效果不好。

import tensorflow as tf
tf.enable_eager_execution()

# alpha: Slope of the activation function at x < 0
leaky_relu_test=tf.nn.leaky_relu([-3.,-2.,-1.,0.0,1.,2.,3.],alpha=0.2,name='leaky_relu_op')
print(leaky_relu_test)

輸出：

tf.Tensor([-0.6 -0.4 -0.2  0.   1.   2.   3. ], shape=(7,), dtype=float32)

elu

elu函數，導函數圖像：

elu和relu的區別在負區間，relu輸出爲0，而elu輸出會逐漸接近-α，更具魯棒性。elu激活函數另一優點是它將輸出值的均值控制爲0（這一點確實和BN很像，BN將分佈控制到均值爲0，標準差爲1）。

import tensorflow as tf
tf.enable_eager_execution()

elu_relu_test=tf.nn.elu([-10000,-100.,-3.,-2.,-1.,0.0,1.,2.,3.],name='elu_relu_op')
print(elu_relu_test)

輸出：

tf.Tensor(
[-1.         -1.         -0.95021296 -0.86466473 -0.63212055  0.
  1.          2.          3.        ], shape=(9,), dtype=float32)

softmax

softmax單元常作爲網絡的輸出層，它很自然地表示了具有 k 個可能值的離散型隨機變量的概率分佈。

softmax將向量等比例壓縮到[0,1]之間，且保證所有元素之和爲1。

import tensorflow as tf
tf.enable_eager_execution()

softmax_test=tf.nn.softmax([-3.,-2.,-1.,0.0,1.,2.,3.],name='softmax_op')
print(softmax_test)
softmax_test_sum=tf.reduce_sum(softmax_test)
print(softmax_test_sum)

輸出：

tf.Tensor(
[0.0015683  0.00426308 0.01158826 0.03150015 0.0856263  0.23275642
 0.6326975 ], shape=(7,), dtype=float32)

tf.Tensor(1.0, shape=(), dtype=float32)

總結

激活函數的選擇還要根據項目實際情況，考慮不同激活函數的優缺點。

深度學習常見激活函數介紹及代碼實現

作用

性質

sigmoid

tanh

relu

leakyrelu

elu

softmax

總結

如何使用 JS 判斷用戶是否處於活躍狀態

lightdb秒級增加列和刪除列（not null帶默認值）

lightdb數據庫超時相關控制參數

通過HPA+CronHPA組合應對業務複雜彈性伸縮場景

❤️‍🔥 Solon Cloud Event 新的事務特性與應用

lightdb mysql 8.0兼容之不可見主鍵

使用 JS 實現在瀏覽器控制檯打印圖片 console.image()

基於Ubuntu-22.04安裝K8s-v1.28.2實驗（四）使用域名訪問網站應用

YOLO目標檢測模型原理介紹

YOLO目標檢測模型重新訓練

YOLO目標檢測快速上手

深度學習常見激活函數介紹及代碼實現

圖像風格轉移

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結