Keras文檔程序分析 - 1

目錄

多層感知機(MLP)的softmax多分類

 關於Momentum

關於Nesterov Accelerated Gradient

基於多層感知器的二分類

類似VGG的卷積神經網絡

關於VGG

關於卷積和池化結果的計算公式


多層感知機(MLP)的softmax多分類

from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation
from keras.optimizers import SGD
import keras
import numpy as np


# 生成虛擬數據
x_train = np.random.random((1000, 20))
y_train = keras.utils.to_categorical(np.random.randint(10, size=(1000, 1)), num_classes=10)
x_test = np.random.random((100, 20))
y_test = keras.utils.to_categorical(np.random.randint(10, size=(100, 1)), num_classes=10)

model = Sequential()
model.add(Dense(64, activation='relu', input_dim=20))
model.add(Dropout(0.5))
model.add(Dense(64, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(10, activation='softmax'))

sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='categorical_crossentropy', optimizer=sgd, metrics=['accuracy'])
model.fit(x_train, y_train, epochs=20, batch_size=128)
score = model.evaluate(x_test, y_test, batch_size=128)
print("score: ", score)

運行結果:

Using TensorFlow backend.
Epoch 1/20
2019-02-28 10:37:16.147684: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2

 128/1000 [==>...........................] - ETA: 0s - loss: 2.4224 - acc: 0.0859
1000/1000 [==============================] - 0s 149us/step - loss: 2.3773 - acc: 0.1000
Epoch 2/20

 128/1000 [==>...........................] - ETA: 0s - loss: 2.3494 - acc: 0.0703
1000/1000 [==============================] - 0s 11us/step - loss: 2.3454 - acc: 0.0870
Epoch 3/20

 128/1000 [==>...........................] - ETA: 0s - loss: 2.3077 - acc: 0.1094
1000/1000 [==============================] - 0s 11us/step - loss: 2.3437 - acc: 0.0870
Epoch 4/20

 128/1000 [==>...........................] - ETA: 0s - loss: 2.3307 - acc: 0.0938
1000/1000 [==============================] - 0s 11us/step - loss: 2.3364 - acc: 0.0980
Epoch 5/20

 128/1000 [==>...........................] - ETA: 0s - loss: 2.3100 - acc: 0.1094
1000/1000 [==============================] - 0s 10us/step - loss: 2.3171 - acc: 0.1100
Epoch 6/20

 128/1000 [==>...........................] - ETA: 0s - loss: 2.3120 - acc: 0.1172
1000/1000 [==============================] - 0s 11us/step - loss: 2.3188 - acc: 0.1040
Epoch 7/20

 128/1000 [==>...........................] - ETA: 0s - loss: 2.3345 - acc: 0.0859
1000/1000 [==============================] - 0s 12us/step - loss: 2.3167 - acc: 0.1030
Epoch 8/20

 128/1000 [==>...........................] - ETA: 0s - loss: 2.3246 - acc: 0.0625
1000/1000 [==============================] - 0s 11us/step - loss: 2.3150 - acc: 0.0970
Epoch 9/20

 128/1000 [==>...........................] - ETA: 0s - loss: 2.3137 - acc: 0.1562
1000/1000 [==============================] - 0s 11us/step - loss: 2.3083 - acc: 0.1070
Epoch 10/20

 128/1000 [==>...........................] - ETA: 0s - loss: 2.2797 - acc: 0.0781
1000/1000 [==============================] - 0s 12us/step - loss: 2.3075 - acc: 0.1090
Epoch 11/20

 128/1000 [==>...........................] - ETA: 0s - loss: 2.3197 - acc: 0.1016
1000/1000 [==============================] - 0s 11us/step - loss: 2.3028 - acc: 0.1030
Epoch 12/20

 128/1000 [==>...........................] - ETA: 0s - loss: 2.2950 - acc: 0.1250
1000/1000 [==============================] - 0s 11us/step - loss: 2.2958 - acc: 0.1240
Epoch 13/20

 128/1000 [==>...........................] - ETA: 0s - loss: 2.2962 - acc: 0.1172
1000/1000 [==============================] - 0s 11us/step - loss: 2.3070 - acc: 0.1080
Epoch 14/20

 128/1000 [==>...........................] - ETA: 0s - loss: 2.2960 - acc: 0.1016
1000/1000 [==============================] - 0s 12us/step - loss: 2.3027 - acc: 0.1070
Epoch 15/20

 128/1000 [==>...........................] - ETA: 0s - loss: 2.2930 - acc: 0.1172
1000/1000 [==============================] - 0s 11us/step - loss: 2.2939 - acc: 0.1260
Epoch 16/20

 128/1000 [==>...........................] - ETA: 0s - loss: 2.3049 - acc: 0.1016
1000/1000 [==============================] - 0s 11us/step - loss: 2.3043 - acc: 0.1080
Epoch 17/20

 128/1000 [==>...........................] - ETA: 0s - loss: 2.3016 - acc: 0.0703
1000/1000 [==============================] - 0s 11us/step - loss: 2.3060 - acc: 0.1000
Epoch 18/20

 128/1000 [==>...........................] - ETA: 0s - loss: 2.2788 - acc: 0.1328
1000/1000 [==============================] - 0s 11us/step - loss: 2.2954 - acc: 0.1190
Epoch 19/20

 128/1000 [==>...........................] - ETA: 0s - loss: 2.2863 - acc: 0.1641
1000/1000 [==============================] - 0s 11us/step - loss: 2.2952 - acc: 0.1210
Epoch 20/20

 128/1000 [==>...........................] - ETA: 0s - loss: 2.3144 - acc: 0.0625
1000/1000 [==============================] - 0s 10us/step - loss: 2.2917 - acc: 0.1150

100/100 [==============================] - 0s 259us/step
score:  [2.301650047302246, 0.05999999865889549]

其中SGD爲隨機梯度下降優化器, Stochastic Gradient Descent。

四個參數:

  • lr:學習率
  • momentum:動量優化的momentum參數,用於加速SGD在相關方向上前進,並抑制震盪
  • decay:每次更新後學習率的衰減值
  • nesterov:boolean,是否適用Nesterov動量

 關於Momentum

SGD在ravines的情況下容易被困住(ravines就是曲面的一個方向比另一個方向更陡),這時SGD會發生震盪而遲遲不能接近極小值:

Momentum通過加入\gamma v_{t-1}可以加速SGD,並且抑制震盪:

v_t={\color{Blue} \gamma v_{t-1}}+\eta \nabla_\theta J(\theta)

\theta=\theta-v_t

加入這一項,可以使得梯度方向不變的維度上速度變快,梯度方向有所改變的維度上的更新速度變慢,這樣就可以加快收斂並減少震盪

超參數的設定:一般取\gamma=0.9左右

關於Nesterov Accelerated Gradient

\theta-\gamma v_{t-1}來近似作爲下一步會變成的值,則在計算梯度時,不是在當前的位置,而是在未來的位置上:

v_t=\gamma v_{t-1}+\eta \nabla_\theta J({\color{Blue} \theta-\gamma v_{t-1}})

\theta=\theta-v_t

超參數的設定值:一般取\gamma=0.9左右

效果比較:

藍色是Momentum的過程,會計算當前的梯度,然後再更新後的累積梯度後會有一個大的跳躍。

NAG (Nesterov Accelerated Gradient)會在前一步累積的梯度上(灰色)有一個大的跳躍,然後衡量一下梯度做一下修正(紅色),這種預期的更新可以避免我們走的太快。

基於多層感知器的二分類

# 生成虛擬數據
x_train = np.random.random((1000, 20))
y_train = np.random.randint(2, size=(1000, 1))
x_test = np.random.random((100, 20))
y_test = np.random.randint(2, size=(100, 1))

model = Sequential()
model.add(Dense(64, activation='relu', input_dim=20))
model.add(Dropout(0.5))
model.add(Dense(64, activation='relu'))
model.add(Dropout(0.5))
model.add((Dense(1, activation='sigmoid')))

model.compile(loss='binary_crossentropy', optimizer='rmsprop', metrics=['accuracy'])
model.fit(x_train, y_train, batch_size=128, epochs=20)
score = model.evaluate(x_test, y_test, batch_size=128)
print("score: ", score)

結果:

Using TensorFlow backend.
Epoch 1/10
2019-02-28 14:24:11.425481: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2

 128/1000 [==>...........................] - ETA: 1s - loss: 0.7517 - acc: 0.5078
1000/1000 [==============================] - 0s 186us/step - loss: 0.7252 - acc: 0.5090
Epoch 2/10

 128/1000 [==>...........................] - ETA: 0s - loss: 0.7346 - acc: 0.4609
1000/1000 [==============================] - 0s 10us/step - loss: 0.7191 - acc: 0.4820
Epoch 3/10

 128/1000 [==>...........................] - ETA: 0s - loss: 0.6853 - acc: 0.5078
1000/1000 [==============================] - 0s 10us/step - loss: 0.7109 - acc: 0.4860
Epoch 4/10

 128/1000 [==>...........................] - ETA: 0s - loss: 0.6962 - acc: 0.5000
1000/1000 [==============================] - 0s 11us/step - loss: 0.7083 - acc: 0.4890
Epoch 5/10

 128/1000 [==>...........................] - ETA: 0s - loss: 0.6950 - acc: 0.5391
1000/1000 [==============================] - 0s 10us/step - loss: 0.7050 - acc: 0.4990
Epoch 6/10

 128/1000 [==>...........................] - ETA: 0s - loss: 0.7180 - acc: 0.4609
1000/1000 [==============================] - 0s 10us/step - loss: 0.7037 - acc: 0.5040
Epoch 7/10

 128/1000 [==>...........................] - ETA: 0s - loss: 0.7023 - acc: 0.4453
1000/1000 [==============================] - 0s 10us/step - loss: 0.7014 - acc: 0.4850
Epoch 8/10

 128/1000 [==>...........................] - ETA: 0s - loss: 0.7039 - acc: 0.5000
1000/1000 [==============================] - 0s 10us/step - loss: 0.6987 - acc: 0.5040
Epoch 9/10

 128/1000 [==>...........................] - ETA: 0s - loss: 0.7013 - acc: 0.5078
1000/1000 [==============================] - 0s 10us/step - loss: 0.6934 - acc: 0.5360
Epoch 10/10

 128/1000 [==>...........................] - ETA: 0s - loss: 0.7071 - acc: 0.4922
1000/1000 [==============================] - 0s 10us/step - loss: 0.6983 - acc: 0.5210

100/100 [==============================] - 0s 279us/step
score:  [0.6979394555091858, 0.4399999976158142]

類似VGG的卷積神經網絡

關於VGG

Karen Simonyan & Andrew Zisserman的VGG網絡結構:

è¿éåå¾çæè¿°

# 生成虛擬數據
x_train = np.random.random((100, 100, 100, 3))
y_train = keras.utils.to_categorical(np.random.randint(10, size=(100, 1)), num_classes=10)
x_test = np.random.random((20, 100, 100, 3))
y_test = keras.utils.to_categorical(np.random.randint(10, size=(20, 1)), num_classes=10)

model = Sequential()
model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(100, 100, 3)))
model.add(Conv2D(32, (3, 3), activation='relu'))
model.add(Dropout(0.25))

model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPool2D(pool_size=(2, 2)))
model.add(Dropout(0.25))

model.add(Flatten())
model.add(Dense(256, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(10, activation='softmax'))

sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='categorical_crossentropy', optimizer=sgd)

model.fit(x_train, y_train, batch_size=32, epochs=10)
score = model.evaluate(x_test, y_test, batch_size=32)
print("score:", score)

結果:

Using TensorFlow backend.
Epoch 1/10
2019-02-28 14:50:59.772493: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2

 32/100 [========>.....................] - ETA: 6s - loss: 2.2790
 64/100 [==================>...........] - ETA: 2s - loss: 2.3399
 96/100 [===========================>..] - ETA: 0s - loss: 2.3195
100/100 [==============================] - 7s 72ms/step - loss: 2.3264
Epoch 2/10

 32/100 [========>.....................] - ETA: 4s - loss: 2.3259
 64/100 [==================>...........] - ETA: 2s - loss: 2.2703
 96/100 [===========================>..] - ETA: 0s - loss: 2.3011
100/100 [==============================] - 6s 61ms/step - loss: 2.3065
Epoch 3/10

 32/100 [========>.....................] - ETA: 4s - loss: 2.2935
 64/100 [==================>...........] - ETA: 2s - loss: 2.2876
 96/100 [===========================>..] - ETA: 0s - loss: 2.2873
100/100 [==============================] - 6s 62ms/step - loss: 2.2882
Epoch 4/10

 32/100 [========>.....................] - ETA: 4s - loss: 2.2825
 64/100 [==================>...........] - ETA: 2s - loss: 2.2777
 96/100 [===========================>..] - ETA: 0s - loss: 2.2668
100/100 [==============================] - 6s 62ms/step - loss: 2.2689
Epoch 5/10

 32/100 [========>.....................] - ETA: 4s - loss: 2.3120
 64/100 [==================>...........] - ETA: 2s - loss: 2.2865
 96/100 [===========================>..] - ETA: 0s - loss: 2.2830
100/100 [==============================] - 6s 62ms/step - loss: 2.2771
Epoch 6/10

 32/100 [========>.....................] - ETA: 4s - loss: 2.3145
 64/100 [==================>...........] - ETA: 2s - loss: 2.2907
 96/100 [===========================>..] - ETA: 0s - loss: 2.2718
100/100 [==============================] - 6s 62ms/step - loss: 2.2757
Epoch 7/10

 32/100 [========>.....................] - ETA: 4s - loss: 2.2969
 64/100 [==================>...........] - ETA: 2s - loss: 2.2606
 96/100 [===========================>..] - ETA: 0s - loss: 2.2733
100/100 [==============================] - 6s 62ms/step - loss: 2.2728
Epoch 8/10

 32/100 [========>.....................] - ETA: 4s - loss: 2.2306
 64/100 [==================>...........] - ETA: 2s - loss: 2.2661
 96/100 [===========================>..] - ETA: 0s - loss: 2.2564
100/100 [==============================] - 6s 62ms/step - loss: 2.2579
Epoch 9/10

 32/100 [========>.....................] - ETA: 4s - loss: 2.2718
 64/100 [==================>...........] - ETA: 2s - loss: 2.2901
 96/100 [===========================>..] - ETA: 0s - loss: 2.2900
100/100 [==============================] - 6s 62ms/step - loss: 2.2870
Epoch 10/10

 32/100 [========>.....................] - ETA: 4s - loss: 2.3367
 64/100 [==================>...........] - ETA: 2s - loss: 2.2874
 96/100 [===========================>..] - ETA: 0s - loss: 2.2905
100/100 [==============================] - 6s 62ms/step - loss: 2.2886

20/20 [==============================] - 0s 20ms/step
score: 2.2975594997406006

本程序模型結構:

  • 輸入:100*100*3的數據
  • 第一層:使用了32個3*3的卷積核,則第一層的大小爲:98*98*32
  • 第二層:使用了32個3*3的卷積核,則第二層的大小爲:96*96*32
  • 第三層:使用了2*2的最大池化,則第三層大小爲:95*95*32
  • 第四層:使用了64個3*3的卷積核,則第四層的大小爲:93*93*64
  • 第五層:使用了64個3*3的卷積核,則第五層的大小爲:91*91*64
  • 第六層:使用了2*2的最大池化, 則第六層的大小爲:90*90*64
  • 第七層:全連接層,256個神經元
  • 輸出層:全連接層,10個神經元,分別對應數據的10個類別

 

關於卷積和池化結果的計算公式

卷積結果:

  • 卷積核個數K
  • 卷積核大小F
  • 步長S
  • 填充P

(W_1, H_1, D_1)卷積過後的大小爲:

W_2=\frac{W_1-F+2P}{S}+1

H_2=\frac{H_1-F+2P}{S}+1

D_2=K

 

池化結果:

  • 池化大小F
  • 步長S

(W_1, H_1, D_1)池化後的大小爲:

W_2=\frac{W_1-F}{S}+1

H_2=\frac{H_1-F}{S}+1

D_2=D_1

 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章