模塊化搭建神經網絡
前向傳播:由輸入到輸出,搭建完整的網絡結構,描述前向傳播的過程需要定義三個函數:
(1) 第一個函數 forward()完成網絡結構的設計,從輸入到輸出搭建完整的網絡結構,實現前向傳播過程。
該函數中,參數 x 爲輸入,regularizer 爲正則化權重,返回值爲預測或分類結果 y。def forward(x, regularizer): w= b= y= return y
(2) 第二個函數 get_weight()對參數 w 設定。
該函數中,參數 shape 表示參數 w 的形狀,regularizer表示正則化權重,返回值爲參數 w。
其中,tf.variable()給 w 賦初值,tf.add_to_collection()表示將參數 w 正則化損失加到總損失 losses 中。def get_weight(shape, regularizer): w = tf.Variable() tf.add_to_collection('losses', tf.contrib.layers.l2_regularizer(regularizer)(w)) return w
(3) 第三個函數 get_bias()對參數 b 進行設定。
該函數中,參數 shape 表示參數 b 的形狀,返回值爲參數b。
其中,tf.variable()表示給 b 賦初值。def get_bias(shape): b = tf.Variable() return b
反向傳播:訓練網絡,優化網絡參數,提高模型準確性。
函數 backward()中, placeholder()實現對數據集 x 和標準答案 y_佔位, forward.forward()實現前向傳播的網絡結構,參數 global_step 表示訓練輪數,設置爲不可訓練型參數。def backward( ): x = tf.placeholder() y_ = tf.placeholder() y = forward.forward(x, REGULARIZER) global_step = tf.Variable(0, trainable=False) loss =
注意:在訓練網絡模型時,常將正則化、指數衰減學習率和滑動平均這三個方法作爲模型優化方法。其中,滑動平均和指數衰減學習率中的 global_step 爲同一個參數。
例如:在前一篇文章的例子中,我們加入指數衰減學習率優化效率,加入正則化提高泛化性,並使用模塊化設計方法,把紅色點和藍色點分開。
代碼總共分爲三個模塊 : 生成數據集(generateds.py) 、 前向傳播 (forward.py) 、 反向傳播(backward.py)。
(1) generateds.py#coding:utf-8 import numpy as np SEED = 2 def generateds(): rdm = np.random.RandomState(SEED) X = rdm.randn(300, 2) Y_ = [int(xi[0]*xi[0] + xi[1]*xi[1] < 2) for xi in X] Y_c = [['red' if yi else 'blue'] for yi in Y_] X = np.vstack(X).reshape(-1, 2) Y_ = np.vstack(Y_).reshape(-1, 1) return X, Y_, Y_c
(2) forward.py
#coding:utf-8 import tensorflow as tf def get_weight(shape, regularizer): w = tf.Variable(tf.random_normal(shape), dtype=tf.float32) tf.add_to_collection('losses', tf.contrib.layers.l2_regularizer(regularizer)(w)) return w def get_bias(shape): b = tf.Variable(tf.constant(0.01, shape=shape)) return b def forward(x, regularizer): w1 = get_weight([2, 11], regularizer) b1 = get_bias([11]) y1 = tf.nn.relu(tf.matmul(x, w1) + b1) w2 = get_weight([11, 1], regularizer) b2 = get_bias([1]) y = tf.matmul(y1, w2) + b2 return y
(3) backward.py
#coding:utf-8 import tensorflow as tf import numpy as np import matplotlib.pyplot as plt import generateds import forward DATA_NUM = 300 BATCH_SIZE = 30 REGULARIZER = 0.01 LR = 0.001 LR_DECAY_STEPS = DATA_NUM // BATCH_SIZE LR_DECAY_RATE = 0.999 STEPS = 40000 def backward(): x = tf.placeholder(tf.float32, [None, 2]) y_ = tf.placeholder(tf.float32, [None, 1]) global_step = tf.Variable(0, trainable=False) X, Y_, Y_c = generateds.generateds() y = forward.forward(x, REGULARIZER) lr = tf.train.exponential_decay( learning_rate = LR, global_step = global_step, decay_steps = LR_DECAY_STEPS, decay_rate = LR_DECAY_RATE, staircase = True ) loss_mse = tf.reduce_mean(tf.square(y-y_)) loss_total = loss_mse + tf.add_n(tf.get_collection('losses')) train_step = tf.train.AdamOptimizer(lr).minimize(loss_total, global_step) with tf.Session() as sess: sess.run(tf.global_variables_initializer()) for i in range(STEPS): start = (i*BATCH_SIZE) % 300 end = min(start+BATCH_SIZE, 300) sess.run(train_step, feed_dict={x:X[start:end], y_:Y_[start:end]}) if i % 2000 == 0: loss_v = sess.run(loss_total, feed_dict={x:X, y_:Y_}) print('After %d stpes, loss is: %f' % (i, loss_v)) xx, yy = np.mgrid[-3:3:0.01, -3:3:0.01] grid = np.c_[xx.ravel(), yy.ravel()] probs = sess.run(y, feed_dict={x: grid}) probs = probs.reshape(xx.shape) plt.scatter(X[:,0], X[:,1], c=np.squeeze(Y_c)) plt.contour(xx, yy, probs, levels=[.5]) plt.show() if __name__ == '__main__': backward()
運行結果圖: