淺層神經網絡-圖片識別-TF實現-數據可視化【人工智能學習】

我們在前文邏輯迴歸的基礎上深入一下，使用淺層神經網絡優化一下我們的學習方法

淺層神經網絡：

淺層神經網絡一般指的是只有一層隱藏層的神經網絡
而神經網絡的作用是通過多個學習單元的作用，來提高判斷的準確率

換句話說，我們之前的迴歸判斷相當於衆多神經元當中的一個，我們需要通過多個這樣的迴歸方程組建一個學習網絡，這樣他可以從更多的角度來去推測結果，這樣的結果也會更加準確

而淺層神經網絡一般是這樣的：

上圖就是我們今天要是實現的神經網絡的圖解了，這裏面的【i[n ]】是指輸入層，換句話說我們一口氣輸入的訓練集數量。

如果按照非神經元件的做法，下一步就直接到out節點的輸出層了，而我們在這裏則多了一層h層（hide），這一層就是我們搭建的神經網絡，本文中我們將構建1000個節點作爲隱藏層。

通過多個節點學習公式的組合，體高精度，最後選取最後可能的結果使用softmax進行輸出

值得注意的是，這裏面涉及到了激活函數的概念，其實在之前的迴歸當中我們已經逐步涉及到了激活的相關概念，她的作用是：將原來的數據集當中的因變量、自變量的變化和相關性放大，詳情請看ReLu函數的相關解析。

和昨天的邏輯迴歸一樣，我們這裏的每一層神經元最後還是要通過loss計算損失（交叉熵），再優化和迭代，最後使用Softmax函數進行輸出。

代碼實現

首先我們翻開之前的邏輯迴歸的相關代碼：

#-*- coding:utf-8 -*-
import time
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data

MNIST = input_data.read_data_sets("./", one_hot=True)
cost_accum = []

learning_rate = 0.01
batch_size = 128
n_epochs = 25

X = tf.placeholder(tf.float32, [batch_size, 784])
Y = tf.placeholder(tf.float32, [batch_size, 10])

w = tf.Variable(tf.random_normal(shape=[784,10], stddev=0.01), name="weights")
b = tf.Variable(tf.zeros([1, 10]), name="bias")

logits = tf.matmul(X, w) + b

entropy = tf.nn.softmax_cross_entropy_with_logits(labels=Y, logits=logits)
loss = tf.reduce_mean(entropy)
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate).minimize(loss)

init = tf.global_variables_initializer()

with tf.Session() as sess:
    sess.run(init)

    n_batches = int(MNIST.train.num_examples/batch_size)
    for i in range(n_epochs):
        for j in range(n_batches):
            X_batch, Y_batch = MNIST.train.next_batch(batch_size)
            loss_ = sess.run([optimizer, loss], feed_dict={ X: X_batch, Y: Y_batch})
            print("Loss of epochs[{0}] batch[{1}]: {2}".format(i, j, loss_))
            cost_accum.append(loss_)


    n_batches = int(MNIST.test.num_examples/batch_size)
    total_correct_preds = 0
    for i in range(n_batches):
        X_batch, Y_batch = MNIST.test.next_batch(batch_size)
        preds = tf.nn.softmax(tf.matmul(X_batch, w) + b) #算預測結果
        correct_preds = tf.equal(tf.argmax(preds, 1), tf.argmax(Y_batch, 1)) #判斷預測結果和標準結果
        accuracy = tf.reduce_sum(tf.cast(correct_preds, tf.float32))#先轉化判斷的字符類型，再降維求和，這樣就得到了一大堆壓縮後的判斷結果
        total_correct_preds += sess.run(accuracy) #之前都是公式，必須要run纔有用，然後記錄數量

    print("Accuracy {0}".format(total_correct_preds/MNIST.test.num_examples))#判斷正確率並輸出

plt.plot(range(len(cost_accum)), cost_accum, 'r')
plt.title('Logic Regression Cost Curve')
plt.xlabel('epoch*batch')
plt.ylabel('loss')
print('show')
plt.show()

我們對比前文的圖可知，我們現在缺少的是將一個學習模型，變成多個並且組成隱藏層的函數。

參考上文代碼中對於w、b的定義來規劃一個快速製作單個節點的方法

w = tf.Variable(tf.random_normal(shape=[784,10], stddev=0.01), name="weights")
b = tf.Variable(tf.zeros([1, 10]), name="bias")

在這裏我們需要怎家每個節點的輸出值，我們定義爲Z，Z的定義取代了前文我們logit的函數（就是那個logits = tf.matmul(X, w) + b），然後爲在增加一個變量：激活函數（方便後續使用）
則得到下面的方法：

def add_layer(inputs ,in_size,out_size,activation_function= True):
    #初始化變量
    W = tf.Variable(tf.random_normal([in_size,out_size]))#初始化正態分佈,in列，out行，第一層in傳入784，第二層傳入1000
    b = tf.Variable(tf.zeros([1,out_size])) #偏置量初始化，1列。out行

    Z = tf.matmul(inputs,W)+b #初始化公式，每一層都要有

     #爲了防止報錯，如果沒有傳入激活共識，則不作處理
    if activation_function is None:
        outputs = Z
    else:
        outputs = activation_function(Z)

    return outputs

並且在原來代碼的基礎上，將W、b的定義變成添加隱藏層和構建輸出層，輸入層不用構建，是因爲直接由訓練集輸入的時候已經規定好的。

#初始化
X = tf.placeholder(tf.float32, [batch_size, 784])
Y = tf.placeholder(tf.float32, [batch_size, 10])
# 添加隱藏層
l1 = add_layer(X, 784, 1000, activation_function=tf.nn.relu)
pre = add_layer(l1, 1000, 10, activation_function=None)  # 輸出
# w = tf.Variable(tf.random_normal(shape=[784,10], stddev=0.01), name="weights")
# b = tf.Variable(tf.zeros([1, 10]), name="bias")

# logits = tf.matmul(X, w) + b

OK，改造完成，把代碼改好、調試完畢後（我增加了一點點顯示效果罷了，無關痛癢）代碼參考如下：

#-*- coding:utf-8 -*-
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data

def add_layer(inputs ,in_size,out_size,activation_function= True):
    #初始化變量
    W = tf.Variable(tf.random_normal([in_size,out_size]))#初始化正態分佈,in列，out行，第一層in傳入784，第二層傳入1000
    b = tf.Variable(tf.zeros([1,out_size])) #偏置量初始化，1列。out行

    Z = tf.matmul(inputs,W)+b #初始化公式，每一層都要有

     #爲了防止報錯，如果沒有傳入激活共識，則不作處理
    if activation_function is None:
        outputs = Z
    else:
        outputs = activation_function(Z)

    return outputs

MNIST = input_data.read_data_sets("./", one_hot=True)
cost_accum = []

learning_rate = 0.05
batch_size = 128
n_epochs = 1

X = tf.placeholder(tf.float32, [batch_size, 784])
Y = tf.placeholder(tf.float32, [batch_size, 10])
# 添加隱藏層
l1 = add_layer(X, 784, 1000, activation_function=tf.nn.relu)
pre = add_layer(l1, 1000, 10, activation_function=None)  # 輸出
# w = tf.Variable(tf.random_normal(shape=[784,10], stddev=0.01), name="weights")
# b = tf.Variable(tf.zeros([1, 10]), name="bias")

# logits = tf.matmul(X, w) + b

entropy = tf.nn.softmax_cross_entropy_with_logits(labels=Y, logits=pre)
loss = tf.reduce_mean(entropy)
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(loss)
# optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate).minimize(loss)

init = tf.global_variables_initializer()

with tf.Session() as sess:
    sess.run(init)

    n_batches = int(MNIST.train.num_examples/batch_size)
    for i in range(n_epochs):
        for j in range(n_batches):
            X_batch, Y_batch = MNIST.train.next_batch(batch_size)
            loss_ = sess.run([optimizer, loss], feed_dict={ X: X_batch, Y: Y_batch})
            if j==0 :
                print("Loss oss of epochs[%s] batch[%s]: %s"%(i, j, loss_))
            cost_accum.append(loss_)


    n_batches = int(MNIST.test.num_examples/batch_size)
    total_correct_preds = 0
    for i in range(n_batches):
        X_batch, Y_batch = MNIST.test.next_batch(batch_size)
        # preds = tf.nn.softmax(tf.matmul(X_batch, w) + b)  # 算預測結果
        preds = sess.run(pre, feed_dict={X: X_batch, Y: Y_batch}) #算預測結果
        correct_preds = tf.equal(tf.argmax(preds, 1), tf.argmax(Y_batch, 1)) #判斷預測結果和標準結果
        accuracy = tf.reduce_sum(tf.cast(correct_preds, tf.float32))#先轉化判斷的字符類型，再降維求和，這樣就得到了一大堆壓縮後的判斷結果
        total_correct_preds += sess.run(accuracy) #之前都是公式，必須要run纔有用，然後記錄數量


    print("Accuracy {0}".format(total_correct_preds/MNIST.test.num_examples))#判斷正確率並輸出

plt.plot(range(len(cost_accum)), cost_accum, 'r')
plt.title('Logic Regression Cost Curve')
plt.xlabel('epoch*batch')
plt.ylabel('loss')
print('show')
plt.show()

代碼運行結果如下（僅僅學習一個回合哦）：

可以明顯觀察到位擬合速度快速提高，且學習後loss波動大幅減少，擬合率提高1個百分點。
那我們提高學習次數的結果呢：

可以看到我們的學習結果有非常大的進步！
那我們把神經元的數量擴大呢，直接變成2000個隱藏層？

可以看到我們的進步很小，但是學習時間卻非常的長，這樣很不划算