【深度學習】神經網絡實戰-手寫體識別+圖片驗證

原創

2020-06-16 12:00

剛剛接觸深度學習的時候，最先了解的都是神經網絡，在之前的神經網絡介紹中我簡單的介紹了神經網絡，在本篇文章我將使用mnist手寫體數據集爲例子，教大家如何實現一個神經網絡。

1、基本結構

基本的神經網絡包含，輸入層，隱藏層，輸出層，我們本次使用兩層神經網絡爲例

數據從輸入層進入然後經歷神經元進行特徵提取，具體神經元展開如下圖所示

數據需要經過乘權重w並與偏置b求和最終通過激活函數，然後纔算從一個神經元輸出（當只有一個神經元的時候就變成了邏輯迴歸。），損失函數有很多種 relu ,sigmoid,tanh等等，詳細講解會在之後詳解，引入激活函數主要是爲了引入非線性影響。之後在輸出的時候採用梯度下降法來優化損失函數就是整個求解過程。
此處的例子我們使用的全連接層所以就沒加激活函數。

2、代碼實現

此處我們使用tensorflow框架進行開發，mnist手寫體數據集的例子已經被用爛了，tensorflow官網上就是以此作爲入門教程，今天我們寫個清新脫俗的，這次我們不僅訓練我們的網絡，還要具體使用一個自己的圖片，來讓我們的模型識別。

import tensorflow as tf
import cv2
import numpy as np

# 導入MNIST 數據集，不能直接下載的就先去官網下好，然後拖到工程目錄下自己讀取
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("input_data/", one_hot=True)

# 神經網絡參數
num_input = 784   # mnist數據集裏面的圖片是28*28所以輸入爲784
n_hidden_1 = 256  # 隱藏層神經元
n_hidden_2 = 256
num_output = 10   # 輸出層

# 模型類
class Model(object):
    def __init__(self, learning_rate, num_steps, batch_size, display_step):
        self.learning_rate = learning_rate  # 學習率
        self.num_steps = num_steps          # 訓練次數
        self.batch_size = batch_size        # batch大小
        self.display_step = display_step    # 日誌打印週期

        # 權重參數 注意此處不能講權重全部初始化爲零
        self.weights = {
            'h1': tf.Variable(tf.random_normal([num_input, n_hidden_1])),
            'h2': tf.Variable(tf.random_normal([n_hidden_1, n_hidden_2])),
            'out': tf.Variable(tf.random_normal([n_hidden_2, num_output])),
        }

        # 偏置參數
        self.biases = {
            'h1': tf.Variable(tf.random_normal([n_hidden_1])),
            'h2': tf.Variable(tf.random_normal([n_hidden_2])),
            'out': tf.Variable(tf.random_normal([num_output])),
        }

    # 網絡模型 
    def neural_net(self, input):
        layer_1 = tf.add(tf.matmul(input, self.weights['h1']), self.biases['h1'])
        layer_2 = tf.add(tf.matmul(layer_1, self.weights['h2']), self.biases['h2'])
        out_layer = tf.add(tf.matmul(layer_2, self.weights['out']), self.biases['out'])
        return out_layer

    # 訓練模型
    def train(self):
        # 佔位符
        X = tf.placeholder(tf.float32, shape=[None, num_input])
        Y = tf.placeholder(tf.float32, shape=[None, num_output])
        # 創建模型
        logits = self.neural_net(X)
        pred = tf.nn.softmax(logits)

        # 損失函數
        loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=Y))
        # 計算準確率
        correct_pred = tf.equal(tf.argmax(pred, 1), tf.argmax(Y, 1))
        accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))

        # 定義優化器
        optimizer = tf.train.AdamOptimizer(self.learning_rate).minimize(loss)
        init = tf.global_variables_initializer()
        saver = tf.train.Saver()

        with tf.Session() as sess:
            sess.run(init)

            for step in range(1, self.num_steps + 1):
                batch_x, batch_y = mnist.train.next_batch(batch_size=self.batch_size)
                sess.run(optimizer, feed_dict={X: batch_x, Y: batch_y})

                if step % self.display_step == 0 or step == 1:
                    loss_v, acc = sess.run([loss, accuracy], feed_dict={X: batch_x, Y: batch_y})

                    print("Step " + str(step) + ", Minibatch Loss= " + \
                          "{:.4f}".format(loss_v) + ", Training Accuracy= " + \
                          "{:.3f}".format(acc))
            print("optimization finished!")
            saver.save(sess, './model/neural_net.ckpt')
            # 用測試集計算準確率
            print("Testing Accuracy:", sess.run(accuracy, feed_dict={X: mnist.test.images,
                                                                     Y: mnist.test.labels}))
    # 評估函數 用來讀入自定義的圖片來驗證模型的準確率
    def evaluate(self, img_dir):
        with tf.Session() as sess:
            # 二值化處理
            image = cv2.imread(img_dir, cv2.IMREAD_GRAYSCALE).astype(np.float32)
            im = cv2.resize(image, (28, 28), interpolation=cv2.INTER_CUBIC)
            img_gray = (im - (255 / 2.0)) / 255
            cv2.imshow('out',img_gray)
            cv2.waitKey(0)
            img = np.reshape(img_gray, [-1, 784]) # -1表示不固定當前維度大小
            # 恢復模型
            saver = tf.train.Saver()
            saver.restore(sess, save_path='./model/neural_net.ckpt')
            # 識別
            X = tf.placeholder(tf.float32, shape=[None, num_input])
            Y = tf.placeholder(tf.float32, shape=[None, num_output])
            # 創建模型
            logits = self.neural_net(X)
            pred = tf.nn.softmax(logits)
            prediction = tf.argmax(pred, 1)
            predint = prediction.eval(feed_dict={X: img}, session=sess)
            print(predint)


if __name__ == '__main__':
    model = Model(learning_rate=0.01, num_steps=5000, batch_size=128, display_step=100)
    model.train()
    model.evaluate("test.jpg")

3、最終效果

我迭代了50000次，使用測試集驗證最後識別準確率能達到88%左右，我自己用畫圖畫了個 “3”

然後使用這個圖片作爲輸入，最終模型可以成功的識別出3

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

【深度學習】神經網絡實戰-手寫體識別+圖片驗證

1、基本結構

2、代碼實現

3、最終效果

工作中用到的腳本合集

微服務實踐Aspire項目發佈到遠程k8s集羣

通過f-string編寫簡潔高效的Python格式化輸出代碼

[轉帖]20個常用的Linux工具命令

[轉帖]PostgreSQL從小白到高手教程 - 第46講：poc-tpch測試

24-5-18 X

求有向圖的最短路徑python

【強化學習】DDPG(Deep Deterministic Policy Gradient)算法詳解

【機器學習】邏輯迴歸(Logistic Regression)

【深度學習】神經網絡實戰-手寫體識別+圖片驗證

【論文筆記】Filter Pruning via Geometric Median for Deep Convolutional Networks Acceleration

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結