對論文 Deep Learning with Limited Numerical Precision 的理解與結論的驗證

文章思想：在深度學習中使用定點數來代替浮點數。傳統的定點數取捨是最近鄰取捨，本文引出了一種新的取捨方式：隨機取捨。即在產生下溢的時候是隨機舍入到跟它最接近的兩個數之一，其概率與他們之間的距離成反比。如 x=1.735 要保留一位小數，那麼距離它最近的兩個數分別是 1.7和1.8 。因爲 |x-1.7|=0.35,|x-1.8|=0.65 所以x舍爲1.7的概率是65%，入爲1.8的概率是35%
論文中說這樣的可以使得期望誤差達到0。比最近鄰舍入效果好一些。
浮點數量化爲定點數的好處在於：
1.浮點數運算慢，每次運算要對階之類的操作
2.定點數可以用一半的bit位達到浮點數的效果，可以節省空間

以下是論文中給出的這兩種取捨方法的定義：

符號意義：
IL：定點數的整數位數
FL：定點數的小數部分
<IL,FL>：表示一個定點數
WL：表示定點數的位數，即WL=IL+FL
x：表示原數字
e：表示最小的單位 e=2^(-FL)
Round-to-nearest:表示最近鄰取捨
Stochastic：隨機取捨
w.p:以概率爲…

.
.
.
.
如果上溢了，則用最大或者最小的來代替它

.
.
.
.
論文接着設計實驗來比對了該結論：
.

（一）使用mnist數據集在DNN上的效果如下：

上面兩幅圖是採用16位的定點數，在最近鄰舍入的情況下和32bit浮點數之間的比對，左邊是訓練集的誤差（注意：訓練集沒有百分號），右邊是測試集的誤差。當然他們都沒有浮點數好，但是隨機舍入更加接近浮點數效果。
下面兩幅圖是採用16位定點數，在隨機舍入的情況下和浮點數的比對，同樣左邊訓練數據集，右邊測試數據集。可以發現表現上跟浮點數相當了
.
.

（二）使用mnist數據集在CNN上的效果如下：
(三)實驗設計部分：

使用浮點數DNN全連接網絡在mnist上的效果：
DNN網絡結構如下：
兩個隱層都是1000個結點
初始化連接權：均值爲0，方差爲0.01的正太分佈，即N（0,0.01）
初始化偏置：全0
採用隨機梯度下降，每次batch=100

結果：在測試數據集上的誤差爲1.4%
對此我用tensorflow來驗證了一下：
同時使用了l2正則化和學習效率逐步遞減的小操作。看論文中是訓練了30輪。因爲mnist數據集訓練集有55000個，一個batch是100，所以應該迭代 30*55000/100=16500次，每55000/100=550次輸出一次結果。實驗結果與論文中基本一致
調試了好久，正則化參數最終設置爲0.005

import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
batch=100
input_node = 28*28
output_node = 10
fc_node1 = 1000
fc_node2=1000
regularization_rate=0.0005
epoch=40000
def trains(mnist):
    init = tf.truncated_normal_initializer(stddev=0.01)
    initb = tf.constant_initializer(0.0)
    x=tf.compat.v1.placeholder(dtype=tf.float32,shape=[None,input_node],name='x-input')
    y_=tf.compat.v1.placeholder(dtype=tf.float32,shape=[None,output_node],name='y-input')

    weight1=tf.get_variable("weight1",[input_node,fc_node1],initializer=init)
    bias1=tf.get_variable('bais1',[fc_node1],initializer=initb)
    weight2=tf.get_variable('weight2',[fc_node1,fc_node2],initializer=init)
    bias2=tf.get_variable('bias2',[fc_node2],initializer=initb)
    weight3 = tf.get_variable('weight3', [fc_node2, output_node], initializer=init)
    bias3=tf.get_variable('bias3',[output_node],initializer=initb)

    hidden1 = tf.nn.relu(tf.matmul(x, weight1) + bias1)
    hidden2=tf.nn.relu(tf.matmul(hidden1,weight2)+bias2)
    y=tf.matmul(hidden2,weight3)+bias3
    regularizer = tf.contrib.layers.l2_regularizer(regularization_rate)
    loss=regularizer(weight1)+regularizer(weight2)
    global_step = tf.Variable(0, trainable=False)
    losses=loss+tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(logits=y, labels=tf.argmax(y_, 1)))

    learning_rate = tf.compat.v1.train.exponential_decay(0.8, global_step, mnist.train.num_examples / batch, 0.95)

    train_step = tf.compat.v1.train.GradientDescentOptimizer(learning_rate).minimize(losses, global_step=global_step)
    correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
    with tf.compat.v1.Session() as sess:
        tf.global_variables_initializer().run()
        validata_feed = {x: mnist.validation.images, y_: mnist.validation.labels}
        test_feed = {x: mnist.test.images, y_: mnist.test.labels}
        for i in range(epoch):
            if i % 1000 == 0:
                validata_acc = sess.run(accuracy, feed_dict=validata_feed)
                print("訓練 %d 輪後,訓練數據集的精度爲： %g%%" % (i, validata_acc*100))
                print("當前學習效率爲：%g"%(sess.run(learning_rate)))
            if i%5000==0:
                test_acc = sess.run(accuracy, feed_dict=test_feed)
                print("訓練 %d 輪後,測試數據集的精度爲： %g%%" % ( i, test_acc*100,))
            xs, ys = mnist.train.next_batch(batch)
            sess.run(train_step, feed_dict={x: xs, y_: ys})

        test_acc = sess.run(accuracy, feed_dict=test_feed)
        print("訓練 %d 輪後,測試數據集的精度爲： %g%%" % (epoch, test_acc * 100,))


def main(argv=None):
    mnist = input_data.read_data_sets("C:/Users/tang/Desktop/deeplearning/mnist數據集", one_hot=True)
    trains(mnist)
    
if __name__ == '__main__':
    tf.app.run()

運行結果如下：

訓練 0 輪後,驗證數據集的精度爲： 11.63%
訓練 0 輪後,訓練數據集的精度爲： 11.4927%
訓練 1 輪後,驗證數據集的精度爲： 96.34%
訓練 1 輪後,訓練數據集的精度爲： 96.6564%
訓練 2 輪後,驗證數據集的精度爲： 96.94%
訓練 2 輪後,訓練數據集的精度爲： 97.6182%
......
訓練 28 輪後,驗證數據集的精度爲： 98.53%
訓練 28 輪後,訓練數據集的精度爲： 99.9927%
訓練 29 輪後,驗證數據集的精度爲： 98.54%
訓練 29 輪後,訓練數據集的精度爲： 99.9891%
訓練 30 輪後,測試數據集的精度爲： 98.62%

最終結果測試數據集精度爲98.62%，誤差是1.38%
.
圖像大概這樣：

.
.
下面把浮點數改爲定點數，折騰了兩天後才把程序寫完。

tensorflow是個靜態的框架，當模型搭建好後就不能動態的改變了，並且框架裏並沒有定點數這個數據類型。後來想到了一個辦法，每訓練一次，把網絡裏的權值讀取出來，進行量化後再存進去。這樣可以間接的使用定點數，這種方法在計算中還是使用的浮點數，這是框架本身不能避免的。如果想自己構建這個網絡太耗時了，而且效率遠不如框架。所以只能間接使用這個驗證方法。

以下是量化程序：

FL = 8  # 小數部分位數
IL = 16-FL  # 整數部分位數
round = False  # False 是最近鄰取捨，True 是隨機取捨
MIN = -(1 << (IL - 1))
MAX = -MIN - 2 ** (-FL)
# 量化範圍是[  -2^(IL-1)~2^(IL-1)-2^(-IF)  ]
def float_to_fixed(x):

    global MIN, MAX, FL
    #print(FL)
    if x <= MIN: return MIN
    if x >= MAX: return MAX
    sig = 1
    if x < 0:
        sig = -1
        x = -x
    q = int(x)
    x -= q
    e = 1
    for i in range(FL):
        x *= 2
        e /= 2
        if x >= 1:
            x -= 1
            q += e

    if round:  # 隨機舍入
        r = random()  # 產生0-1的隨機數
        if r < x:
            q += e
    else:  # 鄰近舍入
        if x >= 0.5:
            q += e
    q *= sig
    if q <= MIN: return MIN
    if q >= MAX: return MAX
    return q

搗鼓了一天把它和tensorflow對接上了，可是運行效率太低了。後來查了下原因，是因爲python語言本身就慢，用python寫的量化程序運算效率遠遠低於tensorflow的效率，估計相差50倍左右。於是晚上逛知乎發現了一個寶貝—— numba ，用用numba 的jit 可以把python中的高密度計算和循環加速幾十倍，它是通過預編譯把python代碼處理成本地的彙編語言，從而達到和c語言一樣的速度，而且代碼寫得越像c++，加速越明顯。第二天趕緊加上這個神奇的操作後，終於能跑起來了。看cup的佔用率應該基本能匹配上tensorflow的速度了。

以下是完整代碼

import tensorflow as tf
from random import random
from tensorflow.examples.tutorials.mnist import input_data
import os
from numba import jit, int32, float32, boolean, cuda
from matplotlib import pyplot as plt

os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
batch = 100
input_node = 28 * 28
output_node = 10
fc_node1 = 1000
fc_node2 = 1000
regularization_rate = 0.0001
epoch = 16500


FL = 8  # 小數部分位數
IL = 16-FL  # 整數部分位數
round = False  # False 是最近鄰取捨，True 是隨機取捨
# 論文要驗證 FL=8,10,14 round=False 和 FL=8,10,14 round=True 這六種情況
# 量化範圍是[  -2^(IL-1)~2^(IL-1)-2^(-IF)  ]
MIN = -(1 << (IL - 1))
MAX = -MIN - 2 ** (-FL)

@jit(float32(float32), nopython=True)
def float_to_fixed(x):

    global MIN, MAX, FL
    #print(FL)
    if x <= MIN: return MIN
    if x >= MAX: return MAX
    sig = 1
    if x < 0:
        sig = -1
        x = -x
    q = int(x)
    x -= q
    e = 1
    for i in range(FL):
        x *= 2
        e /= 2
        if x >= 1:
            x -= 1
            q += e

    if round:  # 隨機舍入
        r = random()  # 產生0-1的隨機數
        if r < x:
            q += e
    else:  # 鄰近舍入
        if x >= 0.5:
            q += e

    q *= sig
    if q <= MIN: return MIN
    if q >= MAX: return MAX
    return q


@jit(nopython=True)
def exchange2(arr):
    for i in range(arr.shape[0]):
        for j in range(arr.shape[1]):
            arr[i, j] = float32(arr[i, j])
            arr[i, j] = float_to_fixed(arr[i, j])
    return None


@jit(nopython=True)
def exchange1(arr):
    for i in range(arr.shape[0]):
        arr[i] = float_to_fixed(arr[i])

    return None


train_error = []
test_error = []


def trains(mnist):
    init = tf.truncated_normal_initializer(stddev=0.01)
    initb = tf.constant_initializer(0.0)
    x = tf.compat.v1.placeholder(dtype=tf.float32, shape=[None, input_node], name='x-input')
    y_ = tf.compat.v1.placeholder(dtype=tf.float32, shape=[None, output_node], name='y-input')

    Swap_weight1 = tf.compat.v1.placeholder(dtype=tf.float32, shape=[input_node, fc_node1], name='Swap_weight1')
    Swap_bias1 = tf.compat.v1.placeholder(dtype=tf.float32, shape=[fc_node1], name='Swap_bias1')
    Swap_weight2 = tf.compat.v1.placeholder(dtype=tf.float32, shape=[fc_node1, fc_node2], name='Swap_weight2')
    Swap_bias2 = tf.compat.v1.placeholder(dtype=tf.float32, shape=[fc_node2], name='Swap_bias2')
    Swap_weight3 = tf.compat.v1.placeholder(dtype=tf.float32, shape=[fc_node2, output_node], name='Swap_weight3')
    Swap_bias3 = tf.compat.v1.placeholder(dtype=tf.float32, shape=[output_node], name='Swap_bias3')

    weight1 = tf.get_variable("weight1", [input_node, fc_node1], initializer=init)
    bias1 = tf.get_variable('bais1', [fc_node1], initializer=initb)
    weight2 = tf.get_variable('weight2', [fc_node1, fc_node2], initializer=init)
    bias2 = tf.get_variable('bias2', [fc_node2], initializer=initb)
    weight3 = tf.get_variable('weight3', [fc_node2, output_node], initializer=init)
    bias3 = tf.get_variable('bias3', [output_node], initializer=initb)

    sw1 = tf.assign(weight1, Swap_weight1)
    sw2 = tf.assign(weight2, Swap_weight2)
    sw3 = tf.assign(weight3, Swap_weight3)
    sb1 = tf.assign(bias1, Swap_bias1)
    sb2 = tf.assign(bias2, Swap_bias2)
    sb3 = tf.assign(bias3, Swap_bias3)

    hidden1 = tf.nn.relu(tf.matmul(x, weight1) + bias1)
    hidden2 = tf.nn.relu(tf.matmul(hidden1, weight2) + bias2)

    y = tf.matmul(hidden2, weight3) + bias3
    regularizer = tf.contrib.layers.l2_regularizer(regularization_rate)
    loss = regularizer(weight1) + regularizer(weight2)
    global_step = tf.Variable(0, trainable=False)
    losses = loss + tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(logits=y, labels=tf.argmax(y_, 1)))
    learning_rate = tf.compat.v1.train.exponential_decay(0.8, global_step, mnist.train.num_examples / batch, 0.95)

    train_step = tf.compat.v1.train.GradientDescentOptimizer(learning_rate).minimize(losses, global_step=global_step)
    correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
    with tf.compat.v1.Session() as sess:
        tf.global_variables_initializer().run()
        test_feed = {x: mnist.test.images, y_: mnist.test.labels}

        for i in range(epoch):
            if i % 550 == 0:  # 因爲batch=100 ，每迭代550次，就相當於完成了1次訓練。總共訓練30輪
                test_acc = sess.run(accuracy, feed_dict=test_feed)
                print("訓練 %d 輪後,驗證數據集的精度爲： %g%%" % (i / 550, test_acc * 100))
                # print("當前學習效率爲：%g"%(sess.run(learning_rate)))
                train_acc = 0
                for j in range(10):
                    XS, YS = mnist.train.next_batch(5500)
                    train_acc += sess.run(accuracy, feed_dict={x: XS, y_: YS})
                print("訓練 %d 輪後,訓練數據集的精度爲： %g%%" % (i / 550, train_acc * 10))
                train_error.append(100 - train_acc * 10)
                test_error.append(100 - test_acc * 100)

            xs, ys = mnist.train.next_batch(batch)
            sess.run(train_step, feed_dict={x: xs, y_: ys})

            w1, w2, w3, b1, b2, b3 = sess.run([weight1, weight2, weight3, bias1, bias2, bias3])
            # print("befroe:",w1)
            exchange2(w1)
            # print("after:",w1)
            exchange2(w2)
            exchange2(w3)
            exchange1(b1)
            exchange1(b2)
            exchange1(b3)
            sess.run([sw1, sw2, sw3], feed_dict={Swap_weight1: w1, Swap_weight2: w2, Swap_weight3: w3})
            sess.run([sb1, sb2, sb3], feed_dict={Swap_bias1: b1, Swap_bias2: b2, Swap_bias3: b3})

        test_acc = sess.run(accuracy, feed_dict=test_feed)
        print("訓練 %d 輪後,測試數據集的精度爲： %g%%" % (epoch / 550, test_acc * 100))


def main(argv=None):
    mnist = input_data.read_data_sets("C:/Users/tang/Desktop/deeplearning/mnist數據集", one_hot=True)
    trains(mnist)
    xlable ='(FL='
    xlable+=str(FL)+' and round to '
    if round :
        xlable+='nearest '
    else :
        xlable+='Stochastic '

    xlable+=') Training epoch'
    plt.plot(range(0, epoch//550), train_error, label="train error")
    plt.plot(range(0, epoch//550), test_error, label='test error')
    plt.xlabel(xlable)
    plt.ylabel('Error (%)')
    plt.legend()
    plt.show()


if __name__ == '__main__':
    tf.app.run()

這是我的垃圾筆記本跑的，快一個小時了。貼一個FL=8，最近鄰舍入的運行結果

訓練 0 輪後,驗證數據集的精度爲： 5.95%
訓練 0 輪後,訓練數據集的精度爲： 6.53818%
訓練 1 輪後,驗證數據集的精度爲： 90.46%
訓練 1 輪後,訓練數據集的精度爲： 90.4982%
訓練 2 輪後,驗證數據集的精度爲： 92.94%
訓練 2 輪後,訓練數據集的精度爲： 93.1145%
......
訓練 29 輪後,驗證數據集的精度爲： 97.41%
訓練 29 輪後,訓練數據集的精度爲： 99.9436%
訓練 30 輪後,測試數據集的精度爲： 97.47%

圖像大概這樣：

這個結果跟論文中的圖像基本吻合
訓練集的誤差0.05%，測試集誤差2.53% ，比論文好也是理應的，因爲我程序計算的中間過程全是32bit浮點數。
.
下面是8位小數隨機舍入的結果：

訓練 0 輪後,驗證數據集的精度爲： 13.98%
訓練 0 輪後,訓練數據集的精度爲： 14.1473%
訓練 1 輪後,驗證數據集的精度爲： 96.21%
訓練 1 輪後,訓練數據集的精度爲： 96.5145%
訓練 2 輪後,驗證數據集的精度爲： 97.08%
......
訓練 29 輪後,驗證數據集的精度爲： 98.28%
訓練 29 輪後,訓練數據集的精度爲： 99.9527%
訓練 30 輪後,測試數據集的精度爲： 98.3%

圖像大概是這個樣子：

.
.
.
下面是mnist數據集在CNN上的設計
CNN網絡結果跟LeNet-5 模型類似
CNN結構：
含有兩個卷積層，每個卷積層後跟一個max_pooling(池化）層
第一個卷積層是8個5x5的filter
第二個卷積層是16個5x5的filter
兩個池化層都是2x2的步長也是2x2的採樣
全連接層是128個結點，然後輸出層採用softmax，結點個數爲10
激活函數是：ReLU
採用學習效率梯度遞減（最開始爲0.1，衰減率是0.95，衰減速度是每個batch）
使用動量梯度下降算法（p=0.9）
L2正則化權重衰減率爲0.0005

論文最終的誤差是0.77%

沒有使用動量梯度，還是用隨機梯度下降

import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
import numpy as np
batch = 100
image_size = 28
conv1_size = 5
conv1_deep = 16
conv2_size = 5
conv2_deep = 32
fc_node = 128
output_node = 10
regularization_rate = 0.0001
epoch = 40000


def trains(mnist):
    init = tf.truncated_normal_initializer(stddev=0.01)
    initb = tf.constant_initializer(0.0)
    x = tf.placeholder(tf.float32, [batch, image_size, image_size, 1], name='x-input')
    y_ = tf.placeholder(tf.float32, [None, output_node], name='y-input')
    # 第一層卷積
    weight1 = tf.get_variable("weight1", [conv1_size, conv1_size, 1, conv1_deep], initializer=init)
    bias1 = tf.get_variable("bias1", [conv1_deep], initializer=initb)
    conv1 = tf.nn.bias_add(tf.nn.conv2d(x, weight1, strides=[1, 1, 1, 1], padding="SAME"), bias1)
    conv1 = tf.nn.relu(conv1)
    pool1 = tf.nn.max_pool(conv1, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding="SAME")
    # 第二層卷積
    weight2 = tf.get_variable("weight2", [conv2_size, conv2_size, conv1_deep, conv2_deep], initializer=init)
    bias2 = tf.get_variable('bias2', [conv2_deep], initializer=initb)
    conv2 = tf.nn.bias_add(tf.nn.conv2d(pool1, weight2, strides=[1, 1, 1, 1], padding="SAME"), bias2)
    conv2 = tf.nn.relu(conv2)
    pool2 = tf.nn.max_pool(conv2, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding="SAME")
    #轉換形狀
    P_shape = pool2.get_shape().as_list()
    node = P_shape[1] * P_shape[2] * P_shape[3]
    fc_input1 = tf.reshape(pool2, [P_shape[0], node])

    # 全連接層
    weight3 = tf.get_variable("weight3", [node, fc_node], initializer=init)
    bias3 = tf.get_variable("bias3", [fc_node], initializer=initb)
    fc_output1 = tf.nn.relu(tf.matmul(fc_input1, weight3) + bias3)

    # 輸出層
    weight4 = tf.get_variable("weight4", [fc_node, output_node], initializer=init)
    bias4 = tf.get_variable("bias4", [output_node], initializer=initb)
    y=tf.matmul(fc_output1, weight4) + bias4

    regularizer = tf.contrib.layers.l2_regularizer(regularization_rate)
    loss = regularizer(weight3)+regularizer(weight4)
    global_step = tf.Variable(0, trainable=False)
    losses = loss + tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(logits=y, labels=tf.argmax(y_, 1)))
    learning_rate = tf.compat.v1.train.exponential_decay(0.1, global_step, mnist.train.num_examples / batch, 0.95)

    train_step = tf.compat.v1.train.GradientDescentOptimizer(learning_rate=learning_rate)\
        .minimize(losses,global_step=global_step)
    sum = tf.reduce_sum(tf.cast(tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1)), dtype=tf.float32))
    with tf.compat.v1.Session() as sess:
        tf.global_variables_initializer().run()
        for i in range(epoch+1):
            if i % 1000 == 0:
                SUM=0
                for j in range(mnist.validation.num_examples//batch):
                    xs, ys = mnist.validation.next_batch(batch)
                    xs = np.reshape(xs, newshape=[batch, image_size, image_size, 1])
                    SUM+= sess.run(sum, feed_dict={x:xs,y_:ys})
                print("訓練 %d 輪後,驗證數據集的精度爲： %g%%" % (i, SUM/mnist.validation.num_examples * 100))
                print("當前學習效率爲：%g" % (sess.run(learning_rate)))

            if i % 5000 == 0:
                SUM=0
                for j in range(mnist.test.num_examples//batch):
                    xs, ys = mnist.train.next_batch(batch)
                    xs = np.reshape(xs, newshape=[batch, image_size, image_size, 1])
                    SUM += sess.run(sum, feed_dict={x: xs, y_: ys})
                print("訓練 %d 輪後,測試數據集的精度爲： %g%%" % (i, SUM/mnist.test.num_examples * 100,))

            xs, ys = mnist.train.next_batch(batch)
            xs = np.reshape(xs, newshape=[batch, image_size, image_size, 1])
            sess.run(train_step, feed_dict={x: xs, y_: ys})

def main(argv=None):
    mnist = input_data.read_data_sets("C:/Users/tang/Desktop/deeplearning/mnist數據集", one_hot=True)
    trains(mnist)


if __name__ == '__main__':
    tf.app.run()

運行結果如下：

訓練 0 輪後,驗證數據集的精度爲： 7.72%
當前學習效率爲：0.1
訓練 0 輪後,測試數據集的精度爲： 8.47%
訓練 1000 輪後,驗證數據集的精度爲： 89.78%
當前學習效率爲：0.0910956
訓練 2000 輪後,驗證數據集的精度爲： 98.16%
當前學習效率爲：0.0829841
訓練 3000 輪後,驗證數據集的精度爲： 98.48%
當前學習效率爲：0.0755949
訓練 4000 輪後,驗證數據集的精度爲： 98.72%
當前學習效率爲：0.0688636
訓練 5000 輪後,驗證數據集的精度爲： 98.86%
當前學習效率爲：0.0627317
訓練 5000 輪後,測試數據集的精度爲： 98.86%
訓練 6000 輪後,驗證數據集的精度爲： 98.9%
當前學習效率爲：0.0571459
訓練 7000 輪後,驗證數據集的精度爲： 99.02%
當前學習效率爲：0.0520574
訓練 8000 輪後,驗證數據集的精度爲： 98.7%
當前學習效率爲：0.047422
訓練 9000 輪後,驗證數據集的精度爲： 98.96%
當前學習效率爲：0.0431993
訓練 10000 輪後,驗證數據集的精度爲： 99%
當前學習效率爲：0.0393527
訓練 10000 輪後,測試數據集的精度爲： 98.82%
訓練 11000 輪後,驗證數據集的精度爲： 99.06%
當前學習效率爲：0.0358486
訓練 12000 輪後,驗證數據集的精度爲： 99.06%
當前學習效率爲：0.0326565
訓練 13000 輪後,驗證數據集的精度爲： 99.14%
當前學習效率爲：0.0297486
訓練 14000 輪後,驗證數據集的精度爲： 99.16%
當前學習效率爲：0.0270997
訓練 15000 輪後,驗證數據集的精度爲： 99.12%
當前學習效率爲：0.0246866
訓練 15000 輪後,測試數據集的精度爲： 99.03%

可能由於優化不當，誤差爲0.97%。效果沒有論文的好。運行時間過長。

對論文 Deep Learning with Limited Numerical Precision 的理解與結論的驗證

實習生去公司都幹些啥

快排，歸併排序遞歸和非遞歸寫法

對論文 Deep Learning with Limited Numerical Precision 的理解與結論的驗證

LeNet-5 卷積神經網絡

強化學習 baselines項目源碼部分解讀

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結