語音識別學習日誌 2018-7-26 [tensorflow實現CNN]

CNN看的差不多了，直接上代碼吧，CNN的介紹什麼的就不發了，參考這篇博客，講的很清楚。有一點需要明確的是卷積核應該是[寬，高，通道數量，filter數量]四維。可以看着下面這張非常有名的圖理解一下：

下面是代碼，還是用MNIST數據集，數據集的下載和說明可以翻我以前的博客，不再一直說了，兩個卷積層，一個全連接層。其他的練習都在https://github.com/IMLHF/tensorflow_practice上。

# -*- coding: utf-8 -*-
# github ：https://github.com/IMLHF/tensorflow_practice
# -*- coding: utf-8 -*-
from tensorflow.examples.tutorials.mnist import input_data
import tensorflow as tf

__mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)

__learning_rate=0.0001
__training_epochs = 600
__batch_size = 100  # 每批訓練數據的大小
__display_step = 10  # 每隔__display_step批次顯示一次進度

'''
tf.nn.conv2d 卷積函數
參數 input 輸入圖像              四維，shape如[batch, in_height, in_width, in_channels]
參數 filter 卷積核               四維，shape如[filter_height, filter_width, in_channels, out_channels]
參數 strides 卷積核移動步長       列表，卷積時在圖像每一維的步長
參數 padding 邊緣處理方式       SAME和VALID,SAME就是可以在外圍補0再卷積，VALID會檢查步長是否合理，不能補0
返回 Tensor
'''
def conv2d(__x_input, __conv_kernel):
  return tf.nn.conv2d(__x_input, __conv_kernel, strides=[1, 1, 1, 1], padding='SAME')


'''
tf.nn.max_pool 池化函數
參數 value 輸入圖像               四維，shape如[batch_num, height, width, channels]
參數 ksize 池化窗口大小            列表[batch, height, width, channels]
參數 strides 池化窗口移動步長       列表[batch, height, width, channels]，
一般不對batch和圖像通道數進行池化，所以ksize和strides的batch個channels都是1
參數 padding 邊緣處理方式      SAME和VALID,SAME就是可以在外圍補0再卷積，VALID會檢查步長是否合理，不能補0
返回 Tensor
'''
def max_pooling_2x2(__max_pooling_target):
  return tf.nn.max_pool(__max_pooling_target, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')



#定義權值和偏置
__weights = {
    # 截斷正態分佈獲取隨機值
    # [batch, in_height, in_width, in_channels]
    # 5x5x1x32,卷積核的視野大小是5x5，輸入圖像通道爲1，該層卷積核32個filter
    '__w_conv1': tf.Variable(tf.truncated_normal(shape=[5, 5, 1, 32], stddev=0.1)),
    # [filter_height, filter_width, in_channels, out_channels]
    # 5x5x32x64,同樣是卷積核的視野大小是5x5，因爲上一層使用了32個filter，
    # 所以上層生成結果的通道爲32，該層卷積核有64個filter
    '__w_conv2': tf.Variable(tf.truncated_normal(shape=[5, 5, 32, 64], stddev=0.1)),
    '__w_fc1': tf.Variable(tf.truncated_normal(shape=[7 * 7 * 64, 1024], stddev=0.1)),
    '__w_fc2': tf.Variable(tf.truncated_normal(shape=[1024, 10], stddev=0.1)),
}
__biases={
    '__b_conv1': tf.Variable(tf.constant(0.1, shape=[32])),
    '__b_conv2': tf.Variable(tf.constant(0.1, shape=[64])),
    '__b_fc1': tf.Variable(tf.constant(0.1, shape=[1024])),
    '__b_fc2': tf.Variable(tf.constant(0.1, shape=[10]))
}

# region 創建CNN結構
def conv_net(__x_t,__keep_probability_t):
  #shape，-1表示有此維度，但是數值不定
  __x_image = tf.reshape(__x_t, [-1, 28, 28, 1])
  __h_conv1 = tf.nn.relu(conv2d(__x_image, __weights['__w_conv1']) + __biases['__b_conv1'])
  __h_pool1 = max_pooling_2x2(__h_conv1)
  __h_conv2 = tf.nn.relu(conv2d(__h_pool1, __weights['__w_conv2']) + __biases['__b_conv2'])
  __h_pool2 = max_pooling_2x2(__h_conv2)
  __h_pool2_flat = tf.reshape(__h_pool2, [-1, 7 * 7 * 64])
  __h_fc1 = tf.nn.relu(tf.matmul(__h_pool2_flat, __weights['__w_fc1']) + __biases['__b_fc1'])

  # tf.nn.dropout(x, keep_prob, noise_shape=None, seed=None, name=None)
  # 防止過擬合   在對輸入的數據進行一定的取捨，從而降低過擬合
  # 參數 x 輸入數據
  # 參數 keep_prob 保留率             對輸入數據保留完整返回的概率
  # 返回 Tensor
  __h_fc1_drop = tf.nn.dropout(__h_fc1, __keep_probability_t)

  # tf.nn.softmax(logits, dim=-1, name=None): SoftMax函數
  # 參數 logits 輸入            一般輸入是logit函數的結果
  # 參數 dim 卷積核             指定是第幾個維度，默認是-1，表示最後一個維度
  # 返回 Tensor
  __y_conv_logits = tf.matmul(__h_fc1_drop, __weights['__w_fc2']) + __biases['__b_fc2']
  return __y_conv_logits
#endregion


__X_input = tf.placeholder(tf.float32, [None, 784])
__Y_true = tf.placeholder(tf.float32, [None, 10])
__keep_probability = tf.placeholder(tf.float32)

__logits=conv_net(__X_input,__keep_probability)
__out_softmax=tf.nn.softmax(__logits)
# __loss_cross_entropy = tf.reduce_mean(
#     -tf.reduce_sum(__Y_true * tf.log(__out_softmax), axis=1))
__loss_cross_entropy = tf.reduce_mean(
    tf.nn.softmax_cross_entropy_with_logits(logits=__logits, labels=__Y_true))

__train_op = tf.train.AdamOptimizer(__learning_rate).minimize(__loss_cross_entropy)

__accuracy = tf.reduce_mean(
    tf.cast(tf.equal(tf.argmax(__out_softmax, 1), tf.argmax(__Y_true, 1)), tf.float32))

init=tf.global_variables_initializer()

with tf.Session() as __session_t:
  __session_t.run(init)
  for i in range(__training_epochs):
    # Datasets.train.next_batch 獲取批量處樣本
    # 返回 [image,label]
    __x_batch,__y_batch = __mnist.train.next_batch(__batch_size)

    if i % __display_step == 0:
      __train_accuracy = __session_t.run(
          __accuracy,
          feed_dict={__X_input: __x_batch,
                     __Y_true: __y_batch,
                     __keep_probability: 1.0})
      print("step %d, training accuracy %g" % (i, __train_accuracy))

    __train_op.run(feed_dict={__X_input: __x_batch, __Y_true: __y_batch, __keep_probability: 0.5})

  print("test accuracy %g" % __session_t.run(
      __accuracy,
      feed_dict={__X_input: __mnist.test.images,
                 __Y_true: __mnist.test.labels,
                 __keep_probability: 1.0}))

語音識別學習日誌 2018-7-26 [tensorflow實現CNN]

《Python進階》學習筆記

Leetcode 3161. 物塊放置查詢

leetcode 60 排列序列

一個docker容器暴露多個端口

微服務實踐之使用 Visual Studio 2022 調試Dapr 應用程序

wpf附加屬性理解 WPF附加屬性

linux mysql登陸

linux shell命令編寫&&在終端輸入文件名直接運行

加密shell命令

語音識別學習日誌 2018-7-20 感知機PLA、多層感知機MLP

MongoDB使用

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結