python3__深度學習__卷積神經網絡(CNN):VGGNet / Finetuning

目錄

1.卷積層實現

2.全連接層實現

3.卷積組實現

4.全連接組實現

5.完整代碼

6.模型參數複用及模型存儲

7.模型Finetuning(微調)複用

7.1 基本概念

7.2 貓狗大戰(Finetuning使用VGG)

VGG16_model.py

create_and_read_TFRecord2.py

main.py


GGNet是於ICLR 2015(International Conference on Learning Representations, 2015)上展示的一種新的卷積神經網絡,在ImageNet上達到了非常高的辨識率,且能夠在以DCNN(Deep Convolutional Network, 深度卷積神經網絡)爲基礎的工程上達到很好的效果,可以廣泛的在其後使用Fine-tuning(微調)。VGGNet是在AlexNet基礎上添加更多隱藏層(一般爲16層或19層)的卷積神經網絡,該網絡參數是AlexNet的3倍左右。

VGGNet共包含5組卷積層3組全連接層,層與層間通過maxpool分隔(結構如下圖)。具體程序實現見第一小節(面向對象思想編程,編程語言python)

1.卷積層實現

卷積層包含:卷積層+激活函數層。在定義域中定義了卷積層名稱,這樣做的好處是可以根據不同的局基層對變量進行命名,並且根據命名規則,所屬不同,命名域中的變量也會自動標記上不同的名稱。

tf.variable_scope(name):定義變量域

name_scope與variable_scope均會爲創建的變量加上詞頭,而tf.name_scope對tf.get_variable創建的變量沒有詞頭影響。同時,tf.variable_scope與tf.get_variable搭配使用可以實現tensorflow中的“變量共享機制”。在共享機制當中,是否使用之前創建的變量要看tf.variable_scope中 “reuse”參數是不是爲True,如果爲True則使用之前變量,不是True則創建變量,如果變量已存在會抱錯。

    def conv(self, name, input_data, out_channel):
        """
        定義卷積組
        :param name:
        :param input_data:
        :param out_channel:
        :return:
        """
        in_channel = input_data.get_shape()[-1]

        # 定義變量命名空間
        with tf.variable_scope(name):
            kernel = tf.get_variable("weights", [3, 3, in_channel, out_channel], dtype=tf.float32, trainable=False)
            biases = tf.get_variable("biases", [out_channel], dtype=tf.float32, trainable=False)
            conv_res = tf.nn.conv2d(input_data, kernel, [1, 1, 1, 1], padding="SAME")
            res = tf.nn.bias_add(conv_res, biases)
            out = tf.nn.relu(res, name=name)
        return out

2.全連接層實現

全連接層與一般的卷積神經網絡層定義相似。需要注意的是,全連接層是對數據進行一個展開操作,因此第一步需要獲取輸入展開後的全部維度,即判斷輸入數據是卷積層要求的[-1, width, height, dim]還是已經展開的數組數據。

    def fc(self, name, input_data, out_channel, trainable=True):
        """
        定義全連接組(展開圖像數據)
        :param name:
        :param input_data:
        :param out_channel:
        :return:
        """
        shape = input_data.get_shape().as_list()
        # 獲取img緯度
        if len(shape) == 4:
            size = shape[-1]*shape[-2]*shape[-3]
        else:
            size = shape[1]
        input_data_flat = tf.reshape(input_data, [-1, size])

        # 定義變量命名空間
        with tf.variable_scope(name):
            weights = tf.get_variable("weights", [size, out_channel], dtype=tf.float32, trainable=trainable)
            biases = tf.get_variable("biases", [out_channel], dtype=tf.float32, trainable=trainable)
            res = tf.nn.bias_add(tf.matmul(input_data_flat, weights), biases)
            out = tf.nn.relu(res, name=name)
        return out

3.卷積組實現

卷機組包含5大卷積層,第1卷積層:2卷積+1池化;第2卷積層:2卷積+1池化;第3卷積層:3卷積+1池化;第4卷積層:3卷積+1池化;第5卷積層:3卷積+1池化

    def convlayers(self):
        """
        定義卷積模型
        :return:
        """
        # conv1
        self.conv1_1 = self.conv("conv1_1", self.imgs, 64)
        self.conv1_2 = self.conv("conv1_2", self.conv1_1, 64)
        self.pool1 = self.maxpool("pool1", self.conv1_2)

        # conv2
        self.conv2_1 = self.conv("conv2_1", self.pool1, 128)
        self.conv2_2 = self.conv("conv2_2", self.conv2_1, 128)
        self.pool2 = self.maxpool("pool2", self.conv2_2)

        # conv3
        self.conv3_1 = self.conv("conv3_1", self.pool2, 256)
        self.conv3_2 = self.conv("conv3_2", self.conv3_1, 256)
        self.conv3_3 = self.conv("conv3_3", self.conv3_2, 256)
        self.pool3 = self.maxpool("pool3", self.conv3_3)

        # conv4
        self.conv4_1 = self.conv("conv4_1", self.pool3, 512)
        self.conv4_2 = self.conv("conv4_2", self.conv4_1, 512)
        self.conv4_3 = self.conv("conv4_3", self.conv4_2, 512)
        self.pool4 = self.maxpool("pool4", self.conv4_3)

        # conv5
        self.conv5_1 = self.conv("conv5_1", self.pool4, 512)
        self.conv5_2 = self.conv("conv5_2", self.conv5_1, 512)
        self.conv5_3 = self.conv("conv5_3", self.conv5_2, 512)
        self.pool5 = self.maxpool("pool5", self.conv5_3)

4.全連接組實現

全連接組包含3次全連接層。n_class爲最終的分類個數

    def fclayers(self):
        """
        定義全連接模型
        :return:
        """
        self.fc6 = self.fc("fc6", self.pool5, 4096, trainable=False)
        self.fc7 = self.fc("fc7", self.fc6, 4096, trainable=False)
        self.fc8 = self.fc("fc8", self.fc7, n_class)

5.完整代碼

import tensorflow as tf
import time
import numpy as np
import matplotlib.pyplot as plt
from tensorflow.examples.tutorials.mnist import input_data
n_class = 10


class Vgg16(object):
    def __init__(self, imgs):
        self.parameters = []
        self.imgs = imgs
        self.convlayers()
        self.fclayers()
        self.probs = tf.nn.softmax(self.fc8)

    def saver(self):
        """
        定義模型存儲器
        :return:
        """
        return tf.train.Saver()

    def variable_summaries(self, var, name):
        """
        生成變量監控信息並定義生成監控信息日誌的操作
        :param var: 輸入變量
        :param name: 變量名稱
        :return:
        """
        with tf.name_scope('summaries'):
            tf.summary.histogram(name, var)
            mean = tf.reduce_mean(var)
            tf.summary.scalar('mean/' + name, mean)
            stddev = tf.sqrt(tf.reduce_mean(tf.square(var - mean)))
            tf.summary.scalar('stddev/' + name, stddev)

    def conv(self, name, input_data, out_channel):
        """
        定義卷積組
        :param name:
        :param input_data:
        :param out_channel:
        :return:
        """
        in_channel = input_data.get_shape()[-1]

        # 定義變量命名空間
        with tf.variable_scope(name):
            kernel = tf.get_variable("weights", [3, 3, in_channel, out_channel], dtype=tf.float32, trainable=False)
            biases = tf.get_variable("biases", [out_channel], dtype=tf.float32, trainable=False)
            self.variable_summaries(biases, "biases")  # +++
            conv_res = tf.nn.conv2d(input_data, kernel, [1, 1, 1, 1], padding="SAME")
            res = tf.nn.bias_add(conv_res, biases)
            out = tf.nn.relu(res, name=name)
            self.parameters += [kernel, biases]
        return out

    def fc(self, name, input_data, out_channel, trainable=True):
        """
        定義全連接組(展開圖像數據)
        :param name:
        :param input_data:
        :param out_channel:
        :return:
        """
        shape = input_data.get_shape().as_list()
        # 獲取img緯度
        if len(shape) == 4:
            size = shape[-1]*shape[-2]*shape[-3]
        else:
            size = shape[1]
        input_data_flat = tf.reshape(input_data, [-1, size])

        # 定義變量命名空間
        with tf.variable_scope(name):
            weights = tf.get_variable("weights", [size, out_channel], dtype=tf.float32, trainable=trainable)
            self.variable_summaries(weights, "weights")  # +++
            biases = tf.get_variable("biases", [out_channel], dtype=tf.float32, trainable=trainable)
            self.variable_summaries(biases, "biases")  # +++
            res = tf.nn.bias_add(tf.matmul(input_data_flat, weights), biases)
            out = tf.nn.relu(res, name=name)
            self.parameters += [weights, biases]
        return out

    def maxpool(self, name, input_data):
        """
        定義池化層
        :param name:
        :param input_data:
        :return:
        """
        with tf.variable_scope(name):
            out = tf.nn.max_pool(input_data, [1, 2, 2, 1], [1, 2, 2, 1], padding="SAME", name=name)
        return out

    def convlayers(self):
        """
        定義卷積模型
        :return:
        """
        # conv1
        self.conv1_1 = self.conv("conv1_1", self.imgs, 64)
        self.conv1_2 = self.conv("conv1_2", self.conv1_1, 64)
        self.pool1 = self.maxpool("pool1", self.conv1_2)

        # conv2
        self.conv2_1 = self.conv("conv2_1", self.pool1, 128)
        self.conv2_2 = self.conv("conv2_2", self.conv2_1, 128)
        self.pool2 = self.maxpool("pool2", self.conv2_2)

        # conv3
        self.conv3_1 = self.conv("conv3_1", self.pool2, 256)
        self.conv3_2 = self.conv("conv3_2", self.conv3_1, 256)
        self.conv3_3 = self.conv("conv3_3", self.conv3_2, 256)
        self.pool3 = self.maxpool("pool3", self.conv3_3)

        # conv4
        self.conv4_1 = self.conv("conv4_1", self.pool3, 512)
        self.conv4_2 = self.conv("conv4_2", self.conv4_1, 512)
        self.conv4_3 = self.conv("conv4_3", self.conv4_2, 512)
        self.pool4 = self.maxpool("pool4", self.conv4_3)

        # conv5
        self.conv5_1 = self.conv("conv5_1", self.pool4, 512)
        self.conv5_2 = self.conv("conv5_2", self.conv5_1, 512)
        self.conv5_3 = self.conv("conv5_3", self.conv5_2, 512)
        self.pool5 = self.maxpool("pool5", self.conv5_3)

    def fclayers(self):
        """
        定義全連接模型
        :return:
        """
        self.fc6 = self.fc("fc6", self.pool5, 4096, trainable=False)
        self.fc7 = self.fc("fc7", self.fc6, 4096, trainable=False)
        self.fc8 = self.fc("fc8", self.fc7, n_class)

    def load_weight(self, weight_file, sess):
        weights = np.load(weight_file)
        keys = sorted(weights.keys())
        for i, k in enumerate(keys):
            sess.run(self.parameters[i].assign(weights[k]))
        print("--------------------all done--------------------")


if "__main__" == __name__:
    mnist = input_data.read_data_sets("./mnist", one_hot=True)

    x = tf.placeholder("float32", [None, 784])
    y_ = tf.placeholder(tf.float32, [None, 10])


    imgs = tf.reshape(x, [-1, 28, 28, 1])
    vgg = Vgg16(imgs)
    y = vgg.probs

    with tf.name_scope("cross_entropy"):
        cross_entropy = -tf.reduce_sum(y_*tf.log(y))
    with tf.name_scope("train_step"):
        train_step = tf.train.GradientDescentOptimizer(0.001).minimize(cross_entropy)

    # 模型評價
    correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))
    with tf.name_scope("accuracy"):
        accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))

    # 模型保存
    saver = vgg.saver()
    merged = tf.summary.merge_all()

    init = tf.global_variables_initializer()
    with tf.Session() as sess:
        sess.run(init)

        # TODO
        writer = tf.summary.FileWriter("./log", sess.graph)

        iters = 5
        for i in range(iters):
            batch_xs, batch_ys = mnist.train.next_batch(200)

            acc = sess.run(accuracy, feed_dict={x: batch_xs, y_: batch_ys})
            print(acc)

            summary, _ = sess.run([merged, train_step], feed_dict={x: batch_xs, y_: batch_ys})
            # sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})
            writer.add_summary(summary, i)

        saver.save(sess, "./train/VGG16.ckpt")

6.模型參數複用及模型存儲

權重下載鏈接:www.cs.toronto.edu/~frossard/vgg16/vgg16_weights.npz

分類文件下載鏈接:www.cs.toronto.edu/~frossard/vgg16/imagenet_classes.py

模型文件:上邊代表保存到VGG16_model.py腳本中

# 訓練權重複用
import tensorflow as tf
import numpy as np
from scipy.misc import imread, imresize
from imagenet_classes import class_names
from VGG16_model import Vgg16
n_class = 1000




if "__main__" == __name__:

    # 定義輸入圖像佔位符
    imgs = tf.placeholder(tf.float32, [None, 224, 224, 3])

    vgg = Vgg16(imgs)
    prob = vgg.probs
    
    # =====模型存儲
    saver = vgg.saver()
    with tf.Session() as sess:

        # 加載訓練好的vgg16權重文件
        vgg.load_weight("./vgg16_weights.npz", sess)
        saver.save(sess, "./model/vgg.ckpt")
    
    # =====模型恢復
    saver = vgg.saver()
    with tf.Session() as sess: 
        saver.restore(sess, "./model/vgg.ckpt")
        
        # 加載圖像文件
        img1 = imread("001.jpg", mode="RGB")
        img1 = imresize(img1, (224, 224))
    
        # 預測
        prob = sess.run(vgg.probs, feed_dict={vgg.imgs: [img1]})[0]
    
        # 生成一系列key值和對應的概率值,選擇最有可能的前5個
        preds = (np.argsort(prob)[::-1])[0:5]
    
        for p in preds:
            # 輸出類名稱及對應概率
            print(class_names[p], prob[p])

7.模型Finetuning(微調)複用

7.1 基本概念

在實際的問題當中,並不能泛化的使用VGGNet在本身模型參數所帶的1000個類別中判斷所屬或者近似的類別,而是對其更進一步的需求專精一項分類,這是一項非常重要的工作,需要對模型進行重新的Finetuning複用。(AlexNet, VGGNet, ResNet)

Finetuning指的是對已訓練好的模型進行微調,相當於使用別人的模型的前基層來提取淺層特徵,從而讓其所在需要針對的訓練集上的判別能力更強、更加適合所需要的判斷。例如:Vgg16可對1000中物體進行分類,但實際用不了這麼多,此時可選擇某些針對性物體對已有網絡進行微調,重新訓練,使其更具有針對性。

另外,之所以會通過對現有網絡進行微調,主要是考慮:某一類別物體圖像數據收集難度、模型訓練成本和參數調整難度等因素,能夠有效的節省模型訓練成本且對數據集不是很大的情況下Finetuning是一個較高的選擇

數據集大小 與舊數據集相似程度 是否合適Finetuning
較小 相似 過擬合;高層特徵
相似 可Finetuning整個網絡
較小 不相似 非Finetuning整個網絡;非高層特徵
不相似 可重新訓練;可Finetuning整個網絡

7.2 貓狗大戰(Finetuning使用VGG)

1.修改輸出層的全連接數據,即分類數目

2.weights,biases定義過程中將trainable=False(確保其在訓練過程中不會修改已經訓練好的權重)

3.權重加載函數修改:權重的載入文件“*.npz”,是以鍵值對的字典模型進行保存。而Python對字典的使用可以根據其標記序列號的形式對其進行讀取,因此可以剔除不需要的序號而對模型進行有針對性的載入。由於最後的全連接層進行了調整,因此不加載最後全連接層的“權重”和“偏執項”

4.定義“數據路徑加載函數”——get_file(file_dir)

5.定義“圖片讀取函數”get_batch(image_list, label_list, img_width, img_height, batch_size, caplacity):按照指定批量

6.對labels進行“標籤編碼”(Durex編碼)

7.模型的“重新訓練與存儲

8.模型的“複用

VGG16_model.py

# 訓練模型
import tensorflow as tf
import time
import numpy as np
import matplotlib.pyplot as plt
from tensorflow.examples.tutorials.mnist import input_data
import os
import create_and_read_TFRecord2 as reader2
import time
n_class = 2


# trainable參數:在進行Finetuning對模型進行重新訓練時,對於部分不需要訓練的層可通過設置trainable=False
# 來確保其在訓練過程中不會因爲訓練而修改權值
class Vgg16(object):
    def __init__(self, imgs):
        self.parameters = []
        self.imgs = imgs
        self.convlayers()
        self.fclayers()
        self.probs = tf.nn.softmax(self.fc8)

    def saver(self):
        """
        定義模型存儲器
        :return:
        """
        return tf.train.Saver()

    def variable_summaries(self, var, name):
        """
        生成變量監控信息並定義生成監控信息日誌的操作
        :param var: 輸入變量
        :param name: 變量名稱
        :return:
        """
        with tf.name_scope('summaries'):
            tf.summary.histogram(name, var)
            mean = tf.reduce_mean(var)
            tf.summary.scalar('mean/' + name, mean)
            stddev = tf.sqrt(tf.reduce_mean(tf.square(var - mean)))
            tf.summary.scalar('stddev/' + name, stddev)

    def conv(self, name, input_data, out_channel):
        """
        定義卷積組
        :param name:
        :param input_data:
        :param out_channel:
        :return:
        """
        in_channel = input_data.get_shape()[-1]

        # 定義變量命名空間
        with tf.variable_scope(name):
            kernel = tf.get_variable("weights", [3, 3, in_channel, out_channel], dtype=tf.float32, trainable=False)
            biases = tf.get_variable("biases", [out_channel], dtype=tf.float32, trainable=False)
            conv_res = tf.nn.conv2d(input_data, kernel, [1, 1, 1, 1], padding="SAME")
            res = tf.nn.bias_add(conv_res, biases)
            out = tf.nn.relu(res, name=name)
            self.parameters += [kernel, biases]
        return out

    def fc(self, name, input_data, out_channel, trainable=True):
        """
        定義全連接組(展開圖像數據)
        :param name:
        :param input_data:
        :param out_channel:
        :return:
        """
        shape = input_data.get_shape().as_list()
        # 獲取img緯度
        if len(shape) == 4:
            size = shape[-1]*shape[-2]*shape[-3]
        else:
            size = shape[1]
        input_data_flat = tf.reshape(input_data, [-1, size])

        # 定義變量命名空間
        with tf.variable_scope(name):
            weights = tf.get_variable("weights", [size, out_channel], dtype=tf.float32, trainable=trainable)
            biases = tf.get_variable("biases", [out_channel], dtype=tf.float32, trainable=trainable)
            res = tf.nn.bias_add(tf.matmul(input_data_flat, weights), biases)
            out = tf.nn.relu(res, name=name)
            self.parameters += [weights, biases]
        return out

    def maxpool(self, name, input_data):
        """
        定義池化層
        :param name:
        :param input_data:
        :return:
        """
        with tf.variable_scope(name):
            out = tf.nn.max_pool(input_data, [1, 2, 2, 1], [1, 2, 2, 1], padding="SAME", name=name)
        return out

    def convlayers(self):
        """
        定義卷積模型
        :return:
        """
        # conv1
        self.conv1_1 = self.conv("conv1_1", self.imgs, 64)
        self.conv1_2 = self.conv("conv1_2", self.conv1_1, 64)
        self.pool1 = self.maxpool("pool1", self.conv1_2)

        # conv2
        self.conv2_1 = self.conv("conv2_1", self.pool1, 128)
        self.conv2_2 = self.conv("conv2_2", self.conv2_1, 128)
        self.pool2 = self.maxpool("pool2", self.conv2_2)

        # conv3
        self.conv3_1 = self.conv("conv3_1", self.pool2, 256)
        self.conv3_2 = self.conv("conv3_2", self.conv3_1, 256)
        self.conv3_3 = self.conv("conv3_3", self.conv3_2, 256)
        self.pool3 = self.maxpool("pool3", self.conv3_3)

        # conv4
        self.conv4_1 = self.conv("conv4_1", self.pool3, 512)
        self.conv4_2 = self.conv("conv4_2", self.conv4_1, 512)
        self.conv4_3 = self.conv("conv4_3", self.conv4_2, 512)
        self.pool4 = self.maxpool("pool4", self.conv4_3)

        # conv5
        self.conv5_1 = self.conv("conv5_1", self.pool4, 512)
        self.conv5_2 = self.conv("conv5_2", self.conv5_1, 512)
        self.conv5_3 = self.conv("conv5_3", self.conv5_2, 512)
        self.pool5 = self.maxpool("pool5", self.conv5_3)

    def fclayers(self):
        """
        定義全連接模型
        :return:
        """
        self.fc6 = self.fc("fc6", self.pool5, 4096, trainable=False)
        self.fc7 = self.fc("fc7", self.fc6, 4096, trainable=False)
        self.fc8 = self.fc("fc8", self.fc7, n_class)

    def load_weight(self, weight_file, sess):
        weights = np.load(weight_file)
        keys = sorted(weights.keys())
        for i, k in enumerate(keys):
            # 獲取全連接層的權重與偏執項
            # <class 'list'>: ['conv1_1_W', 'conv1_1_b', 'conv1_2_W', 'conv1_2_b',
            # 'conv2_1_W', 'conv2_1_b', 'conv2_2_W', 'conv2_2_b', 'conv3_1_W',
            # 'conv3_1_b', 'conv3_2_W', 'conv3_2_b', 'conv3_3_W', 'conv3_3_b',
            # 'conv4_1_W', 'conv4_1_b', 'conv4_2_W', 'conv4_2_b', 'conv4_3_W',
            # 'conv4_3_b', 'conv5_1_W', 'conv5_1_b', 'conv5_2_W', 'conv5_2_b',
            # 'conv5_3_W', 'conv5_3_b', 'fc6_W', 'fc6_b', 'fc7_W', 'fc7_b', 'fc8_W',
            # 'fc8_b']
            if i not in [30, 31]:
                sess.run(self.parameters[i].assign(weights[k]))
        print("-------------------- all done --------------------")


# ==========模型的重新訓練與存儲
if "__main__" == __name__:

    # 加載訓練集
    X_train, y_train = reader2.get_file(".\\cat_and_dog_r")
    image_batch, label_batch = reader2.get_batch(X_train, y_train, 224, 224, 25, 256)

    # 定義模型輸入
    x_imgs = tf.placeholder(tf.float32, [None, 224, 224, 3])
    y_imgs = tf.placeholder(tf.int32, [None, 2])

    # 定義模型
    vgg = Vgg16(x_imgs)
    fc3_cat_and_dog = vgg.probs  # 模型預測值

    loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=fc3_cat_and_dog, labels=y_imgs))
    optimizer = tf.train.GradientDescentOptimizer(0.0001).minimize(loss)

    # 定義會話
    init = tf.global_variables_initializer()
    with tf.Session() as sess:
        sess.run(init)

        # 加載權重
        vgg.load_weight(".\\vgg16_weights.npz", sess)

        # 定義模型存儲器
        saver = vgg.saver()

        # 創建線程協調器
        coord = tf.train.Coordinator()
        threads = tf.train.start_queue_runners(sess=sess, coord=coord)

        # 記錄模型時間
        start_time = time.time()

        for i in range(1000):
            image, label = sess.run([image_batch, label_batch])
            labels = reader2.onehot(label)

            # *打印模型損失值*
            sess.run(optimizer, feed_dict={x_imgs: image, y_imgs: labels})
            res = sess.run(loss, feed_dict={x_imgs: image, y_imgs: labels})
            print("now the loss is: ", res)

            # *打印模型耗時*
            end_time = time.time()
            print("time:", (end_time-start_time))
            start_time = time.time()
            print("--------------------epoch %d is finished--------------------")

        saver.save(sess, ".\\vgg_finetuning_model\\")
        print("Optimization Finished!")
        coord.request_stop()
        coord.join(threads=threads)

create_and_read_TFRecord2.py

import numpy as np
import os
import tensorflow as tf
img_width = 224
img_height = 224


def get_file(file_dir):
    """
    數據的輸入
    :param file_dir: 數據文件地址
    :return:
    """
    images = []
    temp = []

    # 遍歷文件樹:第一次遍歷(i=0)時,files = []
    for root, sub_folders, files in os.walk(file_dir):
        for name in files:
            images.append(os.path.join(root, name))
        for name in sub_folders:
            temp.append(os.path.join(root, name))

    labels = []
    for one_folder in temp:
        n_img = len(os.listdir(one_folder))
        letter = one_folder.split("\\")[-1]  # 獲取“分類”名稱
        if "cat" == letter:
            labels = np.append(labels, n_img*[0])  # “貓”
        else:
            labels = np.append(labels, n_img*[1])  # “狗”

    # 重排
    temp = np.array([images, labels])
    temp = temp.transpose()
    np.random.shuffle(temp)
    image_list = list(temp[:, 0])
    label_list = list(temp[:, 1])
    label_list = [int(float(i)) for i in label_list]

    return image_list, label_list


def get_batch(image_list, label_list, img_width, img_height, batch_size, capacity):
    """
    加載對應圖片和標籤
    :param image_list:
    :param label_list:
    :param img_width:
    :param img_height:
    :param batch_size:
    :param capacity:
    :return:
    """
    image = tf.cast(image_list, tf.string)
    label = tf.cast(label_list, tf.int32)

    input_queue = tf.train.slice_input_producer([image, label])

    label = input_queue[1]
    image_contents = tf.read_file(input_queue[0])
    # 將圖像編碼爲“unit8” and “3通道”tensor
    image = tf.image.decode_jpeg(image_contents, channels=3)

    # 裁剪/填充圖片爲指定大小
    image = tf.image.resize_image_with_crop_or_pad(image, img_width, img_height)
    # 圖像標準化
    image = tf.image.per_image_standardization(image)
    image_batch, label_batch = tf.train.batch([image, label], batch_size=batch_size, num_threads=64, capacity=capacity)
    label_batch = tf.reshape(label_batch, [batch_size])

    return image_batch, label_batch


def onehot(labels):
    """
    標籤編碼
    :param labels:
    :return:
    """
    n_sample = len(labels)
    n_class = max(labels) + 1

    # 生成標籤矩陣,並賦值
    onehot_labels = np.zeros((n_sample, n_class))
    onehot_labels[np.arange(n_sample), labels] = 1

    return onehot_labels

main.py

# Finetuning使用VGGNet進行貓狗大戰
import tensorflow as tf
import numpy as np
from VGG16_model import Vgg16
from imagenet_classes import class_names
import os
from scipy.misc import imread, imresize


if "__main__" == __name__:
    imgs = tf.placeholder(tf.float32, [None, 224, 224, 3])

    sess = tf.Session()

    vgg = Vgg16(imgs)
    fc3_cat_and_dog = vgg.probs

    saver = vgg.saver()
    saver.restore(sess, ".\\vgg_finetuning_model\\")

    for root, sub_folders, files in os.walk(".\\cat_and_dog_t"):
        i = 0
        cat = 0
        dog = 0
        for name in files:
            i += 1
            filepath = os.path.join(root, name)

            try:
                img1 = imread(filepath, mode="RGB")
                img1 = imresize(img1, (224, 224))
            except:
                print("remove", filepath)
            pred = sess.run(fc3_cat_and_dog, feed_dict={vgg.imgs: [img1]})[0]
            max_index = np.argmax(pred)
            if 0 == max_index:
                cat += 1
            else:
                dog += 1
            if i % 50 == 0:
                acc = (dog * 1.) / (cat + dog)
                print(acc)
                print("---------- img number is %d ----------" % i)

 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章