一個小目標:構建一個classfier超越 lenet5!(model1_cifar10)

新的模型結構還沒,想出來.....

想先試試cifar10上訓練的結果和lenet5的差距,baseline主要參考這裏

然而我沒有1080Ti...... 我只有1060 6G版,哭

首先先寫個cifar10的數據讀入和預處理的腳本,腳本在之前的repo cifar10_loader.py  這裏只講一下大概的函數:

 

import pickle
import glob
import cv2
import tqdm
import os
import sys
import logging
import random
import numpy as np
import math
class Cifa10_data: #承接cifar10 數據的對象,在訓練腳本中主要用這個類
    #cropSzie 是否裁剪圖像,這裏爲了和mnist保持一致,這裏裁剪成28*28的圖片(原始是32*32的)
    #rotate_ratio 隨機取多少比例的圖片做旋轉
    #flip_ratio 隨機取多少比例的圖片進行水平鏡像
    def __init__(self,base_dir,batch_size,rotate_ratio,flip_ratio,cropSize,validate_batch_num=3):
        self.train_data_tensor,self.test_data_tensor,\
        self.train_label_tensor,self.test_label_tensor=load_cifar10(base_dir,rotate_ratio,
                                                                            flip_ratio,
                                                                                cropSize)
        self.batch_size=batch_size
        self.batchs_for_one_epoch_train=self.train_data_tensor.shape[0]//batch_size
        self.batchs_for_one_epoch_test=self.test_data_tensor.shape[0]//batch_size
        self.train_batch_counter=0
        self.test_batch_counter=0
        self.label_map=load_label_map(base_dir)
        self.valid_batches=validate_batch_num
        self.shuffle_train()

    def next_Batch_train(self):
        if(self.train_batch_counter+1)<self.batchs_for_one_epoch_train:
            start_idx=self.train_batch_counter*self.batch_size
            end_idx=(self.train_batch_counter+1)*self.batch_size
            self.train_batch_counter+=1
        else:
            self.train_batch_counter=0
            start_idx=0
            end_idx=self.batch_size
            self.shuffle_train()

        return self.train_data_tensor[start_idx:end_idx],self.train_label_tensor[start_idx:end_idx]

    def next_Batch_test(self):
        if(self.test_batch_counter+1)<self.batchs_for_one_epoch_test:
            start_idx=self.test_batch_counter*self.batch_size
            end_idx=(self.test_batch_counter+1)*self.batch_size
            self.test_batch_counter+=1
        else:
           return None

        return self.test_data_tensor[start_idx:end_idx],self.test_label_tensor[start_idx:end_idx]
    def get_validate_datas(self):
        start_idx=0
        end_idx=self.valid_batches*self.batch_size
        return self.test_data_tensor[start_idx:end_idx],self.test_label_tensor[start_idx:end_idx]
    def shuffle_train(self):
        perm=list(range(self.train_data_tensor.shape[0]))
        np.random.shuffle(perm)
        self.train_data_tensor=self.train_data_tensor[perm]
        self.train_label_tensor=self.train_label_tensor[perm]

def file_loader(file_path):
    with open(file_path, 'rb') as fo:
        dict = pickle.load(fo, encoding='bytes')
    #cifar10數據是1維數據這裏讀取圖像並把圖像還原成32*32的彩色圖
    images=map(lambda x:rotate_image(
                                        cv2.cvtColor(
                                            np.array(x).reshape((32,32,3)
                                                                ,order="F"
                                                                ),
                                            cv2.COLOR_RGB2BGR
                                        ),
                                        270,
                                        True
                                    ),
               dict[b'data']
               )

    labels=dict[b'labels']
    return list(images),labels

def load_cifar10(base_dir:str,rotate_ratio=0.1,flip_ratio=0.1,croppedSize=None):
    train_flie_list=glob.glob(os.path.join(base_dir,"data_batch_*"))
    test_file_list=glob.glob(os.path.join(base_dir,"test_batch"))

    train_image=[]
    train_label=[]
    test_image=[]
    test_label=[]
    logging.info("train data file loading....")
    for file_path in tqdm.tqdm(train_flie_list):
        images,labels=file_loader(file_path)
        train_image.extend(images)
        train_label.extend(labels)

    logging.info("test file loading....")
    for file_path in tqdm.tqdm(test_file_list):
        images,labels=file_loader(file_path)
        test_image.extend(images)
        test_label.extend(labels)

    logging.info("data preprocessing")
    train_data_tensor,train_label_tensor=preprocess(train_image,train_label,True,rotate_ratio,flip_ratio,croppedSize)
    test_data_tensor,test_label_tensor=preprocess(test_image,test_label,False,rotate_ratio,flip_ratio,croppedSize)
    return train_data_tensor,test_data_tensor,train_label_tensor,test_label_tensor



def rotate_image(img,rotate,keep_size=False):

    height, width = img.shape[:2]
    if not keep_size:
        heightNew = int(width * math.fabs(math.sin(math.radians(rotate))) + height * math.fabs(math.cos(math.radians(rotate))))
        widthNew = int(height * math.fabs(math.sin(math.radians(rotate))) + width * math.fabs(math.cos(math.radians(rotate))))
    else:
        heightNew=height
        widthNew=width
    matRotation = cv2.getRotationMatrix2D((width / 2, height / 2), rotate, 1)

    matRotation[0, 2] += (widthNew - width) / 2
    matRotation[1, 2] += (heightNew - height) / 2

    imgRotation = cv2.warpAffine(img, matRotation, (widthNew, heightNew), borderValue=(255, 255, 255))
    return imgRotation


def preprocess(images_list,label_list,is_train=True,rotate_ratio=0.1,flip_ratio=0.1,cropSzie=None):
    rotate_angle=[30,60,90]
    flip_code=[1]
    if cropSzie==None:
        offset=0
    else:
        offset=(images_list[0].shape[0]-cropSzie)//2

    cropped_size=images_list[0].shape[0]-offset
    cropSzie=images_list[0].shape[0]-2*offset
    if not is_train:
        image_element_tensor=[item[offset:cropped_size,offset:cropped_size,:].reshape(1,cropSzie,cropSzie,3) for item in images_list]
        return np.concatenate(image_element_tensor,axis=0).astype(np.float32),build_onehot(label_list,10).astype(np.float32)
    else:
        smaple_idx_list=random.sample(range(0,len(images_list)),int(len(images_list)*rotate_ratio))
        smaple_flip_idx_list=random.sample(range(0,len(images_list)),int(len(images_list)*flip_ratio))
        rotated_images=list(map(lambda x:rotate_image(images_list[x],np.random.choice(rotate_angle),True),smaple_idx_list))
        rotate_image_labels=[label_list[item] for item in smaple_idx_list]
        fliped_images=list(map(lambda x:cv2.flip(images_list[x],np.random.choice(flip_code)),smaple_flip_idx_list))
        fliped_image_labels=[label_list[item] for item in smaple_flip_idx_list]
        images_list.extend(rotated_images)
        label_list.extend(rotate_image_labels)
        images_list.extend(fliped_images)
        label_list.extend(fliped_image_labels)

        image_element_tensor=[item[offset:cropped_size,offset:cropped_size,:].reshape(1,cropSzie,cropSzie,3) for item in images_list]
        return np.concatenate(image_element_tensor,axis=0).astype(np.float32),build_onehot(label_list,10).astype(np.float32)


def build_onehot(labels,label_num):
    label_tensor=np.zeros((len(labels),label_num),dtype=np.int)
    for i in range(len(labels)):
        label_tensor[i,labels[i]]=1
    return label_tensor

def load_label_map(base_dir):
    file_path=os.path.join(base_dir,"batches.meta")
    with open(file_path, 'rb') as fo:
        dict = pickle.load(fo, encoding='bytes')
    return [str(item, encoding = "utf-8") for item in dict[b'label_names']]

if __name__ == "__main__":
    logger = logging.getLogger()    # initialize logging class
    logger.setLevel(logging.DEBUG)  # default log level
    format = logging.Formatter("%(asctime)s - %(message)s")    # output format
    sh = logging.StreamHandler(stream=sys.stdout)    # output to standard output
    sh.setFormatter(format)
    logger.addHandler(sh)


    data_loader=Cifa10_data("C:\\Users\\rebel\\.keras\\datasets\\cifar-10-batches-py",128,0.25,0.25,28,3)
    print(data_loader.test_data_tensor.shape)
    print(data_loader.train_data_tensor.shape)
    print(data_loader.get_validate_datas()[0].shape)

注意,這裏用的是cifar10 python版本的數據。

 

然後在model1的基礎上進行修改

值得一提的是上一篇文章中模型在對圖片標準化的過程中有bug....本來應該除標準差,結果除成方差了.....(已經在repo中修復)

mean,var=tf.nn.moments(x_image,[1,2],keep_dims=True)
x_image=tf.subtract(x_image,mean)
x_image=tf.divide(x_image,tf.sqrt(var)) #這裏,修復bug

1.先修改模型的輸入,因爲這裏是彩色圖像所以輸入維度要改成

 

cropSize=28
x=tf.placeholder(shape=[None,cropSize,cropSize,3],dtype=tf.float32)
y=tf.placeholder(shape=[None,10],dtype=tf.float32)
keep=tf.placeholder(tf.float32)
#change 1:normalize input
mean,var=tf.nn.moments(x,[1,2],keep_dims=True)
x_image__=tf.subtract(x,mean)
x_image1=tf.divide(x_image__,tf.sqrt(var))

 

2.由於我們的數據讀入使用  Cifa10_data 類,所以對訓練和測試的數據讀入也有小小的修改

 

然後總steps 設置爲10000步

 

按照之前的方法,訓練分兩個階段第一個階段用adam 第二個階段用sgd。直接train一把,果不其然,adam這玩意很難伺候,經常train到3000-6000步左右時梯度崩了,loss變成了nan值.....

經過不停的修改學習率,總算train下來了:

acc:74.2%,第一階段train 的dropout keep 0.6 第二階段 keep 0.9  flip_ratio 和 rotate ratio 均爲0.05, 學習率 adam 4e-5 sgd:4e-6

在訓練過程中發現 train acc 和 validation acc 差距比較大,考慮可能有點過擬合,所以我又修改了一下:

第一階段train 的dropout keep 0.5 第二階段 keep 1.0

acc:75.4%

第一階段一共7000步第二階段3000步,訓練時長13min左右

emmmm...超不過

在調整第一階段訓練和第二階段訓練的過程中突然想到可以加入wam up的過程,取一個比較大的學習率訓幾步然後再開始第一階段第二階段這樣會不會比較好呢。

這樣就變成了:

第一階段:adam 學習率:4e-4 2000步 keep=0.3

第二階段: adam 學習率:4e-5 5000步 keep=0.5

第三階段:sgd 學習率:4e-6 3000步 keep=1.0

同時把flip ratio 改到了 0.1 引入更多的水平鏡像

 

但是我手滑了一下.... 導致第二階段和第一階段連在了一起,也就是說前2000步在訓練時train了兩次.... 實際一共訓了 12000步

最終 acc:達到了 77.48% 超過 baseline 1.2%個點,訓練時間16min。

改正手滑後最終 acc:77.23% 訓練時間 13min。

 

最終證明了我的小模型(model1)超過了lenet5!

代碼放在:https://github.com/lordrebel/beatLenet5 model1_cifar10

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章