2020.3.31 miniImagenet數據集處理

原創

2020-06-23 08:57

miniImagenet數據集的結構爲：所有的照片放在一個文件夾中，另外有三個csv文件分別是：train.csv，val.csv，test.csv，三個csv文件都有兩列，第一列是文件名，第二列是標籤。我在使用數據集時，把train數據，test數據，val數據分別都保存到對應的標籤下。

處理後的目錄結構如下：

具體處理代碼：

import csv
import os
from PIL import Image
train_csv_path="C:/Users/MMatx/Desktop/研究生/mini-imagenet/mini-imagenet/train.csv"
val_csv_path="C:/Users/MMatx/Desktop/研究生/mini-imagenet/mini-imagenet/val.csv"
test_csv_path="C:/Users/MMatx/Desktop/研究生/mini-imagenet/mini-imagenet/test.csv"

train_label={}
val_label={}
test_label={}
with open(train_csv_path) as csvfile:
    csv_reader=csv.reader(csvfile)
    birth_header=next(csv_reader)
    for row in csv_reader:
        train_label[row[0]]=row[1]

with open(val_csv_path) as csvfile:
    csv_reader=csv.reader(csvfile)
    birth_header=next(csv_reader)
    for row in csv_reader:
        val_label[row[0]]=row[1]

with open(test_csv_path) as csvfile:
    csv_reader=csv.reader(csvfile)
    birth_header=next(csv_reader)
    for row in csv_reader:
        test_label[row[0]]=row[1]

img_path="C:/Users/MMatx/Desktop/研究生/mini-imagenet/mini-imagenet/images"
new_img_path="C:/Users/MMatx/Desktop/研究生/mini-imagenet/mini-imagenet/ok"
for png in os.listdir(img_path):
    path = img_path+ '/' + png
    im=Image.open(path)
    if(png in train_label.keys()):
        tmp=train_label[png]
        temp_path=new_img_path+'/train'+'/'+tmp
        if(os.path.exists(temp_path)==False):
            os.makedirs(temp_path)
        t=temp_path+'/'+png
        im.save(t)
        # with open(temp_path, 'wb') as f:
        #     f.write(path)

    elif(png in val_label.keys()):
        tmp = train_label[png]
        temp_path = new_img_path + '/val' + '/' + tmp
        if (os.path.exists(temp_path) == False):
            os.makedirs(temp_path)
        t = temp_path + '/' + png
        im.save(t)

    elif(png in test_label.keys()):
        tmp = train_label[png]
        temp_path = new_img_path + '/test' + '/' + tmp
        if (os.path.exists(temp_path) == False):
            os.makedirs(temp_path)
        t = temp_path + '/' + png
        im.save(t)

涉及到的python知識：

1、python獨寫csv文件

使用pythonI/O讀取csv文件是按照行讀取，每一行都是一個List列表，可以通過使用List列表帶獲取每一行每一列的元素

2、python判斷文件/目錄是否存在

（1）判斷文件是否存在：os.path.exists(path)

（2）新建一個目錄：os.makedirs(path)

3、將圖片保存在新的文件夾

使用 fromPIL import Image

img=Image.open(path)

img.save(new_path)

4、python中自帶的glob支持文件的通配檢索

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

2020.3.31 miniImagenet數據集處理

Pytorch神經網絡基礎

閱讀《遷移學習簡明手冊》總結（一）

論文《Matching Networks for One Shot Learning》閱讀

安裝Dlib

torchvision.datasets.ImageFolder 數據加載

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結