yolov3代碼學習（1）小功能實現代碼

從下面的博客開始學習的，總結一些自己學到的東西。
https://blog.csdn.net/qq_34199326/article/details/84206079
看源碼一個大的感慨就是維度問題，維度，維度，維度，會經常看到unsqueeze,repeat

1. opencv讀取圖片–>model能處理的格式，在darknet.py的第1個函數。

 from torch.autograd import Variable
   def get_test_input():
       img = cv2.imread("dog-cycle-car.png")
       img = cv2.resize(img, (416,416))          # Resize to the input dimension, img[416,416,3]
       print(img.shape)
       img_ =  img[:,:,::-1].transpose((2,0,1))  # [:,:,::-1]第三個維度從後向前取所有元素，BGR -> RGB | img[3,416,416]
       img_ = torch.from_numpy(img_).float()     # Convert to float
       img_ = Variable(img_)                     # Convert to Variable
       return img_

2. 從.cfg讀取yolov3網絡，在darknet.py的第2個函數。

把.cfg文件按行讀取，保存在list中，並將註釋，空行，左右兩邊去掉。

file = open(cfgfile, 'r')
lines = file.read().split('\n')  # store the lines in a list
lines = [x for x in lines if len(x) > 0]  # get rid of the empty lines = save noozero lines
lines = [x for x in lines if x[0] != '#']  # get rid of comments
lines = [x.rstrip().lstrip() for x in lines]  # 去掉左右兩邊的空格

總體代碼如下,寫的很簡短，有效，看了半天才看明白。

def parse_cfg(cfgfile):
    """
    Takes a configuration file

    Returns a list of blocks. Each blocks describes a block in the neural
    network to be built. Block is represented as a dictionary in the list

    """

    file = open(cfgfile, 'r')
    lines = file.read().split('\n')  # store the lines in a list
    lines = [x for x in lines if len(x) > 0]  # get rid of the empty lines = save noozero lines
    lines = [x for x in lines if x[0] != '#']  # get rid of comments
    lines = [x.rstrip().lstrip() for x in lines]  # 去掉左右兩邊的空格

    block = {}
    blocks = []

    for line in lines:
        if line[0] == "[":  # This marks the start of a new block
            if len(block) != 0:  # If block is not empty, implies it is storing values of previous block.
                blocks.append(block)  # add it the blocks list

                block = {}  # re-init the block

            block["type"] = line[1:-1].rstrip()  # [1:-1] 就是 【net】裏面的net
        else:
            key, value = line.split("=")
            block[key.rstrip()] = value.lstrip()
    blocks.append(block)   # 退出循環，將最後一個未加入的block加進去

    return blocks  # # ??? 是否會造成第一個就是爲空字典,--》不會

返回的blocks是list，每個元素是字典

3. 文件路徑的一寫操作

獲取絕對路徑osp.realpath(’.’)

import os.path as osp
imlist = [ osp.join(osp.realpath('.'), images, img) for img in os.listdir(images) ]

os.listdir 返回指定路徑下的文件和文件夾列表。
不存在目錄就新建

if not os.path.exists(args.det):
    os.makedirs(args.det)

4. try和except

   try:
       img = Image.open(self.images[index]).convert('RGB')
   except:
       try:
           img = Image.open(self.images[index][:-3] + 'png').convert('RGB')
       except:
           img = Image.open(self.images[index][:-3] + 'jpeg').convert('RGB')

也可以一個try同級接多個except，但是每個except錯誤類型要不一致。detect.py中出現的。

except NotADirectoryError:  # 如果上面的路徑有錯，只得到images文件夾絕對路徑即可 ???
    imlist = []              # images 是一張圖片的名,不是圖片組成的文件夾
    imlist.append(osp.join(osp.realpath('.', images)))

except FileNotFoundError:
    print(" No file or directory with the name {}".format(images))
    exit()

5.map的使用

map的使用到現在也不是很懂。
map(function, iterable, iterable…)

將每張圖片都增一個空維。[1,3,w,h]，prep_image在util.py中。

map( prep_image, load_imgs, [inp_dim for x in range(len(imlist))] )

在video.py中，將結果在圖片中顯示出來，map和lambda組和使用。lambda做function，output做輸入x

list(map(lambda x: write(x, frame), output))

6.將單張圖片組成batch

#  創建 batch，將所有測試圖片按照batch_size分成多個batch
leftover = 0
if (len( im_dim_list) % batch_size): # 如果測試圖片的數量不能被batch_size整除，leftover=1
    leftover = 1

# 將所有圖片分成num_batches個batch，leftover=1的畫表明分不盡，有餘數
# 如果batch size 不等於1，則將一個batch的圖片作爲一個元素保存在im_batches中，
# 按照if語句裏面的公式計算。如果batch_size=1,則每一張圖片作爲一個元素保存在im_batches中
if batch_size != 1:
    # 如果batch_size不等於1,則batch的數量=圖片數量//batch_size+leftover(測試圖片的數量不能被batch_size整除，
    # leftover=1，否則爲0)。本例有11張圖片，假設batch_size=2,則batch數量=6
    num_batches = len(imlist) // batch_size + leftover

    im_batches = [torch.cat( (im_batches[i*batch_size : min( (i + 1)*batch_size, len(im_batches) )] )
                                for i in range(num_batches) ) ]

其中[i*batch_size : min( (i + 1)*batch_size, len(im_batches) )]就是爲了防止圖片數量不能整除

7.圖片的resize.在util.py中。

將圖片保存比列不變的resize,然後不足的部分用同一元素補充。

opencv版

def letterbox_image(img, inp_dim):
    """
    lteerbox_image()將圖片按照縱橫比進行縮放，將空白部分用(128,128,128)填充,調整圖像尺寸
    具體而言,此時某個邊正好可以等於目標長度,另一邊小於等於目標長度
    將縮放後的數據拷貝到畫布中心,返回完成縮放
    :param img: 一張j經過Opencv讀出的圖， 可用 cv2 讀取
    :param imp_dim: （416，416）
    :return:
    """
    img_w, img_h = img.shape[1], img.shape[0]
    w, h = inp_dim #inp_dim是需要resize的尺寸（如416*416
    new_w = int(img_w * min(w/img_w, h/img_h))
    new_h = int(img_h * min(w/img_w, h/img_h))
    # 將圖片按照縱橫比不變來縮放爲new_w x new_h，768 x 576的圖片縮放成416x312.,用了雙三次插值
    resized_image = cv2.resize(img, (new_w, new_h), interpolation= cv2.INTER_CUBIC)
     # 創建一個畫布, 將resized_image數據拷貝到畫布中心。
    canvas = np.full((inp_dim[1], inp_dim[0], 3), 128)
    # 生成一個我們最終需要的圖片尺寸[h,w,3]的array,這裏生成416x416x3的array,每個元素值爲128
    # 將[w,h,3]的array中對應[new_w,new_h,3]的部分(這兩個部分的中心應該對齊)賦值爲剛剛由原圖縮放得到的數組
    # ,得到最終縮放後圖片
    canvas[(h - new_h)//2 : (h - new_h)//2 + new_h, (w - new_w)//2 : (w - new_w)//2 + new_w, :] = resized_image
    # print(" Resize success")
    return canvas

PIL.Image版

def letter_image(img, net_w, net_h):
    img_w, img_h = img.size
    if float(net_w) / float(img_w) < float(net_h) / float(img_h):
        new_w = net_w
        new_h = (img_h * net_w) // img_w
    else:
        new_w = (img_w * net_h) // img_h
        new_h = net_h
    # Image.ANTIALTAS表示高質量差值
    resized = img.resize((new_w, new_h), Image.ANTIALIAS)
    lbImage = Image.new("RGB", (net_w, net_h), (127, 127, 127))
    lbImage.paste(resized, ((net_w - new_w) // 2, (net_h - new_h) // 2, (net_w + new_w) // 2, (net_h + new_h) // 2))
    return lbImage

resize逆變換
resize分爲三步:
a.按更長的邊縮小,分母大，選min,找到縮放係數min(w/img_w, h/img_h)，得到縮放後的大小int(img_w * min(w/img_w, h/img_h))。
b、縮放後的圖片與目標圖片中心對齊。[(h - new_h)//2 : (h - new_h)//2 + new_h, (w - new_w)//2 : (w - new_w)//2 + new_w, :]，也就是求出需要填補的區域(h - new_h)//2，
c、填補=在圖上將縮放後的圖片複製到畫布。

逆變換也一樣：
a、先求出縮放係數torch.min(inp_dim / im_dim, 1)[0].view(-1, 1)這裏inp_dim和im_dim是多張圖片的長寬組成的二維tensor，在取1維下比較的最小值。
b、求出原始圖片圖片的區域output[:, [1,3]] -= ( inp_dim - scaling_factor*im_dim[:,0].view(-1,1) )/2 ，也就是減去填充區域
c、再縮放回原來圖片大小output[:, 1:5] /= scaling_factor
如果是錨框的話還需限制框的區域，因爲預測時可能填充區域也存在錨框的部分

# 將超過了原始圖片範圍的方框座標限定在圖片範圍之內
        for i in range(output.shape[0]):
            output[i, [1, 3]] = torch.clamp(output[i, [1, 3]], 0.0, im_dim[i, 0])
            output[i, [2, 4]] = torch.clamp(output[i, [2, 4]], 0.0, im_dim[i, 1])

yolov3代碼學習（1）小功能實現代碼

1. opencv讀取圖片–>model能處理的格式，在darknet.py的第1個函數。

2. 從.cfg讀取yolov3網絡，在darknet.py的第2個函數。

3. 文件路徑的一寫操作

4. try和except

5.map的使用

6.將單張圖片組成batch

7.圖片的resize.在util.py中。

985 碩士程序員，空窗 4 個月沒有 Offer！

【入門教程】5分鐘教你快速學會集成Java springboot ~

營銷系統黑名單優化：位圖的應用解析

一文搞懂 Spring 循環依賴

我真的從測試轉成了開發......

盛大發布 | Zabbix 7.0 LTS--性能與擴展的卓越融合

nginx添加相應配置，通過瀏覽器訪問或curl時返回客戶端對應公網IP

賽博鬥地主——使用大語言模型扮演Agent智能體玩牌類遊戲。

python內置函數——sorted

[oeasy]python020在遊戲中體驗數值自由_勇闖地下城_終端文字遊戲

降維算法_LDA_PAC

根據最後的feature map生成anchor的location

yolov3代碼學習（1）小功能實現代碼

工具、軟件使用 windows Git bash

Pytorch學習

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結