我們 習慣的座標表示 是 先 x 橫座標,再 y 縱座標。在圖像處理中,這種慣性思維尤其需要擔心。
因爲在計算機中,圖像是以矩陣的形式保存的,先行後列。所以,一張 寬×高×顏色通道=480×256×3 的圖片會保存在一個 256×480×3 的三維張量中。圖像處理時也是按照這種思想進行計算的(其中就包括 OpenCV 下的圖像處理),即 高×寬×顏色通道。
但是問題來了,cv2.resize這個api卻是個小例外。因爲它的參數輸入卻是 寬×高×顏色通道。
查看官方文檔 Geometric Image Transformations :
resize Resizes an image. C++: void resize(InputArray src, OutputArray dst, Size dsize, double fx=0, double fy=0, int interpolation=INTER_LINEAR )¶ Python: cv2.resize(src, dsize[, dst[, fx[, fy[, interpolation]]]]) → dst C: void cvResize(const CvArr* src, CvArr* dst, int interpolation=CV_INTER_LINEAR ) Python: cv.Resize(src, dst, interpolation=CV_INTER_LINEAR) → None Parameters: src – input image. dst – output image; it has the size dsize (when it is non-zero) or the size computed from src.size(), fx, and fy; the type of dst is the same as of src. dsize – output image size; if it equals zero, it is computed as: \texttt{dsize = Size(round(fx*src.cols), round(fy*src.rows))} Either dsize or both fx and fy must be non-zero.
由以下語段可知, cv2.resize 的 dsize 的參數輸入是 x軸×y軸,即 寬×高:
dst – output image; it has the size dsize (when it is non-zero) or the size computed from src.size(), fx, and fy; the type of dst is the same as of src.
自己寫了一個代碼實例來驗證它:
import cv2 import numpy as np import random seq = [random.randint(0, 255) for _ in range(256*480*3)] mat = np.resize(seq, new_shape=[256, 480, 3]) print ('mat.shape = {}'.format(mat.shape)) cv2.imwrite('origin_pic.jpg', mat) origin_pic = cv2.imread('./origin_pic.jpg') print ('origin_pic.shape = {}'.format(origin_pic.shape)) resize_pic = cv2.resize(src=origin_pic, dsize=(int(origin_pic.shape[1] * 2), int(origin_pic.shape[0] * 1)) ) print ('resize_pic.shape = {}'.format(resize_pic.shape)) cv2.imshow('resize_pic', resize_pic) cv2.waitKey(0) cv2.destroyAllWindows()
Output:
mat.shape = (256, 480, 3) origin_pic.shape = (256, 480, 3) resize_pic.shape = (256, 960, 3)
成功應驗了文檔裏的參數說明。