OpenCV相機標定及距離估計（單目）

相機標定基本知識

對於針孔攝像機模型，一幅視圖是通過透視變換將三維空間中的點投影到圖像平面。投影公式如下：

$s \cdot m' = A\cdot[R|t] \cdot M'$ 或者

$s\cdot \begin{bmatrix} u \\ v \\ 1 \end{bmatrix} = \begin{bmatrix} fx & 0 & cx \\ 0 & fy & cy \\ 0 & 0 & 1 \end{bmatrix} \cdot \begin{bmatrix} r_{11} & r_{12} & r_{13} & t_{1} \\ r_{21} & r_{22} & r_{23} & t_{2} \\ r_{31} & r_{32} & r_{33} & t_{3} \end{bmatrix} \cdot \begin{bmatrix} X \\ Y \\ Z \\ 1 \end{bmatrix}$

這裏(X, Y, Z)是一個點的世界座標，(u, v)是點投影在圖像平面的座標，以像素爲單位。A被稱作攝像機矩陣，或者內參數矩陣。(cx, cy)是基準點（通常在圖像的中心），fx, fy是以像素爲單位的焦距。所以如果因爲某些因素對來自於攝像機的一幅圖像升採樣或者降採樣，所有這些參數(fx, fy, cx和cy)都將被縮放（乘或者除）同樣的尺度。內參數矩陣不依賴場景的視圖，一旦計算出，可以被重複使用（只要焦距固定）。旋轉－平移矩陣[R|t]被稱作外參數矩陣，它用來描述相機相對於一個固定場景的運動，或者相反，物體圍繞相機的的剛性運動。也就是[R|t]將點(X, Y, Z)的座標變換到某個座標系，這個座標系相對於攝像機來說是固定不變的。上面的變換等價與下面的形式（z≠0）：

$\begin{bmatrix}x \\ y \\z \end{bmatrix} = R \cdot \begin{bmatrix}X \\ Y \\Z \end{bmatrix} + t$

x' = x / z

y' = y / z

$u=fx \cdot x' + cx$

$v=fy \cdot y' + cy$

真正的鏡頭通常有一些形變，主要的變形爲徑向形變，也會有輕微的切向形變。所以上面的模型可以擴展爲：

$\begin{bmatrix}x \\ y \\z \end{bmatrix} = R \cdot \begin{bmatrix}X \\ Y \\Z \end{bmatrix} + t$

x' = x / z

y' = y / z

$x'' = x' \cdot (1 + k_1 \cdot r^2 + k_2 \cdot r^4) + 2 \cdot p_1 \cdot x'\cdot y' + p_2 \cdot (r^2+2x'^2)$

$y'' = y' \cdot (1 + k_1 \cdot r^2 + k_2 \cdot r^4) + p_1 \cdot (r^2+2 \cdot y'^2) + 2 \cdot p_2 \cdot x'\cdot y'$

這裏 r2 = x'2 + y'2

$u = fx \cdot x'' + cx$

$v = fy \cdot y'' + cy$

k1和k2是徑向形變係數，p1和p1是切向形變係數。OpenCV中沒有考慮高階係數。形變係數跟拍攝的場景無關，因此它們是內參數，而且與拍攝圖像的分辨率無關。

OpenCV標定函數

double cv::calibrateCamera	(	InputArrayOfArrays	objectPoints,
		InputArrayOfArrays	imagePoints,
		Size	imageSize,
		InputOutputArray	cameraMatrix,
		InputOutputArray	distCoeffs,
		OutputArrayOfArrays	rvecs,
		OutputArrayOfArrays	tvecs,
		OutputArray	stdDeviationsIntrinsics,
		OutputArray	stdDeviationsExtrinsics,
		OutputArray	perViewErrors,
		int	flags = `0`,
		TermCriteria	criteria = `TermCriteria(TermCriteria::COUNT+TermCriteria::EPS, 30, DBL_EPSILON)`
	)

Finds the camera intrinsic and extrinsic parameters from several views of a calibration pattern.

Parameters

objectPoints	In the new interface it is a vector of vectors of calibration pattern points in the calibration pattern coordinate space (e.g. std::vector<std::vector<cv::Vec3f>>). The outer vector contains as many elements as the number of the pattern views. If the same calibration pattern is shown in each view and it is fully visible, all the vectors will be the same. Although, it is possible to use partially occluded patterns, or even different patterns in different views. Then, the vectors will be different. The points are 3D, but since they are in a pattern coordinate system, then, if the rig is planar, it may make sense to put the model to a XY coordinate plane so that Z-coordinate of each input object point is 0. In the old interface all the vectors of object points from different views are concatenated together.
imagePoints	In the new interface it is a vector of vectors of the projections of calibration pattern points (e.g. std::vector<std::vector<cv::Vec2f>>). imagePoints.size() and objectPoints.size() and imagePoints[i].size() must be equal to objectPoints[i].size() for each i. In the old interface all the vectors of object points from different views are concatenated together.
imageSize	Size of the image used only to initialize the intrinsic camera matrix.
cameraMatrix	Output 3x3 floating-point camera matrix A=⎡⎣⎢⎢⎢fx000fy0cxcy1⎤⎦⎥⎥⎥ . If CV_CALIB_USE_INTRINSIC_GUESS and/or CALIB_FIX_ASPECT_RATIO are specified, some or all of fx, fy, cx, cy must be initialized before calling the function.
distCoeffs	Output vector of distortion coefficients (k1,k2,p1,p2[,k3[,k4,k5,k6[,s1,s2,s3,s4[,τx,τy]]]]) of 4, 5, 8, 12 or 14 elements.
rvecs	Output vector of rotation vectors (see Rodrigues ) estimated for each pattern view (e.g. std::vector<cv::Mat>>). That is, each k-th rotation vector together with the corresponding k-th translation vector (see the next output parameter description) brings the calibration pattern from the model coordinate space (in which object points are specified) to the world coordinate space, that is, a real position of the calibration pattern in the k-th pattern view (k=0.. M -1).
tvecs	Output vector of translation vectors estimated for each pattern view.
stdDeviationsIntrinsics	Output vector of standard deviations estimated for intrinsic parameters. Order of deviations values: (fx,fy,cx,cy,k1,k2,p1,p2,k3,k4,k5,k6,s1,s2,s3,s4,τx,τy) If one of parameters is not estimated, it's deviation is equals to zero.
stdDeviationsExtrinsics	Output vector of standard deviations estimated for extrinsic parameters. Order of deviations values: (R1,T1,…,RM,TM) where M is number of pattern views, Ri,Ti are concatenated 1x3 vectors.
perViewErrors	Output vector of the RMS re-projection error estimated for each pattern view.
flags	Different flags that may be zero or a combination of the following values: CALIB_USE_INTRINSIC_GUESS cameraMatrix contains valid initial values of fx, fy, cx, cy that are optimized further. Otherwise, (cx, cy) is initially set to the image center ( imageSize is used), and focal distances are computed in a least-squares fashion. Note, that if intrinsic parameters are known, there is no need to use this function just to estimate extrinsic parameters. Use solvePnP instead. CALIB_FIX_PRINCIPAL_POINT The principal point is not changed during the global optimization. It stays at the center or at a different location specified when CALIB_USE_INTRINSIC_GUESS is set too. CALIB_FIX_ASPECT_RATIO The functions considers only fy as a free parameter. The ratio fx/fy stays the same as in the input cameraMatrix . When CALIB_USE_INTRINSIC_GUESS is not set, the actual input values of fx and fy are ignored, only their ratio is computed and used further. CALIB_ZERO_TANGENT_DIST Tangential distortion coefficients (p1,p2) are set to zeros and stay zero. CALIB_FIX_K1,...,CALIB_FIX_K6 The corresponding radial distortion coefficient is not changed during the optimization. If CALIB_USE_INTRINSIC_GUESS is set, the coefficient from the supplied distCoeffs matrix is used. Otherwise, it is set to 0. CALIB_RATIONAL_MODEL Coefficients k4, k5, and k6 are enabled. To provide the backward compatibility, this extra flag should be explicitly specified to make the calibration function use the rational model and return 8 coefficients. If the flag is not set, the function computes and returns only 5 distortion coefficients. CALIB_THIN_PRISM_MODEL Coefficients s1, s2, s3 and s4 are enabled. To provide the backward compatibility, this extra flag should be explicitly specified to make the calibration function use the thin prism model and return 12 coefficients. If the flag is not set, the function computes and returns only 5 distortion coefficients. CALIB_FIX_S1_S2_S3_S4 The thin prism distortion coefficients are not changed during the optimization. If CALIB_USE_INTRINSIC_GUESS is set, the coefficient from the supplied distCoeffs matrix is used. Otherwise, it is set to 0. CALIB_TILTED_MODEL Coefficients tauX and tauY are enabled. To provide the backward compatibility, this extra flag should be explicitly specified to make the calibration function use the tilted sensor model and return 14 coefficients. If the flag is not set, the function computes and returns only 5 distortion coefficients. CALIB_FIX_TAUX_TAUY The coefficients of the tilted sensor model are not changed during the optimization. If CALIB_USE_INTRINSIC_GUESS is set, the coefficient from the supplied distCoeffs matrix is used. Otherwise, it is set to 0.
criteria	Termination criteria for the iterative optimization algorithm.

Returns

the overall RMS re-projection error.

The function estimates the intrinsic camera parameters and extrinsic parameters for each of the views. The algorithm is based on [206] and [17] . The coordinates of 3D object points and their corresponding 2D projections in each view must be specified. That may be achieved by using an object with a known geometry and easily detectable feature points. Such an object is called a calibration rig or calibration pattern, and OpenCV has built-in support for a chessboard as a calibration rig (see findChessboardCorners ). Currently, initialization of intrinsic parameters (when CALIB_USE_INTRINSIC_GUESS is not set) is only implemented for planar calibration patterns (where Z-coordinates of the object points must be all zeros). 3D calibration rigs can also be used as long as initial cameraMatrix is provided.

The algorithm performs the following steps:

Compute the initial intrinsic parameters (the option only available for planar calibration patterns) or read them from the input parameters. The distortion coefficients are all set to zeros initially unless some of CALIB_FIX_K? are specified.
Estimate the initial camera pose as if the intrinsic parameters have been already known. This is done using solvePnP .
Run the global Levenberg-Marquardt optimization algorithm to minimize the reprojection error, that is, the total sum of squared distances between the observed feature points imagePoints and the projected (using the current estimates for camera parameters and the poses) object points objectPoints. See projectPoints for details.

Note

If you use a non-square (=non-NxN) grid and findChessboardCorners for calibration, and calibrateCamera returns bad values (zero distortion coefficients, an image center very far from (w/2-0.5,h/2-0.5), and/or large differences between fx and fy (ratios of 10:1 or more)), then you have probably used patternSize=cvSize(rows,cols) instead of using patternSize=cvSize(cols,rows) in findChessboardCorners .

Reference

https://docs.opencv.org/2.4/modules/calib3d/doc/camera_calibration_and_3d_reconstruction.html

雜記

---

一次相機標定僅僅只是對物理相機模型的一次近似，再具體一點來說，一次標定僅僅是對相機物理模型在採樣空間範圍內的一次近似。所以當你成像物體所在的空間跟相機標定時的採樣空間不一樣的時候，你可能永遠都沒辦法得到足夠高的精度，當你大幅改變相機與成像物體的距離的時候，你最好重新標定相機。

通過攝像機標定我們可以知道些什麼:
1.外參數矩陣。告訴你現實世界點(世界座標)是怎樣經過旋轉和平移，然後落到另一個現實世界點(攝像機座標)上。
2.內參數矩陣。告訴你上述那個點在1的基礎上，是如何繼續經過攝像機的鏡頭、並通過針孔成像和電子轉化而成爲像素點的。
3.畸變矩陣。告訴你爲什麼上面那個像素點並沒有落在理論計算該落在的位置上，還tm產生了一定的偏移和變形！！！

總的來說，攝像機標定是通過尋找對象在圖像與現實世界的轉換數學關係，找出其定量的聯繫，從而實現從圖像中測量出實際數據的目的。

---

每個標定板擺放的角度對應一個單應矩陣。然後每個矩陣根據旋轉矩陣的單位正交性，可以構成2個約束，對應2個方程。

內參數矩陣中包含了5個自由度(駐點座標u0,v0)，相機焦距(fx,fy)，X軸和Y軸垂直扭曲skew。因此至少3個單應關係，纔可以求解，因此至少3個擺放的角度。

另外考慮畸變參數的建模，一般有4個，因此使用LM方法完成非線性優化。實際應用中擺放的角度20個，不同的角度能夠保證目標函數更加接近凸函數，便於完成所有參數的迭代優化，使得結果更加準確。

---

通過鏡頭，一個三維空間中的物體經常會被映射成一個倒立縮小的像（當然顯微鏡是放大的，不過常用的相機都是縮小的），被傳感器感知到。

理想情況下，鏡頭的光軸（就是通過鏡頭中心垂直於傳感器平面的直線）應該是穿過圖像的正中間的，但是，實際由於安裝精度的問題，總是存在誤差，這種誤差需要用內參來描述；

理想情況下，相機對x方向和y方向的尺寸的縮小比例是一樣的，但實際上，鏡頭如果不是完美的圓，傳感器上的像素如果不是完美的緊密排列的正方形，都可能會導致這兩個方向的縮小比例不一致。內參中包含兩個參數可以描述這兩個方向的縮放比例，不僅可以將用像素數量來衡量的長度轉換成三維空間中的用其它單位（比如米）來衡量的長度，也可以表示在x和y方向的尺度變換的不一致性；

理想情況下，鏡頭會將一個三維空間中的直線也映射成直線（即射影變換），但實際上，鏡頭無法這麼完美，通過鏡頭映射之後，直線會變彎，所以需要相機的畸變參數來描述這種變形效果。然後，說到爲什麼需要20張圖片，這只是一個經驗值，實際上太多也不好，太少也不好。單純從統計上來看，可能越多會越好，但是，實際上圖片太多可能會讓參數優化的結果變差，因爲棋盤格角點座標的確定是存在誤差的，而且這種誤差很難說是符合高斯分佈的，同時，標定過程所用的非線性迭代優化算法不能保證總是得到最優解，而更多的圖片，可能會增加算法陷入局部最優的可能性。

---

總結一下就是：外參：另一個（相機或世界）座標系->這個個相機座標系，內參：這個相機座標系->圖像座標系

實際應用中，個人經驗是，單目標定板還是不要傾斜角度太大，儘量在攝像機視場範圍內的所有地方都出現，遠近也可以做一做，但不需要拉的太遠。不知道有沒有更好的操作，之前這麼做誤差還行，所以也就先這麼搞着。

張正友的標定法論文裏結果顯示11幅圖之後的效果都很好，標定板繞一個軸轉動的角度在45度時最好。

OpenCV相機標定及距離估計（單目）

相機標定基本知識

OpenCV標定函數

Reference

雜記

關於遊戲付費的一點想法

我通過CKA和CKS啦！

MIPI CSI-2、DVP、FPD-Link III、GMSL對比

C++語法特性cheat paper

C++17的inline variable

MQ消息隊列中間件介紹及IoT領域應用

C++11之std::ratio

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結