RGBDTAM: A Cost-Effective and Accurate RGB-D Tracking and Mapping System

0.引言

IROS2017.一個小衆的不太出名的RGBD-SLAM系統，無意中搜索到，發現是開源的。學習一下。

github
paper

主要貢獻：

1.系統結合半稠密光度誤差和稠密集合誤差作爲VO的優化error,並證明了這樣選擇是精確度最高的組合；
2.構建了多視圖約束及其在tracking和mapping線程的誤差模型；

．　　In the case of the geometric error, all the pixels have a high signal/noise ratio. There are some degenerated cases, though, where some degrees of freedom are not constrained, and those justify the combination of both residuals. As they are complementary, the minimization of both errors achieves the best performance. The photometric error is useless in texture-less scenarios, and the geometric one is useless in structure-less scenarios.

系統說明：

1.vo_system launch the three threads, tracking, semidense mapping and dense mapping (3D superpixels)

///Launch semidense tracker thread
boost::thread thread_semidense_tracker(&ThreadSemiDenseTracker,&images,&semidense_mapper,&semidense_tracker,&dense_mapper,&Map,&vis_pub,&pub_image);
//Launch semidense mapper thread
boost::thread thread_semidense_mapper(&ThreadSemiDenseMapper,&images,&images_previous_keyframe,&semidense_mapper,&semidense_tracker,&dense_mapper,&Map,&pub_cloud);
///Launch viewer updater.
boost::thread thread_viewer_updater(&ThreadViewerUpdater, &semidense_tracker,&semidense_mapper,&dense_mapper);

2.mapping線程構建新幀： $\mathcal{M}$
關鍵幀： $\left\{\mathcal{K}_{1}, \ldots, \mathcal{K}_{j}, \ldots, \mathcal{K}_{m}\right\}$ 其中 $\mathcal{K}_{j}=\left\{T_{w}^{j}, P^{j}\right\}$
點雲幀： $P_{w}^{j}=$ $\left\{p_{w}^{1}, \dots, p_{w}^{i}, \dots, p_{w}^{n}\right\}$
3.代碼使用openmp加速，boost管理線程。

1.相關係統(direct RGB-D odometry)

1.KinectFusion:只利用了深度信息;
2.Kintinuous:KinectFusion的改進，對內存機制進行了改進，能進行較大場景的重建且加入了閉環與位姿優化；
3.DVO-SLAM:基於圖優化，關鍵幀約束，based on dense photometric and geometric error minimization，CPU實時，但是輸入系統的圖像分辨率被降低(不是640*480)；
4.ElasticFusion：ICP + photometric reprojection error.

2. tracking thread

最小化光度誤差 $r_{ph}$ 與幾何誤差 $r_g$ .

$\{\hat{T}, \hat{a}, \hat{b}\}=\underset{T, a, b}{\arg \min } r_{p h}+\lambda r_{g}$

其中， $a、b$ 是當前圖像的增益和亮度. $T$ 是當前相機姿態的運動估計增量。 $\lambda$ 是對光度和幾何項進行加權的學習常數。tracking線程只優化 $T、a、b$ 三個量。
優化在李代數空間進行：

$T=\left[\begin{array}{cc}{\exp _{\mathrm{SO}(3)}(\delta \omega)} & {\delta t} \\ {0_{1 \times 3}} & {1}\end{array}\right]$
GN優化得到的結果右乘更新：
$T_{w}^{f} \leftarrow T_{w}^{f} \hat{T}^{-1}$

2.1.Photometric error ( $r_{ph}$ )

We minimize the photometric error only for those pixels belonging to Canny edges.Their inverse depth is estimated using the mapping method.

photometric error:

$r_{p h}=\sum_{i=1}^{n} w_{p}\left(\frac{\left(I_{k}\left(\pi\left(T_{w}^{k} p_{w}^{i}\right)\right)-a I_{f}\left(\pi\left(T_{w}^{f} T^{-1} p_{w}^{i}\right)\right)+b\right)^{2}}{\sigma_{p h}^{2}}\right)$
其中：

$I_{k}\left(\pi\left(\boldsymbol{T}_{w}^{k} p_{w}^{i}\right)\right.$ 爲關鍵幀 $I_k$ 中3D點的 $P_w^i$ 的灰度(或則說光照強度)；
$\left.I_{f}\left(\pi\left(\boldsymbol{T}_{w}^{f} \hat{T}^{-1} p_{w}^{i}\right)\right)\right)$ 爲當前幀 $I_f$ 中3D點的 $P_w^i$ 的灰度(或則說光照強度)；
$\pi()$ 爲重投影函數；
a和b是當前幀相對於當前關鍵幀的增益和亮度，通過估計a和b來解決全局光照明暗的變化；
$w_p$ 是Geman-McClure魯棒cost function，用於消除遮擋和動態對象的影響;
$\sigma_{p h}^{2}$ 是什麼？論文中沒提到。是高維高斯傳遞的方差？？

2.2.Covariance-weighted Geometric error ( $r_g$ )

（1）Covariance-weighted Geometric error:
$\left.r_{g}=\sum_{i=1}^{n} w_{p}\left(\frac{\left(\frac{1}{e_{z}^{T} T_{w}^{f} T^{-1} p_{w}^{i}}-D_{f}\left(\pi\left(T_{w}^{f} T^{-1} p_{w}^{i}\right)\right)^{2}\right.}{\sigma_{g}^{2}}\right)\right)$

其中:

$\frac{1}{e_{z}^{T} \boldsymbol{T}_{w}^{f} p_{x}^{i}}$ 3D點雲與投影幀對齊後的逆深度？錯了：是當前幀的逆深度(殘差構建就是預測的逆深度減去測量的你深度值，從而優化 $T$ )；
$D_{f}$ 測量值的逆深度；
$e_{z}=[0,0,1]$ 爲三維向量；

.　　 in order to achieve CPU real-time performance. We use four pyramid levels (from 80 × 60 to 640 × 480). For the first level we use all pixels. For the second, third and fourth levels we use one in every two, three and four pixels respectively –horizontally and vertically.

（2）Covariance Propagation for Structured Light Cameras:

只對結構光深度相機？？有什麼區別嗎？雙目！！
深度：
$z=\frac{f b}{d}$
逆深度：
$\rho=\frac{d}{f b}$

標準差：
$\sigma_{z}=\frac{\partial z}{\partial d} \sigma_{d}=\frac{f b}{d^{2}} \sigma_{d}=\frac{z^{2}}{f b} \sigma_{d}$

逆深度標準差：

$\sigma_{\rho}=\frac{\partial \rho}{\partial d} \sigma_{d}=\frac{\sigma_{d}}{f b}$

2.3.Scaling parameters

As we combine residuals of different magnitudes, we need to scale them according to their covariances. For the geometric error we propagate its uncertainty using equations 8 and 9.（就是上面的標準差） For the photometric error we use the median absolute deviation of the residuals of the previous frame to extract a robust estimation of the standard deviation.
對於光度誤差，使用前一幀殘差的中值絕對偏差來提取標準偏差的可靠估計：
$\sigma_{p h}=1.482 * \operatorname{median}\left(r_{p h}-\operatorname{median}\left(r_{p h}\right)\right)$

3.Mapping thread

添加關鍵幀到地圖。每個像素有兩種方式來估計其逆深度：傳感器測量 $\rho 1$ 以及多視圖三角化 $\rho 2$ .The inverse depth $\rho 2$ for every high-gradient pixel $u^*$ in a keyframe $I_j$ is estimated by minimizing its photometric error $r_{ph}^o$ with respect to several overlapping views $I_o$ .

$\hat{\rho}_{2}=\underset{\rho_{2}}{\arg \min } r_{p h}$
$r_{p h}=\sum_{o}\left\|\left(I_{j}\left(s_{u^{*}}\right)-I_{o}\left(G\left(s_{u^{*}}, T_{w}^{j}, T_{w}^{o}, \rho\right)\right)\right)\right\|_{2}^{2}$

其中：

$\rho=\frac{\sum_{j=1}^{2} \frac{\rho_{j}}{\sigma_{j}^{2}}}{\sum_{j=1}^{2} \frac{1}{\sigma_{j}^{2}}}, \quad \sigma=\frac{1}{\sum_{j=1}^{2} \frac{1}{\sigma_{j}^{2}}}$
$S_{u} *$ ：座標
$G()$ 投影函數， $I_j$ 投影到 $I_o$ ；
$\sigma_{j}$ 爲前面的標準差；

4.LOOP CLOSURE AND MAP REUSE

略.

RGBDTAM: A Cost-Effective and Accurate RGB-D Tracking and Mapping System