論文學習：《Accurate, Low-Latency Visual Perception for Autonomous Racing》

原創

2020-06-19 13:06

（一）摘要

關鍵部分：基於YOLOv3的物體檢測、位姿估計、時鐘同步
The key components of DUT18D include YOLOv3-based object detection, pose estimation and time synchronization on its dual stereovision/monovision camera setup

（硬件和軟件堆棧的延遲）
Of critical importance in autonomous driving is the latency of the hardware and software stack

視覺感知佔據了60%的時間
we ﬁnd that perception occupies up to 60% of the end-to-end latency

Despite its importance, low-latency visual perception of environment landmarks remains riddled with practical challenges across the entire stack, from noisy image capture and data transmission to accurate positioning in an unmapped environment. To our knowledge, there remains no available prior work detailing the full-stack design of a high-accuracy, a lowlatency perception system for autonomous driving

The visual perception system on DUT18D was designed to perceive and position（感知和地位） landmarks on a map using multiple CNNs for object detection and depth-estimation

（二）關鍵點

an open design and evaluation of a thoroughly-tested low-latency visionstackforhigh-performance autonomous racing

new techniques for domain adaptation of pre-trained CNN-based object detectors, useful loss function modiﬁcations for landmark pose estimation,and microsecond time synchronization of multiple cameras（多相機的微秒時間同步）

open-source C++ modules for mobile-GPU accelerated ONNX-DNN inference, landmark pose estimation, and a complete plug-andplay visual perception system for Formula Student racecars

a publicly available 10K+ pose-estimation/boundingbox dataset for trafﬁc cones of multiple colors and sizes（大量的數據集）

（三）目標

the goal of our perception system is to accurately localize environment landmarks (trafﬁc cones) that demarcate（vt. 劃分界線；區別） the racetrack.

（四）需求

精確建圖：地標的精確定位
延遲：路標進入視野到路標被定爲的時間
直線距離：保持準確性的最長直線距離
Horizontal Field-of-View (FOV)：廣角

（五）使用單目的原理

The rationale for using the monocular camera for short-range rather than long-range detections is that for a reasonable mounting height, a landmark’s 3D location on a relatively ﬂat surface is a much stronger function of pixel space location for short-range objects than long-range objects. This relieves some of the challenges for estimating landmark pose from a monocular camera.

（六）軟件處理過程

1.Data Acquisition（數據獲取）: Synchronized image streams are captured, disparity matched for the stereovision pipeline, and transferred to the Jetson Xavier. A critical component of this is time synchronizing all devices. （時鐘同步很關鍵）

時鐘同步解決方案： a hardware timestamped signal generated by the Nerian FPGA which is synchronized using the IEEE PTP protocol to Xavier’s master clock
單目

定位準確性和延遲是的一對矛盾
解決方案：低分辨率檢測，高分辨率深度估計
單目怎麼時鐘同步

2. 2D Space Localization（目標在圖片中的位置）: Using a neural network-based approach, landmarks are detected and outlined by bounding boxes in the images.

路標大小不同的處理
A drawback of this process is that the distribution of landmark bounding box (BB) sizes (in pixels) in the training set no longer was representative of what would be seen by the network in the wild. To mitigate this, each set of training images from a speciﬁc sensor/lens/perspective combination was uniformly rescaled such that their landmark size distributions matched that of the camera system on the vehicle
什麼意思？
tuning the hyperparameters in front of each of the termsinthelossfunction

3. 3D Space Localization（目標在3D空間的位置）: For the stereovision pipeline, depth from each landmark is extracted by a clusteringbased approach. A neural network-based approach is used to compute depth from the monocular camera

單目

算法 ReKTNet YOLO detection for use in a Perspective-n-Point (PnP) algorithm
單個極端值的處理
To make the algorithm robust to single keypoint outliers all subset permutations of the keypoints with one point removed are calculated if the reprojection error from the PnP estimate using all keypoints is above a threshold. The permutation with the lowest error is used as the ﬁnal estimate
改進之處
（1）用卷積層代替全連接層
the fully connected output layer was replaced with a convolutional layer
（2）在損失函數中加入幾何關係（利用共線關係）
an additional term in the loss function to leverage the geometric relationship between points

雙目

3D points from detections are ﬁrst passed through a hue and saturation ﬁlter where all pixels with H or S values below 0.3 are neglected
什麼是H值和S值？
To reduce the latency of each detection, the point clouds within the bounding box are downsampled if there are more than 200 points remaining
不懂？

延遲時間

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

論文學習：《Accurate, Low-Latency Visual Perception for Autonomous Racing》

（一）摘要

（二）關鍵點

an open design and evaluation of a thoroughly-tested low-latency visionstackforhigh-performance autonomous racing

new techniques for domain adaptation of pre-trained CNN-based object detectors, useful loss function modiﬁcations for landmark pose estimation,and microsecond time synchronization of multiple cameras（多相機的微秒時間同步）

open-source C++ modules for mobile-GPU accelerated ONNX-DNN inference, landmark pose estimation, and a complete plug-andplay visual perception system for Formula Student racecars

a publicly available 10K+ pose-estimation/boundingbox dataset for trafﬁc cones of multiple colors and sizes（大量的數據集）

（三）目標

（四）需求

（五）使用單目的原理

（六）軟件處理過程

1.Data Acquisition（數據獲取）: Synchronized image streams are captured, disparity matched for the stereovision pipeline, and transferred to the Jetson Xavier. A critical component of this is time synchronizing all devices. （時鐘同步很關鍵）

2. 2D Space Localization（目標在圖片中的位置）: Using a neural network-based approach, landmarks are detected and outlined by bounding boxes in the images.

3. 3D Space Localization（目標在3D空間的位置）: For the stereovision pipeline, depth from each landmark is extracted by a clusteringbased approach. A neural network-based approach is used to compute depth from the monocular camera

延遲時間

vue項目獲取富文本編輯器wangEditor內容導出爲word（html轉word格式並下載）

dotnet C# 創建 X11 應用時設置窗口背景顏色

Navicat安裝與激活教程

TDengine docker安裝方法

vue3組件通信與props

sapui5

Alpine Linux apk add DNS lookup error

部分JDK版本的發佈時間

工作中用到的腳本合集

合併代碼時Beyond Compare設置

matlab01

STM32F103固件庫編程（7）—SPI

機器學習課程（吳恩達）學習筆記（3）—分類算法和正則化

ROS基礎知識學習筆記（9）—Robot_Localization

ROS基礎知識學習筆記（4）—C++類和對象(2)

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結