目標跟蹤的一般思想是跟蹤目標中關鍵點。TLD也是跟蹤點(但不是跟蹤SIFT之類的關鍵點)。點跟蹤採用的是光流法,具體來說是Pyramidal Lucas-Kanade tracker,這個以後機會再介紹,推薦閱讀《Learning OpenCV》第10章的Lucas-Kanade Method部分,這裏只介紹OpenCV的實現函數,跳過原理和實現細節。
首先看跟蹤點的函數,calcOpticalFlowPyrLK,它的作用是找到上一幀中的跟蹤點在當前幀的位置。調用形式如下:
calcOpticalFlowPyrLK( img_last,img_curr, points_last , points_curr, status, errs)
參數意思應該很直白了吧,補充一下status爲1,表示對應點找到了,爲0就是沒找到,errs自然是誤差。注意:可以是單個點,也可以是點集,如果是點集,那麼對應的status和errs就都是vector啦。
下面說說怎麼跟蹤目標,TLD採用的是基於作者自己提出的Median-Flow tracker,此外增加了跟蹤失敗檢測。
通過Forward-Backward error來篩選要跟蹤的點
前面提到TLD跟蹤的不是關鍵點,它跟蹤的是更簡單的點:能穩定存在的點,那哪些點是穩定的呢?Median-Flow tracker的基本思想是,看反向跟蹤後的殘差,用所有點的殘差中值作爲穩定點的篩選條件。如上圖中的黃色點就因爲殘差太大,被pass掉了,既然穩定點是可以篩選出來的,那麼就不必煞費苦心的尋找那些關鍵點,可以直接將所有的點都作爲初始跟蹤點,好吧所,有的點畢竟還是太多了,於是作者是選取網格交叉點作爲初始跟蹤點(見下圖框框中黃色的點點)。
Median Flow tracker 的流程圖
下面正式介紹作者的跟蹤函數TLD::track,調用形式如下:
track(img1,img2,points1,points2)
img1是上一幀的圖像,img2是當前幀的圖像,points1,points2都是這個函數的輸出函數,points1是將上一次跟蹤到的目標區域lastbox劃分成網格後,所得到的網格交點,即上圖左邊的黃色點,而points2是points1中能穩定出現在當前幀出的點,即右圖中的點。
下面結合上面的流程圖,並補充TLD需要增加的環節,來介紹track。
TLD::track函數
-
void TLD::track(const Mat& img1, const Mat& img2,vector<Point2f>& points1,vector<Point2f>& points2){
-
-
-
bbPoints(points1,lastbox);
-
if (points1.size()<1){
-
printf("BB= %d %d %d %d, Points not generated\n",lastbox.x,lastbox.y,lastbox.width,lastbox.height);
-
tvalid=false;
-
tracked=false;
-
return;
-
}
-
vector<Point2f> points = points1;
-
-
-
-
tracked = tracker.trackf2f(img1,img2,points,points2);
-
if (tracked){
-
-
bbPredict(points,points2,lastbox,tbb);
-
-
if (tracker.getFB()>10 || tbb.x>img2.cols || tbb.y>img2.rows || tbb.br().x < 1 || tbb.br().y <1){
-
tvalid =false;
-
tracked = false;
-
printf("Too unstable predictions FB error=%f\n",tracker.getFB());
-
return;
-
}
-
-
Mat pattern;
-
Scalar mean, stdev;
-
BoundingBox bb;
-
bb.x = max(tbb.x,0);
-
bb.y = max(tbb.y,0);
-
bb.width = min(min(img2.cols-tbb.x,tbb.width),min(tbb.width,tbb.br().x));
-
bb.height = min(min(img2.rows-tbb.y,tbb.height),min(tbb.height,tbb.br().y));
-
getPattern(img2(bb),pattern,mean,stdev);
-
vector<int> isin;
-
float dummy;
-
classifier.NNConf(pattern,isin,dummy,tconf);
-
tvalid = lastvalid;
-
if (tconf>classifier.thr_nn_valid){
-
tvalid =true;
-
}
-
}
-
else
-
printf("No points tracked\n");
-
}
1.Initialize points to
grid
將bb切成10*10的網格,將網格交點存在points,函數爲TLD::bbPoints。
-
-
void TLD::bbPoints(vector<cv::Point2f>& points,const BoundingBox& bb){
-
int max_pts=10;
-
int margin_h=0;
-
int margin_v=0;
-
int stepx = ceil((bb.width-2*margin_h)/max_pts);
-
int stepy = ceil((bb.height-2*margin_v)/max_pts);
-
for (int y=bb.y+margin_v;y<bb.y+bb.height-margin_v;y+=stepy){
-
for (int x=bb.x+margin_h;x<bb.x+bb.width-margin_h;x+=stepx){
-
points.push_back(Point2f(x,y));
-
}
-
}
-
}
2.Track points
3.Estimate tracking error
4.Filter out outliers
這三步都在函數trackf2f 中,調用層次關係tld.processFrame->track->[tracked = tracker.trackf2f(img1,img2,points,points2)]
-
-
bool LKTracker::trackf2f(const Mat& img1, const Mat& img2,vector<Point2f> &points1, vector<cv::Point2f> &points2){
-
-
-
calcOpticalFlowPyrLK( img1,img2, points1, points2, status,similarity, window_size, level, term_criteria, lambda, 0);
-
calcOpticalFlowPyrLK( img2,img1, points2, pointsFB, FB_status,FB_error, window_size, level, term_criteria, lambda, 0);
-
-
for( int i= 0; i<points1.size(); ++i ){
-
FB_error[i] = norm(pointsFB[i]-points1[i]);
-
}
-
-
-
normCrossCorrelation(img1,img2,points1,points2);
-
return filterPts(points1,points2);
-
}
其中normCrossCorrelation(img1,img2,points1,points2)是對光流法跟蹤的結果不放心,因此希望通過對比前後兩點周圍的小塊的相似性,來進一步去掉不穩定的點。這次的相似性不是相關係數,而是normalized cross-correlation
(NCC):
這個比較複雜,建議看wiki的公式,其實還是前面提到的相關係數,只不過計算的時候需要自己減去均值。
-
void LKTracker::normCrossCorrelation(const Mat& img1,const Mat& img2, vector<Point2f>& points1, vector<Point2f>& points2) {
-
Mat rec0(10,10,CV_8U);
-
Mat rec1(10,10,CV_8U);
-
Mat res(1,1,CV_32F);
-
for (int i = 0; i < points1.size(); i++) {
-
if (status[i] == 1) {
-
getRectSubPix( img1, Size(10,10), points1[i],rec0 );
-
getRectSubPix( img2, Size(10,10), points2[i],rec1);
-
matchTemplate( rec0,rec1, res, CV_TM_CCOEFF_NORMED);
-
similarity[i] = ((float *)(res.data))[0];
-
-
-
} else {
-
similarity[i] = 0.0;
-
}
-
}
-
rec0.release();
-
rec1.release();
-
res.release();
-
}
該計算的都計算好了,終於可以篩選了,filterPts(points1,points2)
-
-
bool LKTracker::filterPts(vector<Point2f>& points1,vector<Point2f>& points2){
-
-
simmed = median(similarity);
-
size_t i, k;
-
for( i=k = 0; i<points2.size(); ++i ){
-
if( !status[i])
-
continue;
-
if(similarity[i]> simmed){
-
points1[k] = points1[i];
-
points2[k] = points2[i];
-
FB_error[k] = FB_error[i];
-
k++;
-
}
-
}
-
if (k==0)
-
return false;
-
points1.resize(k);
-
points2.resize(k);
-
FB_error.resize(k);
-
-
-
fbmed = median(FB_error);
-
for( i=k = 0; i<points2.size(); ++i ){
-
if( !status[i])
-
continue;
-
if(FB_error[i] <= fbmed){
-
points1[k] = points1[i];
-
points2[k] = points2[i];
-
k++;
-
}
-
}
-
points1.resize(k);
-
points2.resize(k);
-
if (k>0)
-
return true;
-
else
-
return false;
-
}
5.Update bounding box
bbPredict(points,points2,lastbox,tbb), points和points2是前面篩選完之後的點對,現在要依據points,points2來估計bb1的位移和尺度變化,這兩個信息都有了,自然可以決定lastbox在當前幀的位置tbb。
位移估計
位移估計的方法是用所有點對x,y位移的中值作爲位移的估計,如上圖。尺度的估計的方法是用所有點對(同一幀)的伸縮比的中值作爲尺度伸縮的估計,假設只有一堆點,尺度伸縮值的估計方式如下圖:
尺度估計
-
-
void TLD::bbPredict(const vector<cv::Point2f>& points1,const vector<cv::Point2f>& points2,
-
const BoundingBox& bb1,BoundingBox& bb2) {
-
int npoints = (int)points1.size();
-
vector<float> xoff(npoints);
-
vector<float> yoff(npoints);
-
printf("tracked points : %d\n",npoints);
-
-
for (int i=0;i<npoints;i++){
-
xoff[i]=points2[i].x-points1[i].x;
-
yoff[i]=points2[i].y-points1[i].y;
-
}
-
float dx = median(xoff);
-
float dy = median(yoff);
-
float s;
-
-
if (npoints>1){
-
vector<float> d;
-
d.reserve(npoints*(npoints-1)/2);
-
for (int i=0;i<npoints;i++){
-
for (int j=i+1;j<npoints;j++){
-
d.push_back(norm(points2[i]-points2[j])/norm(points1[i]-points1[j]));
-
}
-
}
-
s = median(d);
-
}
-
else {
-
s = 1.0;
-
}
-
float s1 = 0.5*(s-1)*bb1.width;
-
float s2 = 0.5*(s-1)*bb1.height;
-
printf("s= %f s1= %f s2= %f \n",s,s1,s2);
-
bb2.x = round( bb1.x + dx -s1);
-
bb2.y = round( bb1.y + dy -s2);
-
bb2.width = round(bb1.width*s);
-
bb2.height = round(bb1.height*s);
-
printf("predicted bb: %d %d %d %d\n",bb2.x,bb2.y,bb2.br().x,bb2.br().y);
-
}
6.Failure detection
這一步很簡單,原文是說A failure of the tracker is declared if pixels,其中是殘差的中值,殘差即反向跟蹤和原始跟蹤點的距離。不過程序裏面還要防止目標飛到圖像外面去了。
-
if (tracker.getFB()>10 || tbb.x>img2.cols || tbb.y>img2.rows || tbb.br().x < 1 || tbb.br().y <1){
-
tvalid =false;
-
tracked = false;
-
printf("Too unstable predictions FB error=%f\n",tracker.getFB());
-
return;
-
}
7.Estimate
Confidence and Validity
-
Mat pattern;
-
Scalar mean, stdev;
-
BoundingBox bb;
-
bb.x = max(tbb.x,0);
-
bb.y = max(tbb.y,0);
-
bb.width = min(min(img2.cols-tbb.x,tbb.width),min(tbb.width,tbb.br().x));
-
getPattern(img2(bb),pattern,mean,stdev);
-
vector<int> isin;
-
float dummy;
-
classifier.NNConf(pattern,isin,dummy,tconf);
-
tvalid = lastvalid;
-
if (tconf>classifier.thr_nn_valid){
-
tvalid =true;
-
}
註釋很清楚了,大家可以先忽略判定軌跡是否有效這一部分,只要知道它是用最近鄰分類器的Conservative
Similarity【5.2】作爲跟蹤目標的得分即可,後面要用這個分數和檢測器進行比較。