TLD 詳細解析之跟蹤器

目標跟蹤的一般思想是跟蹤目標中關鍵點。TLD也是跟蹤點（但不是跟蹤SIFT之類的關鍵點）。點跟蹤採用的是光流法，具體來說是Pyramidal Lucas-Kanade tracker,這個以後機會再介紹，推薦閱讀《Learning OpenCV》第10章的Lucas-Kanade Method部分，這裏只介紹OpenCV的實現函數，跳過原理和實現細節。

首先看跟蹤點的函數，calcOpticalFlowPyrLK，它的作用是找到上一幀中的跟蹤點在當前幀的位置。調用形式如下：

calcOpticalFlowPyrLK( img_last,img_curr, points_last , points_curr, status, errs)

參數意思應該很直白了吧，補充一下status爲1，表示對應點找到了，爲0就是沒找到，errs自然是誤差。注意：可以是單個點，也可以是點集，如果是點集，那麼對應的status和errs就都是vector啦。

下面說說怎麼跟蹤目標，TLD採用的是基於作者自己提出的Median-Flow tracker，此外增加了跟蹤失敗檢測。

通過Forward-Backward error來篩選要跟蹤的點

前面提到TLD跟蹤的不是關鍵點，它跟蹤的是更簡單的點：能穩定存在的點，那哪些點是穩定的呢？Median-Flow tracker的基本思想是，看反向跟蹤後的殘差，用所有點的殘差中值作爲穩定點的篩選條件。如上圖中的黃色點就因爲殘差太大，被pass掉了，既然穩定點是可以篩選出來的，那麼就不必煞費苦心的尋找那些關鍵點，可以直接將所有的點都作爲初始跟蹤點，好吧所，有的點畢竟還是太多了，於是作者是選取網格交叉點作爲初始跟蹤點（見下圖框框中黃色的點點）。

Median Flow tracker 的流程圖

下面正式介紹作者的跟蹤函數TLD::track，調用形式如下：

track(img1,img2,points1,points2)

img1是上一幀的圖像，img2是當前幀的圖像，points1，points2都是這個函數的輸出函數，points1是將上一次跟蹤到的目標區域lastbox劃分成網格後，所得到的網格交點，即上圖左邊的黃色點，而points2是points1中能穩定出現在當前幀出的點，即右圖中的點。

下面結合上面的流程圖，並補充TLD需要增加的環節，來介紹track。

TLD::track函數

[cpp]view
plaincopy

void TLD::track(const Mat& img1, const Mat& img2,vector<Point2f>& points1,vector<Point2f>& points2){  

  //【5.4】  

  // 1.Generate points  

  bbPoints(points1,lastbox);  

  if (points1.size()<1){//問題：何時會出現這種情況？？  

      printf("BB= %d %d %d %d, Points not generated\n",lastbox.x,lastbox.y,lastbox.width,lastbox.height);  

      tvalid=false;  

      tracked=false;  

      return;  

  }  

  vector<Point2f> points = points1;  

  //Frame-to-frame tracking with forward-backward error cheking  

  // 2. 推斷上一幀的points，在當前幀的位置，points->points2  

  // 注意:只有通過篩選的point對 還保留在points，points2  

  tracked = tracker.trackf2f(img1,img2,points,points2);  

  if (tracked){//只要有一個點跟到了，就算跟到了……，是不是應該嚴格一點呢？？  

      // 3. Bounding box prediction  

      bbPredict(points,points2,lastbox,tbb);//此時，lastbox，還是依據上一幀預測的目標在當前幀的位置  

      // 4. Failure detection,檢測 getFB()>10 || 完全出軌   

      if (tracker.getFB()>10 || tbb.x>img2.cols ||  tbb.y>img2.rows || tbb.br().x < 1 || tbb.br().y <1){//br() bottom right座標  

          tvalid =false; //too unstable prediction or bounding box out of image  

          tracked = false;  

          printf("Too unstable predictions FB error=%f\n",tracker.getFB());  

          return;  

      }  

      // 5. Estimate Confidence and Validity  

      Mat pattern;  

      Scalar mean, stdev;  

      BoundingBox bb;  

      bb.x = max(tbb.x,0);  

      bb.y = max(tbb.y,0);  

      bb.width = min(min(img2.cols-tbb.x,tbb.width),min(tbb.width,tbb.br().x));//問題：我覺得後面的min沒必要呀？？  

      bb.height = min(min(img2.rows-tbb.y,tbb.height),min(tbb.height,tbb.br().y));  

      getPattern(img2(bb),pattern,mean,stdev);  

      vector<int> isin;  

      float dummy;  

      classifier.NNConf(pattern,isin,dummy,tconf); //1.tconf是用Conservative Similarity  

      tvalid = lastvalid;   

      if (tconf>classifier.thr_nn_valid){//thr_nn_valid  

          tvalid =true;//2.判定軌跡是否有效，從而決定是否要增加正樣本，標誌位tvalid【5.6.2 P-Expert】  

      }  

  }  

  else  

    printf("No points tracked\n");  

}

1.Initialize points to grid

將bb切成10*10的網格，將網格交點存在points，函數爲TLD::bbPoints。

[cpp]view
plaincopy

//將bb切成10*10的網格，將網格交點存在points  

void TLD::bbPoints(vector<cv::Point2f>& points,const BoundingBox& bb){  

  int max_pts=10;  

  int margin_h=0;//留白沒有用到  

  int margin_v=0;  

  int stepx = ceil((bb.width-2*margin_h)/max_pts);//向上取整  

  int stepy = ceil((bb.height-2*margin_v)/max_pts);  

  for (int y=bb.y+margin_v;y<bb.y+bb.height-margin_v;y+=stepy){  

      for (int x=bb.x+margin_h;x<bb.x+bb.width-margin_h;x+=stepx){  

          points.push_back(Point2f(x,y));//最多有11*11=121個點  

      }  

  }  

}

2.Track points
3.Estimate tracking error
4.Filter out outliers

這三步都在函數trackf2f 中，調用層次關係tld.processFrame->track->[tracked = tracker.trackf2f(img1,img2,points,points2)]

[cpp]view
plaincopy

//points1->points2，由於調用了filterPts，所以只有通過篩選的point對還保留在points1，points2  

bool LKTracker::trackf2f(const Mat& img1, const Mat& img2,vector<Point2f> &points1, vector<cv::Point2f> &points2){  

  //TODO!:implement c function cvCalcOpticalFlowPyrLK() or Faster tracking function  

  //1. Track points,Forward-Backward tracking  

  calcOpticalFlowPyrLK( img1,img2, points1, points2, status,similarity, window_size, level, term_criteria, lambda, 0);  

  calcOpticalFlowPyrLK( img2,img1, points2, pointsFB, FB_status,FB_error, window_size, level, term_criteria, lambda, 0);  

  //2. Estimate tracking error,Compute the real FB-error  

  for( int i= 0; i<points1.size(); ++i ){  

        FB_error[i] = norm(pointsFB[i]-points1[i]);//殘差爲歐氏距離【ICPR 2】  

  }  

  //3.Filter out outliers  

  //Filter out points with FB_error[i] > median(FB_error) && points with sim_error[i] > median(sim_error)  

  normCrossCorrelation(img1,img2,points1,points2);  

  return filterPts(points1,points2);  

}

其中normCrossCorrelation(img1,img2,points1,points2)是對光流法跟蹤的結果不放心，因此希望通過對比前後兩點周圍的小塊的相似性，來進一步去掉不穩定的點。這次的相似性不是相關係數，而是normalized cross-correlation (NCC)：

這個比較複雜，建議看wiki的公式，其實還是前面提到的相關係數，只不過計算的時候需要自己減去均值。

[cpp]view
plaincopy

void LKTracker::normCrossCorrelation(const Mat& img1,const Mat& img2, vector<Point2f>& points1, vector<Point2f>& points2) {  

        Mat rec0(10,10,CV_8U);  

        Mat rec1(10,10,CV_8U);  

        Mat res(1,1,CV_32F);  

        for (int i = 0; i < points1.size(); i++) {  

                if (status[i] == 1) {//跟蹤到了  

                        getRectSubPix( img1, Size(10,10), points1[i],rec0 );//以points1[i]爲中心，提取10*10的小塊  

                        getRectSubPix( img2, Size(10,10), points2[i],rec1);  

                        matchTemplate( rec0,rec1, res, CV_TM_CCOEFF_NORMED);//Cross Correlation   

                        similarity[i] = ((float *)(res.data))[0];  

                } else {  

                        similarity[i] = 0.0;  

                }  

        }  

        rec0.release();  

        rec1.release();  

        res.release();  

}

該計算的都計算好了，終於可以篩選了，filterPts(points1,points2)

[cpp]view
plaincopy

//Filter out points with FB_error[i] > median(FB_error) && points with sim_error[i] > median(sim_error)  

bool LKTracker::filterPts(vector<Point2f>& points1,vector<Point2f>& points2){  

  //Get Error Medians  

  simmed = median(similarity);//NCC中值  

  size_t i, k;  

  for( i=k = 0; i<points2.size(); ++i ){//前向篩選，沒跟蹤到的不要  

        if( !status[i])  

          continue;  

        if(similarity[i]> simmed){//normalized crosscorrelation (NCC)篩選，比對前後兩點周圍的小塊  

          points1[k] = points1[i];  

          points2[k] = points2[i];  

          FB_error[k] = FB_error[i];  

          k++;  

        }  

    }  

  if (k==0)  

    return false;  

  points1.resize(k);  

  points2.resize(k);  

  FB_error.resize(k);  

  fbmed = median(FB_error);//殘差中值  

  for( i=k = 0; i<points2.size(); ++i ){//後向篩選，找到了，但是偏離太多  

      if( !status[i])  

        continue;  

      if(FB_error[i] <= fbmed){  

        points1[k] = points1[i];  

        points2[k] = points2[i];  

        k++;  

      }  

  }  

  points1.resize(k);  

  points2.resize(k);  

  if (k>0)  

    return true;  

  else  

    return false;  

}

5.Update bounding box

bbPredict(points,points2,lastbox,tbb), points和points2是前面篩選完之後的點對，現在要依據points，points2來估計bb1的位移和尺度變化，這兩個信息都有了，自然可以決定lastbox在當前幀的位置tbb。

位移估計

位移估計的方法是用所有點對x，y位移的中值作爲位移的估計，如上圖。尺度的估計的方法是用所有點對（同一幀）的伸縮比的中值作爲尺度伸縮的估計，假設只有一堆點，尺度伸縮值的估計方式如下圖：

尺度估計

[cpp]view
plaincopy

//依據points1，points2估計bb1的位移和尺度變化，這兩個信息都有了，自然可以決定其範圍bb2  

void TLD::bbPredict(const vector<cv::Point2f>& points1,const vector<cv::Point2f>& points2,  

                    const BoundingBox& bb1,BoundingBox& bb2)    {  

  int npoints = (int)points1.size();  

  vector<float> xoff(npoints);  

  vector<float> yoff(npoints);  

  printf("tracked points : %d\n",npoints);  

  // 用位移的中值，作爲目標位移的估計  

  for (int i=0;i<npoints;i++){  

      xoff[i]=points2[i].x-points1[i].x;  

      yoff[i]=points2[i].y-points1[i].y;  

  }  

  float dx = median(xoff);//  

  float dy = median(yoff);  

  float s;  

  // 用點對之間的距離的伸縮比例的中值，作爲目標尺度變化的估計  

  if (npoints>1){  

      vector<float> d;  

      d.reserve(npoints*(npoints-1)/2);  

      for (int i=0;i<npoints;i++){  

          for (int j=i+1;j<npoints;j++){  

              d.push_back(norm(points2[i]-points2[j])/norm(points1[i]-points1[j]));  

          }  

      }  

      s = median(d);//  

  }  

  else {  

      s = 1.0;  

  }  

  float s1 = 0.5*(s-1)*bb1.width;// top-left 座標的偏移(s1,s2)  

  float s2 = 0.5*(s-1)*bb1.height;  

  printf("s= %f s1= %f s2= %f \n",s,s1,s2);  

  bb2.x = round( bb1.x + dx -s1);  

  bb2.y = round( bb1.y + dy -s2);  

  bb2.width = round(bb1.width*s);  

  bb2.height = round(bb1.height*s);  

  printf("predicted bb: %d %d %d %d\n",bb2.x,bb2.y,bb2.br().x,bb2.br().y);  

}

6.Failure detection

這一步很簡單，原文是說A failure of the tracker is declared if pixels，其中是殘差的中值，殘差即反向跟蹤和原始跟蹤點的距離。不過程序裏面還要防止目標飛到圖像外面去了。

[cpp]view
plaincopy

if (tracker.getFB()>10 || tbb.x>img2.cols || tbb.y>img2.rows || tbb.br().x < 1 || tbb.br().y <1){//br() bottom right座標  

tvalid =false; //too unstable prediction or bounding box out of image  

tracked = false;  

printf("Too unstable predictions FB error=%f\n",tracker.getFB());  

return;  

}

7.Estimate Confidence and Validity

[cpp]view
plaincopy

Mat pattern;  

Scalar mean, stdev;  

BoundingBox bb;  

bb.x = max(tbb.x,0);  

bb.y = max(tbb.y,0);  

bb.width = min(min(img2.cols-tbb.x,tbb.width),min(tbb.width,tbb.br().x));// bb.height = min(min(img2.rows-tbb.y,tbb.height),min(tbb.height,tbb.br().y));  

getPattern(img2(bb),pattern,mean,stdev);  

vector<int> isin;  

float dummy;  

classifier.NNConf(pattern,isin,dummy,tconf); //1.tconf是用Conservative Similarity  

tvalid = lastvalid;   

if (tconf>classifier.thr_nn_valid){//thr_nn_valid  

tvalid =true;//2.判定軌跡是否有效，從而決定是否要增加正樣本，標誌位tvalid【5.6.2 P-Expert】  

}

註釋很清楚了，大家可以先忽略判定軌跡是否有效這一部分，只要知道它是用最近鄰分類器的Conservative Similarity【5.2】作爲跟蹤目標的得分即可，後面要用這個分數和檢測器進行比較。

TLD 詳細解析之跟蹤器

TLD::track函數

1.Initialize points to grid

2.Track points
3.Estimate tracking error
4.Filter out outliers

5.Update bounding box

6.Failure detection

7.Estimate Confidence and Validity

工作中用到的腳本合集

24-5-18 X

Otsu 大律法圖像二值化

Meanshift原理與應用

c和c++中字符串的截取

mmap()函數用法詳解

RT Thread 設備模型分析

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結

TLD 詳細解析之 跟蹤器

TLD::track函數

1.Initialize points to grid

2.Track points 3.Estimate tracking error 4.Filter out outliers

5.Update bounding box

6.Failure detection

7.Estimate Confidence and Validity

TLD 詳細解析之跟蹤器

2.Track points
3.Estimate tracking error
4.Filter out outliers