本文主要介紹機器學習中的自適應提升算法Adaboost,主要參考爲李航老師的《統計學習方法》和這位同學的博客:一個很有才的同學的博客,我實驗時候的代碼即是由參考這位同學的改寫而成。其實本人接觸機器學習時間並不很長,如果本文哪裏不對,還請各位看官批評指正!
機器學習是一門很有前途的學科。近些年來,機器學習也算一個熱門話題。Google公司的超級大腦,百度的深度研究院,等等。前不久,Google也對外公開發布了自己的無人駕駛汽車,微軟還有用於Skype的實時翻譯。
Adaboost就是一種很強大的機器學習方法,是使用多個“弱分類器”構造一個“強分類器”的方法,“豬隊友”的合作也能變身大神。如果給定一個數據集讓一個分類器來分類,可以想見,構造一個很簡單但是分類效果差強人意的分類器肯定比一個設計精巧,分類效果很好的分類器簡單。所謂的“弱分類器”就是指這種很簡單的分類器,究竟有多弱呢?這個“弱”體現在分類效果僅僅比隨機猜(分類正確的概率是50%)要好。這種“弱”可以說是極易滿足的。因爲即使你拿出來的分類器的正確率甚至不到50%,那麼只需要把結果取個反就好了。所謂“強分類器”自然與弱相對,是指分類效果很好的分類器。但是疑問隨之而來,“豬隊友”們真的能夠變身大神嗎?這個問題已經有大牛們探討過了,結論是肯定的。弱分類器是可以組合成強分類器的。Adaboost即Adaptive Boosting自適應提升方法,便是其中的一種具體的操作方法。
舉個例子,判別一個人是男人還是女人,我們可以構造一個分類器,用語言描述如下:
IF 頭髮長度不超過15cm THEN 是男人
ELSE 是女人
這個分類器不可謂不簡單,只用到了一條規則,判斷方法可以說是“簡單粗暴”。結合生活,不難想象,這個分類器能夠取得一定的分類效果,但是我們仍不滿意。
再加上一個分類器呢?(這條好像是自黑。。。)
IF 身高大於170 THEN 是男人
ELSE 是女人
此外,我們還能舉出其他的一些弱分類器。而Adaboost算法的用處就是將它們捏合在一起,通過各個弱分類器加權投票的方式,最終實現判定某個人是不是男人。所以問題就變成了如何解決各個弱分類器的權重問題。
Adabbost的基本思路如下:
給樣本數據分配一個初始權重(比如可以是1/SAMPLE_NUM)。在每一輪中,首先用帶有權重的樣本數據集確定一個最優弱分類器(被分錯的樣本的權重加起來最小),然後利用這個弱分類器的分類結果計算該分類器在最終強分類器的權重,並調整各個樣本的權重。被它分錯的樣本權重變大,被分對的樣本權重變小。當滿足一定的條件時(迭代次數已滿或者樣本數據完全正確分類了)退出。各個弱分類器的輸出根據自己的權重按照線性組合的方式構成最終的強分類器。
/* 初始化樣本集中各個樣本的權重 w_0 = 1/N ,確定分類器分類器個數 T
* i = 0
* while i < T
* 根據樣本的權重,取分類誤差率最小的弱分類器
* 分類誤差率err即被該分類器分類錯誤的樣本權重相加的和
* 權重 \alpha_i = log(1-err)/err ;
* 改變樣本權重:
* if 此輪的最優弱分類器將樣本分類正確,則\beta = exp(-\alpha)
* else \beta = exp(\alpha)
* w_(i+1) = w_i * beta ;
* w_(i+1)歸一化,使得其相加和是1
* 得到強分類器 = \alpha_i * f_i(x)
*/
或者如此圖所示(符號表示略有不同):
這樣就實現了將若干個弱分類器結合成強分類器的過程。
接下來附上示例代碼,這段代碼使用了一些OpenCV庫函數(主要用於可視化),將二維平面內不同位置的點分開,所用的弱分類器爲很簡單的if_else規則。
其中,f_j(x)是特徵向量x的一個元素。 p_j取值爲+1或者-1,起到改變不等號方向的作用(具體可見李航老師的“統計學習方法”)。
由weakclassifier.h和weakclassifier.cpp文件以及測試文件main.cpp組成。首先是weakclassifier.h文件:
#ifndef _WEAKCLASSIFIER_H
#define _WEAKCLASSIFIER_H
#include <opencv2/opencv.hpp>
#include <opencv2/core/core.hpp>
#include <opencv2/highgui/highgui.hpp>
#include <iostream>
#include <vector>
#include <algorithm>
using namespace std;
using namespace cv;
#define P_SAMPLE_NUM 300 //正樣本個數
#define N_SAMPLE_NUM 400 //負樣本個數
#define SAMPLE_NUM (P_SAMPLE_NUM+N_SAMPLE_NUM)
#define MAX_FEATURE 40
#define X_MAX 300
#define Y_MAX 300
//define the class weakclassifier
//#define CIRCLE_TEST
// 弱分類器的判別函數
int h_fun (double x , double thresh , int parity);
// 樣本數據
// feature數組是特徵
// label是樣本對應的標籤
struct SampleData
{
double _features[MAX_FEATURE];
int _label;
double _eigen;
double _weight;
};
struct Weakclassifier
{
public:
double _threshold;
int _feature;
int _parity;
double _error;
double _alpha;
};
struct StrongClassifier
{
int _nweak; //弱分類器的個數,也是訓練中迭代的次數
vector<Weakclassifier> _weak; //弱分類器
//SampleData _samples[SAMPLE_NUM];
SampleData* _samples; //指向樣本數據的指針
//用於演示的圖像
IplImage* frame;
StrongClassifier (SampleData* samples, int nweak)
{
this->_nweak = nweak;
this->_samples = samples;
this->frame = cvCreateImage (cvSize (X_MAX , Y_MAX) , IPL_DEPTH_8U , 3);
//show the image
for (int i = 0; i < Y_MAX; i++)
for (int j = 0; j < X_MAX; j++)
CV_IMAGE_ELEM (frame , uchar , i , 3 * j) =
CV_IMAGE_ELEM (frame , uchar , i , 3 * j + 1) =
CV_IMAGE_ELEM (frame , uchar , i , 3 * j + 2) = 255;
}
void getEigenVal (const int& feature);
void sortSamples ();
// 改變樣本權重
void updateWeight (const Weakclassifier& best);
// 訓練
void train ();
// 在每一步迭代中得到最優弱分類器
bool getWeakclassifier (Weakclassifier& weakc,const int& feature);
// 弱分類器結果,改變樣本權重時候調用
int getClassifyResult (const SampleData& s);
// 將結果可視化
void drawResult ();
void display ()
{
cvShowImage ("show" , frame);
}
};
void GenerateSampleData (SampleData sample_ori[]);
bool Compare_fun (SampleData s1 , SampleData s2);
#endif // WEAKCLASSIFIER_H
weakclassifier.cpp:
#include "weakclassifier.h"
//define the class weakclassifier
int h_fun (double x , double thresh , int parity)
{
return (parity*x<parity*thresh) ? 1 : -1;
}
bool Compare_fun (SampleData s1 , SampleData s2)
{
return s1._eigen < s2._eigen ? true : false;
}
void swapSampleData (SampleData& s1 , SampleData& s2)
{
SampleData tmp = s2;
s2 = s1;
s1 = tmp;
return;
}
void generateFeatures (SampleData& s,const int x,const int y)
{
#ifdef CIRCLE_TEST
for (int i = 0; i<MAX_FEATURE-1;i++)
{
s._features[i]=
std::cos (CV_PI*i/MAX_FEATURE)*x+
std::sin (CV_PI*i/MAX_FEATURE)*y;
}
s._features[MAX_FEATURE-1] = (x-150)*(x-150)+(y-150)*(y-150);
#else
for (int i = 0; i<MAX_FEATURE;i++)
{
s._features[i] =
std::cos (CV_PI*i/MAX_FEATURE)*x+
std::sin (CV_PI*i/MAX_FEATURE)*y;
}
#endif
return;
}
void StrongClassifier::sortSamples ()
{
std::sort (_samples , _samples+SAMPLE_NUM , Compare_fun);
/*
SampleData* Psamples = &(this->_samples[0]);
for (int i = 0; i<SAMPLE_NUM;i++)
{
double mineigen = Psamples[i]._eigen;
int index = i;
for (int j = i; j<SAMPLE_NUM; j++)
{
if (mineigen>Psamples[j]._eigen)
{
index = j;
mineigen = Psamples[j]._eigen;
}
}
if (index!=i)
swapSampleData (Psamples[index] , Psamples[i]);
}
*/
return;
}
// 得到Eigen的值
void StrongClassifier::getEigenVal (const int& feature)
{
for (int i = 0; i<SAMPLE_NUM;i++)
{
this->_samples[i]._eigen = this->_samples[i]._features[feature];
}
return;
}
bool StrongClassifier::getWeakclassifier (Weakclassifier& weakc , const int& feature)
{
/*1.將SAMPLEDATA按照feature的值順序排列*/
// 產生eigen
this->getEigenVal (feature);
// 排序
this->sortSamples ();
/**/
// 統計正樣本權重和負樣本權重
double pos_weight = 0;
double neg_weight = 0;
for (int i = 0; i<SAMPLE_NUM; i++)
{
if (_samples[i]._label==1)
pos_weight += _samples[i]._weight;
else
neg_weight += _samples[i]._weight;
}
// 按照訓練算法訓練
double loss_pos_weight = 0 , loss_neg_weight = 0;
double besterror = 0.5;
int bestparity = 0;
double bestthresh = -1;
//
for (int i = 1; i<SAMPLE_NUM; i++)
{
if (_samples[i-1]._label==1)
loss_pos_weight +=_samples[i-1]._weight;
else
loss_neg_weight += _samples[i-1]._weight;
// FP + FN
if ((loss_pos_weight + neg_weight - loss_neg_weight) < besterror)
{
besterror = loss_pos_weight + neg_weight - loss_neg_weight;
bestparity = -1;
//the optimal threshold is the half of the sum of kth and (k+1)th
bestthresh = (_samples[i]._eigen + _samples[i-1]._eigen) / 2;
}
// FN+FP
else if (loss_neg_weight + pos_weight - loss_pos_weight < besterror)
{
besterror = loss_neg_weight + pos_weight - loss_pos_weight;
bestparity = 1;
bestthresh = (_samples[i]._eigen + _samples[i-1]._eigen) / 2;
}
}
CV_Assert (besterror>=0);
weakc._threshold = bestthresh;
weakc._error = besterror;
weakc._parity = bestparity;
weakc._feature = feature;
weakc._alpha = 0.5*std::log ((1.0-besterror)/(besterror+1E-8));
return true;
}
//訓練 強分類器
void StrongClassifier::train ()
{
int classifier_num = this->_nweak; //要用到的弱分類器數量
Weakclassifier besth , h_tmp;
for (int i = 0; i<classifier_num; i++) //最多這麼多弱分類器,每個弱分類器對應了一個特徵
{
/*1 . 找到 使得加權誤差最小的弱分類器 */
//找到最優的分類器
double curerrror = 0.5;
for (int j = 0; j<MAX_FEATURE; j++)
{
this->getWeakclassifier (h_tmp , j);
if (h_tmp._error<curerrror)
{
curerrror = h_tmp._error;
besth = h_tmp;
}
}
CV_Assert (curerrror<0.5);
this->_weak.push_back (besth); //找到了這次迭代步驟中的最優弱分類器
std::cout<<"****************************************************"<<endl;
std::cout<<"Best Classifier :" <<i<<" Complete! " <<endl;
std::cout<<"Threshold: "<<besth._threshold<<" "<<"Parity: "<<besth._parity<<" "
<<endl<<"Error "<<besth._error<<" "<<"Alpha "<<besth._alpha<<" "<<
"Feature Index:"<<besth._feature<<endl;
//update the weight
this->updateWeight (besth);
int errorcount = 0;
for (int j = 0; j<SAMPLE_NUM; j++)
{
SampleData* Ps = &(_samples[j]);
if (this->getClassifyResult (*Ps)!=Ps->_label)
errorcount++;
}
cout<<"There is "<<errorcount<< " error !"<<endl;
cout<<"--------------------------------------"<<endl;
/*畫圖*/
this->drawResult ();
int c = waitKey ();
while (c!=27)
{
c = waitKey ();
}
if (errorcount==0)
{
break;
}
}
return;
}
void StrongClassifier::drawResult ()
{
for (int i = 0; i < Y_MAX; i++)
for (int j = 0; j < X_MAX; j++)
CV_IMAGE_ELEM (frame , uchar , i , 3 * j) =
CV_IMAGE_ELEM (frame , uchar , i , 3 * j + 1) =
CV_IMAGE_ELEM (frame , uchar , i , 3 * j + 2) = 0;
for (int y = 0; y<Y_MAX; y += 1)
{
for (int x = 0; x<X_MAX; x += 1)
{
SampleData s;
generateFeatures (s , x , y);
int label = this->getClassifyResult (s);
if (label==1)
{
CV_IMAGE_ELEM (frame , uchar , y , 3 * x + 1) = 255;
}
else
{
CV_IMAGE_ELEM (frame , uchar , y , 3 * x + 2) = 255;
}
}
}
cvShowImage ("Img" , this->frame);
}
int StrongClassifier::getClassifyResult (const SampleData& s)
{
double res = 0;
int curWeakNum = this->_weak.size ();
Weakclassifier* Pweak;
for (int i = 0; i<curWeakNum; i++)
{
Pweak = &(this->_weak[i]);
res += Pweak->_alpha*h_fun (s._features[Pweak->_feature],
Pweak->_threshold , Pweak->_parity);
}
int label = res>0 ? 1 : -1;
return label;
}
void StrongClassifier::updateWeight (const Weakclassifier& best)
{
double weight_sum = 0;
double weight[SAMPLE_NUM];
double weight_tmp;
for (int i = 0; i<SAMPLE_NUM;i++)
{
SampleData* Ps = _samples+i;
int label = h_fun (Ps->_features[best._feature] , best._threshold , best._parity);
CV_Assert (Ps->_label==1 || Ps->_label==-1);
if (label!=Ps->_label) //預測錯了
{
weight_tmp = Ps->_weight*std::sqrt ((1-best._error)/best._error);
}
else
{
weight_tmp = Ps->_weight*std::sqrt (best._error/(1-best._error));
}
weight_sum += weight_tmp;
weight[i] = weight_tmp;
}
for (int i = 0; i<SAMPLE_NUM;i++)
{
SampleData* Ps = _samples+i;
Ps->_weight = weight[i]/weight_sum;
}
return;
}
測試用的main.cpp:
#include "weakclassifier.h"
#include <iostream>
#include <time.h>
using std::cout;
using std::cin;
using namespace cv;
double point_x[SAMPLE_NUM] = { };
double point_y[SAMPLE_NUM] = { };
void generateTrainMat (Mat& trainMat)
{
trainMat.create (SAMPLE_NUM , MAX_FEATURE+1 , CV_64FC1);
//trainMat
srand (time (0));
int counter = 0;
int random_x , random_y;
while (counter < P_SAMPLE_NUM)
{
random_x = rand () % 300 - 150;
random_y = rand () % 300 - 150;
#ifdef CIRCLE_TEST
if (random_x * random_x + random_y * random_y > 2500
&& random_x*random_x+random_y*random_y<3600)
continue;
#else
if (random_x * random_x + random_y * random_y > 2500)
continue;
#endif
point_x[counter] = random_x + 150;
point_y[counter] = random_y + 150;
trainMat.at<double> (counter , 0) = 1;
#ifdef CIRCLE_TEST
for (int j = 0; j<MAX_FEATURE-1; j++)
trainMat.at<double> (counter , j+1) =
std::cos (CV_PI*j/MAX_FEATURE)*point_x[counter]+
std::sin (CV_PI*j/MAX_FEATURE)*point_y[counter];
trainMat.at<double> (counter , MAX_FEATURE) =
random_x * random_x + random_y * random_y;
#else
for (int j = 0; j<MAX_FEATURE; j++)
trainMat.at<double> (counter , j+1) =
std::cos (CV_PI*j/MAX_FEATURE)*point_x[counter]+
std::sin (CV_PI*j/MAX_FEATURE)*point_y[counter];
#endif
counter++;
}
while (counter < SAMPLE_NUM)
{
random_x = rand () % 300 - 150;
random_y = rand () % 300 - 150;
#ifdef CIRCLE_TEST
if (random_x * random_x + random_y * random_y < 2500 ||
random_x*random_x+random_y*random_y>3600)
continue;
#else
if (random_x * random_x + random_y * random_y < 2500)
continue;
#endif
point_x[counter] = random_x + 150;
point_y[counter] = random_y + 150;
trainMat.at<double> (counter , 0) = -1;
#ifdef CIRCLE_TEST
for (int j = 0; j<MAX_FEATURE-1; j++)
trainMat.at<double> (counter , j+1) =
std::cos (CV_PI*j/MAX_FEATURE)*point_x[counter]+
std::sin (CV_PI*j/MAX_FEATURE)*point_y[counter];
trainMat.at<double> (counter , MAX_FEATURE) =
random_x * random_x + random_y * random_y;
#else
for (int j = 0; j<MAX_FEATURE; j++)
trainMat.at<double> (counter , j+1) =
std::cos (CV_PI*j/MAX_FEATURE)*point_x[counter]+
std::sin (CV_PI*j/MAX_FEATURE)*point_y[counter];
#endif
counter++;
}
}
void displayPic (const StrongClassifier& cl)
{
// display
int i = 0;
for (i = 0; i<P_SAMPLE_NUM;i++)
{
// display
cvCircle (cl.frame , cvPoint (int (point_x[i]) , int (point_y[i])) ,
3 , cvScalar (0 , 255 , 0) , 1);
}
for (; i<SAMPLE_NUM; i++)
{
cvCircle (cl.frame , cvPoint (int (point_x[i]) , int (point_y[i])) ,
3 , cvScalar (0 , 0 , 255) , 1);
}
return;
}
void generateSampleData (SampleData* samples , const Mat& trainMat)
{
for (int i = 0; i<SAMPLE_NUM; i++)
{
samples[i]._label = (int) (trainMat.at<double> (i , 0));
for (int j = 1; j<=MAX_FEATURE; j++)
{
samples[i]._features[j-1] = trainMat.at<double> (i , j);
}
samples[i]._weight = 1.0/SAMPLE_NUM;
}
}
int main ()
{
Mat trainMat;
generateTrainMat (trainMat);
SampleData samples[SAMPLE_NUM];
generateSampleData (samples , trainMat);
StrongClassifier classifier(samples,100);
displayPic (classifier);
classifier.display ();
classifier.train ();
waitKey ();
}
運行結果:
首先是樣本數據集,
迭代一次的分類結果:
迭代5次:
迭代28次之後,
迭代31次,發現分類錯誤數爲0.