SVM算法預測股指

四年前的代碼,當時想着能不能用這個算法,在股市上大賺一筆,現在想起來真是好笑,不過,想了,也做了,雖然沒結果,也許對後來者有用,就共享出來,然後一笑了之。


%SVM_SMO solving regression problem;
%almost that all the basic module could be kept;
%using the RBF training data as the input;
%copy the SVM_SMO_C as the algorithm scheme;
 
%--------------------------------------------------------------------------
%build the training set;
%this is a regression problem;
%think about the linear efxt non-sensitive loss;
%--------------------------------------------------------------------------

%--------------------------------------------------------------------------
%20140218 updating
%今天需要更新的地方有:
%1,對偶間隙,不要偏置的版本;試驗了,作用不大;
%2,偏置如何更新;作用不大,不過已經更新過了;
%3,delta_W矩陣的運用;似乎沒什麼作用,不過繼續使用;
%4,ratio停機準則來選擇第一個點;這個好像有點作用,我稱之爲“啃硬骨頭法”——如果
%有一個外循環,造成的ratio不降反升,那就繼續這個外循環,也就是第一個點不變化,繼
%續這個過程,知道下降爲止,試驗結果還可以;
%5,但是在內循環,需要找到能夠讓對偶目標增長最快的點,所以要記錄以前的點,我稱之爲
%“好吃你就多吃一點”法則;
%6,總的來說方法如下:在大原則下,要啃硬骨頭,具體實施的時候,找到讓對偶目標增長
%最快的方法;
%7,此外,e的選取,以及容許誤差的選擇對收斂速度有影響;還有就是停機準則的判決標準;
%8,另外,還有abs(a(i))*(e-|f(xi)-yi|)+C*efxt(i);沒有測試;這個也是值得測試的
%一個方案;
%--------------------------------------------------------------------------

%--------------------------------------------------------------------------
%20140305 updating
%今天實驗用輸入輸出異質的方法來預測股指,看看能達到怎樣的精度;
%首先要解決數據錄入整理的問題;
%接下來就是細節的處理,比如C的值,樣本數量,方差,精度,相對精度,等等。。。
%--------------------------------------------------------------------------

clear all;
close all;

fprintf('Build the training set.\n');
st = cputime;

for round=1:5
    
N=400;
NN=5000;
NS=N;                               %suppose that all data points are SVs.
NS_1=N;                             %suppose that half points are SVs which in bonds;
si=1;                               %方差;
m0=5;                               %輸入向量的維度,這裏可以看成是追溯歷史的深度;
qwe=4;                              %1,開盤價; 4,收盤價;2,最高價;3,最低價;
del_d=0;                           %delay days, at this time, 20;

%nt=3000;                    %total points: 10,000;
%n=1e3;                      %training set scale: 1000;
%ep=100*25;                  %epoch number: 2500;
%vf=5;                       %frequency of verification: validation on every 5 epoch training;
%sigma_v=1;
%beta=0.5;

load sz000001.mat;
A=data;
%x=A(:,1);
[a1,a2]=size(A);
%交換第一列和第四列;

%{
swap=A(:,1);
A(:,1)=A(:,4);
A(:,4)=swap;
%}

for i=1:a2
    miu_a(i)=mean(A(:,i));                      %取得x1的均值;注意到這裏是減去了整個數據集的均值,實際上對於訓練集而言,均值不一定爲0;
    A(:,i)=A(:,i)-miu_a(i);                     %減去均值;相當於減去了直流分量;    
    max_a(i)=max(abs(A(:,i)));                          %取得絕對值的最大值;這就是歸一化了;
    A(:,i)=A(:,i)./max_a(i);                                 %除以這個數;
end

for i=1:N;
    for j=1:m0;
        X(i,j)=A(i+a1-N-del_d-1,j);
    end
end

for i=1:N
    y(i)=A(i+a1-N-del_d,qwe);
end
figure;
plot(y,'r.');

fprintf('initial the SMO parameters.\n');

    a=zeros(1,N);%normrnd(0,1,1,N)./10;%  zeros(1,N);%                 %the first initial value;
    b=0;                            %bias.
    C=1;                          %the boundry; C是懲罰係數,又稱正則化係數?預示風險大小?這個數字大會怎樣呢?
    FX=zeros(1,N);                  %f(x(i)) i=1toN, since {a(i)}=0, so f(x)=0; f(x)=w'*x-b; or f(x)=SUM ai*K(xi,xj)-b=0;
    E=FX-y;                         %y: the expected response; normrnd(0,1,1,N)./1;%
                                    %??E should be followed by training set;
    efxt=zeros(1,N);                %efxt is the gap distance related parameters; 這個是所有管道外面的點,對於管壁的偏離程度;也是我們想儘可能消除的參數;(減少絕對值=C的算子);
    e=(1e-3)*16;                    %相當於是管子,超過這個界限就跑到管子外面,小於這個界限還在管子內部;小於
    e_f=0;                          %因爲e是動態變化的,所以用e_f標識,看看是如何變化的;
    eps=e/2;                        %threshold for stopping judgement;容許範圍,一般要比e更小才行的;
    thre=0.001;                     %停機準則的閾值;
    times=0;                        %at the start, times=0;the number of
                                    %external loops;
    presv=0;                        %at the beginning, suppose there is no
                                    %support vector found;
    Gram=eye(N,N);                  %build the gram matrix, for recording the calculation results;
    delta_W=zeros(N,N);%abs(normrnd(0,1,N,N));
    ot=0;                           %stop the operation according the run times;
    totaltimes=0;                   %how many times of calculation operated?
    in=1;                           %set the initial value of in;in start from 1;
    ic=1;
    i0=1;
    %ratio_old=0;
    ratio=1;
    
while(1)                           %the first loop, when ratio<eps, loop stop;   
%select the first point;
%--------------------------------------------------------------------------
if times==0   %(mod(times,10)==0)                      %first selection or calculation;
    i1=1;                          %select the point in random, so chose 1;
else                               %if this is not the first selection:
    %----------------------------------------------------------------------
    %when other selection:
    %1, choose the (0,C) alpha which break the KKT conditions;
    %2, then alpha=0 and alpha=C points which break the KKT conditions;
    %3, if alpha choosen is in the last of the squene, re-start from 1;
    %----------------------------------------------------------------------
    %{
    if ratio>=ratio_old
        times=0;
    else
        ratio_old=ratio;
    end
    %}
  %if ratio>ratio_old              %這裏的用意是:如果ratio沒有取得進展,那麼重複上一輪循環,直到取得進展爲止;
     %ratio_old=ratio;
    while (in<=N)                  %search in all (0,C)alpha; 遍歷界內違反KKT條件的所有點;
                                   %could I simplize the alpha parameters as below?
            if (abs(a(in))<C) && (a(in)~=0)
                ain=abs(y(in)-FX(in))-e;
                if abs(ain)>eps
                    i1=in;
                    presv=1;
                end
            end
                in=in+1;                   %if we don't found the break KKT condition SV, continue...
            if (presv)                 %we've found SV and not exceed the NS times;
                break;
            end
    end   
    if presv==0                    %if we don't found SV which break KKT condition, or, the times_i1 out of NS times;               
       in=1;
        while (ic<=N)              %遍歷所有管道外的點;
                                   %if the alpha on the boundry,i.e. a=0 or C;
            if abs(a(ic))==C       %尋找那些不滿足KKT條件的點;實際上這些點都能符合這些條件;因此這個循環似乎作用不大;只能這麼說了,每次迭代之後,這些值都會變化,所以還是有幫助的;
               aic=abs(y(ic)-FX(ic))-e;%-efxt(ic);%-efxt(ic);%
                if aic<-eps        %(aic>eps)||(aic<-eps)
                    i1=ic;         %if break the KKT rule, i1=i;
                    presv=1;       %we found a SV already;
                end
            end
                ic=ic+1;
            if (presv)
                break;
            end           
        end
    end   
    if presv==0
       while (i0<=N)
                                   %if the alpha on the boundry,i.e. a=0 or C;
           if a(i0)==0
              ai0=abs(y(i0)-FX(i0))-e;
              if ai0>eps
                 i1=i0;         %if break the KKT rule, i1=i;                   
                 presv=1;       %we found a SV already;                
              end
           end
              i0=i0+1;
           if (presv)
              break;
           end          
       end
    end      
    if (presv==0)                  %if we didn't found a SV which break KKT;               
       ic=1;
       i0=1;       
       i1=floor(rand*N)+1;
    end   
    presv=0;                       %back to the initial value;i.e.no sv found;
 %end
end

%--------------------------------------------------------------------------    
%{
--------------------------------------------------------
now I have the first point, and search the second point.
--------------------------------------------------------
%}
times_2=0;                            %how many times of internal loop?
i2old=i1;                             %the initial value of i2;
while(1)                              %the important question is how to
                                      %stop the loop and this is internal
                                      %loop, too.
 
if times==0 %(mod(times,10)==0)                 %at the first time, most points have;every 5 loops, we calculate all data points.
                                      %a(i)=0;
   if i2old<N                        %i2old from i1:N; i.e. 1:N.
        i2=i2old+1;                   %i2 from 2:N+1;
        i2old=i2;                     %remember the choice;
   end   
else                                  %at the other loops,there should be
                                      %some points a(i)>0;    
%--------------------------------------------------------------------------
%第二個點的尋找纔是費腦筋的活,這裏需要尋找界內的點,然後,儘可能尋找對對偶目標
%貢獻大的點;
%可以通過計算每次對delta_W的貢獻,並且記錄下來完成這個活動;
%這種方法被稱爲“好吃你就多吃一點法”;
%也可以通過計算對偶間隙的大小來選擇合適的點;
%雖然運算一般來說都有簡潔的形式,但在一開始,必須窮盡所有可能性;
%對偶間隙的大小,這種方法效果不大好,取消了;
%--------------------------------------------------------------------------
    max_delta=0;
    min_E=E(i1);                         
    max_E=E(i1);
    %found_f=0;
  if times_2<=NS_1
    if E(i1)>=0
       for i=1:N
            if delta_W(i1,i)<=0 && a(i)~=0 && abs(a(i))<C           %根據這個方法,所有界內的點基本上都遍歷了一遍;給那些遍歷過後,對偶函數反而下降的點一次機會;
                if (E(i)<min_E) %&&(i~=i2old_x)
                    min_E=E(i);
                    i2=i;
                    %found_f=1;
                end
            end    
       end
    else
       for i=1:N
            if delta_W(i1,i)<=0 && a(i)~=0 && abs(a(i))<C %&&       不管大小,正負;
                if (E(i)>max_E) %&&(i~=i2old_x)
                    max_E=E(i);
                    i2=i;
                    %found_f=1;
                end            
            end    
       end
    end
  else
      
        for i=1:N
            if a(i)~=0 && abs(a(i))<C                               %然後再界內選擇最優點,繼續,但是這裏就有問題了,
                                                                    %需不需要選擇管道外的點??如果不需要,那麼內循環
                                                                    %應該是2×NS_1次就可以了,多運算無益;剛纔試過了,
                                                                    %沒有效果,內循環必須有足夠的次數;
                if delta_W(i1,i)>max_delta && i~=i1;
                    max_delta=delta_W(i1,i);
                    i2=i;
                    %found_f=1;
                    %times_2=0;
                end
            end
        end
    end

end
   times_2=times_2+1;               %totally N for times=0;   
if times_2>NS        %+1            %finish one complete loop;        
   break;
end
    totaltimes=totaltimes+1;        %total times of calculation;開始運算了,計數加一;
   
    %---------------------------------------------------------------
    %now I have two points, start the calculation;
    %---------------------------------------------------------------
%{
if i2>=max_i                         %the i2 should larger than max_i;
    max_i=i2;                        %update the value of max_i;
else                                 %if the i2 less than max_i;
    max_i=1;                         %max_i go back to 1;
    times=1;                         %times increased since the first loop
                                     %is complete;
    break;                           %jump out the loop;
end
%}
%now we get the i2, so we could start the optimization;
      
%select the second point;
 
%SMO optimization algorithm;
%{
possible parameters;
SV(i); collect all support vectors into one group and calculate the E1&E2
using these elements;
x1,x2;
x_sv(i) and y_sv(i) compare to the SV(i);
---------------------------------------------
maybe we don't need to collect the SV group.
we just choose the {a(i)>0}is okay,
when the algorithm stops, we could update the
SVs and give the final decision hyper plane.
---------------------------------------------
a1,a2;
a2new,a2new-unc,a2old;a1new,a1old;
y1,y2;
E1,E2;
L,H;
K11,K22,K12;
%}
 
%and we need the kernel function;
%{
    K(x,z)=exp(-1/(2*sigma^2)*norm(x-z)^2);
%}
%--------------------------------------------------------------------------
%
%RUN the SMO algorithm
%
%--------------------------------------------------------------------------
x1=X(i1,:);
x2=X(i2,:);
y1=y(i1);
y2=y(i2);
a1old=a(i1);
a2old=a(i2);
E1=E(i1);                        %this value got from memory;
E2=E(i2);                        %ditto;
star=a1old+a2old;
%FX1=FX(i1);
FX1old=FX(i1);
%FX2=FX(i2);
FX2old=FX(i2);
bold=b;


K11=1;%K(x1,x1); for Guass kernel function;
K22=1;%K(x2,x2); ditto;

if Gram(i1,i2)~=0
    K12=Gram(i1,i2);
else
    K12=K(x1,x2,si);
    Gram(i1,i2)=K12;   
    Gram(i2,i1)=K12;
end

k=K11+K22-2*K12;                 %parameter couldn't be 0 when it worked as divider.
 
if k==0                          %the possibility is rare.
    k=0.01;
end

delta=2*e/k;
a2new=a2old+(E1-E2)/k;
a1new=star-a2new;

if (a1new*a2new<0)
    if (abs(a2new)>=delta)||(abs(a1new)>=delta)
        a2new=a2new-sign(a2new)*delta;
    else
        judcon=abs(a2new)-abs(a1new);
        a2new=(sign(judcon)+1)/2*star;
    end   
end

L=max(star-C,-C);
H=min(C,star+C);

a2new=min(max(a2new,L),H);
a1new=star-a2new;

if times~=0 && L>=H                         %first time, we should let L=H=0;
    break;
end

if a1new>C
   a1new=C;
end
if a1new<-C
   a1new=-C;
end
%a1new=max(0,a1new);
%{
if abs(a2new-a2old)<(eps*(a2new+a2old-eps))          %if the difference is little, jump out of the loop;
    break;
end
%}
a(i1)=a1new;                           %now we could update the{a(i)}i=1toN
a(i2)=a2new;                           %update the {a(i)};
%now we starts update the bias;
%--------------------------------------------------------------------------
%bold=b;
%{
a1e=a1old-a1new;%-a1old;
%a1e_2=a1e*K12;
a2e=a2old-a2new;%-a2old;
%a2e_2=a2e*K12;

b1new=-E1+a1e+a2e*K12+b;
b2new=-E2+a1e*K12+a2e+b;
%bnew=(b1new+b2new)/2;

 
if abs(a1new)<C && a1new~=0                  %if a1new is in the bounds;
    b=b1new;
else
    if abs(a2new)<C && a2new~=0
        b=b2new;
    else
        %if (a1new==0||abs(a1new)==C)&&(a2new==0||abs(a2new)==C)&&(L~=H)
            b=(b1new+b2new)/2;
        %end
    end
end
%}
%--------------------------------------------------------------------------
%try to update the FX1 and FX2 by faster method;
%--------------------------------------------------------------------------
%{
if abs(a1new)<C && a1new~=0
    FX1=FX1+(a2new-a2old)*K12+a1new-a1old;
else
    FX1=0;
  for i=1:N
    if a(i)~=0
        if Gram(i,i1)~=0;
            Ki1=Gram(i,i1);
        else
            Ki1=K(X(i,:),x1,si);
            Gram(i,i1)=Ki1;
            Gram(i1,i)=Ki1;
        end
        FX1=FX1+a(i)*Ki1;
    end
  end
end   

if abs(a2new)<C && a2new~=0
    FX2=FX2+(a1new-a1old)*K12+a2new-a2old;
else
    FX2=0;
  for i=1:N
    if a(i)~=0
        if Gram(i,i2)~=0;
            Ki2=Gram(i,i2);
        else
            Ki2=K(X(i,:),x2,si);
            Gram(i,i2)=Ki2;
            Gram(i2,i)=Ki2;
        end
        FX2=FX2+a(i)*Ki2;
    end
  end
    
end
%}

FX1=0;
FX2=0;
for i=1:N
    if a(i)~=0
        if  Gram(i,i1)~=0;
            Ki1=Gram(i,i1);
        else
            Ki1=K(X(i,:),x1,si);
            Gram(i,i1)=Ki1;
            Gram(i1,i)=Ki1;
        end
        FX1=FX1+a(i)*Ki1;
        if  Gram(i,i2)~=0;
            Ki2=Gram(i,i2);
        else
            Ki2=K(X(i,:),x2,si);
            Gram(i,i2)=Ki2;
            Gram(i2,i)=Ki2;
        end
        FX2=FX2+a(i)*Ki2;
    end
end

%ve=a*Gram(:,i1);
%ve2=a*Gram(:,i2);


%}
if (a2new>0) && (a2new<C)%abs(a2new)<C && a2new~=0
    b=y2-FX2-e;
else
    if (a2new<0) && (a2new>-C)
        b=y2-FX2+e;
    else
        if (a1new>0) && (a1new<C)%abs(a1new)<C && a1new~=0                  %if a1new is in the bounds;
            b=y1-FX1-e;
        else
             if (a1new<0) && (a1new>-C)
                b=y1-FX1+e;
            else

        %if (a1new==0||abs(a1new)==C)&&(a2new==0||abs(a2new)==C)&&(L~=H)
                b=(y1-FX1+y2-FX2)/2;
        %end
            end
        end
    end
end

FX(i1)=FX1+b;                           %FX=SUM ai*yi*K(xi,xj)-b;
FX(i2)=FX2+b;
E(i1)=FX(i1)-y1;                        %store the E(i) into the E matrix;
E(i2)=FX(i2)-y2;

%這裏是對偶目標引起的變化,還是有點作用的;
delta_W(i1,i2)=a1new*(FX1-y1-0.5*a1new)-a1old*(FX1old-y1-0.5*a1old-bold)+a2new*(FX2-y2-0.5*a2new)-a2old*(FX2old-y2-0.5*a2old-bold)+K12*(a1old*a2old-a1new*a2new)+e*(abs(a1new)-abs(a1old)+abs(a2new)-abs(a2old));
%delta_W(i1,i2)=abs(delta_W(i1,i2));
%delta_W(i2,i1)=delta_W(i1,i2);%??????
%%覺得i1和i2,如果兩個點相同,但是次序不一樣,可能引起的對偶函數變化也不盡相同,所以這裏要拿出來;不能混爲一談;效果一般,可以接受;

%{
E1=E(i1);
E2=E(i2);

b1new=E1+a1e+a2e*K12+bold;
b2new=E2+a1e*K12+a2e+bold;
b=(b1new+b2new)/2;
%}
%N, the number of training set;
 
%now we calculate the stop formula and judge whether the algorithm should
%stop.
end
    %---------------------------------------------------------------
    %one internal loops complete, times++ for external loop;
    %---------------------------------------------------------------
%C=max(a)+1;
%--------------------------------------------------------------------------
% I hope that calculation could be complete less than (n-1)*(n-2)/2 times;
%--------------------------------------------------------------------------
   
    if totaltimes>2500*N
        fprintf( 'how many knives?\n');
        fprintf( '-----------------------------------------\n' );
        break;
    end
   
times=times+1;                        %times++, i.e. outer loop ++; 內循環結束後,times+1,
%{
if times>=N                           %i.e. we don't want C(N-1,2) times calculation;
    times=1;
    ot=ot+1;
end
 
    if ot>2                           %how many times of C(N-1,2) operated;
        fprintf('out of times ...\n');
        fprintf('-----------------------------------------\n');
        break;
    end
  %}
 
 
%do we need update the E(i) and FX(i)? I think so. FX(i) is neccesary, E(i)
%don't need.
%now I hesitate.
 
i=1;
l=0;
j=0;
x_sv=zeros(NS,m0);
%統計所有的支持向量;
for i=1:N
    if a(i)~=0
        l=l+1;
        SV(l)=a(i);
        y_sv(l)=y(i);
        x_sv(l,:)=X(i,:);
        ptr(l)=i;                   %remember the pointer;
        if abs(a(i))~=C;                 %how many SVs which in the bonds.(0,C)
            j=j+1;
        end
    end   
end
NS=l;                               %the number of support vectors.
NS_1=j;                             %所有管道壁上的支持向量;
%{
if NS<=(N/10)
    p=0;
    for j=1:NS;
        for i=1:NS;
            d_max=norm(x_sv(:,i)-x_sv(:,j));
            if d_max>p;
                p=d_max;
            end
        end
    end
    d_max=p;
    si=d_max^2/(2*NS);
end
%}
%{
lold=1;
while (a(lold)==C)
    lold=lold+1;
end
FV=0;
for l=1:NS
    FV=FV+SV(l)*y_sv(l)*K(x_sv(:,l),x_sv(:,lold));
end
b=FV-y_sv(lold);
lold=lold+1;
if lold>NS
    lold=1;
end
%}
%計算所有的SVM輸出;
FX=zeros(1,N);
for i=1:N;
   %if a(i)~=0;
    for l=1:NS
        if  Gram(ptr(l),i)~=0;
            Kli=Gram(ptr(l),i);
        else
            Kli=K(x_sv(l,:),X(i,:),si);
            Gram(ptr(l),i)=Kli;
            Gram(i,ptr(l))=Kli;
        end
        FX(i)=FX(i)+SV(l)*Kli;
    end
   %end
end
sv_x=x_sv;                         %this vector for demostation.
x_sv=zeros(NS,m0);                  %clear past data for preparation.
FX_old=FX;

%--------------------------------------------------------------------------
%update the bias; average all |a|<C and a~=0;
%--------------------------------------------------------------------------
i=0;
bias=0;
for l=1:NS;
    if SV(l)>0 && SV(l)<C
        i=i+1;
        bias=bias+y_sv(l)-FX(ptr(l))-e;
    end
    if SV(l)<0 && SV(l)>-C              %這裏居然出現了一個大bug,實在是太不仔細了!
        i=i+1;
        bias=bias+y_sv(l)-FX(ptr(l))+e;
    end
end

if i~=0;
    b=bias/i;
end

FX=FX+b*ones(1,N);
E=FX-y;

efxt=max(0,abs(E)-e);
%{
figure;
plot(FX);
hold on;
plot(y,'r.');
%}
%{
for i=1:N   
    efxt(i)=max(0,abs(E(i))-e);
end
%}
%{
    if a(i)>=0
    efxt(i)=efxt(i);
    else
        efxt(i)=-efxt(i);
    end
    %}
%-------------------------------------
%the new criteria: meet the KKT or not
%20140204 the effect is not good.
%think about the C-SVM method.
%-------------------------------------
%{
meetKKT=0;
i=1;
while i<=N
    jg=y(i)*FX(i);
    if a(i)>0
        if jg~=1;
            meetKKT=1;
        end
    else
        if a(i)==0
            if jg<1
                meetKKT=1;
            end      
        end
    end
    if (meetKKT)                  %if break the KKT rule, quit the loop;
                                  %i.e. need more loops.
        break;
    end
    i=i+1;
end
fprintf('Now we can see whether we could stop calculation or not...\n');
if (meetKKT)   
    fprintf('NOT now\n');
else
    fprintf('Complete.\n');
    break;
end
%}
%--------------------------------------------------------------
%Now we revise the stop criteria.
%對於迴歸問題,停機條件肯定要改的;原目標和對偶目標都已經不一致了;
%--------------------------------------------------------------
%{
%停機條件1;
AAK_1=0;                    %a(i)>0 or a(i)=-C;
AAK_2=0;                    %-C<a(i)<0;
%AAK_3=0;                    %a(i)=-C;
AAK=0;
alpha_e=0;                  %alpha(i)*e;
alpha_b_y=0;                %alpha(i)*(b-y(i));


for i=1:N
    if a(i)~=0
        for j=1:N
            if (a(j)~=0)
                if Gram(i,j)~=0
                    Kij=Gram(i,j);
                else
                    Kij=K(X(i,:),X(j,:),si);
                    Gram(i,j)=Kij;
                    Gram(j,i)=Kij;
                end
                if (a(i)>0)||(a(i)==-C)
                    AAK_1=AAK_1+a(i)*a(j)*Kij;
                else
                    if (a(i)<0) && (a(i)>-C)
                        AAK_1=AAK_1-a(i)*a(j)*Kij;
                    end
                end                    
            end
        end
    end
end
%{
for i=1:N
    if (a(i)<0) && (a(i)>-C)
        for j=1:N
            if (a(j)~=0)
                if Gram(i,j)~=0
                    Kij=Gram(i,j);
                else
                    Kij=K(X(i,:),X(j,:),si);
                    Gram(i,j)=Kij;
                    Gram(j,i)=Kij;
                end
                AAK_2=AAK_2+a(i)*a(j)*Kij;
            end
        end
    end
end
AAK_2=-AAK_2;                %符號問題這裏先解決;到最後全部加總就可以了;
%}
%{
for i=1:N
    if a(i)==-C
        for j=1:N
            if (a(j)~=0)
                if Gram(i,j)~=0
                    Kij=Gram(i,j);
                else
                    Kij=K(X(i,:),X(j,:),si);
                    Gram(i,j)=Kij;
                    Gram(j,i)=Kij;
                end
                AAK_3=AAK_3+a(i)*a(j)*Kij;
            end
        end
    end
end
%}
for i=1:N
    if a(i)==-C
        alpha_e=alpha_e+C*e;
    else
        alpha_e=alpha_e+a(i)*e;
    end
end

for i=1:N
    if (a(i)>0) || (a(i)==-C)
        alpha_b_y=alpha_b_y+a(i)*(b-y(i));
    else
        if (a(i)<0) && (a(i)>-C)
            alpha_b_y=alpha_b_y+a(i)*(y(i)-b);
        end
    end
end


for i=1:N
    if (a(i)~=0)
        for j=1:N
            if (a(j)~=0)
                if Gram(i,j)~=0
                    Kij=Gram(i,j);
                else
                    Kij=K(X(i,:),X(j,:),si);
                    Gram(i,j)=Kij;
                    Gram(j,i)=Kij;
                end
                AAK=AAK+a(i)*a(j)*Kij;
            end
        end
    end
end
 
sumay=a*y';
sumabsa=sum(abs(a));
sume=C*sum(efxt);

Gap=AAK_1+alpha_e+alpha_b_y+sume;%+AAK_2
%}
%停機條件2;
%Gap=0;
%alpha_e=0;
Gap=abs(a)*(e-abs(FX-y))';%_old-y))';%??這裏是有疑問的,究竟是採用FX還是FX_old?要重新查找一下對偶間隙的問題;
                                    %if we use the FX_old, the result is not good, though the runtime is small;
                                    %so, we must use the FX; i.e. there is
                                    %no bias in the formula;
%{
for i=1:N
    Gap=Gap+abs(a(i))*(e-abs(FX(i)-y(i)));
end
%}
%{
for i=1:N
    if a(i)>0 || a(i)==-C
        Gap=Gap+a(i)*(FX(i)-y(i));
    else
        if a(i)<0 && a(i)>-C
            Gap=Gap+a(i)*(y(i)-FX(i));
        end    
    end
end
for i=1:N
    if a(i)==-C
        alpha_e=alpha_e+C*e;
    else
        alpha_e=alpha_e+a(i)*e;
    end
end
%}
%AAK=0;
%以下是代碼優化;
AAK_1=0;
for i=1:NS
    for j=1:NS
        if  Gram(ptr(i),ptr(j))~=0
            Kij=Gram(ptr(i),ptr(j));
        else
            Kij=K(sv_x(i,:),sv_x(j,:),si);
            Gram(ptr(i),ptr(j))=Kij;
            Gram(ptr(j),ptr(i))=Kij;
        end
        AAK_1=AAK_1+SV(i)*SV(j)*Kij;
    end
end

%{
for i=1:N
    if (a(i)~=0)
        for j=1:N
            if (a(j)~=0)
                if Gram(i,j)~=0
                    Kij=Gram(i,j);
                else
                    Kij=K(X(i,:),X(j,:),si);
                    Gram(i,j)=Kij;
                    Gram(j,i)=Kij;
                end
                    AAK=AAK+a(i)*a(j)*Kij;
            end
        end
    end
end
%}
sumay=a*y';
sumabsa=sum(abs(a));
sume=C*sum(efxt);
%Gap=Gap+alpha_e;
Gap_raw=Gap+sume;
Gap=abs(Gap_raw);
ratio=Gap/abs(Gap_raw+sumay-e*sumabsa-0.5*AAK_1+1);%(sume+0.5*AAK+1);%(Gap+sumay-e*sumabsa-0.5*AAK+1);%(sumay-e*sumabsa-0.5*AAK+1);
%ratio=(sume+sumay-e*sumabsa)/(sume+0.5*AAK+1);

%dynamically tune the threshold;相當於是慢慢的收攏管道;

 if ratio<0.08 && e_f==0
      e=(1e-3)*8;
      e_f=0.5;
      eps=e/2;
 else
   
   if ratio<0.04 && e_f==0.5
      e=(1e-3)*5;
      e_f=1;
      eps=e/2;
   else
      if ratio<0.01 && e_f==1
         e=(1e-3)*2;
         e_f=2;
         eps=e/2;
      else
        if ratio<0.005 && e_f==2
            e=(1e-3);
            e_f=3;
            eps=e/2;
            %{
            else
        if ratio<0.005 && e_f==3
            e=(1e-3);
            e_f=4;
            eps=e/2;
        end
            %}
        end
      end
   end
 end
   
   if ratio<0.001 && e_f==3
       break;
   end
            
%}
%{
%停機條件3;
W=0;
AAK=0;
for i=1:N
    if (a(i)~=0)
        for j=1:N
            if (a(j)~=0)
                if Gram(i,j)~=0
                    Kij=Gram(i,j);
                else
                    Kij=K(X(i,:),X(j,:),si);
                    Gram(i,j)=Kij;
                    Gram(j,i)=Kij;
                end
                AAK=AAK+a(i)*a(j)*Kij;
            end
        end
    end
end
 
sumay=a*y';
sumabsa=sum(abs(a));
sume=C*sum(efxt);
%W=suma-0.5*AAYYK-e*sumabsa;
 
ratio=(AAK+e*sumabsa-sumay+sume)/(0.5*AAK+sume+1);
%}
%ratio=(0.5*AAK+1/N*sume/C+sumay-e*sumabsa)/(AAK+1/N*sume/C+1);
%((suma-2*W+sume)/(suma-W+sume+1));
 
%fprintf('Now we can see the RATIO...\n');
fprintf('Ratio = %f\n',ratio);
 
if (ratio<thre)
    break;                          %if ratio meet the stop threshold, loop
                                    %stops.
end
%{
if (ratio<ratio_old)
    sv_f=sv_x;
    NS_f=NS;
    SV_f=SV;
    ratio_old=ratio;
end
%}
%{
if times>NS_1                         %if no progress, let working set as all points;
    times=0;
end
%}
end    
%{
for i=1:N;
    %if a(i)~=0;
        for l=1:NS
            if Gram(ptr(l),i)~=0;
                Kli=Gram(ptr(l),i);
            else
                Kli=K(sv_x(l,:),X(i,:),si);
                Gram(ptr(l),i)=Kli;
                Gram(i,ptr(l))=Kli;
            end
            FX(i)=FX(i)+SV(l)*Kli;
        end
    %end
end
%sv_x=x_sv;                         %this vector for demostation.
%x_sv=zeros(NS,20);                  %clear past data for preparation.
FX=FX+b*ones(1,N);   
%}
hold on;
plot(FX,'b.');
%E_y=E./(10+y)*100;
%{
p=0;
for i=1:NS;
    for j=1:NS;
        di(j)=norm(sv_x(i,:)-sv_x(j,:));
        if di(j)>p;
            p=di(j);            
        end
    end        
end

si=p^2/(2*NS);
%}

for i=1:N;
    for j=1:m0;
        X(i,j)=A(i+a1-N-del_d,j);
        %X(i,j)=v(i+j-1+N);
    end   
end

for i=1:N;
    if i<N
        y(i)=A(i+a1-N-del_d+1,qwe);
    else     
        y(i)=0;
    end    
end
figure;
plot(y,'r.');
hold on;
%{
sv_x=sv_f;
NS=NS_f;
SV=SV_f;
%}
FX=zeros(1,N);
for i=1:N;
   %if a(i)~=0;
    for l=1:NS
        %{
        if Gram(ptr(l),i)~=0;
            Kli=Gram(ptr(l),i);
        else
            Kli=K(sv_x(l,:),X(i,:),si);
            Gram(ptr(l),i)=Kli;
            Gram(i,ptr(l))=Kli;
        end
        %}
        Kli=K(sv_x(l,:),X(i,:),si);
        FX(i)=FX(i)+SV(l)*Kli;
    end
   %end
end
%sv_x=x_sv;                         %this vector for demostation.
%x_sv=zeros(NS,20);                  %clear past data for preparation.
FX=FX+b*ones(1,N);
plot(FX,'b.');
%E_1=FX-y;
%E_y_1=E_1./(10+y)*100;
%figure;
%plot(E_y);

%figure;
%plot(E_y_1,'r');

fprintf('run time = %4.2f seconds\n',cputime-st);

FX_pre=miu_a(1)+max_a(1)*FX(N);
%{
y_pre=miu_a(1)+max_a(1)*y(N)
err=FX_pre-y_pre
ratio_e=err/y_pre
%}
HS_300(round)=FX_pre;
end
stock_index=median(HS_300)

%--------------------------------------------------------------------------
%2014/03/06
%calculating the error between the prediction and real value;
%
%
%--------------------------------------------------------------------------

數據是從某炒股軟件中下載的,存爲.mat文件,以下是鏈接:

鏈接:https://pan.baidu.com/s/1Ux3IF1CTrHeLdQt2Ivm_MA 密碼:fzvn

發財的時候,記得回來踩踩!



 


發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章