logistic迴歸案例判斷學生能否錄取（梯度下降法動態效果實現）

原創

2020-07-03 17:11

開始學習機器學習有一段時間了，只是大部分精力都用在看視頻和看書上面，反而忽略了動手編程的重要性，最近開始陸續把欠下的債給補回來，今天就是把Logistic迴歸的應用代碼實現了一下，話說好記性不如爛筆頭，還是寫下來的好
首先，對於邏輯迴歸問題實際也就是分類問題，數據的label只有是或否（即1或0）
因此我們在找假設函數時，希望這個函數最後的輸出值能分佈在0-1之間，並且預測值表示的是P（y=1），也就是label=1的概率大小。這時，sigmoid函數就派上大用場，該函數可以實現非線性映射，使得結果落在（0,1)中。
對於邏輯迴歸模型（Logistic Regression, LR）的詳細介紹可以參考這個博文，講解的比較全面。https://blog.csdn.net/weixin_39910711/article/details/81607386

對於上述無約束優化問題，有多種優化方法，包括梯度下降法，隨機梯度下降法，牛頓法，共軛梯度法等等。詳細的優化方法介紹可以自己查閱，本次結合一個作業練習，實現根據學生的兩門考試成績來預測能否被學校錄取。數據集可以在這裏下載。代碼使用python編寫，使用的時Jupyter Notebook，在結果展示中通過動態繪製收斂的效果圖，可以清晰的看出算法的迭代過程直至收斂。具體代碼如下所示：

%matplotlib qt5
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib
data=pd.read_csv('ex4Data\ex4x.dat', names=['exam1', 'exam2'],sep='\s+')
label=pd.read_csv('ex4Data\ex4y.dat',names=['admitted'],sep='\s+')
#label.head()   查看數據集的前五行
data.insert(0,'ones',1)
is_ipython = 'qt5' in matplotlib.get_backend()
if is_ipython:
    from IPython import display
plt.ion()
dataArr=np.array(data)
labelArr=np.array(label)
n=np.shape(dataArr)[0]
xcord0=[] 
ycord0=[]
xcord1=[]
ycord1=[]
for i in range(n):
    if int(labelArr[i])==1:
        xcord1.append(dataArr[i,1]);ycord1.append(dataArr[i,2])
    else:
        xcord0.append(dataArr[i,1]);ycord0.append(dataArr[i,2])       
fig = plt.figure() 
fig.set_size_inches(12,6)
ax = fig.add_subplot(121)
ax.scatter(xcord0,ycord0,s=30,c='red',marker='+', label='Not Admitted')
ax.scatter(xcord1,ycord1,s=30,c='green',label='Admitted') 
ax.legend(loc=1)     #設置圖標在右上角
x = np.linspace(1, 100, 2)
def plotFit(W):
    ax.set_xlim(1, 100)
    ax.set_ylim(1, 100)
    ax.set_xlabel('Exam 1 Score')
    ax.set_ylabel('Exam 2 Score')
    ax.set_title('Decision Boundary')
    y=(-W[0]-W[1]*x)/W[2]
    lines = ax.plot(x,y,'b')
    plt.pause(0.1)
    ax.lines.remove(lines[0])
    
def plotloss(ilist,loss_list):
    ax = fig.add_subplot(122)
    ax.set_xlabel('iteration')
    ax.set_ylabel('loss')
    ax.set_title('loss trend')
    ax.plot(np.array(ilist),np.array(loss_list),'r.-')
    plt.pause(0.1)  # pause a bit so that plots are updated
    if is_ipython:
        display.clear_output(wait=True)
        display.display(plt.gcf())        
def main():
    X=np.array(data)  
    Y=np.array(label).reshape(-1,1)  #不管有多少行，變成一列
   # W=np.random.randn(3,1).reshape((-1,1))
    W=np.zeros((3,1))
    m=len(X)
    loss_list=[]
    ilist=[]
    rate=0.001
    for i in range(200):
        z=np.dot(X,W)   
        h_x=1/(1+np.exp(-z))
        loss=(-Y)*np.log(h_x)-(1-Y)*np.log(1-h_x)
        loss=np.sum(loss)/m
        dW=X.T.dot(h_x-Y)/m     #實現求梯度
        W=W-rate*dW
        #print("第%i次迭代後：%f"%(i,loss))
        if(i%5==0):
            plotFit(W)
            loss_list.append(loss)
            ilist.append(i)
            plotloss(ilist,loss_list)
    y=(-W[0]-W[1]*x)/W[2]
    lines = ax.plot(x,y,'b')
    print('W最優解：')
    print(W)
    print(loss)
if __name__ == '__main__':
    main()

收斂至最終結果如下圖

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

logistic迴歸案例判斷學生能否錄取（梯度下降法動態效果實現）

Python 潮流週刊#52：Python 處理 Excel 的資源

理解python中numpy.transpose的用法（座標系圖解）

稀疏與低秩表示總結與論文書籍推薦

logistic迴歸案例判斷學生能否錄取（梯度下降法動態效果實現）

Tkinter學習筆記之Scale尺度

Tkinter學習筆記之Listbox列表

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結