ML實踐-Adaptive Linear Neurons（Adaline）

原創

2018-08-28 04:17

原理

在萬事開頭難那篇文中，介紹了一個初級的一層神經網，這是在最初級上面的follow up 版。
增強的點有：
1. Bernard新提出了cost function
2. weights的更新基於線性方程（linear activation function），而不是之前perceptron中的離散方程（unit step function）

Cost Function

Sum of Square Errors(SSE)
J(w)=12∑i(yi−ϕ(zi))2

其中ϕ(zi) 爲圖中Activation function的輸出，在此處簡單定義爲：
ϕ(wTx)=wTx

理想的狀況是，目標方程是U型的。我們可以用梯度下降法找到最小的cost.

梯度下降gradient descent

η 爲步長，偏導結果爲方向

feature scaling

當η 過大時，會發生overshoot. 解決辦法一個是減小它的大小，另外一種辦法是特徵縮放。在此次實驗中用的是標準化的方法縮小特徵值。

實現

import numpy as np
class AdalineGD(object):
    """ADAptive LInear NEuron classifier.
    Parameters
    -----------
    eta : float
    Learning rate (between 0.0 and 1.0)
    n_iter : int
    Passes over the training dataset.

    Attributes
    -----------
    w_ : 1d-array
    Weights after fitting.
    errors_ : list
    Number of misclassifications in every epoch.
    """
    def __init__(self, eta=0.01, n_iter=50):
        self.eta = eta
        self.n_iter = n_iter
    def fit(self, X, y):
    """ Fit training data.

    Parameters
    ----------
    X : {array-like}, shape = [n_samples, n_features]
        Training vectors, where n_samples is the number of samples and
        n_features is the number of features.
    y : array-like, shape = [n_samples]Target values.

    Returns
    -------
    self : object
    """
        self.w_ = np.zeros(1 + X.shape[1])
        self.cost_ = []
        for i in range(self.n_iter):
            output = self.net_input(X)
            errors = (y - output)
            #X.T.dot 叉乘 output：向量
            self.w_[1:] += self.eta * X.T.dot(errors)
            self.w_[0] += self.eta * errors.sum()
            cost = (errors**2).sum() / 2.0
            self.cost_.append(cost)
        return self
    def net_input(self, X):
    """Calculate net input"""
        #np.dot 點乘 output:標量
        return np.dot(X, self.w_[1:]) + self.w_[0]
    def activation(self, X):
    """Compute linear activation"""
        return self.net_input(X)
    def predict(self, X):
    """Return class label after unit step"""
        return np.where(self.activation(X) >= 0.0, 1, -1)

重點是weight的更新：

self.w_[1:] += self.eta * X.T.dot(errors)
self.w_[0] += self.eta * errors.sum()

和新添的activation function

def activation(self, X):
    """Compute linear activation"""
    return self.net_input(X)

測試

>>> fig, ax = plt.subplots(nrows=1, ncols=2, figsize=(8, 4))
>>> ada1 = AdalineGD(eta=0.01, n_iter=50).fit(X, y)
>>> ax[0].plot(range(1, len(ada1.cost_) + 1),
... np.log10(ada1.cost_), marker='o')
>>> ax[0].set_xlabel('Epochs')
>>> ax[0].set_ylabel('log(Sum-squared-error)')
>>> ax[0].set_title('Adaline - Learning rate 0.01')
>>> ada2 = AdalineGD(eta=0.0001, n_iter=50).fit(X, y)
>>> ax[1].plot(range(1, len(ada2.cost_) + 1),
... ada2.cost_, marker='o')
>>> ax[1].set_xlabel('Epochs')
>>> ax[1].set_ylabel('Sum-squared-error')
>>> ax[1].set_title('Adaline - Learning rate 0.0001')
>>> plt.show()

左圖中，因爲learning rate步長太大，發生了overshoot,所以最後沒有降下來。
通過feature scaling, 在此也就是標準化特徵：

#減去平均數，除以標準差
>>> X_std = np.copy(X)
>>> X_std[:,0] = (X[:,0] - X[:,0].mean()) / X[:,0].std()
>>> X_std[:,1] = (X[:,1] - X[:,1].mean()) / X[:,1].std()

再將模型fit函數輸入改爲x_std:

ada.fit(X_std, y)

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

ML實踐-Adaptive Linear Neurons（Adaline）

原理

Cost Function

梯度下降gradient descent

feature scaling

實現

測試

linux安裝cuda和cudnn

模擬手機設備：使用 Playwright 實現移動端自動化測試

Mellanox網卡開啓SR-IOV

全面系統的AI學習路徑，幫助普通人也能玩轉AI

HTML 00 Tutorial

uni-app實現上拉加載

vue3編譯優化之“靜態提升”

又是一個月-20240513

flask 如何保證返回json有序

linux服務器設置ssh免密

應聘——大數據研發（1）-MapReduce編程

應聘——總Plan

Nvidia + Ubuntu/Win7

應聘－系統研發工程師

基礎——算法

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結