吳恩達機器學習課程-作業1-線性迴歸（python實現）

Machine Learning(Andrew) ex1-Linear Regression

椰汁學習筆記

最近剛學習完吳恩達機器學習的課程，現在開始複習和整理一下課程筆記和作業，我將陸續更新。

Linear regression with one variable

2.1 Plotting the Data

首先我們讀入數據，先觀察數據的內容

數據是一個txt文件，每一行存儲一組數據，第一列數據爲城市的人口，第二列數據城市飯店的利潤。
讀入數據時按行讀入，每行通過分解，獲得兩個數據。數據的存儲使用列表。~~初次接觸沒有使用一些第三方庫pandas、numpy，後續會逐漸用到。~~

    # Part1：從txt文件中讀取數據，繪製成散點圖
    f = open("ex1data1.txt", 'r')
    population = []
    profit = []
    for line in f.readlines():
        col1 = line.split(',')[0]
        col2 = line.split(',')[1].split('\n')[0]
        population.append(float(col1))
        profit.append(float(col2))

接下來將數據可視化，需要用到matplotlib這個畫圖庫，這個教程還可以

	import matplotlib.pyplot as plt
	#上面這樣引入
    plt.title("Scatter plot of training data")
    plt.xlabel("population of city")
    plt.ylabel("profit")
    plt.scatter(population, profit, marker='x')
    plt.show()

xlabel和ylabel設置當前圖片的x，y軸的標籤；scatter繪製散點圖，前兩個參數爲兩個座標軸的對應數值列表（元組等），marker指定繪製點的圖形；show函數顯示。

2.2 Gradient Descent

$\mathit{J}(\theta) = \frac{1}{2m} \sum_{i=1}^{m}h_\theta(x^{(i)}-y^{(i)})^{2}$
$h_{\theta}(x)=\theta^{T}x=\theta_{0}+\theta_{1}x_{1}$

實現損失函數的計算

c = 0
theta = [0, 0] 
for j in range(m):
   c += 1.0 / (2 * m) * pow(theta[0] + theta[1] * population[j] - profit[j], 2)

當theta全部初始化爲0時，結果爲32.072733877455676

$\theta_{j}=\theta_{j}-\alpha\frac{1}{m}\sum_{i=1}^{m}(h_{\theta}(x^{(i)}-y^{(i)}))x^{(i)}_{j}\textrm{(simultaneously update θj for all j)}$
在這裏第一次進行梯度下降編碼，沒有使用向量化的編碼，只使用for循環遍歷，求解。
在進行梯度下降編碼時我們首先要弄清楚參數

    alpha = 0.01 #學習速率
    iterations = 1500 #梯度下降的迭代輪數
    theta = [0, 0] #初始化theta

接下來就是梯度下降實現了，第一個循環表示迭代輪數，~~第二個循環應該是遍歷theta，這裏數據的維數爲二維，相應的theta也只有兩個，因此省去了循環。~~ 最內層循環遍歷所有數據集。
注意課程中反覆強調的所有參數同步更新，需要將計算的theta暫存，所有theta更新完畢後，同時修改原來的theta參數。

    # part2：遞歸下降，同時記錄損失值的變化
    m = len(population)
    alpha = 0.01
    iterations = 1500
    theta = [0, 0]
    for i in range(iterations):
        temp0 = theta[0]
        temp1 = theta[1]
        for j in range(m):
            temp0 -= (alpha / m) * (theta[0] + theta[1] * population[j] - profit[j])
            temp1 -= (alpha / m) * (theta[0] + theta[1] * population[j] - profit[j]) * population[j]
        theta[0] = temp0
        theta[1] = temp1

2.3 Debugging

下面我們再繪製出最後擬合出來的直線，首先根據數據的範圍隨意確定X軸頭尾兩點，再使用擬合出的theta計算出對應的y值，再使用plot函數畫出兩點及連線。

    # part3：繪製迴歸直線圖，已經損失函數變化圖
    x = [5.0, 22.5]
    y = [5.0 * theta[1] + theta[0], 22.5 * theta[1] + theta[0]]
    plt.plot(x, y, color="red")
    plt.title("Linear Regression")
    plt.xlabel("population of city")
    plt.ylabel("profit")
    plt.scatter(population, profit, marker='x')

2.4 Visualizing J(θ)

爲了判斷擬合結果，可以將損失值隨梯度下降過程的變化曲線繪出，幫助我們理解。這裏沒有像作業上繪製theta和cost的圖像，只是簡單的繪製cost的變化曲線。因爲我覺得這個圖理解效果一樣。

    plt.title("Visualizing J(θ)")
    plt.xlabel("iterations")
    plt.ylabel("cost")
    plt.plot(t, cost, color="red")
    # t爲迭代輪數，cost爲每輪的損失值，這個計算要插入到梯度下降過程中，計算和記錄。

可以看到，損失值變化是一直下降，最後保持。因此我們的參數選擇是比較好的。特別是學習速率不能取太大，會導致越來越偏離最優值。

Linear regression with multiple variables

首先讀入數據，此次的數據，特徵包括第一列房子面積，第二列房間個數；第三列爲價格。

    # 讀入數據
    f = open("ex1data2.txt", 'r')
    house_size = []
    bedroom_number = []
    house_price = []
    for line in f.readlines():
        col1 = float(line.split(",")[0])
        col2 = float(line.split(",")[1])
        col3 = float(line.split(",")[2].split("\n")[0])
        house_size.append(col1)
        bedroom_number.append(col2)
        house_price.append(col3)

3.1 Feature Normalization

由於存在多個變量，需要將變量進行歸一化，目的是使每個特徵對結果影響不被數據數量級影響。
歸一化的方法有很多，這篇博客有講
這裏我使用min-max標準化，將數據映射到[-1,1]
$x_{normalized}=\frac{x-mean}{ptp},mean=average,ptp=max-min$
從此開始我將使用numpy，教程在這，實現如下。由於y的數值也很大，我一併將其歸一化。

    # 特徵歸一化
    x1 = np.array(house_size).reshape(-1, 1)
    x2 = np.array(bedroom_number).reshape(-1, 1)
    y = np.array(house_price).reshape(-1, 1)
    data = np.concatenate((x1, x2, y), axis=1)  # 放在一個ndarray中便於歸一化處理

    mean = np.mean(data, axis=0)  # 計算每一列的均值
    ptp = np.ptp(data, axis=0)  # 計算每一列的最大最小差值
    nor_data = (data - mean) / ptp  # 歸一化
    X = np.insert(nor_data[..., :2], 0, 1, axis=1)  # 添加x0=1
    y = nor_data[..., -1]

更常用的歸一化方法是zero-mean normalization
$x_{normalized}=\frac{x-\mu}{\sigma},\mu爲均值，\sigma爲方差$
實現方法

    mean = np.mean(data, axis=0)
    std = np.std(data, axis=0, ddof=1) #除數爲m-ddof，ddof默認爲0
    nor_data = (data - mean) / std

不採用這個方法的原因是，會導致在使用正規矩陣求解是第一個參數爲0，還不清楚其中的原因。

3.2 Gradient Descent

這裏我們使用完全向量化的實現方法，ndarray.dot()爲點乘即是內積，可以用@運算符代替，ndarray.T爲當前轉置。

def gradient_descent(X, theta, y, alpha, iterations):
    m = X.shape[0]
    c = []  # 存儲計算損失值
    for i in range(iterations):
        theta -= (alpha / m) * X.T.dot(X.dot(theta) - y)
        c.append(cost(X, theta, y))
    return theta, c

在這裏每次梯度下降計算了當前的損失值，用於debug。畫出損失值變化曲線。

    # 可視化下降過程
    plt.title("Visualizing J(θ)")
    plt.xlabel("iterations")
    plt.ylabel("cost")
    plt.plot([i for i in range(iterations)], c, color="red")
    plt.show()

當alpha=0.1,iterations=5000時曲線如下：
theta = [-2.47585640e-17 9.52353893e-01 -6.58737388e-02]

3.3 Normal Equations

正規方程可以一次計算出theta，根據矩陣變換：
$\theta=(X^TX)^{-1}X^Ty$
使用numpy可以很方便的實現：

def normal_equation(X, y):
    return np.linalg.pinv(X.T.dot(X)).dot(X.T).dot(y)

推薦使用numpy.linalg.pinv()求矩陣的逆
這個方法求解theta只適用在數據量比較小的情況，因爲矩陣求逆是個計算複雜度很高的操作。梯度下降的適用性更廣。
對比發現，兩種方法求出的theta會存在微小不同

完整的代碼會同步在我的github

歡迎指正錯誤
學習的時候參考了Cowry，非常感謝。

吳恩達機器學習課程-作業1-線性迴歸（python實現）

Machine Learning(Andrew) ex1-Linear Regression

Linear regression with one variable

Linear regression with multiple variables

.Net 8.0 下的新RPC，IceRPC之試試的新玩法"打洞"

完美替代postman的軟件

Vue mockjs mock.js

關於遊戲付費的一點想法

我通過CKA和CKS啦！

安裝chromadb注意事項

《最新出爐》系列入門篇-Python+Playwright自動化測試-42-強大的可視化追蹤利器Trace Viewer

大數據怎麼學？對大數據開發領域及崗位的詳細解讀，完整理解大數據開發領域技術體系

吳恩達機器學習課程-作業2-邏輯迴歸（python實現）

吳恩達機器學習課程-作業1-線性迴歸（python實現）

吳恩達機器學習課程-作業5-Bias vs Variance（python實現）

Jupyter notebook修改默認瀏覽器

吳恩達機器學習課程-作業8-異常檢測和推薦系統（python實現）

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結