非監督學習——聚類

原創

laboirousbee

2019-05-01 07:37

K—均值（K-Means）：

尋找每個類的中心，中心就是表徵數據的區域。
分兩步：分配，優化。不停的迭代這兩步，直到不再變化。

侷限性：局部最小值的問題，K-Means非常依賴於初始聚類中心所處的位置。

>>> from sklearn.cluster import KMeans
>>> import numpy as np
>>> X = np.array([[1, 2], [1, 4], [1, 0],
...               [10, 2], [10, 4], [10, 0]])
>>> kmeans = KMeans(n_clusters=2, random_state=0).fit(X)
>>> kmeans.labels_
array([1, 1, 1, 0, 0, 0], dtype=int32)
>>> kmeans.predict([[0, 0], [12, 3]])
array([1, 0], dtype=int32)
>>> kmeans.cluster_centers_
array([[10.,  2.],
       [ 1.,  2.]])

附錄：

1、scikit-learn聚類的網站：https://mp.csdn.net/postedit?not_checkout=1

2、多種方式選擇聚類 k：https://en.wikipedia.org/wiki/Determining_the_number_of_clusters_in_a_data_set

3、matplotlib的顏色：https://matplotlib.org/examples/color/colormaps_reference.html

4、稀疏 csr 矩陣類型（如 SciPi 庫中所定義）：https://docs.scipy.org/doc/scipy-0.19.0/reference/generated/scipy.sparse.csr_matrix.html，要從 pandas dataframe 轉換爲稀疏矩陣，我們需要先轉換爲 SparseDataFrame，然後使用 pandas 的 to_coo() 方法進行轉換。

5、這是一個簡單的推薦引擎，展示了“協同過濾”的最基本概念：https://www.netflixprize.com/assets/GrandPrize2009_BPC_BigChaos.pdf

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

非監督學習——聚類

附錄：

Shell/Python中的用戶名獲取

Halcon算子實現——Texture_Laws

非監督學習——高斯混合模型與聚類驗證

PCL庫安裝和 VS2017環境變量配置

Windows Tensorflow GPU(CUDA) Anaconda配置

深度學習——神經網絡

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結