DBScan 算法

DBScan 是一種基於密度的聚類算法,主要算法流程如下圖:

DBSCAN(D, eps, MinPts)
   C = 0                                          //類別標示
   for each unvisited point P in dataset D        //遍歷
      mark P as visited                           //已經訪問
      NeighborPts = regionQuery(P, eps)           //計算這個點的鄰域    
      if sizeof(NeighborPts) < MinPts             //不能作爲核心點
         mark P as NOISE                          //標記爲噪音數據
      else                                        //作爲核心點,根據該點創建一個類別
         C = next cluster
         expandCluster(P, NeighborPts, C, eps, MinPts)    //根據該核心店擴展類別
          
expandCluster(P, NeighborPts, C, eps, MinPts)
   add P to cluster C                                     //擴展類別,核心店先加入
   for each point P' in NeighborPts                       //然後針對核心店鄰域內的點,如果該點沒有被訪問,
      if P' is not visited
         mark P' as visited                               //進行訪問
         NeighborPts' = regionQuery(P', eps)              //如果該點爲核心點,則擴充該類別
         if sizeof(NeighborPts') >= MinPts
            NeighborPts = NeighborPts joined with NeighborPts'
      if P' is not yet member of any cluster              //如果鄰域內點不是核心點,並且無類別,比如噪音數據,則加入此類別
         add P' to cluster C
          
regionQuery(P, eps)                                       //計算鄰域
   return all points within P's eps-neighborhood

結合百度百科的僞代碼:


發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章