機器學習實戰-使用matplotlib繪製決策樹

matplotlib註解

本文中使用matplotlib中的註解功能繪製樹形圖,它可以對文字着色並提供多種形狀用以選擇,而且我們還可以翻轉箭頭,將他指向數據或者節點。廢話不多,剛代碼,先完成使用文本註解繪製樹節點。先來解決一個matplotlib中文顯示亂碼的問題,加入如下代碼即可:

import matplotlib.pyplot as plt

plt.rcParams['font.sans-serif'] = ['SimHei']
plt.rcParams['axes.unicode_minus'] = False

.py文件的開頭加入就好。後面就是用文本註釋繪製樹節點的代碼:

decisionNode = dict(boxstyle="sawtooth", fc='0.8')
leafNode = dict(boxstyle="round4", fc='0.8')
arrow_args = dict(arrowstyle="<-")

def plotNode(nodeTxt, centrePt, parentPt, nodeType):
    creatPlot.ax1.annotate(nodeTxt, xy = parentPt, xycoords = "axes fraction", \
                           xytext = centrePt,  textcoords = 'axes fraction', \
                           va = 'center', ha = 'center', bbox = nodeType, \
                           arrowprops = arrow_args)

def creatPlot():
    fig = plt.figure(1, facecolor='white')
    creatPlot.ax1 = plt.subplot(111, frameon=False)
    plotNode(u'決策節點', (0.5,0.1), (0.1, 0.5), decisionNode)
    plotNode(u'葉節點', (0.8, 0.1), (0.3, 0.8), leafNode)
    plt.show()

運行結果如下圖:

是不是覺得666,我也是這種感覺。。。太村了。。。以後會越來越高端的。

構造註解樹

先要計算出樹的子葉節點個數和深度,以便計算每個子樹的偏移。爲了方便測試代碼,還增加了一個生成樹的函數,代碼剛起來:

def getNumLeaves(myTree):
    numLeaves = 0
    firstStr = list(myTree.keys())[0]
    nextDict = myTree[firstStr]
    for key in nextDict.keys():
        if type(nextDict[key]).__name__ == 'dict':
            numLeaves += getNumLeaves(nextDict[key])
        else:
            numLeaves += 1
    return numLeaves

def getDepthTree(myTree):
    depthTree = 0
    firststr = list(myTree.keys())[0]
    nextDict = myTree[firststr]
    for key in nextDict.keys():
        if type(nextDict[key]).__name__ == 'dict':
            thisDepth = 1 + getDepthTree(nextDict[key])
        else:
            thisDepth = 1
        if thisDepth > depthTree:
            depthTree = thisDepth
    return depthTree

def retrieveTrees():
    listOfTrees = [{'no surfacing': {0: 'no', 1: {'flippers': {0: 'no', 1: 'yes'}}}}]
    return listOfTrees[0]

在獲取葉節點個數和樹的層數時,都是用了遞歸調用的方法,先判斷子樹是否爲字典,如果是字典則遞歸調用。函數retrieveTree的目的是創造一棵樹,測試代碼的正確性。測試代碼如下所示:

if __name__ == '__main__':
    myTree = retrieveTrees()
    print(type(myTree.keys()))
    depthTree = getDepthTree(myTree)
    leafNum = getNumLeaves(myTree)
    print("tree depth = %d, leaf num = %d" % (depthTree, leafNum))

運行結果如下:


然後添加如下代碼:

def plotMidText(cntrPt, parentPt, txtString):
    xMid = (parentPt[0] - cntrPt[0]) / 2.0 + cntrPt[0]
    yMid = (parentPt[1] - cntrPt[1]) / 2.0 + cntrPt[1]
    creatPlot.ax1.text(xMid, yMid, txtString)

def plotTree(myTree, parentPt, nodeTxt):
    numLeafs = getNumLeaves(myTree)
    depth = getDepthTree(myTree)
    firstStr = list(myTree.keys())[0]
    cntrPt = (plotTree.xOff + (1.0 + float(numLeafs)) / 2.0 / plotTree.totalW, \
              plotTree.yOff)
    plotMidText(cntrPt, parentPt, nodeTxt)
    plotNode(firstStr, cntrPt, parentPt, decisionNode)
    secondDict = myTree[firstStr]
    plotTree.yOff = plotTree.yOff - 1.0 / plotTree.totalD
    for key in secondDict.keys():
        if type(secondDict[key]).__name__ == 'dict':
            plotTree(secondDict[key], cntrPt, str(key))
        else:
            plotTree.xOff = plotTree.xOff + 1.0 / plotTree.totalW
            plotNode(secondDict[key], (plotTree.xOff, plotTree.yOff), \
                     cntrPt, leafNode)
            plotMidText((plotTree.xOff, plotTree.yOff), cntrPt, str(key))
    plotTree.yOff = plotTree.yOff + 1.0 / plotTree.totalD

def creatPlot(inTree):
    fig = plt.figure(1, facecolor='white')
    fig.clf()
    axprops = dict(xticks = [], yticks = [])
    creatPlot.ax1 = plt.subplot(111, frameon=False, **axprops)
    plotTree.totalW = float(getNumLeaves(inTree))
    plotTree.totalD = float(getDepthTree(inTree))
    plotTree.xOff = -0.5 / plotTree.totalW
    plotTree.yOff = 1.0
    plotTree(inTree, (0.5, 1.0), '')
    plt.show()

plt的clf方法是指clear figure的意思。Python中的**表示傳參按照字典的方式理解(http://blog.csdn.net/whhit111/article/details/47759267)。中間的過程如下:1、計算標註的起始點;2、計算text的中點;3、給指示箭頭添加文字。與上文中計算層數和葉節點個數類似,plottree也會使用遞歸方法。

測試代碼很簡單:

if __name__ == '__main__':
    myTree = retrieveTrees()
    #print(type(myTree.keys()))
    #depthTree = getDepthTree(myTree)
    #leafNum = getNumLeaves(myTree)
    #print("tree depth = %d, leaf num = %d" % (depthTree, leafNum))
    creatPlot(myTree)

能夠畫出下圖就是成功了:



發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章