当前位置: 首页 > Machine Learning, Python, 随记 > 正文

机器学习 决策树可视化

机器学习决策树可视化

<机器学习实战>第三章中介绍了基于Json格式的决策树的可视化方法,但是书中介绍的方法画出来的图过于简陋.
于是使用Pythonpygraphviz库重新画了一个,看起来舒服一些.

pygraphviz依赖于GraphViz,因此使用之前需要先下载安装.
Mac系统直接brew安装就好了.

假设决策树算法计算完成之后,生成类似下面两个json格式的决策树:

  1. ml.json
{
  "有胡子": {
    0: {
      "长头发": {
        0: "女",
        1: "女"
      }
    },
    1: "男"
  }
}
  1. lens.json
{
  "tearRate": {
    "reduced": "no lenses",
    "normal": {
      "astigmatic": {
        "yes": {
          "prescript": {
            "hyper": {
              "age": {
                "pre": "no lenses",
                "young": "hard",
                "presbyopic": "no lenses"
              }
            },
            "myope": "hard"
          }
        },
        "no": {
          "age": {
            "pre": "soft",
            "young": "soft",
            "presbyopic": {
              "prescript": {
                "hyper": "soft",
                "myope": "no lenses"
              }
            }
          }
        }
      }
    }
  }
}

使用本文中的方法生成的决策树图形如下所示:

NOTE: 这里是一种简化的生成可视化决策树的方法,感觉没有本文中介绍的好看,
本文是对这里介绍的方法中的优化

mf.png

lens.png

Python代码treeGraph.py如下:

#!/usr/bin/env python3

import pygraphviz as pgv
import uuid

def buildTreeGraphOpt(myTree, parent, treeLevel, label, theGraph):
    """
    使用 pygraphviz 库(底层依赖 graphviz) 画树形图, (优化过的算法)
    :param myTree:  根据训练数据生成的决策树,是dict类型
    :param parent: 当前处理的 myTree的根节点,是graphviz 中的node的ID
    :param treeLevel: 表示当前处理的树的层次, 偶数层是特征名称,需要绘制一个节点,奇数层时特征值,
    特征值会传递给下一层,然后变成当前层和下一层Edge的label
    :param label: 如果当前层时偶数, label是上一层的特征值,需要绘制称链接上一层与当前层的Edge的label
    :param theGraph: 传入的pygraphviz 库中的AGraph 对象
    :return: 没有返回值
    """
    currentGraph = theGraph
    currTreeLevel = treeLevel
    nextTreeLevel = treeLevel + 1
    currentLabel = label
    # print(f"\nlevel={treeLevel},parent={parent}, label={label},myTree={myTree} ")

    if not isinstance(myTree, dict):  # 叶子节点, 由奇数层递归调用至此
        leaf = myTree
        # print(f"\nnot dict, myTree={myTree}")
        keyNodeId = f"{leaf}_{currTreeLevel}_{uuid.uuid1()}"
        currentGraph.add_node(keyNodeId, label=leaf)
        if parent is not None and currentLabel is not None and currentLabel != "":
            # print(f"-------------------------parent={parent},currentLabel={currentLabel}, current node={keyNodeId}")
            currentGraph.add_edge(parent, keyNodeId, label=currentLabel)
        return

    for k in myTree.keys():
        v = myTree[k]
        if currTreeLevel % 2 == 0:
            keyNodeId = f"{k}_{currTreeLevel}_{uuid.uuid1()}"
            currentGraph.add_node(keyNodeId, label=k)
            if parent is not None and currentLabel is not None and currentLabel != "":
                currentGraph.add_edge(parent, keyNodeId, label=currentLabel)
            if isinstance(v, dict):
                buildTreeGraphOpt(v, keyNodeId, nextTreeLevel, None, currentGraph)
            else:
                return  # 已经到叶子节点了
        else:
            currentLabel = k
            buildTreeGraphOpt(v, parent, nextTreeLevel, currentLabel, currentGraph)

def generatePicForTreeOpt(picFileName, theTree):
    """
    将决策树(dict类型)可视化,保存在由picFileName指定的文件中
    :param picFileName:
    :param theTree:
    :return:
    """
    G = pgv.AGraph(directed=True, rankdir='UD')
    G.graph_attr['epsilon'] = '0.001'
    buildTreeGraphOpt(theTree, None, 0, None, G)
    G.layout('dot')
    G.draw(picFileName)

def main():
    print(__file__)
    tree1 = {'tearRate': {'reduced': 'no lenses', 'normal': {'astigmatic': {'yes': {
        'prescript': {'hyper': {'age': {'pre': 'no lenses', 'young': 'hard', 'presbyopic': 'no lenses'}},
                      'myope': 'hard'}}, 'no': {'age': {'pre': 'soft', 'young': 'soft', 'presbyopic': {
        'prescript': {'hyper': 'soft', 'myope': 'no lenses'}}}}}}}}
    tree2 = {'有胡子': {0: {'长头发': {0: '女', 1: '女'}}, 1: '男'}}

    picFilename1 = "lens.png"
    picFilename2 = "mf.png"

    generatePicForTreeOpt(picFilename1, tree1)
    generatePicForTreeOpt(picFilename2, tree2)

if __name__ == '__main__':
    main()
赞 赏

   微信赞赏  支付宝赞赏


本文固定链接: https://www.jack-yin.com/coding/essay/3124.html | 边城网事

该日志由 边城网事 于2019年09月09日发表在 Machine Learning, Python, 随记 分类下, 你可以发表评论,并在保留原文地址及作者的情况下引用到你的网站或博客。
原创文章转载请注明: 机器学习 决策树可视化 | 边城网事

机器学习 决策树可视化:目前有1 条留言

发表评论

快捷键:Ctrl+Enter