基于sklearn 实现决策树(含最简代码,复杂源码:预测带不带眼镜)

最简代码:

#简单的决策树分类
from sklearn import tree
features = [[300,2],[450,2],[200,8],[150,9]]
labels = [‘apple‘,‘apple‘,‘orange‘,‘orange‘]
clf = tree.DecisionTreeClassifier()
clf = clf.fit(features,labels)
print(clf.predict([[400,6]]))

预测代码:

数据集下载地址

代码:

# -*- coding: UTF-8 -*-
from sklearn.preprocessing import LabelEncoder, OneHotEncoder
from sklearn.externals.six import StringIO
from sklearn import tree
import pandas as pd
import numpy as np
import pydotplus

if __name__ == ‘__main__‘:
    with open(‘data\lenses.txt‘, ‘r‘) as fr:                                        #加载文件
        lenses = [inst.strip().split(‘\t‘) for inst in fr.readlines()]        #处理文件
    lenses_target = []                                                        #提取每组数据的类别,保存在列表里
    for each in lenses:
        lenses_target.append(each[-1])

    lensesLabels = [‘age‘, ‘prescript‘, ‘astigmatic‘, ‘tearRate‘]            #特征标签
    lenses_list = []                                                        #保存lenses数据的临时列表
    lenses_dict = {}                                                        #保存lenses数据的字典,用于生成pandas
    for each_label in lensesLabels:                                            #提取信息,生成字典
        for each in lenses:
            lenses_list.append(each[lensesLabels.index(each_label)])
        lenses_dict[each_label] = lenses_list
        lenses_list = []
    # print(lenses_dict)                                                        #打印字典信息
    lenses_pd = pd.DataFrame(lenses_dict)                                    #生成pandas.DataFrame
    print(lenses_pd)                                                        #打印pandas.DataFrame
    le = LabelEncoder()                                                        #创建LabelEncoder()对象,用于序列化
    for col in lenses_pd.columns:                                            #序列化
        lenses_pd[col] = le.fit_transform(lenses_pd[col])
    print(lenses_pd)                                                        #打印编码信息

    clf = tree.DecisionTreeClassifier(max_depth = 4)                        #创建DecisionTreeClassifier()类
    clf = clf.fit(lenses_pd.values.tolist(), lenses_target)                    #使用数据,构建决策树
    print(lenses_target)
    print(clf.predict([[1,1,1,0]]))                    #预测

预测眼镜

相关推荐