Matlab中的决策树

Question

I saw the help in Matlab, but they have provided an example without explaining how to use the parameters in the 'classregtree' function. 我在Matlab中看到了帮助，但是他们提供了一个示例，但没有解释如何使用'classregtree'函数中的参数。 Any help to explain the use of 'classregtree' with its parameters will be appreciated. 任何帮助解释'classregtree'与其参数的使用将不胜感激。

Answer 1

The documentation page of the function classregtree is self-explanatory... 函数classregtree的文档页面是不言自明的......

Lets go over some of the most common parameters of the classification tree model: 让我们回顾一下分类树模型的一些最常见的参数：

x : data matrix, rows are instances, cols are predicting attributes x ：数据矩阵，行是实例，cols是预测属性
y : column vector, class label for each instance y ：列向量，每个实例的类标签
categorical : specify which attributes are discrete type (as opposed to continuous) 分类：指定哪些属性是离散类型（而不是连续）
method : whether to produce classification or regression tree (depend on the class type) 方法：是否生成分类或回归树（取决于类类型）
names : gives names to the attributes names ：给出属性的名称
prune : enable/disable reduced-error pruning prune ：启用/禁用减少错误的修剪
minparent/minleaf : allows to specify min number of instances in a node if it is to be further split minparent / minleaf ：允许指定节点中的最小实例数（如果要进一步拆分）
nvartosample : used in random trees (consider K randomly chosen attributes at each node) nvartosample ：用于随机树（考虑每个节点K随机选择的属性）
weights : specify weighted instances 权重：指定加权实例
cost : specify cost matrix (penalty of the various errors) 成本：指定成本矩阵（各种错误的罚分）
splitcriterion : criterion used to select the best attribute at each split. splitcriterion ：用于在每次拆分时选择最佳属性的标准。 I'm only familiar with the Gini index which is a variation of the Information Gain criterion. 我只熟悉基尼指数，它是信息增益标准的变体。
priorprob : explicitly specify prior class probabilities, instead of being calculated from the training data priorprob ：明确指定先前的类概率，而不是从训练数据计算

A complete example to illustrate the process: 一个完整的例子来说明这个过程：

%# load data
load carsmall

%# construct predicting attributes and target class
vars = {'MPG' 'Cylinders' 'Horsepower' 'Model_Year'};
x = [MPG Cylinders Horsepower Model_Year];  %# mixed continous/discrete data
y = cellstr(Origin);                        %# class labels

%# train classification decision tree
t = classregtree(x, y, 'method','classification', 'names',vars, ...
                'categorical',[2 4], 'prune','off');
view(t)

%# test
yPredicted = eval(t, x);
cm = confusionmat(y,yPredicted);           %# confusion matrix
N = sum(cm(:));
err = ( N-sum(diag(cm)) ) / N;             %# testing error

%# prune tree to avoid overfitting
tt = prune(t, 'level',3);
view(tt)

%# predict a new unseen instance
inst = [33 4 78 NaN];
prediction = eval(tt, inst)    %# pred = 'Japan'

Update: 更新：

The above classregtree class was made obsolete, and is superseded by ClassificationTree and RegressionTree classes in R2011a (see the fitctree and fitrtree functions, new in R2014a). 上面的classregtree类已经过时，并且被R2011a中的ClassificationTree和RegressionTree类所取代（参见fitctree和fitrtree函数，R2014a中的新函数）。

Here is the updated example, using the new functions/classes: 这是更新的示例，使用新的函数/类：

t = fitctree(x, y, 'PredictorNames',vars, ...
    'CategoricalPredictors',{'Cylinders', 'Model_Year'}, 'Prune','off');
view(t, 'mode','graph')

y_hat = predict(t, x);
cm = confusionmat(y,y_hat);

tt = prune(t, 'Level',3);
view(tt)

predict(tt, [33 4 78 NaN])

Matlab中的决策树

问题描述

1 个解决方案

解决方案1
34 已采纳 2009-12-25 07:47:53

Update: 更新：

Matlab中的决策树

问题描述

1 个解决方案

解决方案1 34 已采纳 2009-12-25 07:47:53

Update: 更新：

解决方案1
34 已采纳 2009-12-25 07:47:53