简体   繁体   English

我可以手动创建RWeka决策(递归分区)树吗?

[英]Can I manually create an RWeka decision (Recursive Partitioning) tree?

I have constructed a J48 decision tree using RWeka. 我已经使用RWeka构建了J48决策树。 I would like to compare its performance to a decision tree described an existing (externally computed) decision tree. 我想将其性能与描述现有(外部计算的)决策树的决策树进行比较。 I'm new to RWeka and I'm having trouble manually creating an RWeka decision tree. 我是RWeka的新手,无法手动创建RWeka决策树。 Ideally, I would like to show the two side-by-side and plot them using the RWeka visualization (It is very informative and clean). 理想情况下,我想并排显示这两个图像,并使用RWeka可视化效果对其进行绘制(非常有用且干净)。

Right now, I'm going to export the RWeka computed decision tree to Graphviz and manipulate it into the structure I want. 现在,我将RWeka计算的决策树导出到Graphviz并将其操纵到所需的结构中。 I want to check before I start and make sure I cant simply specify the rules I want to manually specify a decision tree. 我想在开始之前进行检查,并确保不能简单地指定要手动指定决策树的规则。

I don't want to compute the decision tree (I've done that), I want to manually construct/specify a decision tree (for uniform comparison in my presentation). 我不想计算决策树(我已经做到了),我想手动构造/指定决策树(在演示文稿中进行统一比较)。

Thank you in advanced. 在此先感谢您。

The RWeka package itself cannot do that . RWeka程序包本身无法做到这一点。 However, RWeka uses the partykit package for displaying its trees which can do what you want. 但是, RWeka使用partykit包来显示其树,该树可以完成您想要的操作。 Look at the vignette(“partykit“, package = “partykit“) how you can construct a recursive partynode object with pre-specified partysplit s and then turn them into a constparty . 查看vignette(“partykit“, package = “partykit“)如何使用预先指定的partysplit构造递归的partynode对象,然后将其转换为constparty The vignette has a hands-on example for this. 小插图为此提供了一个动手的示例。

Here is some example code for the package partykit that @Achim Zeileis suggested. 这里是软件包一些示例代码partykit@Achim Zeileis建议。

library(partykit)

Load the data: 加载数据:

data("WeatherPlay", package = "partykit")
WeatherPlay
#  outlook temperature humidity windy play
#  1 sunny 85 85 false no
#  2 sunny 80 90 true no
#  3 overcast 83 86 false yes
#  4 rainy 70 96 false yes
#  5 rainy 68 80 false yes
#  6 rainy 65 70 true no
#  7 overcast 64 65 true yes
...

Initialize decisions: integer 1L denotes the column of the yet unspecified data-frame to which this split applies. 初始化决策:整数1L表示此拆分适用的尚未指定的数据帧的列。 Index corresponds to the levels of a factor (discrete splits) and breaks corresponds to a cutoff (continuous splits). 索引对应于因子的水平(离散拆分),中断对应于截止值(连续拆分)。

sp_o <- partysplit(1L, index = 1:3)
sp_h <- partysplit(3L, breaks = 75)
sp_w <- partysplit(4L, index = 1:2)

Incorporate decisions into nodes: 将决策合并到节点中:

pn <- partynode(1L, split = sp_o, kids = list(
  partynode(2L, split = sp_h, kids = list(
  partynode(3L, info = "yes"),
  partynode(4L, info = "no"))),
  partynode(5L, info = "yes"),
  partynode(6L, split = sp_w, kids = list(
  partynode(7L, info = "yes"),
  partynode(8L, info = "no")))))

Fit data to tree: 使数据适合树:

t2 <- party(pn,
  data = WeatherPlay,
  fitted = data.frame(
    "(fitted)" = fitted_node(pn, data = WeatherPlay),
    "(response)" = WeatherPlay$play, # response variable
  check.names = FALSE),
  terms = terms(play ~ ., data = WeatherPlay),
  )

t3 <- as.constparty(t2)
plot(t3)

source: http://cran.r-project.org/web/packages/partykit/vignettes/partykit.pdf 来源: http : //cran.r-project.org/web/packages/partykit/vignettes/partykit.pdf

在此处输入图片说明

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM