简体   繁体   中英

Can I manually create an RWeka decision (Recursive Partitioning) tree?

I have constructed a J48 decision tree using RWeka. I would like to compare its performance to a decision tree described an existing (externally computed) decision tree. I'm new to RWeka and I'm having trouble manually creating an RWeka decision tree. Ideally, I would like to show the two side-by-side and plot them using the RWeka visualization (It is very informative and clean).

Right now, I'm going to export the RWeka computed decision tree to Graphviz and manipulate it into the structure I want. I want to check before I start and make sure I cant simply specify the rules I want to manually specify a decision tree.

I don't want to compute the decision tree (I've done that), I want to manually construct/specify a decision tree (for uniform comparison in my presentation).

Thank you in advanced.

The RWeka package itself cannot do that . However, RWeka uses the partykit package for displaying its trees which can do what you want. Look at the vignette(“partykit“, package = “partykit“) how you can construct a recursive partynode object with pre-specified partysplit s and then turn them into a constparty . The vignette has a hands-on example for this.

Here is some example code for the package partykit that @Achim Zeileis suggested.

library(partykit)

Load the data:

data("WeatherPlay", package = "partykit")
WeatherPlay
#  outlook temperature humidity windy play
#  1 sunny 85 85 false no
#  2 sunny 80 90 true no
#  3 overcast 83 86 false yes
#  4 rainy 70 96 false yes
#  5 rainy 68 80 false yes
#  6 rainy 65 70 true no
#  7 overcast 64 65 true yes
...

Initialize decisions: integer 1L denotes the column of the yet unspecified data-frame to which this split applies. Index corresponds to the levels of a factor (discrete splits) and breaks corresponds to a cutoff (continuous splits).

sp_o <- partysplit(1L, index = 1:3)
sp_h <- partysplit(3L, breaks = 75)
sp_w <- partysplit(4L, index = 1:2)

Incorporate decisions into nodes:

pn <- partynode(1L, split = sp_o, kids = list(
  partynode(2L, split = sp_h, kids = list(
  partynode(3L, info = "yes"),
  partynode(4L, info = "no"))),
  partynode(5L, info = "yes"),
  partynode(6L, split = sp_w, kids = list(
  partynode(7L, info = "yes"),
  partynode(8L, info = "no")))))

Fit data to tree:

t2 <- party(pn,
  data = WeatherPlay,
  fitted = data.frame(
    "(fitted)" = fitted_node(pn, data = WeatherPlay),
    "(response)" = WeatherPlay$play, # response variable
  check.names = FALSE),
  terms = terms(play ~ ., data = WeatherPlay),
  )

t3 <- as.constparty(t2)
plot(t3)

source: http://cran.r-project.org/web/packages/partykit/vignettes/partykit.pdf

在此处输入图片说明

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM