简体   繁体   English

如何在R中将“ RWeka”决策树转换为“方”树?

[英]How do I convert an “RWeka” decision tree into a “party” tree in R?

I am using the RWeka package in R to fit M5' trees to a dataset using "M5P". 我正在R中使用RWeka软件包,以使用“ M5P”将M5'树适合数据集。 I then want to convert the tree generated into a "party" tree so that I can access variable importances. 然后,我想将生成的树转换为“派对”树,以便可以访问变量的重要性。 The issue I am having is that I can't seem to get the function as.party to work without getting the following error: 我遇到的问题是,在没有出现以下错误的情况下,我似乎无法使as.party函数as.party工作:

"Error: all(sapply(split, head, 1) %in% c("<=", ">")) is not TRUE"

This error only arises when I apply the function within a for loop, but the for loop is necessary as I am running 5-fold cross validation. 仅当我在for循环中应用函数时才会出现此错误,但是当我运行5倍交叉验证时,for循环是必需的。

Below is the code I have been running: 下面是我一直在运行的代码:

n <- nrow(data)

k <- 5

indCV <- sample( rep(1:k,each=ceiling(n/k)), n)


for(i in 1:k){

#Training data is for all the observations where indCV is not equal to i

training_data <- data.frame(x[-which(indCV==i),])

training_response <- y[-which(indCV==i)]

#Test the data on the fifth of the data where the observation indices are equal to i

test_data <- x[which(indCV==i),]

test_response <- y[which(indCV==i)]

#Fit a pruned model to the training data

fit <- M5P(training_response~., data=training_data, control=Weka_control(N=TRUE))

#Convert to party

p <- as.party(fit)
}

The RWeka package has an example for converting M5P trees into party objects. RWeka软件包提供了一个将M5P树转换为party对象的示例。 If you run example("M5P", package = "RWeka") then the tree visualizations are actually drawn by partykit . 如果您运行example("M5P", package = "RWeka")那么example("M5P", package = "RWeka")可视化实际上是由partykit绘制的。 After running the examples, see plot(m3) and as.party(m3) . 运行示例之后,请参见plot(m3)as.party(m3)

However, while for J48 you can get a fully fledged constparty object, the same is not true for M5P . 但是,尽管对于J48您可以获得完整的constparty对象,但对于M5P并非如此。 In the latter case, the tree structure itself can be converted to party but the linear models within the nodes are not completely straightforward to convert into lm objects. 在后一种情况下,树结构本身可以转换为party但是节点内的线性模型转换为lm对象并非完全简单。 Thus, if you want to use the party representation to compute measures that only depend on the tree structure (eg, variables used for splitting, number of splits, splitpoints, etc.) then you can do so. 因此,如果要使用参与party表示来计算仅取决于树结构的度量(例如,用于拆分的变量,拆分数,拆分点等),则可以这样做。 But if you want to compute measures that depend on the models or the predictions (eg, mean square errors etc.) then the party class won't be of much help. 但是,如果您要计算依赖于模型或预测的度量(例如均方误差等),那么party类将无济于事。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM