[英]Closures in R, calling functions within a function , recursive functions
I am new to R and I am trying out a Classification decision tree using party:ctree
library. 我是R的新手,正在使用
party:ctree
库尝试分类决策树。 All seems to be fine. 一切似乎都很好。 I get the expected result and a well describing plot.
我得到了预期的结果并很好地描述了情节。
Now if i want to extract the results from the summary of the fit, I ahve to traverse to each node and extract information. 现在,如果我想从拟合摘要中提取结果,我将遍历每个节点并提取信息。 Fortunately this is already written by @baydoganm here .
幸运的是, 这里已经由@baydoganm编写了。 I want to extend this code and write the results to a
dataframe
instead of printing it. 我想扩展此代码并将结果写入
dataframe
而不是打印它。
reproducible code : 可复制的代码:
library(party)
ct <- ctree(Species ~ ., data = iris)
traverse <- function(treenode){
if(treenode$terminal){
bas=paste(treenode$nodeID,treenode$prediction)
print(bas) #here the results are printed
return(0)
}
traverse(treenode$left)
traverse(treenode$right)
}
traverse(ct@tree) #function call
This works fine and i get the output on console. 这工作正常,我在控制台上得到输出。 Now if i want to write the results to a data frame, I am facing problems.
现在,如果我想将结果写入数据帧,则面临问题。
What i tried so far: tried to write to a list using mutable closures(). 到目前为止,我尝试过的事情:尝试使用可变的Closures()写入列表。 But not sure how to get it working.
但是不确定如何使它工作。
l <- list()
count = 0
traverse1 <- function(treenode,l){
if((treenode$terminal == T)){
count <<- count + 1
print(count)
node = c(treenode$nodeID)
pred = c(treenode$prediction)
l[[count]] <- data.frame(node,pred) #write results in the dataframe
}
traverse1(treenode$left,l)
traverse1(treenode$right,l)
}
test <- traverse1(ct@tree,l)# function call
I get only the results of my last call to the function and rest are null 我只得到最后一次调用该函数的结果,其余均为空
Smart way: use assign()
to write in the global environment: 聪明的方法:使用
assign()
在全局环境中编写:
require(party)
ct <- ctree(Species ~ ., data = iris)
tt <- NULL
traverse <- function(treenode){
if(treenode$terminal){
bas=paste(treenode$nodeID,treenode$prediction)
assign("tt", c(tt, bas), envir = .GlobalEnv)
print(bas) #here the results are printed
return(0)
}
traverse(treenode$left)
traverse(treenode$right)
}
traverse(ct@tree) #function call
data.frame(node.id = unlist(lapply(str_split(tt, " "), function(x) x[[1]]))
, prediction = unlist(lapply(str_split(tt, " "), function(x) x[[2]])))
Dirty way: use sink()
to save your printed output. 肮脏的方式:使用
sink()
保存您的打印输出。
sink(file = "test.csv", append = T)
traverse(ct@tree) #function call
sink()
tt <- read.csv("test.csv", header = F)
If you use the new improved ctree()
implementation from the partykit
package, then this has all information you need in its fitted
component: 如果您使用来自
partykit
包的新改进的ctree()
实现,那么在其fitted
组件中将包含所有您需要的信息:
library("partykit")
ct <- ctree(Species ~ ., data = iris)
head(fitted(ct))
## (fitted) (weights) (response)
## 1 2 1 setosa
## 2 2 1 setosa
## 3 2 1 setosa
## 4 2 1 setosa
## 5 2 1 setosa
## 6 2 1 setosa
So for a classification tree you can easily construct the table of absolute frequencies of the response using xtabs()
(or table()
). 因此,对于分类树,您可以使用
xtabs()
(或table()
)轻松构造响应的绝对频率table()
。 And for a regression tree, tapply()
could easily be used to get means, medians, etc. 对于回归树,
tapply()
可以轻松用于获取均值,中位数等。
In this case let's look at absolute and relative frequencies in tabular form: 在这种情况下,让我们以表格形式查看绝对和相对频率:
tab <- xtabs(~ `(fitted)` + `(response)`, data = fitted(ct))
tab
## (response)
## (fitted) setosa versicolor virginica
## 2 50 0 0
## 5 0 45 1
## 6 0 4 4
## 7 0 1 45
ptab <- prop.table(tab, 1)
ptab
## (response)
## (fitted) setosa versicolor virginica
## 2 1.00000000 0.00000000 0.00000000
## 5 0.00000000 0.97826087 0.02173913
## 6 0.00000000 0.50000000 0.50000000
## 7 0.00000000 0.02173913 0.97826087
An alternative route to obtain the frequency table tab
would be: table(predict(ct, type = "node"), iris$Species)
. 获取频率表
tab
的另一种方法是: table(predict(ct, type = "node"), iris$Species)
。
If you want to turn any of these into a data frame the as.data.frame()
works just fine (probably plus some relabeling of the variables...): 如果要将其中任何一个转换为数据框,则
as.data.frame()
可以正常工作(可能加上一些变量的重新标记...):
as.data.frame(ptab)
## X.fitted. X.response. Freq
## 1 2 setosa 1.00000000
## 2 5 setosa 0.00000000
## 3 6 setosa 0.00000000
## 4 7 setosa 0.00000000
## 5 2 versicolor 0.00000000
## 6 5 versicolor 0.97826087
## 7 6 versicolor 0.50000000
## 8 7 versicolor 0.02173913
## 9 2 virginica 0.00000000
## 10 5 virginica 0.02173913
## 11 6 virginica 0.50000000
## 12 7 virginica 0.97826087
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.