[英]Accessing individual results from decision tree function JRip (RWeka library)
I am using library(RWeka) and running the JRip function on a data set. 我正在使用library(RWeka)并在数据集上运行JRip函数。 Does anybody know of a way of accessing the rules result set programmatically so I can access each rule individually?
有人知道以编程方式访问规则结果集的方法,以便我可以分别访问每个规则吗?
Here is an example for illustrative purposes only: 这是一个仅出于说明目的的示例:
> library(datasets)
> head(npk)
block N P K yield
1 1 0 1 1 49.5
2 1 1 1 0 62.8
3 1 0 0 0 46.8
4 1 1 0 1 57.0
5 2 1 0 0 59.8
6 2 1 1 1 58.5
> tree_rip <- JRip(block ~ ., data = npk)
> tree_rip
JRIP rules:
===========
(yield <= 48.8) => block=4 (5.0/2.0)
(yield <= 52) => block=5 (4.0/1.0)
=> block=3 (15.0/11.0)
Number of Rules : 3
I would like to access the results in a dataframe/table fashion. 我想以数据框/表的方式访问结果。 The closest is retrieving a single blob string in the following manner:
最接近的是通过以下方式检索单个Blob字符串:
> tree_rip$classifier
[1] "Java-Object{JRIP rules:\n===========\n\n(yield <= 48.8) => block=4 (5.0/2.0)\n(yield <= 52) => block=5 (4.0/1.0)\n => block=3 (15.0/11.0)\n\nNumber of Rules : 3\n}"
I need something that would allow me to get each result separately, just as it is printed when I call tree_rip
, so I can not only get the length of rules found, but access them one by one. 我需要一些可以让我分别获得每个结果的东西,就像我调用
tree_rip
时显示的tree_rip
,所以我不仅可以获得找到的规则的长度,而且可以一一访问它们。
At the very least something like this (but ideally accessing each result variable separately for every row): 至少是这样的(但理想情况下,每行分别访问每个结果变量):
[1] (yield <= 48.8) => block=4 (5.0/2.0)
[2] (yield <= 52) => block=5 (4.0/1.0)
...
thanks! 谢谢!
This proved surprisingly difficult for me, being a non-user of R's integration with Java. 作为R的非用户与Java集成的用户,这对我来说证明是非常困难的。 At any rate after looking at these results in an effort to find out how the REPL had produced the results you were seeing:
在查看了这些结果之后,无论如何要努力找出REPL如何产生您所看到的结果:
str(tree_rip)
# omitting about 15 lines of output
# - attr(*, "class")= chr [1:3] "JRip" "Weka_rules" "Weka_classifier"
getAnywhere(print.JRIP)
# no object named ‘print.JRIP’ was found
getAnywhere(print.Weka_rules)
# no object named ‘print.Weka_rules’ was found
help(pack="RWeka")
getAnywhere(print.Weka_classifier)
# this did succeed ... so I though `.jcall` should also succeed
.jcall(tree_rip$classifier, "S", "toString")
# Error: could not find function ".jcall"
RWeka:::.jcall(tree_rip$classifier, "S", "toString")
# Error in get(name, envir = asNamespace(pkg), inherits = FALSE) :
# object '.jcall' not found
... I discovered that one needs to load pkg:rJava in order to get access to the .jcall function
. ...我发现需要加载pkg:rJava才能访问
.jcall function
。 Apparently this is one of those situations where the supporting library is not loaded but only attached. 显然,这是不加载支持库而仅附加支持库的情况之一。 (Analogous to assuming [incorrectly] that
grid.text
should be available when only pkg:lattice is loaded. ) So this gives you the desired set of strings: (类似于仅在加载pkg:lattice时[错误地]假定
grid.text
应该可用。)因此,这将为您提供所需的字符串集:
library(rJava)
as.matrix(scan(text=.jcall(tree_rip$classifier, "S", "toString") ,sep="\n", what="") )[
-c(1:2, 6), ,drop=FALSE]
#------------
[,1]
[1,] "(yield <= 48.8) => block=4 (5.0/2.0)"
[2,] "(yield <= 52) => block=5 (4.0/1.0)"
[3,] " => block=3 (15.0/11.0)"
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.