简体   繁体   English

R Weka J48决策树无法处理数字类

[英]R Weka J48 Decision Tree Cannot handle numeric class

I found this document on the web: https://www.erpublication.org/admin/vol_issue1/upload%20Image/IJETR032129.pdf 我在网上找到了该文档: https : //www.erpublication.org/admin/vol_issue1/upload%20Image/IJETR032129.pdf

There it uses on page 4 to build a decision tree with RWeka package and J48 function in R. In his example, he has both numerical and categorical values. 在那里,它在第4页上使用RWeka程序包和R中的J48函数构建决策树。在他的示例中,他同时具有数值和分类值。

So, I made a test, with just on column trying to predict the other. 因此,我进行了一次测试,仅在专栏尝试预测另一个。 Here is a sample: 这是一个示例:

VALUE CHURNED_F
    2         1
    2         0
    2         0
    2         0
    2         0
    1         0

This is my code: 这是我的代码:

m2 <- J48(`CHURNED_F` ~ ., data = head(train[, -c(1)]))

But I get this error: 但是我得到这个错误:

Error in .jcall(o, "Ljava/lang/Class;", "getClass") : 
  weka.core.UnsupportedAttributeTypeException: weka.classifiers.trees.j48.C45PruneableClassifierTree: Cannot handle numeric class!

I don't understand the error, first of all it is a categorical class. 我不明白错误,首先是分类类。 Second, in the example in the document it perfectly uses both categorical and numerical columns. 其次,在文档的示例中,它完美地使用了分类列和数字列。

How can I get this to work? 我该如何工作?

J48 requires the class be categorical, or in the case of R, a factor. J48要求类别是分类的,或者在R的情况下是一个因素。 I believe that your "Churned_F" variable is numeric. 我相信您的“ Churned_F”变量是数字的。 You can check what type your variables are by using the structure function: 您可以使用结构函数检查变量的类型:

str(train)  

The code below allows you to build a J48 tree. 下面的代码使您可以构建J48树。 Here I ensure "CHURNED_F" is a factor. 在这里,我确保“ CHURNED_F”是一个因素。

library(RWeka)
train <- data.frame(VALUE = c(2,2,2,2,2,1), CHURNED_F = factor(c(1,0,0,0,0,0)))
m2 <- J48(CHURNED_F ~., data = train)

It means, that your answer column must be represented by a character instead of a numeric value. 这意味着,您的答案列必须由字符而不是数字表示。 You can change it with this method: 您可以使用以下方法进行更改:

Wine$X1=factor(Wine$X1,levels = c(1,2,3),labels = c("Uno","Dos","Tres"))

Wine is my data set. Wine是我的数据集。 X1 is my answer column. X1是我的答案列。 1, 2 and 3 are the answers. 1、2和3是答案。 Uno, Dos and Tres are the wanted answer after the parsing of the numeric values. 解析数值后,Uno,Dos和Tres是所需的答案。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM