[英]Error in tm_map method
I am new in R as well as tm
package. 我是R和
tm
包的新手。 My taks is to perform text document classification using decision trees. 我的任务是使用决策树执行文本文档分类。 I am following someone's project.
我正在关注某人的项目。 At the page no 14 there is a full code.
在第14页上有完整的代码。 There are 2 types of documents, which I have loaded using
DirSource
without any problems. 我使用
DirSource
加载了两种文档,没有任何问题。 My next step was merging these 2 corpuses into collection 我的下一步是将这两个语料合并到集合中
# Merge corpora into one collection
docs <- c( wheat.train , crude.train , wheat.test , crude.test ) ;
And then I would like to make some pre-processing. 然后我要进行一些预处理。
#pre-processing
docs.p <- docs
docs.p <- tm_map (docs.p, stripWhitespace)
But I got such error 但是我有这样的错误
Error in UseMethod("tm_map", x) :
no applicable method for 'tm_map' applied to an object of class "list"
I understand that this guy is using one of the tm's
previous version, and currently tm_map
takes as an argument a corpus, not a collection of corpuses. 我知道这个人使用的是
tm's
先前版本中tm's
一个,目前tm_map
将语料库而不是语料库集合作为参数。 My question is how to create such collection of corpuses that it will be possible to perform pre-processing on it? 我的问题是如何创建这样的语料库集合,以便可以对其进行预处理?
It worked for me using list
instead of c
and than lapply
. 它使用
list
而不是c
和lapply
为我工作。
ex1 <- "bla bla blah "
ex2 <- "dunno what else to say "
wheat <- Corpus(VectorSource(ex1))
crude <- Corpus(VectorSource(ex2))
docs <- list(wheat, crude)
docs.p <- lapply(docs, tm_map, stripWhitespace)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.