简体   繁体   English

tm_map方法错误

[英]Error in tm_map method

I am new in R as well as tm package. 我是R和tm包的新手。 My taks is to perform text document classification using decision trees. 我的任务是使用决策树执行文本文档分类。 I am following someone's project. 我正在关注某人的项目。 At the page no 14 there is a full code. 在第14页上有完整的代码。 There are 2 types of documents, which I have loaded using DirSource without any problems. 我使用DirSource加载了两种文档,没有任何问题。 My next step was merging these 2 corpuses into collection 我的下一步是将这两个语料合并到集合中

   # Merge corpora into one collection 
docs <- c( wheat.train , crude.train , wheat.test , crude.test ) ;

And then I would like to make some pre-processing. 然后我要进行一些预处理。

#pre-processing
docs.p <- docs
docs.p <- tm_map (docs.p, stripWhitespace)

But I got such error 但是我有这样的错误

    Error in UseMethod("tm_map", x) : 
  no applicable method for 'tm_map' applied to an object of class "list"

I understand that this guy is using one of the tm's previous version, and currently tm_map takes as an argument a corpus, not a collection of corpuses. 我知道这个人使用的是tm's先前版本中tm's一个,目前tm_map将语料库而不是语料库集合作为参数。 My question is how to create such collection of corpuses that it will be possible to perform pre-processing on it? 我的问题是如何创建这样的语料库集合,以便可以对其进行预处理?

It worked for me using list instead of c and than lapply . 它使用list而不是clapply为我工作。

ex1 <- "bla bla blah   "
ex2 <- "dunno    what else to say    "

wheat <- Corpus(VectorSource(ex1))
crude <- Corpus(VectorSource(ex2))

docs <- list(wheat, crude)
docs.p <- lapply(docs, tm_map, stripWhitespace)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM