简体   繁体   中英

Error in creating TermDocumentMatrix using tm package in R

I am unable to create a term document matrix using tm package in R which throws the following error as I try to create one out of a preprocessed corpus.

Error in UseMethod("TermDocumentMatrix", x) : 
  no applicable method for 'TermDocumentMatrix' applied to an object of class 
"character"

Below is my script that I am using. I am using R v3.4.1 with tm package v0.7-1.

data <- readLines("Data/en_US/en_US_sample.txt", n = 100)
data <- Corpus(VectorSource(data))
data <- tm_map(data, removePunctuation)
data <- tm_map(data, removeNumbers)
data <- tm_map(data, content_transformer(tolower))
data <- tm_map(data, removeWords, stopwords("en"))
data <- tm_map(data, stripWhitespace)
words <- TermDocumentMatrix("data")

I believe TermDocumentMatrix requires the corpus to be in some specified text document format so I tried coercing my corpus to PlainTextDocument using tm_map but it doesn't solve the problem. When I am loading the my text data using Corpus on VectorSource, object created shows the class as SimpleCorpus which might be the problem but I am not totally sure.

Any help would be much appreciated. Thanks!

您做对了所有事情,只是在最后一行中,您不小心将字符"data" (请注意引号)传递给了TermDocumentMatrix()函数,而不是对象data

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM