I am looking to use the tm package to make changes to the columns of a dataframe ie I would like to use the content_transformer, removePunctuation etc. functions to be applied on the columns of a dataframe.
For example using the below dataframe
df <- data.frame(a=c("I love TEXTMINING","Here I GO, Again!!"))
I would like to us the content_transformer to make the df$a into lower cases and removePunctuation to remove the punctuation such that the output would look like the below
a
1 i love textmining
2 here i go again
Is there a way to perform the above specifically using the functions in the tm package?
To use the tm package here is an example:
df <- data.frame(a=c("I love TEXTMINING","Here I GO, Again!!"))
library(tm)
corpus<-Corpus(VectorSource(df$a))
corpus<-tm_map(corpus, removeNumbers)
corpus<-tm_map(corpus, content_transformer(tolower))
#corpus<-tm_map(corpus, removeWords, stopwords('english'))
corpus<-tm_map(corpus, removePunctuation)
answer<-unlist(as.list(corpus))
answer
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.