如何从wordcloud R包中删除单词，使其可以包含在输出中？

Question

I'm using package "wordcloud" with description "Word Cloud" from the R Packages repository. 我正在使用R Packages存储库中的描述为“ Word Cloud”的软件包“ wordcloud”。 When I create wordcloud from some random text, some words are omitted automatically as they should not be a part of wordcloud. 当我从一些随机文本创建wordcloud时，某些单词会自动省略，因为它们不应该是wordcloud的一部分。

Code: 码：

library(RColorBrewer)
library(NLP)
library(wordcloud)
library(tm)


wordcloud("foo bar oh oh by by bye bingo hell no", scale=c(3,1), colors=brewer.pal(6,"Dark2"),random.order=FALSE)

Output: 输出：

I want to keep words like "oh" and "by" in the wordcloud. 我想在wordcloud中保留“ oh”和“ by”之类的词。 How? 怎么样？

Edit: I prefer doing so by removing these words from set of stopwords from wordcloud package, instead of using frequency. 编辑：我更喜欢通过从wordcloud软件包中的停用词集中删除这些单词，而不是使用频率。

Answer 1

Here's one way: 这是一种方法：

library(wordcloud)
library(tm)
txt <- "foo bar oh oh by by bye bingo hell no"
corp <- Corpus(VectorSource(txt))
tdm <- TermDocumentMatrix(corp, control = list(wordLengths = c(-Inf, Inf)))
m <- as.matrix(tdm)
v <- sort(rowSums(m),decreasing=TRUE)
d <- data.frame(word = names(v),freq=v)
wordcloud(d$word,d$freq,min.freq=1)

Answer 2

There are two ways to use wordcloud(): 有两种使用wordcloud（）的方式：

one with a string with all the words as the main argument: what you do now 一个以所有单词为主要参数的字符串：您现在要做什么
one with a vector of words and a corresponding vector of frequencies 一个带有单词向量和相应频率向量的

The first input forces wordcloud() to call tm, constitute a corpus, remove the stopwords and this is the step where you lose the two-letter words. 第一个输入将迫使wordcloud（）调用tm，构成语料库，删除停用词，这是您丢失两个字母的单词的步骤。

A simple way is to revert to the use of wordcloud that does not require the tm package, by treating your string before feeding it to wordcloud(): 一种简单的方法是，通过在将字符串提供给wordcloud（）之前对其进行处理，来恢复不需要tm软件包的wordcloud的使用：

library(stringr)
library(wordcloud)
library(RColorBrewer)

## The initial string
mystring <- "foo bar oh oh by by bye bingo hell no"
## Split it and count frequencies
tabl <- table(str_split(mystring,pattern=" "))
## Make the wordcloud: all words are there!
wordcloud(names(tabl),tabl,scale=c(3,1), colors=brewer.pal(6,"Dark2"),random.order=FALSE)

如何从wordcloud R包中删除单词，使其可以包含在输出中？

问题描述

2 个解决方案

解决方案1
2 已采纳 2016-10-07 16:30:11

解决方案2
2 2016-10-07 16:37:07

如何从wordcloud R包中删除单词，使其可以包含在输出中？

问题描述

2 个解决方案

解决方案1 2 已采纳 2016-10-07 16:30:11

解决方案2 2 2016-10-07 16:37:07

解决方案1
2 已采纳 2016-10-07 16:30:11

解决方案2
2 2016-10-07 16:37:07