[英]Removing specific words from word cloud in R
I have made a word cloud in R for 2 songs. 我在R中造了2首歌的词云。 Now in the tdm when I display the items, i get the frequency of words for song 1 and song 2. I am also able to print the word cloud perfectly.
现在在tdm中显示项目时,我得到了歌曲1和歌曲2的词频。我也能够完美地打印词云。 My problem is i do not want words in tdm who frequency is less than 2. How can I do that.
我的问题是我不希望tdm中的频率小于2的单词。我该怎么做。
I wrote the code and got this output: 我编写了代码,并得到以下输出:
tdm=TermDocumentMatrix(corpus)
> tdm=as.matrix(tdm)
>
> tdm
>
song 1 song 2
act 0 2
action 0 2
actions 0 1
activity 5 4
I only want word activity as it occur more than once in both the songs. 我只希望单词活动,因为在两首歌曲中单词活动都多次出现。 I mean I want to remove the words, act, action, actions.
我的意思是我想删除词语,行动,行动,行动。 How can I do that ?
我怎样才能做到这一点 ?
You didn't provide data some something like this should work: 您没有提供类似这样的数据:
data("crude")
tdm <- TermDocumentMatrix(crude)
x <- as.matrix(tdm)[, 1:2]
x[rowSums(apply(x, 2, ">", 1)) == 2, ]
Explanation: The line x <- as.matrix(tdm)[, 1:2]
just getting 2 columns like your data so it doesn't do anything but needed to make data that looked like yours since you didn't provide any. 说明:
x <- as.matrix(tdm)[, 1:2]
仅获得2列类似您的数据的数据,因此它什么也没做,只是需要制作看起来像您的数据,因为您没有提供任何数据。 This line apply(x, 2, ">", 1)
says give me logical values for the statement is this greater than 1. Then I wrap this with rowSums
(logical values are TRUE=1 and FALSE=0). 这行
apply(x, 2, ">", 1)
说给我该语句的逻辑值大于1。然后用rowSums
包装(逻辑值为TRUE = 1和FALSE = 0)。 Values equal to 2 (I had > 1
before but this is sloppy) are the conditions you're looking for. 等于2的值(我之前
> 1
,但这很草率)是您要寻找的条件。 The I use a logical index with this output x[GRAB_THE_ROWS, ]
. 我将逻辑索引与此输出
x[GRAB_THE_ROWS, ]
。 You can tear each step apart and run the code for yourself as seen below: 您可以拆开每一步并自己运行代码,如下所示:
(step_1 <- apply(x, 2, ">", 1))
(step_2 <- rowSums(step_1))
(step_3 <- step_2 == 2)
x[step_3, ]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.