简体   繁体   English

无法调整R的tm包中的findAssocs()

[英]Unable to tweak the findAssocs() in tm package in R

I was trying to find associations between top 10 frequent words with the rest of the frequent words int the input text. 我试图在输入文本中查找前10个常用词与其余常用词之间的关联。

When I look at the individual output of findAssocs() : 当我查看findAssocs()的单个输出时:

findAssocs(dtm, "good", corlimit=0.4)

It gives the output clearly by printing the word 'good' with which associations have been sought. 通过打印已寻求关联的“好”一词,可以清楚地给出输出。

$good
 better     got    hook    next content     fit  person 
   0.44    0.44    0.44    0.44    0.43    0.43    0.43 

But when I try to automate this process for a character vector having top 10 words: 但是,当我尝试对具有前10个字的字符向量进行自动化处理时:

t10 <- c("busi", "entertain", "topic", "interact", "track", "content", "paper", "media", "game", "good")

the output is a list of correlations for each of those elements BUT WITHOUT THE WORD WITH WHICH THE ASSOCIATIONS HAVE BEEN SOUGHT. 输出是这些元素中每一个元素的相关性列表,但是没有单词,并且已经关联了。 The sample output is as below (plz notice that the word at t10[i] is not printed, unlike the above individual output where 'good' was clearly printed): 示例输出如下(请注意,t10 [i]处的单词未打印,不像上面的单独输出中清楚地打印了“ good”一样):

for(i in 1:10) {

   t10_words[i] <- as.list(findAssocs(dtm, t10[i], corlimit=0.4))
}


> t10_words
[[1]]
   littl descript  disrupt    enter    model 
    0.50     0.48     0.48     0.48     0.48 

[[2]]
  immers    anyth   effect     full holodeck      iot  problem      say startrek     such  suspect      wow 
    0.68     0.48     0.48     0.48     0.48     0.48     0.48     0.48     0.48     0.48     0.48     0.48 

[[3]]
         area        captur          give        overal          like          alon          avid         begin 
         0.51          0.47          0.47          0.47          0.44          0.43          0.43          0.43 
      circuit         cloud collaboration      communic     communiti        concis        confus         defin 
         0.43          0.43          0.43          0.43          0.43          0.43          0.43          0.43 
      discord        doesnt          drop enablesupport        esport         event         everi       everyon 
         0.43          0.43          0.43          0.43          0.43          0.43          0.43          0.43 

How do I print the output along with the actual association word? 如何打印输出以及实际的关联词?

Can somebody please help me with this?? 有人可以帮我吗?

Thanks. 谢谢。

After running your for loop, add the following piece of code: 运行for循环后,添加以下代码:

names(t10_words) <- t10

This will name the lists with the words specified in t10. 这将使用t10中指定的单词命名列表。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM