简体   繁体   English

在R语言中嵌套for循环

[英]nested for-loop in R language

I have this code to calculate duplicate in a data frame using cosine similarity through firstly: first loop (nrow) times to take in each time one tweet then compares the cosine similarity results to this tweet with other tweets using second loop. 我有这个代码,首先使用余弦相似度计算数据帧中的副本:首先循环(nrow)次,每次一条推文然后将余弦相似性结果与其他推文使用第二个循环的推文进行比较。

Here is my code: 这是我的代码:

for (i in 1:nrow(temp)) {
  dup=0
  one_Tweets = tweets$Tweet[i]
  cos_similarity = data.frame("v1"=NULL) # NULL So that don't write previous value
  cos_similarity=data.frame(sim <- round( sim.strings(AllTweets,one_Tweets), digits = 3) )
  names(cos_similarity) = c( "v1")

  for (b in i+1:nrow(temp)) {
    Tweet_cos=cos_similarity$v1[b]
    if ( Tweet_cos >= 0.900) {
      count = count+1
      tweets$flag[b]= 1
    }else { #if ( Tweet_cos <0.900) {
      tweets$flag[b]= 2
    }
    Tweet_cos=0
  }
  dup=tweets$duplicate[i]= tweets$duplicate[i]+count 
  count = 0
}

I have a problem in first loop, entered one time although that number of tweets in data frame 10000 tweets. 我在第一个循环中遇到了问题,虽然在数据框10000个推文中发送了一定数量的推文,但输入了一次。

and i get the error: 我得到错误:

Error in if (Tweet_cos >= 0.9) { : missing value where TRUE/FALSE needed

I dont still have rep to put it in comment but I think you are getting this problem because of NA/NULL in Tweet_cos vector. 我还没有代表把它放在评论中,但我认为你因为Tweet_cos矢量中的NA / NULL而遇到了这个问题。 to debug remove this part from code: 调试从代码中删除此部分:

    for (b in i+1:nrow(temp)) {
    Tweet_cos=cos_similarity$v1[b]
    if ( Tweet_cos >= 0.900) {
      count = count+1
      tweets$flag[b]= 1
    }else { #if ( Tweet_cos <0.900) {
      tweets$flag[b]= 2
    }
    Tweet_cos=0
  }
  dup=tweets$duplicate[i]= tweets$duplicate[i]+count 
  count = 0

replace whole of this with print(cos_similarity$v1) . print(cos_similarity$v1)替换整个print(cos_similarity$v1) You should ideally see some NA/NULL which by def could not be compared with 0.9 and hence the error. 理想情况下,您应该看到一些NA / NULL,其中def无法与0.9进行比较,从而导致错误。

If there are too many iterations/loop then try to print values of i and b where you are getting error and print cos_similarity$v1 only for that. 如果有太多的迭代/循环,那么尝试打印你正在获得错误的ib值,并仅为此打印cos_similarity$v1

Please consider sharing small sample data so that others can replicate your problem 请考虑共享小样本数据,以便其他人可以复制您的问题

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM