简体   繁体   English

使用嵌套的for循环和外部计数器在R中遇到麻烦

[英]Having trouble with in R with nested for loops and an external counter

I'm an R novice with experience in Python and C++ trying to do something that makes sense to me in those languages, but apparently isn't working in R. I've got a JSON array with nested objects that I need to pull data from, but I need to synchronize them into separate arrays to make a new data frame so I can plot the data. 我是一位具有Python和C ++经验的R新手,试图在这些语言中做对我来说有意义的事情,但是显然在R中不起作用。我有一个带有嵌套对象的JSON数组,我需要提取数据从,但我需要将它们同步到单独的数组中以制作新的数据框,以便可以绘制数据。

My data looks like this: {URL:[data], ... {VisitHistory:{0:[number], 1:[number]}}} 我的数据如下所示: {URL:[data], ... {VisitHistory:{0:[number], 1:[number]}}}

I'm trying to put this into tabular format, where I get one row for each entry in the VisitHistory array, but each of those rows have the same URL. 我试图将其放入表格格式,在VisitHistory数组中,每个条目对应一行,但是这些行中的每一个都有相同的URL。

Here's what I have so far: 这是我到目前为止的内容:

url<-c()
views<-c()
date<-c()
iter<-1

#bring in data
output<-fromJSON(file='filename')

#generate lists for each variable of interest
for(n in 1:length(output)) {
  for(x in 1:length(output[[n]]$th)) {
    url[iter]<-c(output[[n]]$url)
    if(!is.null(output[[n]]$th[[x]]$sh[[1]])) {

      views[iter]<-c(output[[n]]$th[[x]]$sh[[1]])
    }
    else {
      views[iter]<-c(-1)
    }
    date[iter]<-c(output[[n]]$th[[x]]$ts[[1]])

    iter<-iter+1
  }
  iter<-iter+1
}

I'm trying to use iter to make sure that url , views , and date all stay synchronized in their respective vectors until I merge them into their own data frame. 我正在尝试使用iter来确保urlviewsdate都在各自的向量中保持同步,直到将它们合并到自己的数据帧中为止。 However, trying to do assignment in that block with the iter variable as an index makes the loop go on infinitely, and I can't figure out why. 但是,尝试使用iter变量作为索引在该块中进行赋值会使循环无限进行,我不知道为什么。

I appreciate your help! 我感谢您的帮助!

Have you tried printing the iter variable inside the loop to see if it actually going through the iterations or halting on something? 您是否尝试过在循环内打印iter变量,以查看它是否实际经历了迭代或暂停了某些事情? Maybe your file is just huge. 也许您的文件很大。 I am not providing a solution, just a way to help you debug this. 我没有提供解决方案,只是一种可以帮助您调试此问题的方法。

Also, you are dynamically allocating memory inside the for loop for the variables, that makes things slow. 另外,您正在for循环内为变量动态分配内存,这会使事情变慢。 Try allocating a fixed size matrix or sequence ( seq or rep ) for the variables in the beginning and break the loops when the iter variable has exhausted their size. 尝试在开始时为变量分配固定大小的matrixsequenceseqrep ),并在iter变量耗尽其大小时中断循环。 If that works, then you know that time is the issue. 如果这样可行,那么您就知道时间就是问题。 Eg 例如

# Avoid dynamic allocation, which is slow
# by preallocating memory.
url<-rep(0, 10)
views<-rep(0, 10)
date<-rep(0, 10)
iter<-1

#bring in data
output<-fromJSON(file='filename')

#generate lists for each variable of interest
for(n in 1:length(output)) {
  for(x in 1:length(output[[n]]$th)) {
    print(iter) # print the progression
    url[iter]<-c(output[[n]]$url)
    if(!is.null(output[[n]]$th[[x]]$sh[[1]])) {

      views[iter]<-c(output[[n]]$th[[x]]$sh[[1]])
    }
    else {
      views[iter]<-c(-1)
    }
    date[iter]<-c(output[[n]]$th[[x]]$ts[[1]])

    iter<-iter+1
    if(iter > 10) break
  }
  iter<-iter+1
  if(iter > 10) break
}

You might also want to consider defining a function for what you want and apply that to the list you have, using the plyr package. 您可能还需要考虑使用plyr包为所需的功能定义一个函数,并将其应用于您拥有的列表。 But first try what I added above and see if that works. 但是首先尝试我在上面添加的内容,看看是否可行。 Also, to find the max number of iterations for the preallocation, you can do something like: 另外,要查找预分配的最大迭代次数,可以执行以下操作:

maxiter <- 0
for(i in 1:length(output)){
  maxiter <- maxiter + length(output[[i]]$th)
}

Also, why are you incrementing the iter variable in the outside loop? 另外,为什么要在外部循环中增加iter变量? You only need to increment it in the inner most loop. 您只需要在最里面的循环中增加它。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM