Having trouble with in R with nested for loops and an external counter

Question

I'm an R novice with experience in Python and C++ trying to do something that makes sense to me in those languages, but apparently isn't working in R. I've got a JSON array with nested objects that I need to pull data from, but I need to synchronize them into separate arrays to make a new data frame so I can plot the data.

My data looks like this: {URL:[data], ... {VisitHistory:{0:[number], 1:[number]}}}

I'm trying to put this into tabular format, where I get one row for each entry in the VisitHistory array, but each of those rows have the same URL.

Here's what I have so far:

url<-c()
views<-c()
date<-c()
iter<-1

#bring in data
output<-fromJSON(file='filename')

#generate lists for each variable of interest
for(n in 1:length(output)) {
  for(x in 1:length(output[[n]]$th)) {
    url[iter]<-c(output[[n]]$url)
    if(!is.null(output[[n]]$th[[x]]$sh[[1]])) {

      views[iter]<-c(output[[n]]$th[[x]]$sh[[1]])
    }
    else {
      views[iter]<-c(-1)
    }
    date[iter]<-c(output[[n]]$th[[x]]$ts[[1]])

    iter<-iter+1
  }
  iter<-iter+1
}

I'm trying to use iter to make sure that url , views , and date all stay synchronized in their respective vectors until I merge them into their own data frame. However, trying to do assignment in that block with the iter variable as an index makes the loop go on infinitely, and I can't figure out why.

I appreciate your help!

Answer 1

Have you tried printing the iter variable inside the loop to see if it actually going through the iterations or halting on something? Maybe your file is just huge. I am not providing a solution, just a way to help you debug this.

Also, you are dynamically allocating memory inside the for loop for the variables, that makes things slow. Try allocating a fixed size matrix or sequence ( seq or rep ) for the variables in the beginning and break the loops when the iter variable has exhausted their size. If that works, then you know that time is the issue. Eg

# Avoid dynamic allocation, which is slow
# by preallocating memory.
url<-rep(0, 10)
views<-rep(0, 10)
date<-rep(0, 10)
iter<-1

#bring in data
output<-fromJSON(file='filename')

#generate lists for each variable of interest
for(n in 1:length(output)) {
  for(x in 1:length(output[[n]]$th)) {
    print(iter) # print the progression
    url[iter]<-c(output[[n]]$url)
    if(!is.null(output[[n]]$th[[x]]$sh[[1]])) {

      views[iter]<-c(output[[n]]$th[[x]]$sh[[1]])
    }
    else {
      views[iter]<-c(-1)
    }
    date[iter]<-c(output[[n]]$th[[x]]$ts[[1]])

    iter<-iter+1
    if(iter > 10) break
  }
  iter<-iter+1
  if(iter > 10) break
}

You might also want to consider defining a function for what you want and apply that to the list you have, using the plyr package. But first try what I added above and see if that works. Also, to find the max number of iterations for the preallocation, you can do something like:

maxiter <- 0
for(i in 1:length(output)){
  maxiter <- maxiter + length(output[[i]]$th)
}

Also, why are you incrementing the iter variable in the outside loop? You only need to increment it in the inner most loop.

Having trouble with in R with nested for loops and an external counter

Question

1 answers

solution1
0 ACCPTED 2016-03-22 15:44:54

Having trouble with in R with nested for loops and an external counter

Question

1 answers

solution1 0 ACCPTED 2016-03-22 15:44:54

solution1
0 ACCPTED 2016-03-22 15:44:54