简体   繁体   中英

make a faster for-loop for R with rbinding dataframes

I am trying to bind dataframes that comes from JSON data

I tried using rbind.fill and for loop which works for small data, but it takes too long for data more than 100k. Especially, I would like to know if there is any way to vectorize to make it faster rather than making an empty dataframe.

big[1,1] shows a list of json string looks like below


fromJSONbig[1,1] shows a 6 x 2 dataframe.

fromJSON(big[1,1]) #It is a 6 x 2 dataframe
row=nrow(big) #Number of row which also means number of 'rt's

result=data.frame(latitude=integer(), longitude=integer()) #Make an empty dataframe which will store values
for (i in 1:row){
  result=rbind.fill(result,fromJSON(big[i,1])) #Bind the dataframes
result[,2]=result[,2]/100000 #Adjust longitude and latitude
result #It would be 6*row x 2 dataframe

This is untested, but maybe something similar would work:

result_list <- lapply(big[, 1], "fromJSON")
result <- do.call("rbind.fill", result_list)

Probably not the most elegant answer, but you could read the arrays into a list and then use the reduce function to bind all the rows together.

resultlist <- vector(list, row)

for (i in 1:row){
  resultlist[[i]]= fromJSON(big[i,1]))

result <- reduce(resultlist, rbind.fill)

I expect this should be way faster since the dataframe is not being enlarged in every loop.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM