简体   繁体   English

R有效循环建议

[英]R efficient looping suggestion

I have a dataframe running into about 500,000 rows. 我有一个数据框运行约500,000行。 One of these columns contains positive integer values, say column A. let there be another column B 这些列中的一个包含正整数值,例如列A。让另一列B

I now need to create a second dataframe with number of rows equal to sum(dataframe$A). 现在,我需要创建第二个数据框,其行数等于sum(dataframe $ A)。 this is done. 这个完成了。

A question of performance arises when i need to fill this new data frame up with data. 当我需要用数据填充这个新数据框架时,会出现性能问题。 I am trying to create a column A2 for this second frame as follows: 我正在尝试为此第二帧创建列A2,如下所示:

A2<-vector() 
for (i in 1:nrow(dataframe)){
  A2<-c(A2,rep(dataframe$B[i],dataframe$A[i]))
}

The external loop is obviously very slow for the large number of rows being processed. 对于要处理的大量行,外部循环显然非常慢。 Any suggestions on how to achieve this task with faster processing. 关于如何通过更快的处理来实现此任务的任何建议。

Thanks for responses 感谢您的回应

You simply do not need the loop at all. 您根本不需要循环。 rep is already vectorized. rep已被矢量化。

A2 <- rep(dataframe$B, dataframe$A)

Should work. 应该管用。 As a reproducible example, here is your way using the built in mtcars dataset. 作为可重现的示例,这是使用内置mtcars数据集的方式。

x <- vector()
for(i in 1:nrow(mtcars)) {x <- c(x, rep(mtcars$cyl[i], mtcars$gear[i]))}
> x
  [1] 6 6 6 6 6 6 6 6 4 4 4 4 6 6 6 8 8 8 6 6 6 8 8 8 4 4 4 4 4 4 4 4 6 6 6 6 6
 [38] 6 6 6 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 8
 [75] 8 8 8 8 8 8 8 8 8 8 8 4 4 4 4 4 4 4 4 4 4 4 4 4 4 8 8 8 8 8 6 6 6 6 6 8 8
[112] 8 8 8 4 4 4 4

and vectorized, it is: 并向量化,它是:

x2 <- rep(mtcars$cyl, mtcars$gear)
> x2
  [1] 6 6 6 6 6 6 6 6 4 4 4 4 6 6 6 8 8 8 6 6 6 8 8 8 4 4 4 4 4 4 4 4 6 6 6 6 6
 [38] 6 6 6 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 8
 [75] 8 8 8 8 8 8 8 8 8 8 8 4 4 4 4 4 4 4 4 4 4 4 4 4 4 8 8 8 8 8 6 6 6 6 6 8 8
[112] 8 8 8 4 4 4 4

which will be orders of magnitude faster than using a loop. 这将比使用循环快几个数量级。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM