簡體   English   中英

如何替換 for 循環使處理更快?

[英]How to replace for loop making processing faster?

我有很多文件要處理。 數據如下:

         V2              V3   V4
1 ID_0071817               1    1
2          1 201912312200+00 0.36
3          2 201912312300+00 0.36
4          3 202001010000+00 0.38
5 ID_0089011               1 1.00
6          1 202001010200+00 0.36


我現在做的是:


for(j in 1:nrow(data)) { if (data[j,2] == "1") {ID<-data[j,1]}
  data[j,4] <- ID
}


產生:


      V2              V3   V4       V4.1
1 ID_0071817               1    1 ID_0071817
2          1 201912312200+00 0.36 ID_0071817
3          2 201912312300+00 0.36 ID_0071817
4          3 202001010000+00 0.38 ID_0071817
5 ID_0089011               1 1.00 ID_0089011
6          1 202001010200+00 0.36 ID_0089011


問題是這是處理整個數據的方式太慢了。 單個文件需要 5 分鍾,我得到了幾千個。


我們可以像下面這樣嘗試cumsum

transform(
  df,
  V4.1 = V2[V3 == 1][cumsum(V3 == 1)]
)

這使

          V2              V3   V4       V4.1
1 ID_0071817               1 1.00 ID_0071817
2          1 201912312200+00 0.36 ID_0071817
3          2 201912312300+00 0.36 ID_0071817
4          3 202001010000+00 0.38 ID_0071817
5 ID_0089011               1 1.00 ID_0089011
6          1 202001010200+00 0.36 ID_0089011

數據

> dput(df)
structure(list(V2 = c("ID_0071817", "1", "2", "3", "ID_0089011", 
"1"), V3 = c("1", "201912312200+00", "201912312300+00", "202001010000+00",
"1", "202001010200+00"), V4 = c(1, 0.36, 0.36, 0.38, 1, 0.36)), class = "data.frame", row.names 
= c("1",
"2", "3", "4", "5", "6"))

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM