簡體   English   中英

R:在單個列中按組統計值的連續出現

[英]R: count consecutive occurrences of values in a single column and by group

我試圖創建一個相等數量的連續值,一個出現次數。 但是,即使行保持連續,我也希望一旦引入新的ID后就重置計數。

我的數據示例如下:

dataset <- data.frame(ID = 
c("a","a","a","a","a","a","a","b","b","b","b","b","b","b")
dataset$YesNO <- c(1,1,0,0,0,1,1,1,1,1,0,0,0,0)

所以我想用以下結果創建一個新列:

c(1,2,1,2,3,1,2,1,2,3,1,2,3,4)

我使用了在該論壇上找到的以下代碼:

dataset$Counter <- sequence(rle(as.character(dataset$YesNo))$lengths)

但是,這不會重置新ID號的計數。 而是繼續進行連續計數,結果輸出為:

c(1,2,1,2,3,1,2,3,4,5,1,2,3,4)

我缺少根據ID重置它的步驟。

謝謝!

您可以這樣做:

dataset$Counter <- with(dataset,
                        ave(YesNO, ID, FUN = function(x) sequence(rle(as.character(x))$lengths)))

輸出:

   ID YesNO Counter
1   a     1       1
2   a     1       2
3   a     0       1
4   a     0       2
5   a     0       3
6   a     1       1
7   a     1       2
8   b     1       1
9   b     1       2
10  b     1       3
11  b     0       1
12  b     0       2
13  b     0       3
14  b     0       4

使用rleid (來自data.table包)獲取分組變量,然后使用ave在該分組的通用值內應用seq_along

library(data.table)
transform(dataset, Counter = ave(YesNO, rleid(ID, YesNO), FUN = seq_along))

贈送:

   ID YesNO Counter
1   a     1       1
2   a     1       2
3   a     0       1
4   a     0       2
5   a     0       3
6   a     1       1
7   a     1       2
8   b     1       1
9   b     1       2
10  b     1       3
11  b     0       1
12  b     0       2
13  b     0       3
14  b     0       4

另外一個dplyr可能性:

dataset %>%
 group_by(ID, grp = with(rle(YesNO), rep(seq_along(lengths), lengths))) %>%
 mutate(Counter = seq_along(grp)) %>%
 ungroup() %>%
 select(-grp)

   ID    YesNO Counter
   <fct> <dbl>   <int>
 1 a        1.       1
 2 a        1.       2
 3 a        0.       1
 4 a        0.       2
 5 a        0.       3
 6 a        1.       1
 7 a        1.       2
 8 b        1.       1
 9 b        1.       2
10 b        1.       3
11 b        0.       1
12 b        0.       2
13 b        0.       3
14 b        0.       4

要么:

dataset %>%
 group_by(ID, grp = with(rle(YesNO), rep(seq_along(lengths), lengths))) %>%
 mutate(Counter = 1:n()) %>%
 ungroup() %>%
 select(-grp)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM