[英]Conditionally remove the nth row of a group in `dplyr` in R
我有這種類型的數據,其中Sequ
是一個分組變量:
df <- data.frame(
Sequ = c(1,1,1,1,
2,2,2,
3,3,3,
4,4,4,4,
5,5,5,
6,6,6,6),
Speaker = c("A","B",NA,"A",
"B",NA,"C",
"A",NA,"A",
"A","C",NA,"A",
"A",NA,"C",
"B","A",NA,"C")
)
對於每個Sequ
,我想在其Speaker
值不是NA
的情況下刪除第二行。 我試過這個,但它刪除了整個Sequ
:
library(dplyr)
df %>%
group_by(Sequ) %>%
filter(!is.na(nth(Speaker,2)))
如何獲得所需的輸出:
df
1 1 A
2 1 <NA>
3 1 A
4 2 B
5 2 <NA>
6 2 C
7 3 A
8 3 <NA>
9 3 A
10 4 A
11 4 <NA>
12 4 A
13 5 A
14 5 <NA>
16 5 C
17 6 B
18 6 <NA>
19 6 C
在基礎 R 中:
subset(df, ave(Sequ, Sequ, FUN=seq_along) != 2 | is.na(Speaker))
Sequ Speaker
1 1 A
3 1 <NA>
4 1 A
5 2 B
6 2 <NA>
7 2 C
8 3 A
9 3 <NA>
10 3 A
11 4 A
13 4 <NA>
14 4 A
15 5 A
16 5 <NA>
17 5 C
18 6 B
20 6 <NA>
21 6 C
用dplyr
library(dplyr)
df %>%
group_by(Sequ) %>%
filter(row_number() != 2 | is.na(Speaker))
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.