[英]R: How to Remove Rows with condition from another Group_By dataframe when Row Count is >1
我有以下示例數據集:
structure(list(Vno = c(1111, 1111, 2222, 3333, 3333, 4444, 5555,
5555), ID = c("A001", "X011", "B002", "C003", "Y033", "D004",
"E005", "X055"), Name = c("John", "S/O JJJ", "S/O LLL", "Jane",
"D/O MMM", "S/O ZZZ", "Nicole", "D/O ZZZ")), row.names = c(NA,
-8L), class = c("tbl_df", "tbl", "data.frame"))
Output:
> df
# A tibble: 8 x 3
Vno ID Name
<dbl> <chr> <chr>
1 1111 A001 John
2 1111 X011 S/O JJJ
3 2222 B002 S/O LLL
4 3333 C003 Jane
5 3333 Y033 D/O MMM
6 4444 D004 S/O ZZZ
7 5555 E005 Nicole
8 5555 X055 D/O ZZZ
當 group-by(Vno) 計數大於 1 時,預期的 output 是過濾掉以“S/O”或“D/O”開頭的名稱。 但是,我在下面的嘗試甚至刪除了帶有“S/O”或“D/O”的單行:
pt_byVno <- df %>%
group_by(Vno) %>%
filter(!grepl('S/O|D/O',Name)) %>%
print
Vno ID Name
<dbl> <chr> <chr>
1 1111 A001 John
2 2222 B002 Mark
3 4444 D004 Nicole
所需的 output 應該是:
# A tibble: 5 x 3
Vno ID Name
<dbl> <chr> <chr>
1 1111 A001 John
2 2222 B002 S/O LLL
3 3333 C003 Jane
4 4444 D004 S/O ZZZ
5 5555 E005 Nicole
感謝任何 R 專家在這里提供幫助,謝謝!
您可以 select 行在組中只有一行或其中沒有'S/O|D/O'
。
library(dplyr)
df %>% group_by(Vno) %>% filter(n() == 1 | !grepl('S/O|D/O', Name))
# Vno ID Name
# <dbl> <chr> <chr>
#1 1111 A001 John
#2 2222 B002 S/O LLL
#3 3333 C003 Jane
#4 4444 D004 S/O ZZZ
#5 5555 E005 Nicole
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.