[英]R- check unique values by all members in group
我有這樣的數據:
taxes_sol <- structure(list(type_tax = c(
"good1", "good2", "good1", "good2",
"good1", "good2", "good1", "good2",
"good1", "good2", "good1", "good2"
), sol = c("x1", "x1", "x2", "x2", "x3", "x3", "x4", "x4", "x5", "x5", "x6", "x6"
), tax = c("0.11", "0.16", "0.09", "0.15", "0.11", "0.17",
"0.09", "0.15", "0.21", "0.33", "0.11", "0.16"
)), row.names = c(NA, -12L), class = c("tbl_df", "tbl", "data.frame"))
我想保留稅收不同的解決方案。 在這種情況下,只保留解決方案:“x1”、“x2”、“x3”和“x5”。
因此,我嘗試使用distintc()
並按 type_tax 和 tax 進行分組:
taxes_sol %>%
distinct(type_tax, tax, .keep_all = T)
但這不會為“x3”解決方案返回 good1。
有duplicated
的選項
library(dplyr)
taxes_sol %>%
mutate(flag = !duplicated(tax)) %>%
group_by(sol) %>%
filter(any(flag)) %>%
select(-flag)
# A tibble: 8 x 3
# Groups: sol [4]
# type_tax sol tax
# <chr> <chr> <chr>
#1 good1 x1 0.11
#2 good2 x1 0.16
#3 good1 x2 0.09
#4 good2 x2 0.15
#5 good1 x3 0.11
#6 good2 x3 0.17
#7 good1 x5 0.21
#8 good2 x5 0.33
如果這給了你你想要的東西,我不是 100%。 使用dplyr
:
taxes_sol %>%
group_by(type_tax, tax) %>%
mutate(counter = row_number()) %>%
group_by(sol) %>%
filter(any(counter == 1)) %>%
select(-counter)
給你
# A tibble: 8 x 3
# Groups: sol [4]
type_tax sol tax
<chr> <chr> <chr>
1 good1 x1 0.11
2 good2 x1 0.16
3 good1 x2 0.09
4 good2 x2 0.15
5 good1 x3 0.11
6 good2 x3 0.17
7 good1 x5 0.21
8 good2 x5 0.33
distinct()
不會相互比較組,它只會比較各個列。 比較組的一種方法是首先擴展數據,然后比較定義組的列值。 在此之后,您可以延長數據以使其成為原始形式:
taxes_sol %>% pivot_wider(
names_from=type_tax,
values_from=tax
) %>% distinct(
good1,
good2,
.keep_all=T
) %>% pivot_longer(
-sol,
names_to="type_tax", values_to="tax"
)
pivot_wider
對您的數據執行此操作
sol good1 good2
<chr> <chr> <chr>
1 x1 0.11 0.16
2 x2 0.09 0.15
3 x3 0.11 0.17
4 x4 0.09 0.15
5 x5 0.21 0.33
6 x6 0.11 0.16
您的最終答案如下所示
sol type_tax tax
<chr> <chr> <chr>
1 x1 good1 0.11
2 x1 good2 0.16
3 x2 good1 0.09
4 x2 good2 0.15
5 x3 good1 0.11
6 x3 good2 0.17
7 x5 good1 0.21
8 x5 good2 0.33
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.