[英]How to change the values in several columns of a data frame, when they have the same value in another specific column?
所以我有一個這樣的數據框,其中包含來自三個不同國家的幾個物種:
Species | Country_A | Country_B | Country_C
-----------------------------------------------------
Tilapia guineensis | yes | no | no
Tilapia guineensis | no | yes | no
Tilapia zillii | no | no | yes
Tilapia zillii | no | no | no
Eutrigla gurnardus | no | yes | no
Eutrigla gurnardus | yes | no | no
Sprattus sprattus | yes | no | no
Sprattus sprattus | yes | no | no
Sprattus sprattus | no | yes | no
Aetobatus narinari | no | no | yes
Aetobatus narinari | yes | no | no
Aetobatus narinari | yes | no | no
Aetobatus narinari | yes | no | no
我想基本上更改國家/地區的列,如果該物種的任何出現在該國家/地區的其他行中具有“是”,則更改為“是”。 如果我感到困惑,我很抱歉。 我想做的是這樣的:
Species | Country_A | Country_B | Country_C
-----------------------------------------------------
Tilapia guineensis | yes | yes | no
Tilapia zillii | no | no | yes
Eutrigla gurnardus | yes | yes | no
Sprattus sprattus | yes | yes | no
Aetobatus narinari | yes | no | yes
提前感謝您的任何回答。
也許嘗試重塑數據,然后保留所需的值,管理重復的行並重塑為寬:
library(dplyr)
library(tidyr)
#Code
new <- df %>% pivot_longer(-Species) %>%
filter(value!='no') %>%
group_by(Species) %>%
filter(!duplicated(name)) %>%
pivot_wider(names_from = name,values_from=value,values_fill='no')
Output:
# A tibble: 5 x 4
# Groups: Species [5]
Species Country_A Country_B Country_C
<chr> <chr> <chr> <chr>
1 Tilapia guineensis yes yes no
2 Tilapia zillii no no yes
3 Eutrigla gurnardus yes yes no
4 Sprattus sprattus yes yes no
5 Aetobatus narinari yes no yes
使用的一些數據:
#Data
df <- structure(list(Species = c("Tilapia guineensis", "Tilapia guineensis",
"Tilapia zillii", "Tilapia zillii", "Eutrigla gurnardus", "Eutrigla gurnardus",
"Sprattus sprattus", "Sprattus sprattus", "Sprattus sprattus",
"Aetobatus narinari", "Aetobatus narinari", "Aetobatus narinari",
"Aetobatus narinari"), Country_A = c("yes", "no", "no", "no",
"no", "yes", "yes", "yes", "no", "no", "yes", "yes", "yes"),
Country_B = c("no", "yes", "no", "no", "yes", "no", "no",
"no", "yes", "no", "no", "no", "no"), Country_C = c("no",
"no", "yes", "no", "no", "no", "no", "no", "no", "yes", "no",
"no", "no")), row.names = c(NA, -13L), class = "data.frame")
我們可以通過dplyr
中的操作輕松完成此操作
library(dplyr)
df %>%
group_by(Species) %>%
summarise(across(everything(), ~ if(any(. == 'yes')) 'yes'
else 'no'), .groups = 'drop')
-輸出
# A tibble: 5 x 4
# Species Country_A Country_B Country_C
# <chr> <chr> <chr> <chr>
#1 Aetobatus narinari yes no yes
#2 Eutrigla gurnardus yes yes no
#3 Sprattus sprattus yes yes no
#4 Tilapia guineensis yes yes no
#5 Tilapia zillii no no yes
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.