[英]How to change the values in several columns of a data frame, when they have the same value in another specific column?
所以我有一个这样的数据框,其中包含来自三个不同国家的几个物种:
Species | Country_A | Country_B | Country_C
-----------------------------------------------------
Tilapia guineensis | yes | no | no
Tilapia guineensis | no | yes | no
Tilapia zillii | no | no | yes
Tilapia zillii | no | no | no
Eutrigla gurnardus | no | yes | no
Eutrigla gurnardus | yes | no | no
Sprattus sprattus | yes | no | no
Sprattus sprattus | yes | no | no
Sprattus sprattus | no | yes | no
Aetobatus narinari | no | no | yes
Aetobatus narinari | yes | no | no
Aetobatus narinari | yes | no | no
Aetobatus narinari | yes | no | no
我想基本上更改国家/地区的列,如果该物种的任何出现在该国家/地区的其他行中具有“是”,则更改为“是”。 如果我感到困惑,我很抱歉。 我想做的是这样的:
Species | Country_A | Country_B | Country_C
-----------------------------------------------------
Tilapia guineensis | yes | yes | no
Tilapia zillii | no | no | yes
Eutrigla gurnardus | yes | yes | no
Sprattus sprattus | yes | yes | no
Aetobatus narinari | yes | no | yes
提前感谢您的任何回答。
也许尝试重塑数据,然后保留所需的值,管理重复的行并重塑为宽:
library(dplyr)
library(tidyr)
#Code
new <- df %>% pivot_longer(-Species) %>%
filter(value!='no') %>%
group_by(Species) %>%
filter(!duplicated(name)) %>%
pivot_wider(names_from = name,values_from=value,values_fill='no')
Output:
# A tibble: 5 x 4
# Groups: Species [5]
Species Country_A Country_B Country_C
<chr> <chr> <chr> <chr>
1 Tilapia guineensis yes yes no
2 Tilapia zillii no no yes
3 Eutrigla gurnardus yes yes no
4 Sprattus sprattus yes yes no
5 Aetobatus narinari yes no yes
使用的一些数据:
#Data
df <- structure(list(Species = c("Tilapia guineensis", "Tilapia guineensis",
"Tilapia zillii", "Tilapia zillii", "Eutrigla gurnardus", "Eutrigla gurnardus",
"Sprattus sprattus", "Sprattus sprattus", "Sprattus sprattus",
"Aetobatus narinari", "Aetobatus narinari", "Aetobatus narinari",
"Aetobatus narinari"), Country_A = c("yes", "no", "no", "no",
"no", "yes", "yes", "yes", "no", "no", "yes", "yes", "yes"),
Country_B = c("no", "yes", "no", "no", "yes", "no", "no",
"no", "yes", "no", "no", "no", "no"), Country_C = c("no",
"no", "yes", "no", "no", "no", "no", "no", "no", "yes", "no",
"no", "no")), row.names = c(NA, -13L), class = "data.frame")
我们可以通过dplyr
中的操作轻松完成此操作
library(dplyr)
df %>%
group_by(Species) %>%
summarise(across(everything(), ~ if(any(. == 'yes')) 'yes'
else 'no'), .groups = 'drop')
-输出
# A tibble: 5 x 4
# Species Country_A Country_B Country_C
# <chr> <chr> <chr> <chr>
#1 Aetobatus narinari yes no yes
#2 Eutrigla gurnardus yes yes no
#3 Sprattus sprattus yes yes no
#4 Tilapia guineensis yes yes no
#5 Tilapia zillii no no yes
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.