简体   繁体   English

根据条件变异?

[英]Mutate based on conditions?

df <- data.frame(x1 = c("a","a","a","a","b","b","b","b"),ind = c("O","O","C","C","O","O","O","O"), num = c(6,12,18,24,6,12,18,24))
set.seed(1)
df <- df[sample(nrow(df)),]


df2 <- df %>% group_by(x1) %>%
  arrange(x1,num) 
> df2
# A tibble: 8 x 3
# Groups:   x1 [2]
  x1    ind     num
  <fct> <fct> <dbl>
1 a     O         6
2 a     O        12
3 a     C        18
4 a     C        24
5 b     O         6
6 b     O        12
7 b     O        18
8 b     O        24

I want to create some new columns to this data, the first one should check for each unique value of the column x1 it should take the minimum value of the column num where the column ind is equal to C .我想为此数据创建一些新列,第一个应该检查列x1的每个唯一值,它应该采用列ind等于C的列num的最小值。 For the value a this should return 18 .对于值a这应该返回18 It then does this again but check when ind is equal to O instead.然后它再次执行此操作,但检查ind何时等于O If it finds nothing then it should just return N/A.如果它什么也没找到,那么它应该只返回 N/A。 So the two columns should be result like this:所以这两列应该是这样的结果:

  x1    ind     num min_O min_C
  <fct> <fct> <dbl> <dbl> <dbl>
1 a     O         6     6    18
2 a     O        12     6    18
3 a     C        18     6    18
4 a     C        24     6    18
5 b     O         6     6    NA
6 b     O        12     6    NA
7 b     O        18     6    NA
8 b     O        24     6    NA

I've tried a variation of grouping by the x1 and ind column but couldn't get it to work as I want to do a minimum if it equals a particular value.我尝试了按x1ind列进行分组的变体,但无法使其工作,因为如果它等于特定值,我想做一个最小值。 I am sure there is an easy way!我相信有一个简单的方法!

This looks a bit cumbersome but does the job这看起来有点麻烦,但确实有效

library(dplyr)
library(tidyr)

df2 %>% 
 group_by(x1, ind) %>% 
 pivot_wider(names_from = ind, values_from = num, values_fn = min, names_prefix = 'min_') %>% 
 left_join(df2, by = 'x1')

# A tibble: 8 x 5
# Groups:   x1 [2]
  x1    min_O min_C ind     num
  <chr> <dbl> <dbl> <chr> <dbl>
1 a         6    18 O         6
2 a         6    18 O        12
3 a         6    18 C        18
4 a         6    18 C        24
5 b         6    NA O         6
6 b         6    NA O        12
7 b         6    NA O        18
8 b         6    NA O        24

Another way could be另一种方式可能是

library(tidyr)
library(dplyr)
df %>%
  arrange(x1,num) %>% 
  group_by(x1) %>%
  mutate(min_C = min(num[ind == "C"]), 
         min_O = min(num[ind == "O"]),
         across(starts_with("min"), ~ ifelse(.x == Inf, NA_real_, .x)))

which returns返回

# A tibble: 8 x 5
# Groups:   x1 [2]
  x1    ind     num min_C min_O
  <chr> <chr> <dbl> <dbl> <dbl>
1 a     O         6    18     6
2 a     O        12    18     6
3 a     C        18    18     6
4 a     C        24    18     6
5 b     O         6    NA     6
6 b     O        12    NA     6
7 b     O        18    NA     6
8 b     O        24    NA     6

but also returns a warning, since there are no C in group b .但也会返回警告,因为b组中没有C

If you don't use the across(...) part, NA s are replaced with Inf .如果您不使用across(...)部分,则NA将替换为Inf

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM