简体   繁体   中英

Mutate based on conditions?

df <- data.frame(x1 = c("a","a","a","a","b","b","b","b"),ind = c("O","O","C","C","O","O","O","O"), num = c(6,12,18,24,6,12,18,24))
set.seed(1)
df <- df[sample(nrow(df)),]


df2 <- df %>% group_by(x1) %>%
  arrange(x1,num) 
> df2
# A tibble: 8 x 3
# Groups:   x1 [2]
  x1    ind     num
  <fct> <fct> <dbl>
1 a     O         6
2 a     O        12
3 a     C        18
4 a     C        24
5 b     O         6
6 b     O        12
7 b     O        18
8 b     O        24

I want to create some new columns to this data, the first one should check for each unique value of the column x1 it should take the minimum value of the column num where the column ind is equal to C . For the value a this should return 18 . It then does this again but check when ind is equal to O instead. If it finds nothing then it should just return N/A. So the two columns should be result like this:

  x1    ind     num min_O min_C
  <fct> <fct> <dbl> <dbl> <dbl>
1 a     O         6     6    18
2 a     O        12     6    18
3 a     C        18     6    18
4 a     C        24     6    18
5 b     O         6     6    NA
6 b     O        12     6    NA
7 b     O        18     6    NA
8 b     O        24     6    NA

I've tried a variation of grouping by the x1 and ind column but couldn't get it to work as I want to do a minimum if it equals a particular value. I am sure there is an easy way!

This looks a bit cumbersome but does the job

library(dplyr)
library(tidyr)

df2 %>% 
 group_by(x1, ind) %>% 
 pivot_wider(names_from = ind, values_from = num, values_fn = min, names_prefix = 'min_') %>% 
 left_join(df2, by = 'x1')

# A tibble: 8 x 5
# Groups:   x1 [2]
  x1    min_O min_C ind     num
  <chr> <dbl> <dbl> <chr> <dbl>
1 a         6    18 O         6
2 a         6    18 O        12
3 a         6    18 C        18
4 a         6    18 C        24
5 b         6    NA O         6
6 b         6    NA O        12
7 b         6    NA O        18
8 b         6    NA O        24

Another way could be

library(tidyr)
library(dplyr)
df %>%
  arrange(x1,num) %>% 
  group_by(x1) %>%
  mutate(min_C = min(num[ind == "C"]), 
         min_O = min(num[ind == "O"]),
         across(starts_with("min"), ~ ifelse(.x == Inf, NA_real_, .x)))

which returns

# A tibble: 8 x 5
# Groups:   x1 [2]
  x1    ind     num min_C min_O
  <chr> <chr> <dbl> <dbl> <dbl>
1 a     O         6    18     6
2 a     O        12    18     6
3 a     C        18    18     6
4 a     C        24    18     6
5 b     O         6    NA     6
6 b     O        12    NA     6
7 b     O        18    NA     6
8 b     O        24    NA     6

but also returns a warning, since there are no C in group b .

If you don't use the across(...) part, NA s are replaced with Inf .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM