I am trying to carry out a simple operation (everything is explained in the reprex). I just want to be able to summarise a grouped tibble according to whether a certain element appears in a group (in that case, I want to have its corresponding ranking). The element I talk about appears at most once in each group, so it should be easy, but I am banging my head against the wall.
Any help is appreciated!
Thanks!
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
df <- tibble(year=rep(seq(2017, 2020),6),
ranking=seq(24)) %>%
arrange(year) %>%
mutate(x=c(letters[1:12], letters[1:12]))
##I want to summarise the tibble this way: for every year, if there is "d" in
## the group, I want the ranking of d. If not, I want to put e.g. -1.
##this is my attempt, but it does not work
df_summary <- df %>%
group_by(year) %>%
summarise(my_summary=if_else("d" %in% x, ranking, -1 ))
#> Error: Problem with `summarise()` input `my_summary`.
#> ✖ `true` must be length 1 (length of `condition`), not 6.
#> ℹ Input `my_summary` is `if_else("d" %in% x, ranking, -1)`.
#> ℹ The error occurred in group 1: year = 2017.
Created on 2020-09-18 by the reprex package (v0.3.0)
With the same methode idea as you, and a verification that your variable do not appear more than one time (else you get -1)
df %>%
mutate(is_d = (x=="d")*1) %>%
group_by(year) %>%
summarise(Val=if_else(sum(is_d)==1,
sum(is_d*ranking),
-1))
has_d <- df %>%
filter(x == "d") %>%
select(year, ranking)
df_summary <- df %>%
distinct(year) %>%
anti_join(has_d, by = "year") %>%
mutate(ranking = -1L) %>%
bind_rows(has_d) %>%
arrange(year)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.