简体   繁体   中英

How to find the first observation of a column that matches a condition

I have a data frame:

df = tibble(a=c(7,6,10,12,12), b=c(3,5,8,8,7), c=c(4,4,12,15,20), week=c(1,2,3,4,5))

# A tibble: 5 x 4
      a     b     c  week
  <dbl> <dbl> <dbl> <dbl>
1     7     3     4     1
2     6     5     4     2
3    10     8    12     3
4    12     8    15     4
5    12     7    20     5

and i want for every column a, b and c the week in which the observation is equal to or exceeds 10 . Ie for column a it would be week 3 , for column b it would be week NA , for column c it would be week 3 as well.

A desired ouotcome could look like this:

tibble(abc=c("a", NA, "b"), value=c(10, NA, 12), week=c(3, NA, 3))

# A tibble: 3 x 3
  abc   value  week
  <chr> <dbl> <dbl>
1 a        10     3
2 b        NA    NA
3 c        12     3

One way would be to get the data in long format and for each column name select the first value that is greater than 10. We fill the missing combinations with complete .

library(dplyr)
library(tidyr)

df %>%
  pivot_longer(cols = -week, names_to = 'abc') %>%
  group_by(abc) %>%
  slice(which(value >= 10)[1]) %>%
  ungroup %>%
  complete(abc = names(df)[-4])

# A tibble: 3 x 3
#  abc    week value
#  <chr> <dbl> <dbl>
#1 a         3    10
#2 b        NA    NA
#3 c         3    12

Another way is to first calculate what we want and then transform the dataset into long format.

df %>%
  summarise(across(a:c, list(week = ~week[which(. >= 10)[1]], 
                             value = ~.[. >= 10][1]))) %>%
  pivot_longer(cols = everything(), 
               names_to = c('abc', '.value'), 
               names_sep = "_")

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM