简体   繁体   English

如何识别 R 中组中不同列中第一行的值低于第一行的第一行?

[英]How can I identify the first row with value lower than the first row in different column in groups in R?

I have a data set that looks like this:我有一个如下所示的数据集:

   unique score value day
1       2    52 33.75   1
2       2    39 36.25   2
3       3    47 41.25   1
4       3    26 41.00   2
5       3    17 32.25   3
6       3    22 28.00   4
7       3    11 19.00   5
8       3     9 14.75   6
9       3    20 15.50   7
10      4    32 18.00   1
11      4    20 20.25   2
12      5    32 26.00   1
13      5    31 28.75   2
14      5    25 27.00   3
15      5    27 28.75   4
16      6    44 31.75   1
17      6    25 30.25   2
18      6    31 31.75   3
19      6    37 34.25   4
20      6    28 30.25   5

I would like to identify the first row in each group ( unique ) where the score is lower than the value on day 1.我想确定每个组中的第一行( unique ),其中score低于第 1 dayvalue

I have tried this:我试过这个:

result<-df %>% 
group_by(unique.id) %>% 
filter(dailyMyoActivity < globaltma[globalflareday==1])

But it doesn't seem to do exactly what I want it to do.但它似乎并没有完全按照我的意愿去做。 Is there a way of doing this?有没有办法做到这一点?

This could help:这可能会有所帮助:

library(dplyr)

df %>% group_by(unique) %>% mutate(Index=ifelse(score<value & day==1,1,0))

# A tibble: 20 x 5
# Groups:   unique [5]
   unique score value   day Index
    <int> <int> <dbl> <int> <dbl>
 1      2    52  33.8     1     0
 2      2    39  36.2     2     0
 3      3    47  41.2     1     0
 4      3    26  41       2     0
 5      3    17  32.2     3     0
 6      3    22  28       4     0
 7      3    11  19       5     0
 8      3     9  14.8     6     0
 9      3    20  15.5     7     0
10      4    32  18       1     0
11      4    20  20.2     2     0
12      5    32  26       1     0
13      5    31  28.8     2     0
14      5    25  27       3     0
15      5    27  28.8     4     0
16      6    44  31.8     1     0
17      6    25  30.2     2     0
18      6    31  31.8     3     0
19      6    37  34.2     4     0
20      6    28  30.2     5     0

Then you filter by Index==1然后你按Index==1过滤

If I understood your rationale correctly, and if your dataset is already ordered by day , this dplyr solution may come in handy如果我正确理解了您的基本原理,并且您的数据集已按day排序,则此dplyr解决方案可能会派上用场

library(dplyr)

df %>% 
  group_by(unique) %>% 
  filter(score < value[day==1]) %>% 
  slice(1)

Output Output

# A tibble: 3 x 4
# Groups:   unique [3]
#   unique score value   day
#    <int> <int> <dbl> <int>
# 1      3    26  41       2
# 2      5    25  27       3
# 3      6    25  30.2     2

Given that you have asked for identifying the first row which fulfills the criterion score < value a new column which gives you the row number has been added.鉴于您已要求识别满足标准score < value的第一行,因此添加了一个为您提供行号的新列。

   result <- df %>% 
    mutate(row_nr = row_number()) %>% 
    group_by(unique) %>% 
    filter(score < value) %>% 
    slice(1)

We could also use slice我们也可以使用slice

library(dplyr)
df1 %>%
   group_by(unique) %>%
   slice(which(score < value[day == 1])[1])
# A tibble: 3 x 4
# Groups:   unique [3]
#  unique score value   day
#   <int> <int> <dbl> <int>
#1      3    26  41       2
#2      5    25  27       3
#3      6    25  30.2     2

data数据

df1 <- structure(list(unique = c(2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
4L, 4L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 6L), score = c(52L, 39L, 
47L, 26L, 17L, 22L, 11L, 9L, 20L, 32L, 20L, 32L, 31L, 25L, 27L, 
44L, 25L, 31L, 37L, 28L), value = c(33.75, 36.25, 41.25, 41, 
32.25, 28, 19, 14.75, 15.5, 18, 20.25, 26, 28.75, 27, 28.75, 
31.75, 30.25, 31.75, 34.25, 30.25), day = c(1L, 2L, 1L, 2L, 3L, 
4L, 5L, 6L, 7L, 1L, 2L, 1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L, 5L)), 
class = "data.frame", row.names = c("1", 
"2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", 
"14", "15", "16", "17", "18", "19", "20"))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM