简体   繁体   English

如何过滤一组中的最大值和另一组中的最小值?

[英]How can I filter max value in a group and min value in another group?

I'm struggling with filtering options in R. I have this dataset:我正在努力处理 R 中的过滤选项。我有这个数据集:

patient_id    period     TREAT_CAT               Outcome
1            -3228 days pre-treatment                Pink
1            -3170 days pre-treatment                Pink
1              100 days post-treatment               Blue
1              200 days post-treatment               Pink
2            -2900 days pre-treatment                Blue
2                0 days post-treatment               Pink
2              100 days post-treatment               Pink

structure(list(patient_id = c(1, 1, 1, 1, 2, 2, 2), 
    period = structure(c(-3228, -3170, 100, 200, -2900, 
    0, 100), class = "difftime", units = "days"), 
    TREAT_CAT = structure(c( 1L, 1L, 2L, 2L, 1L, 2L, 2L), levels = c("pre-treatment", "post-treatment"), class = "factor"), 
    Outcome = c("Pink", "Pink", "Blue", "Pink", 
    "Blue", "Pink", "Pink"), row.names = c(19L, 26L, 24L, 3L, 7L, 29L, 20L), class = "data.frame")

I would like to filter the closest "period" to 0 for the pre-treatment group and the farest "period" from 0 on the post-treatment group.我想将治疗前组最接近的“周期”过滤为 0,治疗后组从 0 过滤最远的“周期”。

I've tried something like this我试过这样的事情

df2 <- df %>%
  group_by(patient_id) %>% 
  filter((TREAT_CAT=="pre-treatment" & period == min(period)) | (TREAT_CAT=="post-treatment" & period == max(period))) %>% 
  filter(n() == 2)

But obviously it gaves me the farest from 0 from both periods.但显然它给了我两个时期最远的 0。 I've also tried with max(period) for both groups but it's not working because the max (period) it's only happening for the post-treatment group, resulting in 0 variables.我也尝试过对两组使用 max(period),但它不起作用,因为 max(period) 只发生在治疗后组,导致 0 个变量。

I would expect something like我会期待类似的东西

patient_id    period     TREAT_CAT               Outcome
1            -3170 days pre-treatment                Pink
1              200 days post-treatment               Pink
2            -2900 days pre-treatment                Blue
2              100 days post-treatment               Pink

Could you please help?能否请你帮忙?

Thanks in advance提前致谢

I assume that pre-treatment is always negative.我假设pre-treatment总是消极的。 That way, you can slice_max() on each group这样,您可以在每个组上使用slice_max()

df %>% 
  group_by(patient_id, TREAT_CAT) %>% 
  slice_max(period)

# A tibble: 4 × 4
# Groups:   patient_id, TREAT_CAT [4]
  patient_id period     TREAT_CAT      Outcome
       <dbl> <drtn>     <fct>          <chr>  
1          1 -3170 days pre-treatment  Pink   
2          1   200 days post-treatment Pink   
3          2 -2900 days pre-treatment  Blue   
4          2   100 days post-treatment Pink 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 根据另一组的值过滤组中的行 - Filter rows in a group based on the value for another group 根据另一列的最大值(dplyr::group_by)有条件地过滤组的元素 - Conditionally filter elements of a group based on max value of another column (dplyr::group_by) 如何在循环中使用 min(i) 来逐列返回分组中的最小值? - How to use min(i) in a loop to return the minimum value in group by column? R将增量值指定为最小-最大日期范围,按 - R give incremental value to range min-max Date, group by 如何在 reactable 中使用聚合 function 来显示与另一列的最小值或最大值相关的值? - How can I use the aggregate function in reactable to display a value associated with the min or max of another column? 使用plyr :: ddply按组返回列的最大值/最小值的行 - return rows with max/min value of column, by group, using plyr::ddply 如何按组过滤在另一个值之后出现的值? - How to filter values that appear after another value by group in R? 我如何 label ggplot2 中的每组最大值? - How do I label the max value per group in ggplot2? 使用group by获取另一列最大值对应的值 - Using group by to get the value corresponding to the max value of another column 按组选择具有最小值的行 - Select rows with min value by group
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM