简体   繁体   English

获取与 dplyr 中另一列最大值关联的列值

[英]Get column value associated to another column maximum in dplyr's across

After grouping by species and taken max Sepal.Length (column 1) for each group I need to grab the value of column 2 to 4 that are associated to maximum value of column 1 (by group).按物种分组并为每个组取最大 Sepal.Length(第 1 列)后,我需要获取与第 1 列(按组)的最大值相关联的第 2 至 4 列的值。 I'm able to do so for each single column at once but not in an across process.我可以同时对每一列执行此操作,但不能在across进程中执行此操作。 Any tips?有小费吗?

library(dplyr)
library(datasets)
data(iris)

Summarize by species with data associates to max sepal.length (by group), column by column:按物种汇总,数据关联到最大 sepal.length(按组),逐列:

iris_summary <- iris %>%
  group_by(Species) %>%
  summarise(
    max_sep_length = max(Sepal.Length),
    sep_w_associated_to = Sepal.Width[which.max(Sepal.Length)],
    pet_l_associated_to = Petal.Length[which.max(Sepal.Length)],
    pet_w_associated_to = Petal.Width[which.max(Sepal.Length)]
  )

Now I would like obtain the same result using across , but the outcome is different from that I expected (the df iris_summary has now same number of rows as iris , I can't understand why...)现在我想使用across获得相同的结果,但结果与我预期的不同(df iris_summary现在的行数与iris相同,我不明白为什么......)

iris_summary <- iris %>%
  group_by(Species) %>%
  summarise(
    max_sepa_length = max(Sepal.Length),
    across(
      .cols = Sepal.Width : Petal.Width,
      .funs = ~ .x[which.max(Sepal.Length)]
    )
  )

If we want to do the same with across, here is one option:如果我们想对横跨做同样的事情,这里有一个选项:

iris %>% 
  group_by(Species) %>% 
  summarise(across(everything(), ~ .[which.max(Sepal.Length)]))
  Species    Sepal.Length Sepal.Width Petal.Length Petal.Width
  <fct>             <dbl>       <dbl>        <dbl>       <dbl>
1 setosa              5.8         4            1.2         0.2
2 versicolor          7           3.2          4.7         1.4
3 virginica           7.9         3.8          6.4         2  

Or use slice_max或者使用slice_max

library(dplyr) # devel can have `.by` or use `group_by(Species)`
iris %>% 
   slice_max(Sepal.Length, n = 1, by = 'Species')
Sepal.Length Sepal.Width Petal.Length Petal.Width    Species
1          5.8         4.0          1.2         0.2     setosa
2          7.0         3.2          4.7         1.4 versicolor
3          7.9         3.8          6.4         2.0  virginica

in base R you could do:在基地 R 你可以这样做:

merge(aggregate(Sepal.Length~Species, iris, max), iris)

     Species Sepal.Length Sepal.Width Petal.Length Petal.Width
1     setosa          5.8         4.0          1.2         0.2
2 versicolor          7.0         3.2          4.7         1.4
3  virginica          7.9         3.8          6.4         2.0

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM