将管道映射到 tidyverse 中的多列

Question

我正在处理一个表，我需要为它计算满足某些标准的行数，结果我基本上多次重复相同的 pipe，只是变量名称不同。

假设我想知道在 mtcars 中有多少辆汽车在每个变量上都比 Valiant 好。 下面是带有两个变量的代码示例：

library(tidyverse)

reference <- mtcars %>% 
     slice(6)

mpg <- mtcars  %>% 
  filter(mpg > reference$mpg) %>%
  count() %>% 
  pull()

cyl <- mtcars  %>% 
  filter(cyl > reference$cyl) %>%
  count() %>% 
  pull()

tibble(mpg, cyl)

除了，假设我需要为大约 100 个变量做这件事，所以必须有一个更优化的方法来重复这个过程。

以最佳方式重写上面的代码的方法是什么（也许，使用map()或任何其他可以很好地与管道一起使用的东西，这样结果将是mtcars中所有变量的计数的一个小问题？

我觉得解决方案应该很简单，但我被卡住了。 谢谢！

Answer 1

您可以使用summarise + across来计算每列中大于特定值的观察值。

library(dplyr)

mtcars %>%
  summarise(across(everything(), ~ sum(. > .[6])))

#   mpg cyl disp hp drat wt qsec vs am gear carb
# 1  18  14   15 22   30 11    1  0 13   17   25

base解决方案：

# (1)
colSums(mtcars > mtcars[rep(6, nrow(mtcars)), ])

# (2)
colSums(sweep(as.matrix(mtcars), 2, mtcars[6, ], ">"))

# mpg  cyl disp   hp drat   wt qsec   vs   am gear carb
#  18   14   15   22   30   11    1    0   13   17   25

Answer 2

或者：

library(tidyverse)

map_dfc(mtcars, ~sum(.x[6] < .x))

map2_dfc(mtcars, reference, ~sum(.y < .x))

Answer 3

例如，您可以循环执行。 像这样：

library(tidyverse)

reference <- mtcars %>% 
  slice(6)

# Empty list to save outcome
list_outcome <- list()

# Get the columnnames to loop over
loop_var <- colnames(reference)
for(i in loop_var){
  nr <- mtcars  %>% 
    filter(mtcars[, i] > reference[, i]) %>%
    count() %>% 
    pull()
  # Save every iteration in the loop as the ith element of the list
  list_outcome[[i]] <- data.frame(Variable = i, Value = nr)
}

# combine all the data frames in the list to one final data frame
df_result <- do.call(rbind, list_outcome)

将管道映射到 tidyverse 中的多列

问题描述

3 个解决方案

解决方案1
2 2022-10-03 07:10:59

解决方案2
2 已采纳 2022-10-03 07:49:27

解决方案3
1 2022-10-03 06:47:12

将管道映射到 tidyverse 中的多列

问题描述

3 个解决方案

解决方案1 2 2022-10-03 07:10:59

解决方案2 2 已采纳 2022-10-03 07:49:27

解决方案3 1 2022-10-03 06:47:12

解决方案1
2 2022-10-03 07:10:59

解决方案2
2 已采纳 2022-10-03 07:49:27

解决方案3
1 2022-10-03 06:47:12