dplyr：为什么有些操作在不调用 rowwise() 的情况下“按行”工作，而其他操作却不行？

Question

I am still trying to figure out, how rowwise works exactly in R/dplyr.我仍在尝试弄清楚rowwise在 R/dplyr 中的工作原理。

For example I have this code:例如我有这段代码：

library(dplyr)
df = data.frame(
  group = c("a", "a", "a", "b", "b", "c"),
  var1 = 1:6,
  var2 = 7:12
)

df %>%
  mutate(
    concatNotRW = paste0(var1, "-", group), # work on rows
    meanNotRW = mean(c(var1, var2)), # works not on rows
    charsNotRW = strsplit(concatNotRW, "-") # works on rows
  ) %>%
  rowwise() %>%
  mutate(
    concatRW = paste0(var1, "-", group), # all work on rows
    meanRW = mean(c(var1, var2)),
    charsRW = strsplit(concatRW, "-")
  ) -> res

The res dataframe looks like this: res dataframe 看起来像这样：

  group  var1  var2 concatNotRW meanNotRW charsNotRW concatRW meanRW chars    
  <chr> <int> <int> <chr>           <dbl> <list>     <chr>     <dbl> <list>   
1 a         1     7 1-a               6.5 <chr [2]>  1-a           4 <chr [2]>
2 a         2     8 2-a               6.5 <chr [2]>  2-a           5 <chr [2]>
3 a         3     9 3-a               6.5 <chr [2]>  3-a           6 <chr [2]>
4 b         4    10 4-b               6.5 <chr [2]>  4-b           7 <chr [2]>
5 b         5    11 5-b               6.5 <chr [2]>  5-b           8 <chr [2]>
6 c         6    12 6-c               6.5 <chr [2]>  6-c           9 <chr [2]>

What I do not understand is why paste0 can take each cell of a row and pastes them together (essentially performing a rowwise-operation), yet mean can't do that.我不明白的是为什么paste0可以获取一行中的每个单元格并将它们粘贴在一起（本质上执行逐行操作），但mean不能那样做。 What am I missing and are there any rules on what already works rowwise without the call to rowwise() ?我错过了什么，是否有任何规则可以在不调用rowwise()的情况下按行进行？ I did not find so much info in the rowwise()-vi.nette here https://dplyr.tidyverse.org/articles/rowwise.html我没有在 rowwise()-vi.nette 中找到这么多信息https://dplyr.tidyverse.org/articles/rowwise.html

Answer 1

paste can take vectors as input in the variadic argument ( ... ) and return the same length as vector whereas mean takes the variadic argument for other inputs ( trim etc) and return a single value. paste可以将向量作为可变参数 ( ... ) 的输入并返回与向量相同的长度，而mean将可变参数用于其他输入 ( trim等) 并返回单个值。 Here we need rowMeans .这里我们需要rowMeans 。 Regarding strsplit , it returns a list of split elements关于strsplit ，它返回一个拆分元素list

library(dplyr)
df %>%
  mutate(
    concatNotRW = paste0(var1, "-", group),
    meanNotRW = rowMeans(across(c(var1, var2))),
    charsNotRW = strsplit(concatNotRW, "-") 
  )

> mean(c(1:5, 6:10))
[1] 5.5

Note that the vector we are passing is a single vector by c oncatenating both vectors 1:5 and 6:10请注意，我们传递的向量是单个向量，通过c连接两个向量 1:5 和 6:10

whereas然而

> paste(1:5, 6:10)
[1] "1 6"  "2 7"  "3 8"  "4 9"  "5 10"

are two vectors passed into paste是传递到 paste 中的两个向量

For splitting the column into two columns, we can use separate为了将列拆分为两列，我们可以使用separate

library(tidyr)
 df %>%
  mutate(
    concatNotRW = paste0(var1, "-", group),
    meanNotRW = rowMeans(across(c(var1, var2)))) %>% 
    separate(concatNotRW, into = c("ind", "chars"))
 group var1 var2 ind chars meanNotRW
1     a    1    7   1     a         4
2     a    2    8   2     a         5
3     a    3    9   3     a         6
4     b    4   10   4     b         7
5     b    5   11   5     b         8
6     c    6   12   6     c         9

Why some operations work on rowwise depends on the function. If the function is vectorized, it works on the whole column and doesn't need rowwise .为什么某些操作在rowwise上起作用取决于 function。如果 function 被矢量化，它在整个列上起作用并且不需要rowwise 。 Here, both functions paste and mean are vectorized except that paste is vectorized for variadic input and mean is only vectorized to take a single vector and return a single value as output. Suppose, we have a function that checks each value with if/else , then it is not vectorized as if/else expects a single logical value.在这里，函数paste和mean都是矢量化的，除了paste是针对可变输入进行矢量化的，而mean只是矢量化为采用单个矢量并返回单个值 output。假设，我们有一个 function 用if/else检查每个值，那么它不会像if/else期望单个逻辑值那样被矢量化。 In that case, can use either rowwise or Vectorize the function在这种情况下，可以使用rowwise或Vectorize function

dplyr：为什么有些操作在不调用 rowwise() 的情况下“按行”工作，而其他操作却不行？

问题描述

1 个解决方案

解决方案1
4 已采纳 2023-01-07 16:41:05

dplyr：为什么有些操作在不调用 rowwise() 的情况下“按行”工作，而其他操作却不行？

问题描述

1 个解决方案

解决方案1 4 已采纳 2023-01-07 16:41:05

解决方案1
4 已采纳 2023-01-07 16:41:05