简体   繁体   中英

When is operation complexity such that dplyr rowwise is needed?

According to the documentation the dplyr rowwise operator can be used to "support arbitrary complex operations that need to be applied to each row". I find this a little vague. For example, addition does not appear to rise to the level of complexity required for a rowwise:

df <- data.frame( a =  c(1,2,3,4), b = c(5,6,7,8)) 
df %>% 
  mutate(
    c = a+b,
  )

  a b  c
1 1 5  6
2 2 6  8
3 3 7 10
4 4 8 12

But a very similar function, sum does. For example:

df %>%
  mutate(
    d = sum(a,b)
  ) %>%
  rowwise() %>%
  mutate(
    e = sum(a,b)
  )

  a b  d  e
1 1 5 36  6
2 2 6 36  8
3 3 7 36 10
4 4 8 36 12

My question is, when exactly do we need to use rowwise in the course of dplyr operations? Anytime the operation is not a basic arithmetic one or are there some other rules for when an operation will be automatically treat its inputs as rowwise vs column wise?

I think the short answer is that sum , max is not "vectorised", it acceps multiple vectors and gives you the aggregated answer, a bit weird. I usually try to use functions that dont require rowwise since it is slow, and the risk of error is high. An solution to your simple case could be:

library(hablar)
library(dplyr)

df <- data.frame( a =  c(1,2,3,4), b = c(5,6,7,8)) 

df %>% mutate(c = row_sum(a:b))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM