R 的新本地管道 `|>` 和 magrittr 管道 `%>%` 有什么区别？

Question

In R 4.1 a native pipe operator was introduced that is "more streamlined" than previous implementations.在 R 4.1 中引入了一个本地管道运算符，它比以前的实现“更流线型”。 I already noticed one difference between the native |> and the magrittr pipe %>% , namely 2 %>% sqrt works but 2 |> sqrt doesn't and has to be written as 2 |> sqrt() .我已经注意到原生|>和 magrittr 管道%>%之间的一个区别，即2 %>% sqrt有效，但2 |> sqrt无效，必须写成2 |> sqrt() 。 Are there more differences and pitfalls to be aware of when using the new pipe operator?使用新的管道运算符时是否需要注意更多差异和陷阱？

Answer 1

Another difference between both of them is for the piped in values .它们两者之间的另一个区别是管道输入的值. can be used as a placeholder in magrittr 's pipe可以用作magrittr的 pipe 中的占位符

c("dogs", "cats", "rats") %>% grepl("at", .)
#[1] FALSE  TRUE  TRUE

But this is not possible with R's native pipe.但这对于 R 的原生 pipe 是不可能的。

c("dogs", "cats", "rats") |> grepl("at", .)

Error in grepl(c("dogs", "cats", "rats"), "at", .): object '.' grepl 错误（c("dogs", "cats", "rats"), "at", .): object '.' not found未找到

Here are different ways to reference them -以下是引用它们的不同方法 -

Write a separate function单独写一个function

find_at = function(x) grepl("at", x)
c("dogs", "cats", "rats") |> find_at()
#[1] FALSE  TRUE  TRUE

Use an anonymous function使用匿名 function
a) Use the "old" syntax a) 使用“旧”语法
```
c("dogs", "cats", "rats") |> {function(x) grepl("at", x)}()
```
b) Use the new anonymous function syntax b) 使用新的匿名 function 语法
```
c("dogs", "cats", "rats") |> {\(x) grepl("at", x)}()
```
Specify the first parameter by name.按名称指定第一个参数。 This relies on the fact that the native pipe pipes into the first unnamed parameter, so if you provide a name for the first parameter it "overflows" into the second (and so on if you specify more than one parameter by name)这取决于本机 pipe 管道进入第一个未命名参数的事实，因此如果您为第一个参数提供名称，它会“溢出”到第二个参数（如果您按名称指定多个参数，依此类推）

c("dogs", "cats", "rats") |> grepl(pattern="at")
#> [1] FALSE  TRUE  TRUE

Examples 1 and 2 taken from - https://www.jumpingrivers.com/blog/new-features-r410-pipe-anonymous-functions/示例 1 和 2 取自 - https://www.jumpingrivers.com/blog/new-features-r410-pipe-anonymous-functions/
Example 3 taken from https://mobile.twitter.com/rlangtip/status/1409904500157161477示例 3 取自https://mobile.twitter.com/rlangtip/status/1409904500157161477

Answer 2

The base R pipe |> added in R 4.1.0 "just" does functional composition.基础 R pipe |>添加到 R 4.1.0 中“只是”进行功能组合。 Ie we can see that its use really is just the same as the functional call:即我们可以看到它的使用真的和函数调用一样：

> 1:5 |> sum()             # simple use of |>
[1] 15
> deparse(substitute( 1:5 |> sum() ))
[1] "sum(1:5)"
>

That has some consequences:这有一些后果：

it makes it a little faster它使它更快一点
it makes it a little simpler and more robust它使它更简单，更健壮
it makes is a little more restrictive: sum() here needs the parens for a proper call它使限制性更强：这里的sum()需要括号才能正确调用
it limits uses of the 'implicit' data argument它限制了“隐式”数据参数的使用

This leads to possible use of => which is currently "available but not active" (for which you need to set the enviornment variable _R_USE_PIPEBIND_ , and which may change for R 4.2.0).这导致可能使用当前“可用但未激活”的=> （您需要为此设置环境变量_R_USE_PIPEBIND_ ，并且对于 R 4.2.0 可能会更改）。

(This was first offered as answer to a question duplicating this over here and I just copied it over as suggested.) （这首先是作为在此处复制此问题的问题的答案而提供的，我只是按照建议将其复制了。）

Edit: As the follow-up question on 'what is => ' comes up, here is a quick follow-up.编辑：随着关于“什么是=> ”的后续问题出现，这里有一个快速跟进。 Note that this operator is subject to change.请注意，此运算符可能会发生变化。

> Sys.setenv("_R_USE_PIPEBIND_"=TRUE)
> mtcars |> subset(cyl == 4) |> d => lm(mpg ~ disp, data = d)

Call:
lm(formula = mpg ~ disp, data = subset(mtcars, cyl == 4))

Coefficients:
(Intercept)         disp  
     40.872       -0.135  

> deparse(substitute(mtcars |> subset(cyl==4) |> d => lm(mpg ~ disp, data = d)))
[1] "lm(mpg ~ disp, data = subset(mtcars, cyl == 4))"
>

The deparse(substitute(...)) is particularly nice here. deparse(substitute(...))在这里特别好。

Answer 3

The native pipe is implemented as a syntax transformation and so 2 |> sqrt() has no discernible overhead compared to sqrt(2) , whereas 2 %>% sqrt() comes with a small penalty.本机 pipe 是作为语法转换实现的，因此2 |> sqrt()与sqrt(2)相比没有明显的开销，而2 %>% sqrt()有一个小的惩罚。

microbenchmark(sqrt(1), 
               2 |> sqrt(), 
               3 %>% sqrt())
# Unit: nanoseconds
#          expr  min     lq    mean median   uq   max neval
#       sqrt(1)  117  126.5  141.66  132.0  139   246   100
#       sqrt(2)  118  129.0  156.16  134.0  145  1792   100
#  3 %>% sqrt() 2695 2762.5 2945.26 2811.5 2855 13736   100

You see how the expression 2 |> sqrt() passed to microbenchmark is parsed as sqrt(2) .您会看到传递给microbenchmark的表达式2 |> sqrt()是如何被解析为sqrt(2)的。 This can also be seen in这也可以在

quote(2 |> sqrt())
# sqrt(2)

Answer 4

Topic话题	Magrittr 2.0.3马格利特2.0.3	Base 4.2.0基础4.2.0
Operator操作员	`%>%`	`\|>`
Function call Function 来电	`%>% sum()`	`\|> sum()`
	`%>% sum`	Needs brackets*需要括号*
	%>% `$`(cyl)	Some functions are not supported*不支持某些功能*
Placeholder占位符	`.`	`_`
	`%>% lm(mpg ~ disp, data =. )`	`\|> lm(mpg ~ disp, data = _ )`
	`%>% lm(mpg ~ disp, . )`	Needs named argument*需要命名参数*
	`%>% setNames(., .)`	Can only appear once*只能出现一次*
	`%>% {sum(sqrt(.))}`	Nested calls are not allowed*不允许嵌套调用*
Environment环境	Additional function environement*附加 function 环境*	`"x" \|> assign(1)`
Speed速度	Overhead of function call *function 调用的开销*	Syntax transformation*语法转换*

Needs brackets需要括号

library(magrittr)

1:3 |> sum
#Error: The pipe operator requires a function call as RHS

1:3 |> sum()
#[1] 6

1:3 %>% sum
#[1] 6

1:3 %>% sum()
#[1] 6

Some functions are not supported , but some still can be called by placing them in brackets, call them via the function :: , call it in a function or define a link to the function.不支持某些函数，但仍然可以通过将它们放在括号中来调用它们，通过 function ::调用它们，在 function 中调用它或定义到 ZC1C425268E68385D1AB5074F 的链接。

mtcars |> `$`(cyl)
#Error: function '$' not supported in RHS call of a pipe

mtcars |> (`$`)(cyl)
# [1] 6 6 4 6 8 6 8 4 4 6 6 8 8 8 8 8 8 4 4 4 4 8 8 8 8 4 4 4 8 6 8 4

mtcars |> base::`$`(cyl)
# [1] 6 6 4 6 8 6 8 4 4 6 6 8 8 8 8 8 8 4 4 4 4 8 8 8 8 4 4 4 8 6 8 4

mtcars |> (\(.) .$cyl)()
# [1] 6 6 4 6 8 6 8 4 4 6 6 8 8 8 8 8 8 4 4 4 4 8 8 8 8 4 4 4 8 6 8 4

fun <- `$`
mtcars |> fun(cyl)
# [1] 6 6 4 6 8 6 8 4 4 6 6 8 8 8 8 8 8 4 4 4 4 8 8 8 8 4 4 4 8 6 8 4

mtcars %>% `$`(cyl)
# [1] 6 6 4 6 8 6 8 4 4 6 6 8 8 8 8 8 8 4 4 4 4 8 8 8 8 4 4 4 8 6 8 4

Placeholder needs named argument占位符需要命名参数

2 |> setdiff(1:3, _)
#Error: pipe placeholder can only be used as a named argument

2 |> setdiff(1:3, y = _)
#[1] 1 3

2 |> (\(.) setdiff(1:3, .))()
#[1] 1 3

2 %>% setdiff(1:3, .)
#[1] 1 3

2 %>% setdiff(1:3, y = .)
#[1] 1 3

Placeholder can only appear once占位符只能出现一次

1:3 |> setNames(object = _, nm = _)
#Error in setNames(object = "_", nm = "_") : 
#  pipe placeholder may only appear once

1:3 |> (\(.) setNames(., .))()
#1 2 3 
#1 2 3 

1:3 |> list() |> setNames(".") |> with(setNames(., .))
#1 2 3 
#1 2 3 

1:3 %>% setNames(object = ., nm = .)
#1 2 3
#1 2 3

1:3 %>% setNames(., .)
#1 2 3 
#1 2 3

Nested calls are not allowed不允许嵌套调用

1:3 |> sum(sqrt(x=_))
#Error in sum(1:3, sqrt(x = "_")) : invalid use of pipe placeholder

1:3 |> (\(.) sum(sqrt(.)))()
#[1] 4.146264

1:3 %>% {sum(sqrt(.))}
#[1] 4.146264

No additional Environment没有额外的环境

assign("x", 1)
x
#[1] 1

"x" |> assign(2)
x
#[1] 2

"x" |> (\(x) assign(x, 3))()
x
#[1] 2

"x" %>% assign(4)
x
#[1] 2

Other possibilities:其他可能性：
A different pipe operator and different placeholder could be realized with the Bizarro pipe ->.;使用 Bizarro pipe ->.;可以实现不同的 pipe 运算符和不同的占位符。 what is not a pipe (see disadvantages ) which is overwriting .什么不是正在覆盖的 pipe（请参阅缺点） .

1:3 ->.; sum(.)
#[1] 6

mtcars ->.; .$cyl
# [1] 6 6 4 6 8 6 8 4 4 6 6 8 8 8 8 8 8 4 4 4 4 8 8 8 8 4 4 4 8 6 8 4

1:3 ->.; setNames(., .)
#1 2 3 
#1 2 3 

1:3 ->.; sum(sqrt(x=.))
#[1] 4.146264

"x" ->.; assign(., 5)
x
#[1] 5

and evaluates different.并且评价不同。

x <- data.frame(a=0)
f1 <- \(x) {message("IN 1"); x$b <- 1; message("OUT 1"); x}
f2 <- \(x) {message("IN 2"); x$c <- 2; message("OUT 2"); x}

x ->.; f1(.) ->.; f2(.)
#IN 1
#OUT 1
#IN 2
#OUT 2
#  a b c
#1 0 1 2

x |> f1() |> f2()
#IN 2
#IN 1
#OUT 1
#OUT 2
#  a b c
#1 0 1 2

f2(f1(x))
#IN 2
#IN 1
#OUT 1
#OUT 2
#  a b c
#1 0 1 2

Or define an own operator, which evaluates different.或者定义一个自己的运算符，它评估不同。

":=" <- function(lhs, rhs) {
  e <- exists(".", parent.frame(), inherits = FALSE)
  . <- get0(".", envir = parent.frame(), inherits = FALSE)
  assign(".", lhs, envir=parent.frame())
  on.exit(if(identical(lhs, get0(".", envir = parent.frame(), inherits = FALSE))) {
            if(e) {
              assign(".", ., envir=parent.frame())
            } else {
              if(exists(".", parent.frame())) rm(., envir = parent.frame())
            }
          })
  eval(substitute(rhs), parent.frame())
}

. <- 0
"." := assign(., 1)
.
#[1] 1

1:3 := sum(.)
#[1] 6
.
#[1] 1

mtcars := .$cyl
# [1] 6 6 4 6 8 6 8 4 4 6 6 8 8 8 8 8 8 4 4 4 4 8 8 8 8 4 4 4 8 6 8 4

1:3 := setNames(., .)
#1 2 3 
#1 2 3 

1:3 := sum(sqrt(x=.))
#[1] 4.146264

"x" := assign(., 6)
x
#[1] 6

1 := .+1 := .+2
#[1] 4

x <- data.frame(a=0)
x := f1(.) := f2(.)
#IN 1
#OUT 1
#IN 2
#OUT 2
#  a b c
#1 0 1 2

Speed速度

library(magrittr)

":=" <- function(lhs, rhs) {
  e <- exists(".", parent.frame(), inherits = FALSE)
  . <- get0(".", envir = parent.frame(), inherits = FALSE)
  assign(".", lhs, envir=parent.frame())
  on.exit(if(identical(lhs, get0(".", envir = parent.frame(), inherits = FALSE))) {
            if(e) {
              assign(".", ., envir=parent.frame())
            } else {
              if(exists(".", parent.frame())) rm(., envir = parent.frame())
            }
          })
  eval(substitute(rhs), parent.frame())
}

`%|%` <- function(lhs, rhs) {  #Overwrite and keep .
    assign(".", lhs, envir=parent.frame())
    eval(substitute(rhs), parent.frame())
}

x <- 42
bench::mark(min_time = 0.2, max_iterations = 1e8
, x
, identity(x)
, "|>" = x |> identity()
, "|> _" = x |> identity(x=_)
, "|> f()" = x |> (\(y) identity(y))()
, "%>%" = x %>% identity
, "->.;" = {x ->.; identity(.)}
, ":=" = x := identity(.)
, "%|%" = x %|% identity(.)
, "list." = x |> list() |> setNames(".") |> with(identity(.))
)

Result结果

#   expression       min   median `itr/sec` mem_alloc `gc/sec`   n_itr  n_gc
#   <bch:expr>  <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>   <int> <dbl>
# 1 x             9.89ns  10.94ns 66611556.        0B     11.7 5708404     1
# 2 identity(x) 179.98ns 200.12ns  4272195.        0B     49.6  603146     7
# 3 |>          179.98ns 201.05ns  4238021.        0B     41.1  722534     7
# 4 |> _        189.87ns 219.91ns  4067314.        0B     39.4  722803     7
# 5 |> f()      410.01ns 451.11ns  1889295.        0B     44.6  339126     8
# 6 %>%           1.27µs   1.39µs   632255.    5.15KB     43.2  117210     8
# 7 ->.;        289.87ns 330.97ns  2581693.        0B     27.0  477389     5
# 8 :=            6.46µs   7.12µs   131921.        0B     48.8   24330     9
# 9 %|%           2.05µs   2.32µs   394515.        0B     43.2   73094     8
#10 list.         2.42µs   2.74µs   340220.     8.3KB     42.3   64324     8

Answer 5

One difference is their placeholder, _ in base R, .一个区别是它们的占位符_在基础 R, 中. in magrittr .在magrittr 。

Since R 4.2.0 , the base R pipe has a placeholder for piped-in values, _ , similar to %>% 's .由于R 4.2.0 ，基础 R pipe 有一个用于管道输入值的占位符_ ，类似于%>% . , but its use is restricted to named arguments, and can only be used once per call. ，但其使用仅限于命名为 arguments，并且每次调用只能使用一次。

It is now possible to use a named argument with the placeholder _ in the rhs call to specify where the lhs is to be inserted.现在可以在 rhs 调用中使用带有占位符 _ 的命名参数来指定要插入 lhs 的位置。 The placeholder can only appear once on the rhs.占位符只能在 rhs 上出现一次。

To reiterate Ronak Shah 's example, you can now use _ as a named argument on the right-hand side to refer to the left-hand side of the formula:重申Ronak Shah的示例，您现在可以使用_作为右侧的命名参数来引用公式的左侧：

c("dogs", "cats", "rats") |> 
    grepl("at", x = _)
#[1] FALSE  TRUE  TRUE

but it has to be named:但它必须命名为：

c("dogs", "cats", "rats") |> 
    grepl("at", _)
#Error: pipe placeholder can only be used as a named argument

and cannot appear more than once (to overcome this issue, one can still use the solutions provided by Ronak Shah ):并且不能出现多次（为了克服这个问题，仍然可以使用Ronak Shah提供的解决方案）：

c("dogs", "cats", "rats") |> 
  expand.grid(x = _, y = _)
# Error in expand.grid(x = "_", y = "_") : pipe placeholder may only appear once

While this is possible with magrittr :虽然这可以通过magrittr ：

library(magrittr)
c("dogs", "cats", "rats") %>% 
  expand.grid(x = ., y = .)
#     x    y
#1 dogs dogs
#2 cats dogs
#3 rats dogs
#4 dogs cats
#5 cats cats
#6 rats cats
#7 dogs rats
#8 cats rats
#9 rats rats

R 的新本地管道 `|>` 和 magrittr 管道 `%>%` 有什么区别？

问题描述

5 个解决方案

解决方案1
41 2021-05-21 13:45:02

解决方案2
34 2021-06-03 12:22:47

解决方案3
28 2021-05-25 07:38:26

解决方案4
13 2022-05-02 12:09:26

解决方案5
8 2022-04-25 18:24:01

R 的新本地管道 `|&gt;` 和 magrittr 管道 `%&gt;%` 有什么区别？

问题描述

5 个解决方案

解决方案1 41 2021-05-21 13:45:02

解决方案2 34 2021-06-03 12:22:47

解决方案3 28 2021-05-25 07:38:26

解决方案4 13 2022-05-02 12:09:26

解决方案5 8 2022-04-25 18:24:01

R 的新本地管道 `|>` 和 magrittr 管道 `%>%` 有什么区别？

解决方案1
41 2021-05-21 13:45:02

解决方案2
34 2021-06-03 12:22:47

解决方案3
28 2021-05-25 07:38:26

解决方案4
13 2022-05-02 12:09:26

解决方案5
8 2022-04-25 18:24:01