![](/img/trans.png)
[英]What are the differences between R's new native pipe `|>` and the magrittr pipe `%>%`?
[英]R: transition from magrittr to native pipe and translation of a function
請查看帖子末尾的reprex。 由於各種原因,我正在從 %>% 過渡到本機管道。 我有時會有點掙扎,我需要對幾個功能發表評論。 在第一種情況下(使用 |> 重寫 complete_data() 函數),我不明白為什么我的某些方法有效而另一種方法無效。
在第二種情況下,(move_row() 函數),我找到了一種解決方法,但這並不能很好地推廣到我擁有的其他函數。 使用 magrittr,我可以創建一系列包含 nrow(.) 的管道,以將我當時擁有的任何 tibble 的行數傳遞給一個函數。 我怎樣才能對原生管道做同樣的事情? 非常感謝!
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
## First look at these functions. They just try to discard incomplete rows in
## a tibble
complete_data <- function(data){
res <- data %>% filter(complete.cases(.))
return(res)
}
## By trial and error, I wrote this
complete_data_native <- function(data){
res <- data |> (\(data) filter(data, complete.cases(data)))()
return(res)
}
## this was my first attempt, but why does it fail?
complete_data_native_wrong <- function(data){
res <- data |> (\(x) filter(x, complete.cases(x)))()
return(res)
}
df <- structure(list(x = c(1, 2, NA, 4), y = c(NA, NA, 3, 4)), class = c("tbl_df",
"tbl", "data.frame"), row.names = c(NA, -4L))
df
#> # A tibble: 4 × 2
#> x y
#> <dbl> <dbl>
#> 1 1 NA
#> 2 2 NA
#> 3 NA 3
#> 4 4 4
df |> complete_data()
#> # A tibble: 1 × 2
#> x y
#> <dbl> <dbl>
#> 1 4 4
df |> complete_data_native()
#> # A tibble: 1 × 2
#> x y
#> <dbl> <dbl>
#> 1 4 4
df |> complete_data_native_wrong() ### why does this fail
#> # A tibble: 3 × 2
#> x y
#> <dbl> <dbl>
#> 1 1 NA
#> 2 2 NA
#> 3 4 4
## Now another function. Given a tibble, it moves a row from ini_pos to fin_pos
move_row <- function(df, ini_pos, fin_pos){
row_pick <- slice(df, ini_pos)
if (fin_pos=="last"){
res <- df %>%
slice(-ini_pos) %>%
add_row(row_pick, .before = nrow(.))
} else{
res <- df %>%
slice(-ini_pos) %>%
add_row(row_pick, .before = fin_pos)
}
return(res)
}
move_row_native_attempt <- function(df, ini_pos, fin_pos){
ll <- nrow(df) ## it gets the job done, but I do not want this
row_pick <- slice(df, ini_pos)
if (fin_pos=="last"){
res <- df |>
slice(-ini_pos) |>
add_row(row_pick, .before = ll) ##I want to use the native pipe
## to write the equivalent of nrow(.)
## with magrittr placeholder but I cannot do that
} else{
res <- df |>
slice(-ini_pos) |>
add_row(row_pick, .before = fin_pos)
}
return(res)
}
df
#> # A tibble: 4 × 2
#> x y
#> <dbl> <dbl>
#> 1 1 NA
#> 2 2 NA
#> 3 NA 3
#> 4 4 4
df |> move_row(1,"last")
#> # A tibble: 4 × 2
#> x y
#> <dbl> <dbl>
#> 1 2 NA
#> 2 NA 3
#> 3 1 NA
#> 4 4 4
df |> move_row_native_attempt(1,"last") ## gets the job done, but it is not what I want. See comments in the function definition
#> # A tibble: 4 × 2
#> x y
#> <dbl> <dbl>
#> 1 2 NA
#> 2 NA 3
#> 3 4 4
#> 4 1 NA
print(sessionInfo())
#> R version 4.2.1 (2022-06-23)
#> Platform: x86_64-pc-linux-gnu (64-bit)
#> Running under: Debian GNU/Linux 11 (bullseye)
#>
#> Matrix products: default
#> BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
#> LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
#>
#> locale:
#> [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C
#> [3] LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8
#> [5] LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8
#> [7] LC_PAPER=en_GB.UTF-8 LC_NAME=C
#> [9] LC_ADDRESS=C LC_TELEPHONE=C
#> [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] dplyr_1.0.9
#>
#> loaded via a namespace (and not attached):
#> [1] knitr_1.39 magrittr_2.0.3 tidyselect_1.1.2 R6_2.5.1
#> [5] rlang_1.0.2 fastmap_1.1.0 fansi_1.0.3 stringr_1.4.0
#> [9] highr_0.9 tools_4.2.1 xfun_0.31 utf8_1.2.2
#> [13] DBI_1.1.3 cli_3.3.0 withr_2.5.0 htmltools_0.5.2
#> [17] ellipsis_0.3.2 assertthat_0.2.1 yaml_2.3.5 digest_0.6.29
#> [21] tibble_3.1.7 lifecycle_1.0.1 crayon_1.5.1 purrr_0.3.4
#> [25] vctrs_0.4.1 fs_1.5.2 glue_1.6.2 evaluate_0.15
#> [29] rmarkdown_2.14 reprex_2.0.1 stringi_1.7.6 compiler_4.2.1
#> [33] pillar_1.7.0 generics_0.1.2 pkgconfig_2.0.3
由reprex 包於 2022-06-29 創建 (v2.0.1)
complete_data_native_wrong()
:complete_data_native_wrong <- function(data){
res <- data |> (\(x) filter(x, complete.cases(x)))()
return(res)
}
數據屏蔽是這個可愛的功能無法按預期工作的原因。
“那么,到底發生了什么?”,你問。
dplyr::filter()
檢查名為x
的列,它確實找到了它,然后將該列的內容傳遞給complete.cases()
。 當您使用y
而不是x
時,也會發生同樣的情況。
complete.cases()
最終作用於“向量”而不是data.frame
,因此結果。
“但是......我如何確保dplyr::filter()
不會那樣做?”,你詢問。
那就是砰砰操作員的地方!!
進來了。我們現在可以擁有complete_data_native_right()
:
complete_data_native_right <- function(data){
res <- data |> (\(x) filter(x, complete.cases(!!x)))()
# res <- data |> (\(y) filter(y, complete.cases(!!y)))()
return(res)
}
move_row_native_attempt()
:對於這個,您可以使用速記函數符號而不會出現任何問題:
move_row_native_attempt <- function(df, ini_pos, fin_pos){
row_pick <- slice(df, ini_pos)
if (fin_pos=="last"){
res <- df |>
slice(-ini_pos) |>
(\(x) add_row(x, row_pick, .before = nrow(x)))()
} else{
res <- df |>
slice(-ini_pos) |>
add_row(row_pick, .before = fin_pos)
}
return(res)
}
我認為這僅僅是因為數據框中有一個列x
,並且filter
使用這個x
而不是參數x
到你的內聯函數。 如果您在函數聲明中將變量名從x
更改為z
,我認為它可以工作。 請看下文。
盡管如此,我認為iris |> filter(complete.cases(_))
引發錯誤是對基管的打擊。 限制是_
只能用作管道函數的命名參數,不能用作.
能夠?
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
complete_data_native_wrong <- function(data){
res <- data |> (\(z) filter(z, complete.cases(z)))() # change to z
return(res)
}
df <- structure(
list(x = c(1, 2, NA, 4),
y = c(NA, NA, 3, 4)),
class = c("tbl_df",
"tbl", "data.frame"),
row.names = c(NA, -4L)
)
df |> complete_data_native_wrong()
#> # A tibble: 1 × 2
#> x y
#> <dbl> <dbl>
#> 1 4 4
由reprex 包於 2022-06-29 創建 (v2.0.1)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.