[英]Adjust function created in R
你能帮我把function写对吗? 首先,我将向您展示一个示例:
df1 <- structure(
list(
X1 = c(1, 1, 1, 1),
X2 = c("4","3","1","2"),
X3 = c("1", "2","3","2"),
X4 = c("1", "2","3","2"),
XM1 = c(200, 300, 200, 200),
XMR0 = c(300, 300, 300, 300),
XMR01 = c(300, 300, 300, 300),
XMR02 = c(300,300,300,300),
XMR03 = c(300,300,300,300),
XMR04 = c(300,250,350,350)),row.names = c(NA, 4L), class = "data.frame")
f1 <- function(data){
data %>%
transmute(across(matches("^X\\d+$")),
XM1, across(starts_with("XMR"), ~ XM1 - .x,
.names = "{.col}_PV" ))
}
f1(df1)
> f1(df1)
X1 X2 X3 X4 XM1 XMR0_PV XMR01_PV XMR02_PV XMR03_PV XMR04_PV
1 1 4 1 1 200 -100 -100 -100 -100 -100
2 1 3 2 2 300 0 0 0 0 50
3 1 1 3 3 200 -100 -100 -100 -100 -150
4 1 2 2 2 200 -100 -100 -100 -100 -150
现在我有一个类似的数据库,但列名不同。
df1 <- structure(
list(
Id = c(1, 1, 1, 1),
date1 = c("2022-01-06","2022-01-06","2022-01-06","2022-01-06"),
date2 = c("2022-01-02","2022-01-03","2022-01-09","2022-01-10"),
Week = c("Sunday","Monday","Sunday","Monday"),
Category = c("EFG", "ABC","EFG","ABC"),
DR1 = c(200, 300, 200, 200),
DRM0 = c(300, 300, 300, 300),
DRM01 = c(300, 300, 300, 300),
DRM02 = c(300,300,300,300),
DRM03 = c(300,300,300,300),
DRM04 = c(300,250,350,350)),row.names = c(NA, 4L), class = "data.frame")
所以我想创建一个可以称为f2
的 function 。 与上面的f1
相比,我的 function 现在会是什么样子?
预计 Output
Id date2 Week Category DR1 DRM0_PV DRM01_PV DRM02_PV DRM03_PV DRM04_PV
1 1 2022-01-02 Sunday EFG 200 -100 -100 -100 -100 -100
2 1 2022-01-03 Monday ABC 300 0 0 0 0 50
3 1 2022-01-09 Sunday EFG 200 -100 -100 -100 -100 -150
4 1 2022-01-10 Monday ABC 200 -100 -100 -100 -100 -150
我们可以在 function 中添加一些额外的 arguments 作为输入
colnm
- 用于作为字符串减去的列名( ensym
转换为符号并用!!
评估 - 通过使用ensym
,我们也可以使用不带引号的参数作为输入)pat
- 用于across
这些列之间循环的列名的前缀模式cols_del
- 要删除的列。 默认为NULL
。 因此,如果我们没有第四个参数,则不会删除任何列。f1 <- function(data, colnm, pat, cols_del = NULL){
colnm <- rlang::ensym(colnm)
data %>%
mutate(!! colnm, across(starts_with(pat), ~ !! colnm - .x,
.names = "{.col}_PV" ), .keep = "unused") %>%
select(-any_of(cols_del))
}
代码循环colnm
across
列输入的值,并仅返回那些unused
的列,即当我们使用.names
创建新列时,循环列不会在数据中返回,但我们需要'DR1'或'XM1',因此它被选中( !! colnm
),并在最后一步中从 output 中删除any_of
-测试
> f1(df1, "DR1", "DRM", "date1")
Id date2 Week Category DR1 DRM0_PV DRM01_PV DRM02_PV DRM03_PV DRM04_PV
1 1 2022-01-02 Sunday EFG 200 -100 -100 -100 -100 -100
2 1 2022-01-03 Monday ABC 300 0 0 0 0 50
3 1 2022-01-09 Sunday EFG 200 -100 -100 -100 -100 -150
4 1 2022-01-10 Monday ABC 200 -100 -100 -100 -100 -150
- 使用原始的“df1”
> f1(df1, "XM1", "XMR")
X1 X2 X3 X4 XM1 XMR0_PV XMR01_PV XMR02_PV XMR03_PV XMR04_PV
1 1 4 1 1 200 -100 -100 -100 -100 -100
2 1 3 2 2 300 0 0 0 0 50
3 1 1 3 3 200 -100 -100 -100 -100 -150
4 1 2 2 2 200 -100 -100 -100 -100 -150
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.