[英]Conditionally add pipe using function argument if argument is not null with dplyr
[英]pass only one argument to function from choice of a few and conditionally pipe within dplyr
我正在尋找一種方法來有條件地僅將一個參數傳遞給 function (三個選擇之一)。 根據選擇,我想簡單地在數據集中創建一個變量。 假設我們有以下數據集:
set.seed(10)
test <- data.frame(time_stamp = sample(seq(as.Date('1999/01/01'), as.Date('2012/01/01'), by="day"), 12))
test
# time_stamp
# 1 2000-05-05
# 2 2009-03-09
# 3 2008-04-24
# 4 2011-03-22
# 5 2003-05-27
# 6 2003-01-01
# 7 2008-10-22
# 8 2003-10-13
# 9 2011-02-26
# 10 2008-08-27
# 11 2011-12-30
# 12 2001-07-18
當我運行 function 時,我想要的 output 如下:
test_fun(type = "halfs")
#or more simply
test_fun(halfs)
# time_stamp half_var
# 1 2000-05-05 H1 2000
# 2 2009-03-09 H1 2009
# 3 2008-04-24 H1 2008
# 4 2011-03-22 H1 2011
# 5 2003-05-27 H1 2003
# 6 2003-01-01 H1 2003
# 7 2008-10-22 H2 2008
# 8 2003-10-13 H2 2003
# 9 2011-02-26 H1 2011
# 10 2008-08-27 H2 2008
# 11 2011-12-30 H2 2011
# 12 2001-07-18 H2 2001
根據選擇的參數,我在 pipe 中運行if
語句,我認為如果我將 {} 放在此處提到的條件語句周圍,我可以做到這一點,但我無法弄清楚。 這是 function:
test_fun <- function(type = c("halfs", "quarts", "other")) {
test %>% {
if (type == "halfs") {
mutate(half_var = ifelse(month(time_stamp) <= 6, paste('H1', year(time_stamp)), paste('H2', year(time_stamp))))
} else if (type == "quarts") {
mutate(quarts_var = case_when(month(time_stamp) <= 3 ~ paste('q1', year(time_stamp)),
month(time_stamp) > 3 & month(time_stamp) <= 6 ~ paste('q2', year(time_stamp)),
month(time_stamp) > 6 & month(time_stamp) <= 9 ~ paste('q3', year(time_stamp)),
month(time_stamp) > 9 ~ paste('q4', year(time_stamp))))
} else (type == "other") {
mutate(other = ifelse(month(time_stamp) <= 6, paste('H1', year(time_stamp)), paste('H2', year(time_stamp))))
}
}
}
我收到關於意外括號的錯誤,但我認為問題與條件有關,如果在 pipe 內(所有括號都關閉)。
另一種方法可能是使用這里建議的可選參數test_fun <- function(halfs, quarts = NULL, other = NULL))
halfs
這種方式表明必須提供一半,但事實並非如此。 真的我想要像test_fun <- function(halfs = NULL, quarts = NULL, other = NULL))
或test_fun <- function(...))
這樣的東西。 一種解決方法可能是將數據作為參數提供: test_fun <- function(test, halfs = NULL, quarts = NULL, other = NULL))
但我無法弄清楚。
任何建議都會很棒。
語法錯誤是真實存在的,必須首先解決。 else (type == "other")
不是正確的語法。 我認為您的意思else if (type == "other")
。 由於您沒有if
,因此括號是出乎意料的。
而且當你把 pipe 變成一個代碼塊時,你需要使用.
放置變量。 您在{}
中的變異應該使用mutate(., half_var=...)
test_fun <- function(type = c("halfs", "quarts", "other")) {
test %>% {
if (type == "halfs") {
mutate(., half_var = ifelse(month(time_stamp) <= 6, paste('H1', year(time_stamp)), paste('H2', year(time_stamp))))
} else if (type == "quarts") {
mutate(., quarts_var = case_when(month(time_stamp) <= 3 ~ paste('q1', year(time_stamp)),
month(time_stamp) > 3 & month(time_stamp) <= 6 ~ paste('q2', year(time_stamp)),
month(time_stamp) > 6 & month(time_stamp) <= 9 ~ paste('q3', year(time_stamp)),
month(time_stamp) > 9 ~ paste('q4', year(time_stamp))))
} else if (type == "other") {
mutate(., other = ifelse(month(time_stamp) <= 6, paste('H1', year(time_stamp)), paste('H2', year(time_stamp))))
}
}
}
這些計算已經在動物園 package 中的yearmon
和yearqtr
中直接可用,因此:
library(zoo)
test %>%
mutate(yearmon = as.yearmon(time_stamp),
yearqtr = as.yearqtr(time_stamp),
yearhalf = paste0(as.integer(yearmon), " H", (cycle(yearmon) > 6) + 1))
給予:
time_stamp yearmon yearqtr yearhalf
1 2005-08-07 Aug 2005 2005 Q3 2005 H2
2 2002-12-27 Dec 2002 2002 Q4 2002 H2
3 2004-07-19 Jul 2004 2004 Q3 2004 H2
4 2008-01-03 Jan 2008 2008 Q1 2008 H1
5 2000-02-08 Feb 2000 2000 Q1 2000 H1
6 2001-12-05 Dec 2001 2001 Q4 2001 H2
7 2002-07-26 Jul 2002 2002 Q3 2002 H2
8 2002-07-15 Jul 2002 2002 Q3 2002 H2
9 2006-12-29 Dec 2006 2006 Q4 2006 H2
10 2004-07-29 Jul 2004 2004 Q3 2004 H2
11 2007-06-16 Jun 2007 2007 Q2 2007 H1
12 2006-05-13 May 2006 2006 Q2 2006 H1
目前尚不清楚我們是否真的需要一個 function 來完成這個:
test_fun <- function(x, type = c("month", "quarter", "half")) {
type <- match.arg(type)
ym <- as.yearmon(x)
if (type == "month") ym
else if (type == "quarter") as.yearqtr(x)
else paste0(as.integer(ym), " H", (cycle(ym) > 6) + 1)
}
library(zoo)
test %>%
mutate(yearmonth = test_fun(time_stamp, "month"),
yearqtr = test_fun(time_stamp, "quarter"),
yearhalf = test_fun(time_stamp, "half"))
關於要求一個參數的 function 的問題的主題行,我不太確定這是一個好主意,因為它意味着硬編碼要使用哪一列,但如果你真的想這樣做,我們會在下列的。 我們實際上提供了第二個參數,以防萬一您改變主意並想要指定 time_stamp 列,但如果未指定,則默認為適當的,前提是在mutate
中調用它。
test_fun2 <- function(type = c("month", "quarter", "half"),
x = parent.frame()$.data$time_stamp) {
type <- match.arg(type)
ym <- as.yearmon(x)
if (type == "month") ym
else if (type == "quarter") as.yearqtr(x)
else paste0(as.integer(ym), " H", (cycle(ym) > 6) + 1)
}
library(zoo)
test %>%
mutate(month = test_fun2("month"),
quarter = test_fun2("quarter"),
halfs = test_fun2("half"))
如果您的意思是您希望test_fun3
返回最多 3 列,那么
test_fun3 <- function(x, month = FALSE, quarter = FALSE, half = FALSE) {
ym <- as.yearmon(x)
data <- data.frame(yearmon = ym,
quarter = as.yearqtr(x),
half = paste0(as.integer(ym), " H", (cycle(ym) > 6) + 1))
data[c(month, quarter, half)]
}
test %>%
bind_cols(test_fun3(.$time_stamp, TRUE, TRUE))
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.