[英]Mapply with data.frame/list as the Arguments for the Function
简而言之,我有一个较大的函数,该函数创建data.frame,它们是较大data.frame的子集,并以该函数的参数命名。 它正在构建用于原始数据以及Holt-Winters的输出和预测输出的data.frames ...这意味着它正在创建多个data.frames。 下面是一个小示例(尽管这里没有足够的间隔来实际生成ts类data.frame):
Group <- c("Primary_Group","Primary_Group","Primary_Group","Primary_Group","Primary_Group","Primary_Group","Secondary_Group","Secondary_Group","Secondary_Group","Secondary_Group","Secondary_Group","Secondary_Group","Tertiary_Group","Tertiary_Group","Tertiary_Group","Tertiary_Group","Tertiary_Group","Tertiary_Group")
Day <- c(1,2,3,1,2,3,1,2,3,1,2,3,1,2,3,1,2,3)
Type <- c("A","A","A","B","B","B","A","A","A","B","B","B","A","A","A","B","B","B")
Value <- c(7,3,10,3,9,4,0,9,3,10,1,6,3,4,10,2,3,1)
df <- as.data.frame(cbind(Group,Day,Type,Value))
Fun <- function(Group,Type, A, B, G){
df <- Data[Data$Group== Group & Data$Type== Type, ]
assign(paste(Group,Type,"_df",sep = ''), df, envir = parent.frame())
df_holtwinters <- HoltWinters(ts(Data[Data$Group== Group & Data$Type== Type, ],
frequency = 365), alpha = A, beta = B, gamma = G)
assign(paste(Group,Type,"_hw",sep = ''), df_holtwinters, envir = parent.frame())
}
您会注意到Group和Type是字符,而A,B,G是数字或NULL
。
如果我现在有一个由列表值组成的data.frame,如何最好地循环上述函数(可能使用mapply
)以使用第一行中每一列的值...然后使用第二行中的每一列,等等-创建多个数据框架。
argGroup <- c("Primary_Group","Primary_Group","Secondary_Group","Secondary_Group","Tertiary_Group","Tertiary_Group")
argType <- c("A","B","A","B","A","B")
argA <- c(NA, NA, NA, NA, NA, NA)
argB <- c(0.05, 0.05, NA, NA, NA, NULL)
argG <- c(NA, NA, NA, NA, NA, NA)
argGroup[is.na(argGroup)] <- list(NULL)
argType[is.na(argType)] <- list(NULL)
argA[is.na(argA)] <- list(NULL)
argB[is.na(argB)] <- list(NULL)
argG[is.na(argG)] <- list(NULL)
Arguments <- cbind(argType, argType, argA, argB, argG)
理想情况下,我将获得以下data.frames来生成...
Primary_Group_A_df
Primary_Group_A_hw
Primary_Group_B_df
Primary_Group_B_hw
Secondary_Group_A_df
Secondary_Group_A_hw
Secondary_Group_B_df
Secondary_Group_B_hw
Tertiary_Group_A_df
Tertiary_Group_A_hw
Tertiary_Group_B_df
Tertiary_Group_B_hw
这也将有助于了解如何最佳(最自动化的方式) rbind
共同所有的_DF和所有的_hw在一起。
任何帮助将是惊人的,非常感谢。 非常感谢!
您将通过使用as.data.frame(cbind(...))
丢失类型信息,只需直接使用data.frame
即可:
Data <- data.frame(
Group = rep(c("Primary_Group", "Secondary_Group", "Tertiary_Group"), each = 6L),
Day = rep(1L:3L, 6L),
Type = rep(rep(c("A", "B"), each = 3L), 3L),
Value = c(7,3,10,3,9,4,0,9,3,10,1,6,3,4,10,2,3,1)
)
之后,我想您可以执行以下操作:
split_data <- split(Data, as.list(Data[, c("Group", "Type")]))
dfs <- do.call(rbind, split_data)
dfs_hw <- lapply(split_data, function(sub_data) {
Map(argA, argB, argG, f = function(A, B, G) {
HoltWinters(ts(sub_data, frequency = 365), alpha = A, beta = B, gamma = G)
})
})
dfs_hw <- do.call(rbind, unlist(dfs_hw, recursive = FALSE))
但是我从HoltWinters
收到一个错误,所以我不能肯定地说。 另外,我认为dfs
只是再次具有Data
,只是重新排序。
避免用许多类似结构的对象充斥您的全局环境。 考虑使用诸如列表之类的容器来保存许多数据帧。 一种有用的方法是by
一个或多个因素(例如“ 组”和“ 类型” )对数据框进行子集化,以返回数据框列表。 另外,不要按行进行迭代,而是merge
参数与数据merge
,以便每个子集传递一次参数。
具体来说,呼吁by
两次DF和硬件列表。 但首先,按Group和Type合并df和Arguments数据帧。 一个挑战是NULL
无法存储在数据帧中,因此请考虑保存"NULL"
字符串并分配临时变量以传递到HW
参数中。 不幸的是,这会将整个列转换为字符类型,对于非NULL值,您需要将其转换为as.numeric
。
合并
Group <- c("Primary_Group","Primary_Group","Secondary_Group","Secondary_Group",
"Tertiary_Group","Tertiary_Group")
Type <- c("A","B","A","B","A","B")
argA <- c("NULL", "NULL", "NULL", "NULL", "NULL", "NULL")
argB <- c(0.05, 0.05, "NULL", "NULL", "NULL", "NULL")
argG <- c("NULL", "NULL", "NULL", "NULL", "NULL", "NULL")
Arguments <- data.frame(Group, Type, argA, argB, argG, stringsAsFactors=FALSE)
df <- merge(df, Arguments, by=c("Group", "Type"))
数据框列表 (具有命名的df元素)
# ORDER FOR NAMING LATER
df <- with(df, df[order(Type, Group),])
# DATAFRAME LIST
df_list <- by(df, df[c("Group", "Type")], identity)
# RENAME LIST
df_list <- setNames(df_list, unique(paste0(df$Group, "_", df$Type, "_df")))
# REFERENCE ELEMENTS
df_list$Primary_Group_A_df
df_list$Secondary_Group_A_df
df_list$Tertiary_Group_A_df
...
硬件列表 (带有命名的硬件元素)
# HW LIST
hw_list <- by(df, df[c("Group", "Type")], function(sub) {
# CONDITIONALLY ASSIGN TEMP VARIABLES
# (BEING SUBSETS: max(arg*)==min(arg*)==mean(arg*)==median(arg*))
if(!is.na(max(sub$argA)) & max(sub$argA) == "NULL") { tmpA <- NULL }
else { tmpA <- max(as.numeric(sub$argA)) }
if(!is.na(max(sub$argB)) & max(sub$argB) == "NULL") { tmpB <- NULL }
else { tmpB <- max(as.numeric(sub$argB)) }
if(!is.na(max(sub$argG)) & max(sub$argG) == "NULL") { tmpG <- NULL }
else { tmpG <- max(as.numeric(sub$argG)) }
# PASS ARGS ONCE PER SUBSET
return(HoltWinters(ts(sub, frequency = 365), alpha=tmpA, beta=tmpB, gamma=tmpG))
})
# RENAME LIST
hw_list <- setNames(hw_list, unique(paste0(df$Group, "_", df$Type, "_hw")))
# REFERENCE ELEMENTS
hw_list$Primary_Group_A_hw
hw_list$Secondary_Group_A_hw
hw_list$Tertiary_Group_A_hw
...
输出 (使用3作为硬件频率以与发布的数据对齐)
> hw_list$Primary_Group_A_hw
Holt-Winters exponential smoothing with trend and additive seasonal component.
Call:
HoltWinters(x = ts(sub[c("Group", "Day", "Type", "Value")], frequency = 3), alpha = tmpA, beta = tmpB, gamma = tmpG)
Smoothing parameters:
alpha: 0.2169231
beta : 0.05
gamma: 0.1
Coefficients:
[,1]
a 2.89129621
b 0.08783715
s1 0.54815382
s2 -0.12485260
s3 0.21087038
> hw_list$Secondary_Group_A_hw
Holt-Winters exponential smoothing with trend and additive seasonal component.
Call:
HoltWinters(x = ts(sub[c("Group", "Day", "Type", "Value")], frequency = 3), alpha = tmpA, beta = tmpB, gamma = tmpG)
Smoothing parameters:
alpha: 0.752124
beta : 0
gamma: 0
Coefficients:
[,1]
a 3.691664e+00
b 3.333333e-01
s1 3.333333e-01
s2 -1.480388e-16
s3 -3.333333e-01
> hw_list$Tertiary_Group_A_hw
Holt-Winters exponential smoothing with trend and additive seasonal component.
Call:
HoltWinters(x = ts(sub[c("Group", "Day", "Type", "Value")], frequency = 3), alpha = tmpA, beta = tmpB, gamma = tmpG)
Smoothing parameters:
alpha: 0.3145406
beta : 0
gamma: 0
Coefficients:
[,1]
a 3.022946e+00
b -3.333333e-01
s1 -3.333333e-01
s2 -1.480388e-16
s3 3.333333e-01
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.