In short, I have a larger function that creates data.frames that are subsets of a larger data.frame and are named after the arguments of the function. It's building data.frames for raw data AND the outputs and the predictive output of Holt-Winters...meaning it is creating multiple data.frames. A small example is the following (though there's not enough intervals here to actually generate a ts class data.frame):
Group <- c("Primary_Group","Primary_Group","Primary_Group","Primary_Group","Primary_Group","Primary_Group","Secondary_Group","Secondary_Group","Secondary_Group","Secondary_Group","Secondary_Group","Secondary_Group","Tertiary_Group","Tertiary_Group","Tertiary_Group","Tertiary_Group","Tertiary_Group","Tertiary_Group")
Day <- c(1,2,3,1,2,3,1,2,3,1,2,3,1,2,3,1,2,3)
Type <- c("A","A","A","B","B","B","A","A","A","B","B","B","A","A","A","B","B","B")
Value <- c(7,3,10,3,9,4,0,9,3,10,1,6,3,4,10,2,3,1)
df <- as.data.frame(cbind(Group,Day,Type,Value))
Fun <- function(Group,Type, A, B, G){
df <- Data[Data$Group== Group & Data$Type== Type, ]
assign(paste(Group,Type,"_df",sep = ''), df, envir = parent.frame())
df_holtwinters <- HoltWinters(ts(Data[Data$Group== Group & Data$Type== Type, ],
frequency = 365), alpha = A, beta = B, gamma = G)
assign(paste(Group,Type,"_hw",sep = ''), df_holtwinters, envir = parent.frame())
}
You'll notice that the Group and Type are characters, while A, B, G are either numeric or NULL
.
If I now have a data.frame composed of lists values, how could I best loop the above function (likely with mapply
) to use the values from each column in row one...then each column from row 2 etc - creating several data frames.
argGroup <- c("Primary_Group","Primary_Group","Secondary_Group","Secondary_Group","Tertiary_Group","Tertiary_Group")
argType <- c("A","B","A","B","A","B")
argA <- c(NA, NA, NA, NA, NA, NA)
argB <- c(0.05, 0.05, NA, NA, NA, NULL)
argG <- c(NA, NA, NA, NA, NA, NA)
argGroup[is.na(argGroup)] <- list(NULL)
argType[is.na(argType)] <- list(NULL)
argA[is.na(argA)] <- list(NULL)
argB[is.na(argB)] <- list(NULL)
argG[is.na(argG)] <- list(NULL)
Arguments <- cbind(argType, argType, argA, argB, argG)
Ideally, I would get the following data.frames to generate...
Primary_Group_A_df
Primary_Group_A_hw
Primary_Group_B_df
Primary_Group_B_hw
Secondary_Group_A_df
Secondary_Group_A_hw
Secondary_Group_B_df
Secondary_Group_B_hw
Tertiary_Group_A_df
Tertiary_Group_A_hw
Tertiary_Group_B_df
Tertiary_Group_B_hw
It would also be helpful to understand how to best (most automated way) rbind
all the _df together and all the _hw together.
Any help would be amazing and very appreciated. Thanks so much!
You're losing type information by using as.data.frame(cbind(...))
, just use data.frame
directly:
Data <- data.frame(
Group = rep(c("Primary_Group", "Secondary_Group", "Tertiary_Group"), each = 6L),
Day = rep(1L:3L, 6L),
Type = rep(rep(c("A", "B"), each = 3L), 3L),
Value = c(7,3,10,3,9,4,0,9,3,10,1,6,3,4,10,2,3,1)
)
Afterwards, I presume you can do the following:
split_data <- split(Data, as.list(Data[, c("Group", "Type")]))
dfs <- do.call(rbind, split_data)
dfs_hw <- lapply(split_data, function(sub_data) {
Map(argA, argB, argG, f = function(A, B, G) {
HoltWinters(ts(sub_data, frequency = 365), alpha = A, beta = B, gamma = G)
})
})
dfs_hw <- do.call(rbind, unlist(dfs_hw, recursive = FALSE))
But I get an error from HoltWinters
, so I can't say for sure. Also, I think dfs
simply has Data
again, just reordered.
Avoid flooding your global environment with many similarly structured objects. Consider using a container such as a list to hold the many dataframes. One useful method is by
to subset your dataframe by one or more factor(s) such as Group and Type to return a list of dataframes. Also, don't iterate by rows but merge
arguments with data for one pass of arguments per subset.
Specifically, call by
twice for df and hw lists. But first, merge the df and Arguments data frames by Group and Type . One challenge is NULL
cannot be stored in a data frame, so consider saving "NULL"
string and assign temp variables to pass into the HW
arguments. Unfortunately, this will cast entire column as character type which you will need to convert with as.numeric
for non-NULL values.
Merge
Group <- c("Primary_Group","Primary_Group","Secondary_Group","Secondary_Group",
"Tertiary_Group","Tertiary_Group")
Type <- c("A","B","A","B","A","B")
argA <- c("NULL", "NULL", "NULL", "NULL", "NULL", "NULL")
argB <- c(0.05, 0.05, "NULL", "NULL", "NULL", "NULL")
argG <- c("NULL", "NULL", "NULL", "NULL", "NULL", "NULL")
Arguments <- data.frame(Group, Type, argA, argB, argG, stringsAsFactors=FALSE)
df <- merge(df, Arguments, by=c("Group", "Type"))
Dataframe List (with named df elements)
# ORDER FOR NAMING LATER
df <- with(df, df[order(Type, Group),])
# DATAFRAME LIST
df_list <- by(df, df[c("Group", "Type")], identity)
# RENAME LIST
df_list <- setNames(df_list, unique(paste0(df$Group, "_", df$Type, "_df")))
# REFERENCE ELEMENTS
df_list$Primary_Group_A_df
df_list$Secondary_Group_A_df
df_list$Tertiary_Group_A_df
...
HW List (with named hw elements)
# HW LIST
hw_list <- by(df, df[c("Group", "Type")], function(sub) {
# CONDITIONALLY ASSIGN TEMP VARIABLES
# (BEING SUBSETS: max(arg*)==min(arg*)==mean(arg*)==median(arg*))
if(!is.na(max(sub$argA)) & max(sub$argA) == "NULL") { tmpA <- NULL }
else { tmpA <- max(as.numeric(sub$argA)) }
if(!is.na(max(sub$argB)) & max(sub$argB) == "NULL") { tmpB <- NULL }
else { tmpB <- max(as.numeric(sub$argB)) }
if(!is.na(max(sub$argG)) & max(sub$argG) == "NULL") { tmpG <- NULL }
else { tmpG <- max(as.numeric(sub$argG)) }
# PASS ARGS ONCE PER SUBSET
return(HoltWinters(ts(sub, frequency = 365), alpha=tmpA, beta=tmpB, gamma=tmpG))
})
# RENAME LIST
hw_list <- setNames(hw_list, unique(paste0(df$Group, "_", df$Type, "_hw")))
# REFERENCE ELEMENTS
hw_list$Primary_Group_A_hw
hw_list$Secondary_Group_A_hw
hw_list$Tertiary_Group_A_hw
...
Output (using 3 for HW's frequency to align with posted data)
> hw_list$Primary_Group_A_hw
Holt-Winters exponential smoothing with trend and additive seasonal component.
Call:
HoltWinters(x = ts(sub[c("Group", "Day", "Type", "Value")], frequency = 3), alpha = tmpA, beta = tmpB, gamma = tmpG)
Smoothing parameters:
alpha: 0.2169231
beta : 0.05
gamma: 0.1
Coefficients:
[,1]
a 2.89129621
b 0.08783715
s1 0.54815382
s2 -0.12485260
s3 0.21087038
> hw_list$Secondary_Group_A_hw
Holt-Winters exponential smoothing with trend and additive seasonal component.
Call:
HoltWinters(x = ts(sub[c("Group", "Day", "Type", "Value")], frequency = 3), alpha = tmpA, beta = tmpB, gamma = tmpG)
Smoothing parameters:
alpha: 0.752124
beta : 0
gamma: 0
Coefficients:
[,1]
a 3.691664e+00
b 3.333333e-01
s1 3.333333e-01
s2 -1.480388e-16
s3 -3.333333e-01
> hw_list$Tertiary_Group_A_hw
Holt-Winters exponential smoothing with trend and additive seasonal component.
Call:
HoltWinters(x = ts(sub[c("Group", "Day", "Type", "Value")], frequency = 3), alpha = tmpA, beta = tmpB, gamma = tmpG)
Smoothing parameters:
alpha: 0.3145406
beta : 0
gamma: 0
Coefficients:
[,1]
a 3.022946e+00
b -3.333333e-01
s1 -3.333333e-01
s2 -1.480388e-16
s3 3.333333e-01
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.