[英]How to handle data.frame output in j when grouping by and using lapply (data.table)
I can't figure out how to handle dataframe output in j while using data.table.我不知道如何在使用 data.table 时处理 j 中的数据帧输出。 I hope the MRE is self-explanatory:
我希望 MRE 是不言自明的:
library(data.table)
library(TTR)
? BBands
data(ttrc)
dt <- as.data.table(ttrc)
dt$symbol <- "a"
dt$symbol[1:200] <- "b"
window_sizes <- c(5, 22)
new_cols <- expand.grid("bbands", c("dn", "mavg", "up", "pctB"), window_sizes)
new_cols <- paste(new_cols$Var1, new_cols$Var2, new_cols$Var3, sep = "_")
output_bbands <- dt[, (new_cols) := lapply(window_sizes, function(w) BBands(Close, n = w)), by = symbol]
The code returns an error:代码返回错误:
Error in `[.data.table`(dt, , lapply(window_sizes, function(w) BBands(Close, :
All items in j=list(...) should be atomic vectors or lists. If you are trying something like j=list(.SD,newcol=mean(colA)) then use := by group instead (much quicker), or cbind or merge afterward.
The function returns 4 columns.该函数返回 4 列。 I would like to add all 4 columns to my dt.
我想将所有 4 列添加到我的 dt 中。
Based on the Op's code, the BBands
should be applied grouped by 'symbol', and the error occurred only because the output is a list
of matrix
es.根据 Op 的代码,
BBands
应该按“符号”分组应用,并且仅因为输出是matrix
es 的list
而发生错误。 Eg if we do this on whole data, the structure would be例如,如果我们对整个数据执行此操作,结构将是
str(lapply(window_sizes, function(w) BBands(dt$Close, n = w)))
List of 2
$ : num [1:5550, 1:4] NA NA NA NA 3.07 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : NULL
.. ..$ : chr [1:4] "dn" "mavg" "up" "pctB"
$ : num [1:5550, 1:4] NA NA NA NA NA NA NA NA NA NA ...
..- attr(*, "dimnames")=List of 2
.. ..$ : NULL
.. ..$ : chr [1:4] "dn" "mavg" "up" "pctB"
We can make a small change in the OP's code by cbind
ing keeping the data.table
code similar and also grouped by 'symbol'我们可以通过
cbind
保持data.table
代码相似并按“符号”分组来对 OP 的代码进行小的更改
dt[, (new_cols) := do.call(cbind, lapply(window_sizes, function(w)
as.data.frame(BBands(Close, n = w)))), by = symbol]
head(dt)
Date Open High Low Close Volume symbol bbands_dn_5 bbands_mavg_5 bbands_up_5 bbands_pctB_5 bbands_dn_22 bbands_mavg_22 bbands_up_22 bbands_pctB_22
1: 1985-01-02 3.18 3.18 3.08 3.08 1870906 b NA NA NA NA NA NA NA NA
2: 1985-01-03 3.09 3.15 3.09 3.11 3099506 b NA NA NA NA NA NA NA NA
3: 1985-01-04 3.11 3.12 3.08 3.09 2274157 b NA NA NA NA NA NA NA NA
4: 1985-01-07 3.09 3.12 3.07 3.10 2086758 b NA NA NA NA NA NA NA NA
5: 1985-01-08 3.10 3.12 3.08 3.11 2166348 b 3.074676 3.098 3.121324 0.7572479 NA NA NA NA
6: 1985-01-09 3.12 3.17 3.10 3.16 3441798 b 3.065668 3.114 3.162332 0.9758734 NA NA NA NA
NOTE: If we just do the lapply
on the whole column, the answer would be incorrect as there is no grouping attribute while applyting the BBands
注意:如果我们只是做了
lapply
整列上,答案是有,而applyting无分组属性是不正确的BBands
You can combine the list of dataframes in one dataframe using do.call
and cbind
to the original dataframe.您可以使用
do.call
和cbind
将数据帧列表合并到一个数据帧中到原始数据帧。
cbind(dt, setNames(do.call(cbind.data.frame,
lapply(window_sizes, function(w) BBands(dt$Close, n = w))), new_cols))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.