简体   繁体   English

分组和使用lapply(data.table)时如何处理j中的data.frame输出

[英]How to handle data.frame output in j when grouping by and using lapply (data.table)

I can't figure out how to handle dataframe output in j while using data.table.我不知道如何在使用 data.table 时处理 j 中的数据帧输出。 I hope the MRE is self-explanatory:我希望 MRE 是不言自明的:

  library(data.table)
  library(TTR)
  ? BBands
  data(ttrc)
  dt <- as.data.table(ttrc)
  dt$symbol <- "a"
  dt$symbol[1:200] <- "b"
  window_sizes <- c(5, 22)
  new_cols <- expand.grid("bbands", c("dn", "mavg", "up", "pctB"), window_sizes)
  new_cols <- paste(new_cols$Var1, new_cols$Var2, new_cols$Var3, sep = "_")
  output_bbands <- dt[, (new_cols) := lapply(window_sizes, function(w) BBands(Close, n = w)), by = symbol]

The code returns an error:代码返回错误:

Error in `[.data.table`(dt, , lapply(window_sizes, function(w) BBands(Close,  : 
  All items in j=list(...) should be atomic vectors or lists. If you are trying something like j=list(.SD,newcol=mean(colA)) then use := by group instead (much quicker), or cbind or merge afterward.

The function returns 4 columns.该函数返回 4 列。 I would like to add all 4 columns to my dt.我想将所有 4 列添加到我的 dt 中。

Based on the Op's code, the BBands should be applied grouped by 'symbol', and the error occurred only because the output is a list of matrix es.根据 Op 的代码, BBands应该按“符号”分组应用,并且仅因为输出是matrix es 的list而发生错误。 Eg if we do this on whole data, the structure would be例如,如果我们对整个数据执行此操作,结构将是

str(lapply(window_sizes, function(w) BBands(dt$Close, n = w)))
List of 2
 $ : num [1:5550, 1:4] NA NA NA NA 3.07 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : NULL
  .. ..$ : chr [1:4] "dn" "mavg" "up" "pctB"
 $ : num [1:5550, 1:4] NA NA NA NA NA NA NA NA NA NA ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : NULL
  .. ..$ : chr [1:4] "dn" "mavg" "up" "pctB"

We can make a small change in the OP's code by cbind ing keeping the data.table code similar and also grouped by 'symbol'我们可以通过cbind保持data.table代码相似并按“符号”分组来对 OP 的代码进行小的更改

dt[, (new_cols) := do.call(cbind, lapply(window_sizes, function(w) 
           as.data.frame(BBands(Close, n = w)))), by = symbol] 
head(dt)
          Date Open High  Low Close  Volume symbol bbands_dn_5 bbands_mavg_5 bbands_up_5 bbands_pctB_5 bbands_dn_22 bbands_mavg_22 bbands_up_22 bbands_pctB_22
 1: 1985-01-02 3.18 3.18 3.08  3.08 1870906      b          NA            NA          NA            NA           NA             NA           NA             NA
 2: 1985-01-03 3.09 3.15 3.09  3.11 3099506      b          NA            NA          NA            NA           NA             NA           NA             NA
 3: 1985-01-04 3.11 3.12 3.08  3.09 2274157      b          NA            NA          NA            NA           NA             NA           NA             NA
 4: 1985-01-07 3.09 3.12 3.07  3.10 2086758      b          NA            NA          NA            NA           NA             NA           NA             NA
 5: 1985-01-08 3.10 3.12 3.08  3.11 2166348      b    3.074676         3.098    3.121324     0.7572479           NA             NA           NA             NA
 6: 1985-01-09 3.12 3.17 3.10  3.16 3441798      b    3.065668         3.114    3.162332     0.9758734           NA             NA           NA             NA

NOTE: If we just do the lapply on the whole column, the answer would be incorrect as there is no grouping attribute while applyting the BBands注意:如果我们只是做了lapply整列上,答案是有,而applyting无分组属性是不正确的BBands

You can combine the list of dataframes in one dataframe using do.call and cbind to the original dataframe.您可以使用do.callcbind将数据帧列表合并到一个数据帧中到原始数据帧。

cbind(dt, setNames(do.call(cbind.data.frame, 
      lapply(window_sizes, function(w) BBands(dt$Close, n = w))), new_cols))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM