[英]Programmatically assigning columns in data.table with dynamic column names
[英]passing column names to data.table programmatically
我希望能夠編寫一個函數,通過組在data.table
中運行回歸,然后很好地組織結果。 以下是我想要做的一個示例:
require(data.table)
dtb = data.table(y=1:10, x=10:1, z=sample(1:10), weights=1:10, thedate=1:2)
models = c("y ~ x", "y ~ z")
res = lapply(models, function(f) {dtb[,as.list(coef(lm(f, weights=weights, data=.SD))),by=thedate]})
#do more stuff with res
我想將所有這些包裝成一個函數,因為#doe more stuff
可能很長。 我面臨的問題是如何將各種名稱傳遞給data.table
? 例如,如何傳遞列名稱weights
? 我怎么通過thedate
? 我想象一個看起來像這樣的原型:
myfun = function(dtb, models, weights, dates)
讓我說清楚:將公式傳遞給我的函數不是問題。 如果weights
我想用和列名描述日期, thedate
被稱為然后我的功能可以簡單地是這樣的:
myfun = function(dtb, models) {
res = lapply(models, function(f) {dtb[,as.list(coef(lm(f, weights=weights, data=.SD))),by=thedate]})
#do more stuff with res
}
但是,對應於thedate
和weights
的列名稱是事先未知的。 我想將它們傳遞給我的函數:
#this will not work
myfun = function(dtb, models, w, d) {
res = lapply(models, function(f) {dtb[,as.list(coef(lm(f, weights=w, data=.SD))),by=d]})
#do more stuff with res
}
謝謝
這是一個依賴於長格式數據的解決方案(這對我來說更有意義,在這個cas中
library(reshape2)
dtlong <- data.table(melt(dtb, measure.var = c('x','z')))
foo <- function(f, d, by, w ){
# get the name of the w argument (weights)
w.char <- deparse(substitute(w))
# convert `list(a,b)` to `c('a','b')`
# obviously, this would have to change depending on how `by` was defined
by <- unlist(lapply(as.list(as.list(match.call())[['by']])[-1], as.character))
# create the call substituting the names as required
.c <- substitute(as.list(coef(lm(f, data = .SD, weights = w), list(w = as.name(w.char)))))
# actually perform the calculations
d[,eval(.c), by = by]
}
foo(f= y~value, d= dtlong, by = list(variable, thedate), w = weights)
variable thedate (Intercept) value
1: x 1 11.000000 -1.00000000
2: x 2 11.000000 -1.00000000
3: z 1 1.009595 0.89019190
4: z 2 7.538462 -0.03846154
一種可能的方案:
fun = function(dtb, models, w_col_name, date_name) {
res = lapply(models, function(f) {dtb[,as.list(coef(lm(f, weights=eval(parse(text=w_col_name)), data=.SD))),by=eval(parse(text=paste0("list(",date_name,")")))]})
}
你不能只添加(在匿名函數調用內):
f <- as.formula(f)
...作為dtb[,as.list(coef(lm(f, ...)
之前的單獨一行dtb[,as.list(coef(lm(f, ...)
?這是將字符元素轉換為公式對象的常用方法。
> res = lapply(models, function(f) {f <- as.formula(f)
dtb[,as.list(coef(lm(f, weights=weights, data=.SD))),by=thedate]})
>
> str(res)
List of 2
$ :Classes ‘data.table’ and 'data.frame': 2 obs. of 3 variables:
..$ thedate : int [1:2] 1 2
..$ (Intercept): num [1:2] 11 11
..$ x : num [1:2] -1 -1
..- attr(*, ".internal.selfref")=<externalptr>
$ :Classes ‘data.table’ and 'data.frame': 2 obs. of 3 variables:
..$ thedate : int [1:2] 1 2
..$ (Intercept): num [1:2] 6.27 11.7
..$ z : num [1:2] 0.0633 -0.7995
..- attr(*, ".internal.selfref")=<externalptr>
如果需要從組件名稱構建公式的字符版本,只需使用paste
或paste0
並傳遞給模型字符向量。 通過接收可測試示例提供的經過測試的代碼。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.