简体   繁体   English

在data.tables列表上使用lapply将列表成员名称分配为变量

[英]Using lapply over list of data.tables to assign list member name as variable

I have a list of data.tables 我有一个data.tables列表

library(data.table)

set.seed(27)
test <- list()
test$a <- data.table(x = rnorm(n = 10),
                     y = rnorm (n = 10))
test$b <- data.table(x = rnorm(n = 10),
                     y = rnorm (n = 10))

Each member of the list has a unique name 列表中的每个成员都有一个唯一的名称

test

In preparation to append these multiple tables into a single 'long' format, I want to create a third column that is a variable which is the name of each member via a function (I will need to do this regularly). 在准备将这些多个表追加为单个“长”格式时,我想创建一个第三列,该列是一个变量,它是通过函数的每个成员的名称(我将需要定期执行此操作)。

Currently working (though incorrect) code: 当前有效(尽管不正确)的代码:

lName.asVariable <- function(dataTableList) {
dataTableList <- lapply(X = dataTableList, FUN = function(x)(x[, Site :=names(dataTableList)]))
}

test <- lName.asVariable(test)
test

Which ouputs 哪个输出

$a
               x           y Site
 1:  1.907162564 -1.28512736    a
 2:  1.144876890  0.03482725    b
 3: -0.764530737  1.57029534    a
 4: -1.457432503  0.15801005    b
...
$b
              x          y Site
 1: -0.57488122 -0.1520452    a
 2: -1.15190000 -0.9589459    b
 3:  0.08706853  1.8582198    a
 4: -0.07018075 -1.5747647    b
...

though what I want is 虽然我想要的是

$a
               x           y Site
 1:  1.907162564 -1.28512736    a
 2:  1.144876890  0.03482725    a
 3: -0.764530737  1.57029534    a
 4: -1.457432503  0.15801005    a
...
$b
              x          y Site
 1: -0.57488122 -0.1520452    b
 2: -1.15190000 -0.9589459    b
 3:  0.08706853  1.8582198    b
 4: -0.07018075 -1.5747647    b
...

After reading extract names of objects from list seq_along might be what I need, though the following code produces an error: 从列表 seq_along读取对象的提取名称后,可能是我所需要的,尽管以下代码会产生错误:

lName.asVariable <- function(dataTableList) {
    dataTableList <- lapply(X = seq_along(dataTableList), FUN = function(x)(x[, Site := names(dataTableList)]))
}

test <- lName.asVariable(test)
test

I'm not sharp enough though to work out how to package seq_along to refer to the data.table correctly. 我还不够seq_along ,尽管无法弄清楚如何打包seq_along来正确引用data.table Is this even the right tactic? 这是正确的策略吗?

seq_along produces a sequence of numbers from 1 to the length of your list. seq_along产生一个从1到列表长度的数字序列。 You can then use an intermediate indexing variable to refer to the list item and the names item: 然后,您可以使用中间索引变量来引用列表项和names项:

lapply(seq_along(test), function(i) test[[i]][,Site:=names(test[i])])
[[1]]
               x           y Site
 1:  1.907162564 -1.28512736    a
 2:  1.144876890  0.03482725    a
 3: -0.764530737  1.57029534    a
 4: -1.457432503  0.15801005    a
 5: -1.093468881 -0.74579947    a
 6:  0.295241218 -1.06880297    a
 7:  0.006885942 -1.62743793    a
 8:  1.157410886 -1.06858164    a
 9:  2.134637891 -0.02583971    a
10:  0.237844613  0.31957639    a

[[2]]
              x          y Site
 1: -0.57488122 -0.1520452    b
 2: -1.15190000 -0.9589459    b
 3:  0.08706853  1.8582198    b
 4: -0.07018075 -1.5747647    b
 5: -2.99830401 -0.3981480    b
 6: -1.22399491  0.9686850    b
 7: -0.99707477  0.6711891    b
 8:  0.33571390  0.6788910    b
 9:  1.29534374 -0.1739613    b
10:  0.32775994  0.7890292    b

Note that the output of lapply loses the names so you would have to reinstate them manually. 请注意,lapply的输出会丢失名称,因此您必须手动将其恢复。

If you end goal is to combine them into a single data.table , then in the latest version (1.9.5+) you can do it all in one step: 如果最终目标是将它们组合到一个data.table ,那么在最新版本(1.9.5+)中,您可以一步完成全部操作:

rbindlist(test, idcol = 'Site')
#    Site            x           y
# 1:    a  1.907162564 -1.28512736
# 2:    a  1.144876890  0.03482725
# 3:    a -0.764530737  1.57029534
# 4:    a -1.457432503  0.15801005
# 5:    a -1.093468881 -0.74579947
# 6:    a  0.295241218 -1.06880297
# 7:    a  0.006885942 -1.62743793
# 8:    a  1.157410886 -1.06858164
# 9:    a  2.134637891 -0.02583971
#10:    a  0.237844613  0.31957639
#11:    b -0.574881218 -0.15204521
#12:    b -1.151900001 -0.95894585
#13:    b  0.087068535  1.85821984
#14:    b -0.070180754 -1.57476470
#15:    b -2.998304014 -0.39814797
#16:    b -1.223994910  0.96868503
#17:    b -0.997074773  0.67118912
#18:    b  0.335713896  0.67889104
#19:    b  1.295343743 -0.17396132
#20:    b  0.327759944  0.78902925

I don't know this way works to you. 我不知道这种方式对您有效。 But if you want the result, the below is easy way I believe, 但是,如果您想要结果,我相信以下是简单的方法,

 library(data.table)

set.seed(27)
test <- list()
test$a <- data.table(x = rnorm(n = 10),
                     y = rnorm (n = 10))
test$b <- data.table(x = rnorm(n = 10),
                     y = rnorm (n = 10))
test
test$a$Site <- "a"
test$b$Site <- "b"
test

$a
               x           y Site
 1:  1.907162564 -1.28512736    a
 2:  1.144876890  0.03482725    a
 3: -0.764530737  1.57029534    a
 4: -1.457432503  0.15801005    a
 5: -1.093468881 -0.74579947    a
 6:  0.295241218 -1.06880297    a
 7:  0.006885942 -1.62743793    a
 8:  1.157410886 -1.06858164    a
 9:  2.134637891 -0.02583971    a
10:  0.237844613  0.31957639    a

$b
              x          y Site
 1: -0.57488122 -0.1520452    b
 2: -1.15190000 -0.9589459    b
 3:  0.08706853  1.8582198    b
 4: -0.07018075 -1.5747647    b
 5: -2.99830401 -0.3981480    b
 6: -1.22399491  0.9686850    b
 7: -0.99707477  0.6711891    b
 8:  0.33571390  0.6788910    b
 9:  1.29534374 -0.1739613    b
10:  0.32775994  0.7890292    b

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM