[英]R Add column into data.frame, that is in list of data.frames
[英]R: Combine list of data frames into single data frame, add column with list index
問題與這一問題非常相似。 它用於將數據幀列表組合成單個較長的數據幀。 但是,我希望通過添加包含列表索引(id或source)的額外列來保留數據來自列表項的信息。
這是數據(來自鏈接示例的借用代碼):
dfList <- NULL
set.seed(1)
for (i in 1:3) {
dfList[[i]] <- data.frame(a=sample(letters, 5, rep=T), b=rnorm(5), c=rnorm(5))
}
使用下面的代碼提供了連接數據框,但不添加列表索引的列:
df <- do.call("rbind", dfList)
如何在創建列以捕獲列表中的原點時連接列表中的數據框? 類似於以下內容:
非常感謝你提前。
嘗試data.table::rbindlist
library(data.table) # v1.9.5+
rbindlist(dfList, idcol = "index")
# index a b c
# 1: 1 g 1.27242932 -0.005767173
# 2: 1 j 0.41464143 2.404653389
# 3: 1 o -1.53995004 0.763593461
# 4: 1 x -0.92856703 -0.799009249
# 5: 1 f -0.29472045 -1.147657009
# 6: 2 k -0.04493361 0.918977372
# 7: 2 a -0.01619026 0.782136301
# 8: 2 j 0.94383621 0.074564983
# 9: 2 w 0.82122120 -1.989351696
# 10: 2 i 0.59390132 0.619825748
# 11: 3 m -1.28459935 -0.649471647
# 12: 3 w 0.04672617 0.726750747
# 13: 3 l -0.23570656 1.151911754
# 14: 3 g -0.54288826 0.992160365
# 15: 3 b -0.43331032 -0.429513109
你可以在基地做到這一點:
df[["index"]] <- rep(seq_along(dfList), sapply(dfList, nrow))
df
## a b c index
## 1 g 1.27242932 -0.005767173 1
## 2 j 0.41464143 2.404653389 1
## 3 o -1.53995004 0.763593461 1
## 4 x -0.92856703 -0.799009249 1
## 5 f -0.29472045 -1.147657009 1
## 6 k -0.04493361 0.918977372 2
## 7 a -0.01619026 0.782136301 2
## 8 j 0.94383621 0.074564983 2
## 9 w 0.82122120 -1.989351696 2
## 10 i 0.59390132 0.619825748 2
## 11 m -1.28459935 -0.649471647 3
## 12 w 0.04672617 0.726750747 3
## 13 l -0.23570656 1.151911754 3
## 14 g -0.54288826 0.992160365 3
## 15 b -0.43331032 -0.429513109 3
你也可以這樣做:
library(qdapTools)
list_df2df(setNames(dfList, 1:3), "index")
## index a b c
## 1 1 g 1.27242932 -0.005767173
## 2 1 j 0.41464143 2.404653389
## 3 1 o -1.53995004 0.763593461
## 4 1 x -0.92856703 -0.799009249
## 5 1 f -0.29472045 -1.147657009
## 6 2 k -0.04493361 0.918977372
## 7 2 a -0.01619026 0.782136301
## 8 2 j 0.94383621 0.074564983
## 9 2 w 0.82122120 -1.989351696
## 10 2 i 0.59390132 0.619825748
## 11 3 m -1.28459935 -0.649471647
## 12 3 w 0.04672617 0.726750747
## 13 3 l -0.23570656 1.151911754
## 14 3 g -0.54288826 0.992160365
## 15 3 b -0.43331032 -0.429513109
這是一個完全符合您要求的dplyr解決方案:
dfList <- NULL
set.seed(1)
for (i in 1:3) {
dfList[[i]] <- data.frame(a=sample(letters, 5, rep=T), b=rnorm(5), c=rnorm(5))
}
df <- dplyr::bind_rows(dfList, .id = "index")
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.