[英]Adding a new column to each element in a list of tables or data frames
I have a list of files.我有一个文件列表。 I also have a list of "names" which I
substr()
from the actual filenames of these files.我还有一个“名称”列表,我从这些文件的实际文件名中
substr()
。 I would like to add a new column to each of the files in the list.我想为列表中的每个文件添加一个新列。 This column will contain the corresponding element in "names" repeated times the number of rows in the file.
此列将包含“名称”中相应元素的重复次数乘以文件中的行数。
For example:例如:
df1 <- data.frame(x = 1:3, y=letters[1:3])
df2 <- data.frame(x = 4:6, y=letters[4:6])
filelist <- list(df1,df2)
ID <- c("1A","IB")
Pseudocode伪代码
for( i in length(filelist)){
filelist[i]$SampleID <- rep(ID[i],nrow(filelist[i])
}
// basically create a new column in each of the dataframes in filelist, and fill the column with repeted corresponding values of ID // 基本上在文件列表中的每个数据帧中创建一个新列,并用重复的相应 ID 值填充该列
my output should be like:我的输出应该是这样的:
filelist[1]
should be: filelist[1]
应该是:
x y SAmpleID
1 1 a 1A
2 2 b 1A
3 3 c 1A
fileList[2]
x y SampleID
1 4 d IB
2 5 e IB
3 6 f IB
and so on.....等等.....
Any Idea how it could be done.任何想法如何做到。
An alternate solution is to use cbind, and taking advantage of the fact that R will recylce values of a shorter vector.另一种解决方案是使用 cbind,并利用 R 将回收较短向量的值这一事实。
For Example例如
x <- df2 # from above
cbind(x, NewColumn="Singleton")
# x y NewColumn
# 1 4 d Singleton
# 2 5 e Singleton
# 3 6 f Singleton
There is no need for the use of rep
.不需要使用
rep
。 R does that for you. R 为你做这件事。
Therfore, you could put cbind(filelist[[i]], ID[[i]])
in your for loop
or as @Sven pointed out, you can use the cleaner mapply
:因此,您可以将
cbind(filelist[[i]], ID[[i]])
放入for loop
或者如@Sven 指出的那样,您可以使用更清洁的mapply
:
filelist <- mapply(cbind, filelist, "SampleID"=ID, SIMPLIFY=F)
This is a corrected version of your loop:这是循环的更正版本:
for( i in seq_along(filelist)){
filelist[[i]]$SampleID <- rep(ID[i],nrow(filelist[[i]]))
}
There were 3 problems:有3个问题:
)
was missing after the command in the body.)
。[[
, not by [
.[[
访问,而不是由[
。 [
returns a list of length one. [
返回长度为 1 的列表。 [[
returns the element only. [[
仅返回元素。length(filelist)
is just one value, so the loop runs for the last element of the list only. length(filelist)
只是一个值,所以循环只针对列表的最后一个元素运行。 I replaced it with seq_along(filelist)
.seq_along(filelist)
替换了它。 A more efficient approach is to use mapply
for the task:更有效的方法是对任务使用
mapply
:
mapply(function(x, y) "[<-"(x, "SampleID", value = y) ,
filelist, ID, SIMPLIFY = FALSE)
The purrr
way, using map2
purrr
方式,使用map2
library(dplyr)
library(purrr)
map2(filelist, ID, ~cbind(.x, SampleID = .y))
#[[1]]
# x y SampleId
#1 1 a 1A
#2 2 b 1A
#3 3 c 1A
#[[2]]
# x y SampleId
#1 4 d IB
#2 5 e IB
#3 6 f IB
Or can also use或者也可以使用
map2(filelist, ID, ~.x %>% mutate(SampleId = .y))
If you name the list, we can use imap
and add the new column based on it's name.如果您命名列表,我们可以使用
imap
并根据它的名称添加新列。
names(filelist) <- c("1A","IB")
imap(filelist, ~cbind(.x, SampleID = .y))
#OR
#imap(filelist, ~.x %>% mutate(SampleId = .y))
which is similar to using Map
这类似于使用
Map
Map(cbind, filelist, SampleID = names(filelist))
This one worked for me:这个对我有用:
Create a new column for every dataframe in a list;为列表中的每个数据框创建一个新列; fill the values of the new column based on existing column.
根据现有列填充新列的值。 (In your case IDs).
(在您的情况下是 ID)。
Example:例子:
# Create dummy data
df1<-data.frame(a = c(1,2,3))
df2<-data.frame(a = c(5,6,7))
# Create a list
l<-list(df1, df2)
> l
[[1]]
a
1 1
2 2
3 3
[[2]]
a
1 5
2 6
3 7
# add new column 'b'
# create 'b' values based on column 'a'
l2<-lapply(l, function(x)
cbind(x, b = x$a*4))
Results in:结果是:
> l2
[[1]]
a b
1 1 4
2 2 8
3 3 12
[[2]]
a b
1 5 20
2 6 24
3 7 28
In your case something like:在你的情况下是这样的:
filelist<-lapply(filelist, function(x)
cbind(x, b = x$SampleID))
A tricky way:一个棘手的方法:
library(plyr)
names(filelist) <- ID
result <- ldply(filelist, data.frame)
data_lst <- list(
data_1 = data.frame(c1 = 1:3, c2 = 3:1),
data_2 = data.frame(c1 = 1:3, c2 = 3:1)
)
f <- function (data, name){
data$name <- name
data
}
Map(f, data_lst , names(data_lst))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.