[英]how to combine multiple vectors into a data frame using the names of the elements of each vector (in R)
I would like to concatenate multiple vectors into a data frame, using the names of each vector to guide the concatenation. 我想使用每个向量的名称来指导将多个向量连接到一个数据帧中。
for instance if I have vectors x1, x2, and x3: 例如,如果我有向量x1,x2和x3:
sample(1:50,20)->x1; sample(1:50,20)->x2; sample(1:50,20)->x3
and each vector has names such: 每个向量的名称如下:
nam <- paste("A",1:50, sep=""); names(x1)<-as.character(sample(nam,20)); names(x2)<-as.character(sample(nam,20)); names(x3)<-as.character(sample(nam,20))
I would like to generate a data frame in which the first column contains all names used in at least one vector and the rest of the columns containing the values associated to each vector with "na" when there is no value for a particular name. 我想生成一个数据帧,其中第一列包含至少一个向量中使用的所有名称,而其余列中包含与每个向量相关联的值(以“ na”表示),当没有特定名称的值时。 Something like this: 像这样:
A1 3 NA NA
A2 NA 4 5
A3 NA 3 NA
A4 NA 22 NA
....
That would mean that the name A1 is associated with a value (which is 3) only in x1, but not in x2 or x3. 这意味着名称A1仅与x1中的值(为3)相关联,而不与x2或x3中的值相关联。 A2 is associated with value only in vector x2 and x3 but not in x1. A2仅与向量x2和x3中的值相关联,而不与x1中的值相关联。 Etc. 等等。
Any idea of how to do this? 任何想法如何做到这一点?
Thank you very much, 非常感谢你,
I came out with something like that: 我想到了这样的东西:
sort(unique(names(c(x1,x2,x3))))->nam2
cbind(nam2,x1[match(nam2,names(x1))],x2[match(nam2,names(x2))],x3[match(nam2,names(x3))])
I would like to do this for more than 500 vectors in a list, any idea of how to put this into a lapply or something like that? 我想在列表中的500多个向量中执行此操作,关于如何将其放入lapply或类似对象中的任何想法?
Thanks again 再次感谢
Consider the chain merge after creating a list of dataframes: 创建数据帧列表之后,考虑链合并 :
set.seed(61718) # PLACED AT VERY TOP FOR REPRODUCIBILITY
...
# USES ANY OBJECT WITH "x" IN NAME (HERE BEING c("x1", "x2", "x3"))
df_list <- lapply(ls(pattern="x"), function(d)
# CONVERTS VECTOR INTO DATAFRAME AND RENAMES COLUMNS
setNames(transform(data.frame(get(d)), letter=names(get(d))), c(d, "letter"))
)
# CHAIN MERGE
master_df <- Reduce(function(x,y) merge(x, y, by="letter", all=TRUE), df_list)
head(master_df, 10)
# letter x1 x2 x3
# 1 A11 50 12 5
# 2 A12 34 8 1
# 3 A13 3 31 NA
# 4 A14 42 7 NA
# 5 A17 27 44 41
# 6 A2 14 NA 46
# 7 A24 2 NA NA
# 8 A26 29 1 34
# 9 A30 23 4 38
# 10 A31 1 25 12
Alternatively, if Reduce
(being iterative) runs too slow, consider building same dataframe list but have each merge with an all_name_df , then cbind
all results together: 另外,如果Reduce
(正在迭代)运行太慢,请考虑构建相同的数据框列表,但每个列表都与all_name_df合并,然后cbind
所有结果cbind
在一起:
all_name_df <- data.frame(letter=nam)
df_list <- lapply(c("x1", "x2", "x3"), function(d) {
df <- setNames(transform(data.frame(get(d)), letter=names(get(d))), c(d, "letter"))
merge(all_name_df, df, all.x=TRUE)[-1] # -1 REMOVES letter COLUMN
})
master_df <- cbind(all_name_df, do.call(cbind, df_list))
head(master_df, 10)
# letter x1 x2 x3
# 1 A1 NA NA NA
# 2 A2 NA 32 19
# 3 A3 50 12 5
# 4 A4 34 8 1
# 5 A5 3 31 NA
# 6 A6 42 7 NA
# 7 A7 NA NA NA
# 8 A8 NA 40 NA
# 9 A9 27 44 41
# 10 A10 NA NA NA
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.