[英]How to subset a list containing several lists in R, based on the column name, and merge into a single list/dataframe?
I have several lists with the same column names. 我有几个具有相同列名的列表。 I am trying to write a function to subset by column name, from all the lists, merge into one dataframe and add new column names.
我正在尝试从所有列表中按列名将函数写入子集,合并到一个数据框中并添加新的列名。 For example:
例如:
list1<- list(a= c(1:6),
b= c(4:9),
c= c(3:8))
list2<- list(a= c(12:17),
b= c(10:15),
c= c(11:16))
list3<- list(a= c(2:7),
b= c(14:19),
c= c(9:14))
all<- list (list1, list2, list3)
new_column_names<- c("Block1", "Block2", "Block3")
I would like to subset all lists "a" and merge into a single dataframe with new_column_names as column names. 我想对所有列表“ a”进行子集化,并合并为new_column_names作为列名的单个数据框。 Any suggestions?
有什么建议么? Thanks!
谢谢!
You cannot turn them into a regular data.frame because those a-vectors have different lengths and a data.frame requires all columns to have equal length. 您不能将它们转换为常规的data.frame,因为这些a向量的长度不同,并且data.frame要求所有列的长度均相等。
Instead, you can turn them into a long-formatted data.frame using: 相反,您可以使用以下命令将它们转换为长格式的data.frame:
stack(setNames(lapply(all, `[[`, "a"), new_column_names))
# values ind
# 1 1 Block1
# 2 2 Block1
# 3 3 Block1
# 4 4 Block1
# 5 5 Block1
# 6 12 Block2
# 7 13 Block2
# 8 14 Block2
# 9 15 Block2
# 10 16 Block2
# 11 17 Block2
# 12 2 Block3
# 13 3 Block3
# 14 4 Block3
# 15 5 Block3
# 16 6 Block3
# 17 7 Block3
You can try 你可以试试
library(tidyverse)
all %>%
flatten() %>%
keep(names(.) =="a") %>%
set_names(new_column_names) %>%
map(~tibble(a=.x, n=seq_along(.x))) %>%
bind_rows(.id = "ind")
# A tibble: 17 x 3
ind a n
<chr> <int> <int>
1 Block1 1 1
2 Block1 2 2
3 Block1 3 3
4 Block1 4 4
5 Block1 5 5
6 Block2 12 1
7 Block2 13 2
8 Block2 14 3
9 Block2 15 4
10 Block2 16 5
11 Block2 17 6
12 Block3 2 1
13 Block3 3 2
14 Block3 4 3
15 Block3 5 4
16 Block3 6 5
17 Block3 7 6
Then you can spread for instance to get the data.frame 然后您可以传播例如以获取data.frame
.Last.value %>%
spread(ind, a)
# A tibble: 6 x 4
n Block1 Block2 Block3
<int> <int> <int> <int>
1 1 1 12 2
2 2 2 13 3
3 3 3 14 4
4 4 4 15 5
5 5 5 16 6
6 6 NA 17 7
If you want a data.frame
you will have to make the vectors a
all of equal length first. 如果需要
data.frame
,则必须首先使向量a
长度相等。 Then cbind
the results. 然后
cbind
结果。
res <- lapply(all, `[[`, "a")
n <- max(sapply(res, length))
res <- lapply(res, function(x) if(length(x) < n) c(x, rep(NA, n - length(x))) else x)
res <- do.call(cbind, res)
res <- as.data.frame(res)
res <- setNames(res, new_column_names)
res
# Block1 Block2 Block3
#1 1 12 2
#2 2 13 3
#3 3 14 4
#4 4 15 5
#5 5 16 6
#6 NA 17 7
Here is a modified option using tidyverse
这是使用
tidyverse
的修改选项
library(tidyverse)
map(all, ~ as_tibble(.x) %>%
select(a)) %>%
set_names(new_column_names) %>%
bind_rows(.id = 'ind')
# A tibble: 18 x 2
# ind a
# <chr> <int>
# 1 Block1 1
# 2 Block1 2
# 3 Block1 3
# 4 Block1 4
# 5 Block1 5
# 6 Block1 6
# 7 Block2 12
# 8 Block2 13
# 9 Block2 14
#10 Block2 15
#11 Block2 16
#12 Block2 17
#13 Block3 2
#14 Block3 3
#15 Block3 4
#16 Block3 5
#17 Block3 6
#18 Block3 7
Or using map2
或使用
map2
map2_df(all, new_column_names,
~ as_tibble(.x) %>%
mutate(ind = .y) %>%
select(ind, a))
In base R
: 在基数
R
:
as.data.frame(setNames(lapply(all,`[[`,"a"),new_column_names))
# Block1 Block2 Block3
# 1 1 12 2
# 2 2 13 3
# 3 3 14 4
# 4 4 15 5
# 5 5 16 6
# 6 6 17 7
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.