简体   繁体   English

如何基于列名对R中包含多个列表的列表进行子集化,并合并为单个列表/数据框?

[英]How to subset a list containing several lists in R, based on the column name, and merge into a single list/dataframe?

I have several lists with the same column names. 我有几个具有相同列名的列表。 I am trying to write a function to subset by column name, from all the lists, merge into one dataframe and add new column names. 我正在尝试从所有列表中按列名将函数写入子集,合并到一个数据框中并添加新的列名。 For example: 例如:

list1<- list(a= c(1:6),
      b= c(4:9),
      c= c(3:8))
list2<- list(a= c(12:17),
      b= c(10:15),
      c= c(11:16))
list3<- list(a= c(2:7),
      b= c(14:19),
      c= c(9:14))
all<- list (list1, list2, list3)
new_column_names<- c("Block1", "Block2", "Block3")

I would like to subset all lists "a" and merge into a single dataframe with new_column_names as column names. 我想对所有列表“ a”进行子集化,并合并为new_column_names作为列名的单个数据框。 Any suggestions? 有什么建议么? Thanks! 谢谢!

You cannot turn them into a regular data.frame because those a-vectors have different lengths and a data.frame requires all columns to have equal length. 您不能将它们转换为常规的data.frame,因为这些a向量的长度不同,并且data.frame要求所有列的长度均相等。

Instead, you can turn them into a long-formatted data.frame using: 相反,您可以使用以下命令将它们转换为长格式的data.frame:

stack(setNames(lapply(all, `[[`, "a"), new_column_names))
#    values    ind
# 1       1 Block1
# 2       2 Block1
# 3       3 Block1
# 4       4 Block1
# 5       5 Block1
# 6      12 Block2
# 7      13 Block2
# 8      14 Block2
# 9      15 Block2
# 10     16 Block2
# 11     17 Block2
# 12      2 Block3
# 13      3 Block3
# 14      4 Block3
# 15      5 Block3
# 16      6 Block3
# 17      7 Block3

You can try 你可以试试

library(tidyverse)
all %>% 
  flatten() %>%
  keep(names(.) =="a") %>%
  set_names(new_column_names) %>% 
  map(~tibble(a=.x, n=seq_along(.x))) %>% 
  bind_rows(.id = "ind") 
# A tibble: 17 x 3
   ind        a     n
   <chr>  <int> <int>
 1 Block1     1     1
 2 Block1     2     2
 3 Block1     3     3
 4 Block1     4     4
 5 Block1     5     5
 6 Block2    12     1
 7 Block2    13     2
 8 Block2    14     3
 9 Block2    15     4
10 Block2    16     5
11 Block2    17     6
12 Block3     2     1
13 Block3     3     2
14 Block3     4     3
15 Block3     5     4
16 Block3     6     5
17 Block3     7     6

Then you can spread for instance to get the data.frame 然后您可以传播例如以获取data.frame

.Last.value %>% 
  spread(ind, a)
# A tibble: 6 x 4
      n Block1 Block2 Block3
  <int>  <int>  <int>  <int>
1     1      1     12      2
2     2      2     13      3
3     3      3     14      4
4     4      4     15      5
5     5      5     16      6
6     6     NA     17      7

If you want a data.frame you will have to make the vectors a all of equal length first. 如果需要data.frame ,则必须首先使向量a长度相等。 Then cbind the results. 然后cbind结果。

res <- lapply(all, `[[`, "a")
n <- max(sapply(res, length))
res <- lapply(res, function(x) if(length(x) < n) c(x, rep(NA, n - length(x))) else x)
res <- do.call(cbind, res)
res <- as.data.frame(res)
res <- setNames(res, new_column_names)
res
#  Block1 Block2 Block3
#1      1     12      2
#2      2     13      3
#3      3     14      4
#4      4     15      5
#5      5     16      6
#6     NA     17      7

Here is a modified option using tidyverse 这是使用tidyverse的修改选项

library(tidyverse)
map(all, ~ as_tibble(.x) %>%
               select(a)) %>%
      set_names(new_column_names) %>% 
      bind_rows(.id = 'ind')
# A tibble: 18 x 2
#   ind        a
#   <chr>  <int>
# 1 Block1     1
# 2 Block1     2
# 3 Block1     3
# 4 Block1     4
# 5 Block1     5
# 6 Block1     6
# 7 Block2    12
# 8 Block2    13
# 9 Block2    14
#10 Block2    15
#11 Block2    16
#12 Block2    17
#13 Block3     2
#14 Block3     3
#15 Block3     4
#16 Block3     5
#17 Block3     6
#18 Block3     7

Or using map2 或使用map2

map2_df(all, new_column_names,
                 ~ as_tibble(.x) %>% 
                        mutate(ind = .y) %>%
                        select(ind, a))

In base R : 在基数R

as.data.frame(setNames(lapply(all,`[[`,"a"),new_column_names))
#   Block1 Block2 Block3
# 1      1     12      2
# 2      2     13      3
# 3      3     14      4
# 4      4     15      5
# 5      5     16      6
# 6      6     17      7

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 将两个列表合并为包含单个字符向量 R 的单个列表 - Merge two lists into single list containing single character vectors R 将 dataframe 列中的列表转换为 R 中的单个列表 - Convert lists in a dataframe column into a single list in R 如何有效地将数据帧分成几个大块,以传递到列表列表 - how to efficiently subset a dataframe into several chunks to be passed to a list of lists 如何根据 r 中的值列表对 dataframe 进行子集化 - How to subset a dataframe based on a list of value in r 如何根据列表列中的值对 dataframe 进行子集化 - How to subset a dataframe based on values in a list column R 列表到数据框的列表,列表名称作为额外的列 - R List of lists to dataframe with list name as extra column R:如何将单个元素的多个列表连接到一个元素的单个列表中 - R : How to join several lists of a single element into a single list of an element 包含特定列名的数据框子集列表 - Subset list of dataframes containing specific column name 基于 R 中包含变量名称的对象的子集数据框 - Subset dataframe based on object containing variable name in R R - 在我的 dataframe 中创建一个列,其中包含基于彼此之间重叠的每个列表(在列表对象的列表中)的名称 - R - Create a column in my dataframe that contains the name of each list (in a list of lists object) based on overlap between each other
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM