简体   繁体   English

将字符向量列表转换为整齐的数据框

[英]Convert list of character vectors to tidy data frame

I have a list of character vectors that I would like to convert into a tidy data frame.我有一个字符向量列表,我想将其转换为整洁的数据框。 The lengths of the character vectors are unequal.字符向量的长度不相等。

dput(data)
list(`ko03008 Ribosome biogenesis in eukaryotes` = c("G5382", 
"G13330", "G4043", "G13255"), `ko03010 Ribosome` = c("G16823", 
"G4822", "G11737", "G114", "G18144", "G6031", "G24182", "G9882", 
"G14270", "G16903", "G2506", "G3550"), `ko03013 RNA transport` = c("G18058", 
"G20817", "G6913", "G18004", "G4129", "G5382", "G5264", "G17529", 
"G5114", "G21371", "G19351", "G15511", "G1049", "G14663"), `ko03015 mRNA surveillance pathway` = c("G20817", 
"G6913", "G18004", "G4129", "G5382", "G19351", "G15511", "G1463"
), `ko03018 RNA degradation` = c("G11453", "G7437", "G11483", 
"G12095"), `ko03020 RNA polymerase` = c("G13069", "G10917", "G6973", 
"G7432"))

I would like to create a data frame with two columns.我想创建一个包含两列的数据框。 One with the name of each character vector within the list (eg 'ko03008 Ribosome biogeneis in eukaryotes') and the other with gene IDs (eg 'G5382).一个带有列表中每个特征向量的名称(例如'ko03008 Ribosome biogeneis in eukaryotes'),另一个带有基因 ID(例如'G5382)。

I've used enframe to create a tibble that looks like this:我已经使用enframe创建了一个看起来像这样的 tibble: 在此处输入图像描述

but I would like to format it like this (an example of what the first vector in the list should look like):但我想像这样格式化它(列表中第一个向量的示例):

在此处输入图像描述

Use unnest_longer :使用unnest_longer

library(tidyverse)

data %>% 
  enframe() %>% 
  unnest_longer(value)

# A tibble: 46 x 2
   name                                      value 
   <chr>                                     <chr> 
 1 ko03008 Ribosome biogenesis in eukaryotes G5382 
 2 ko03008 Ribosome biogenesis in eukaryotes G13330
 3 ko03008 Ribosome biogenesis in eukaryotes G4043 
 4 ko03008 Ribosome biogenesis in eukaryotes G13255
 5 ko03010 Ribosome                          G16823
 6 ko03010 Ribosome                          G4822 
 7 ko03010 Ribosome                          G11737
 8 ko03010 Ribosome                          G114  
 9 ko03010 Ribosome                          G18144
10 ko03010 Ribosome                          G6031 
# ... with 36 more rows

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM