[英]Convert list of character vectors to tidy data frame
我有一個字符向量列表,我想將其轉換為整潔的數據框。 字符向量的長度不相等。
dput(data)
list(`ko03008 Ribosome biogenesis in eukaryotes` = c("G5382",
"G13330", "G4043", "G13255"), `ko03010 Ribosome` = c("G16823",
"G4822", "G11737", "G114", "G18144", "G6031", "G24182", "G9882",
"G14270", "G16903", "G2506", "G3550"), `ko03013 RNA transport` = c("G18058",
"G20817", "G6913", "G18004", "G4129", "G5382", "G5264", "G17529",
"G5114", "G21371", "G19351", "G15511", "G1049", "G14663"), `ko03015 mRNA surveillance pathway` = c("G20817",
"G6913", "G18004", "G4129", "G5382", "G19351", "G15511", "G1463"
), `ko03018 RNA degradation` = c("G11453", "G7437", "G11483",
"G12095"), `ko03020 RNA polymerase` = c("G13069", "G10917", "G6973",
"G7432"))
我想創建一個包含兩列的數據框。 一個帶有列表中每個特征向量的名稱(例如'ko03008 Ribosome biogeneis in eukaryotes'),另一個帶有基因 ID(例如'G5382)。
我已經使用enframe
創建了一個看起來像這樣的 tibble:
但我想像這樣格式化它(列表中第一個向量的示例):
使用unnest_longer
:
library(tidyverse)
data %>%
enframe() %>%
unnest_longer(value)
# A tibble: 46 x 2
name value
<chr> <chr>
1 ko03008 Ribosome biogenesis in eukaryotes G5382
2 ko03008 Ribosome biogenesis in eukaryotes G13330
3 ko03008 Ribosome biogenesis in eukaryotes G4043
4 ko03008 Ribosome biogenesis in eukaryotes G13255
5 ko03010 Ribosome G16823
6 ko03010 Ribosome G4822
7 ko03010 Ribosome G11737
8 ko03010 Ribosome G114
9 ko03010 Ribosome G18144
10 ko03010 Ribosome G6031
# ... with 36 more rows
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.