简体   繁体   English

将制表符分隔的字符串列表转换为 dataframe

[英]convert a list of tab delimited strings into dataframe

I have a list of strings with tabs like below:我有一个带有标签的字符串列表,如下所示:

xx <- list("raw total sequences:\t67166250", "1st fragments:\t33583125")
yy <- list("raw total sequences:\t190999", "1st fragments:\t222")

I want to have "row total sequences" and "1st fragments" as column names and the numeric values as column values and xx and yy as row names.我希望将“行总序列”和“第一个片段”作为列名,将数值作为列值,将 xx 和 yy 作为行名。 How can I do it efficiently?我怎样才能有效地做到这一点?

You may create one named list combining the individual lists that you have so that it is easier to work with them.您可以创建一个命名列表,将您拥有的各个列表组合起来,以便更轻松地使用它们。 return_df_from_list function captures the data in two capture groups, one before the colon (as column name) and second after the colon as value and returns a dataframe. return_df_from_list function 在两个捕获组中捕获数据,一个在冒号之前(作为列名),第二个在冒号之后作为值,并返回 dataframe。

We apply the function to each list and combine them in one dataframe using map_df .我们将 function 应用于每个列表,并使用map_df将它们组合到一个 dataframe 中。

library(dplyr)

list_data <- lst(xx, yy)

return_df_from_list <- function(x) {
  value <- stringr::str_match(x, '(.*):\t(.*)')
  setNames(data.frame(t(value[, 3])), value[, 2])  
}

result <- purrr::map_df(list_data, return_df_from_list, .id = "rowname") %>%
  column_to_rownames() %>% 
  type.convert(as.is = TRUE) 

result

#   raw total sequences 1st fragments
#xx            67166250      33583125
#yy              190999           222

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM