将制表符分隔的字符串列表转换为 dataframe

Question

I have a list of strings with tabs like below:我有一个带有标签的字符串列表，如下所示：

xx <- list("raw total sequences:\t67166250", "1st fragments:\t33583125")
yy <- list("raw total sequences:\t190999", "1st fragments:\t222")

I want to have "row total sequences" and "1st fragments" as column names and the numeric values as column values and xx and yy as row names.我希望将“行总序列”和“第一个片段”作为列名，将数值作为列值，将 xx 和 yy 作为行名。 How can I do it efficiently?我怎样才能有效地做到这一点？

Answer 1

You may create one named list combining the individual lists that you have so that it is easier to work with them.您可以创建一个命名列表，将您拥有的各个列表组合起来，以便更轻松地使用它们。 return_df_from_list function captures the data in two capture groups, one before the colon (as column name) and second after the colon as value and returns a dataframe. return_df_from_list function 在两个捕获组中捕获数据，一个在冒号之前（作为列名），第二个在冒号之后作为值，并返回 dataframe。

We apply the function to each list and combine them in one dataframe using map_df .我们将 function 应用于每个列表，并使用map_df将它们组合到一个 dataframe 中。

library(dplyr)

list_data <- lst(xx, yy)

return_df_from_list <- function(x) {
  value <- stringr::str_match(x, '(.*):\t(.*)')
  setNames(data.frame(t(value[, 3])), value[, 2])  
}

result <- purrr::map_df(list_data, return_df_from_list, .id = "rowname") %>%
  column_to_rownames() %>% 
  type.convert(as.is = TRUE) 

result

#   raw total sequences 1st fragments
#xx            67166250      33583125
#yy              190999           222

将制表符分隔的字符串列表转换为 dataframe

问题描述

1 个解决方案

解决方案1
0 2022-09-23 01:36:35

将制表符分隔的字符串列表转换为 dataframe

问题描述

1 个解决方案

解决方案1 0 2022-09-23 01:36:35

解决方案1
0 2022-09-23 01:36:35