[英]convert a list of tab delimited strings into dataframe
I have a list of strings with tabs like below:我有一个带有标签的字符串列表,如下所示:
xx <- list("raw total sequences:\t67166250", "1st fragments:\t33583125")
yy <- list("raw total sequences:\t190999", "1st fragments:\t222")
I want to have "row total sequences" and "1st fragments" as column names and the numeric values as column values and xx and yy as row names.我希望将“行总序列”和“第一个片段”作为列名,将数值作为列值,将 xx 和 yy 作为行名。 How can I do it efficiently?
我怎样才能有效地做到这一点?
You may create one named list combining the individual lists that you have so that it is easier to work with them.您可以创建一个命名列表,将您拥有的各个列表组合起来,以便更轻松地使用它们。
return_df_from_list
function captures the data in two capture groups, one before the colon (as column name) and second after the colon as value and returns a dataframe. return_df_from_list
function 在两个捕获组中捕获数据,一个在冒号之前(作为列名),第二个在冒号之后作为值,并返回 dataframe。
We apply the function to each list and combine them in one dataframe using map_df
.我们将 function 应用于每个列表,并使用
map_df
将它们组合到一个 dataframe 中。
library(dplyr)
list_data <- lst(xx, yy)
return_df_from_list <- function(x) {
value <- stringr::str_match(x, '(.*):\t(.*)')
setNames(data.frame(t(value[, 3])), value[, 2])
}
result <- purrr::map_df(list_data, return_df_from_list, .id = "rowname") %>%
column_to_rownames() %>%
type.convert(as.is = TRUE)
result
# raw total sequences 1st fragments
#xx 67166250 33583125
#yy 190999 222
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.