简体   繁体   English

如何在R中将列表(拆分输出)转换为逻辑数据帧(根据列名称)

[英]How to transform a list (strsplit output) into logical data frame (according to column names) in R

this is my first post and obviously, I do not have programming experience. 这是我的第一篇文章,显然,我没有编程经验。

Problem: 问题:

I have a list of 200 character vectors, ranging from 0 to 7 elements each: (this list is the output of the strsplit function). 我有一个200个字符向量的列表,每个字符向量的范围从0到7个元素:(此列表是strsplit函数的输出)。

>input

> [[1]]
>> [1] "foo" "bar" "norf"
> [[2]]
>> [1] "norf"
> [[3]]
>> NA
.....
> [[200]]
>> [1] "hello" "norf"

I also have a character string of all potential character strings in input: 我在输入中还包含所有潜在字符串的字符串:

possible_strings <- c("foo","bar","hello",...)

I want to convert it into a data frame (or similar object that gets the job done) of the following format: 我想将其转换为以下格式的数据框(或完成工作的类似对象):

> res
        foo   bar   norf  hello
[1,  ]  TRUE  TRUE  TRUE  FALSE
[2,  ]  FALSE FALSE TRUE  FALSE
[3,  ]  FALSE FALSE FALSE FALSE
[...]
[200,]  FALSE FALSE TRUE  TRUE

I tried very extensively to convert it and the furthest I got was a data frame with all possible strings as column names that had the character strings in all rows, filled with NAs (I used rbind.fill in the process). 我进行了广泛的尝试以进行转换,而我得到的最远的是一个数据框,其中所有可能的字符串都作为列名,所有行中都有字符串,并用NA填充(我在过程中使用rbind.fill)。

Any help would be greatly appreciated, 任何帮助将不胜感激,

Thanks! 谢谢!

In your original question, you say you'd like the result to be a data frame, but the result, res , you show is actually a matrix. 在最初的问题中,您说您希望结果是一个数据框,但是显示的结果res实际上是一个矩阵。 Therefore, my first result below is a matrix, and then I convert it to a data frame with as.data.frame() . 因此,下面的第一个结果是一个矩阵,然后使用as.data.frame()将其转换为数据帧。

This can be done fairly easily with sapply() and %in% . 使用sapply()%in%可以很容易地做到这一点。 sapply() goes through list one element at a time and applies the function %in% on each element, looking for the elements of possStr and returning a logical result. sapply()遍历一个元素,并在每个元素上应用%in%函数,查找possStr的元素并返回逻辑结果。

> input <- list(c("foo", "bar", "norf"), "norf", NA, c("hello", "norf"))
> possStr <- c("foo", "bar", "norf", "hello")

> d <- t(sapply(input, function(x) possStr %in% x ))
> colnames(d) <- possStr 
> d                                       ## in matrix form
#        foo   bar  norf hello
# [1,]  TRUE  TRUE  TRUE FALSE
# [2,] FALSE FALSE  TRUE FALSE
# [3,] FALSE FALSE FALSE FALSE
# [4,] FALSE FALSE  TRUE  TRUE

> as.data.frame(d)                        ## convert to data frame
#     foo   bar  norf hello
# 1  TRUE  TRUE  TRUE FALSE
# 2 FALSE FALSE  TRUE FALSE
# 3 FALSE FALSE FALSE FALSE
# 4 FALSE FALSE  TRUE  TRUE

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM