[英]How to transform a list (strsplit output) into logical data frame (according to column names) in R
this is my first post and obviously, I do not have programming experience. 这是我的第一篇文章,显然,我没有编程经验。
Problem: 问题:
I have a list of 200 character vectors, ranging from 0 to 7 elements each: (this list is the output of the strsplit function). 我有一个200个字符向量的列表,每个字符向量的范围从0到7个元素:(此列表是strsplit函数的输出)。
>input
> [[1]]
>> [1] "foo" "bar" "norf"
> [[2]]
>> [1] "norf"
> [[3]]
>> NA
.....
> [[200]]
>> [1] "hello" "norf"
I also have a character string of all potential character strings in input: 我在输入中还包含所有潜在字符串的字符串:
possible_strings <- c("foo","bar","hello",...)
I want to convert it into a data frame (or similar object that gets the job done) of the following format: 我想将其转换为以下格式的数据框(或完成工作的类似对象):
> res
foo bar norf hello
[1, ] TRUE TRUE TRUE FALSE
[2, ] FALSE FALSE TRUE FALSE
[3, ] FALSE FALSE FALSE FALSE
[...]
[200,] FALSE FALSE TRUE TRUE
I tried very extensively to convert it and the furthest I got was a data frame with all possible strings as column names that had the character strings in all rows, filled with NAs (I used rbind.fill in the process). 我进行了广泛的尝试以进行转换,而我得到的最远的是一个数据框,其中所有可能的字符串都作为列名,所有行中都有字符串,并用NA填充(我在过程中使用rbind.fill)。
Any help would be greatly appreciated, 任何帮助将不胜感激,
Thanks! 谢谢!
In your original question, you say you'd like the result to be a data frame, but the result, res
, you show is actually a matrix. 在最初的问题中,您说您希望结果是一个数据框,但是显示的结果
res
实际上是一个矩阵。 Therefore, my first result below is a matrix, and then I convert it to a data frame with as.data.frame()
. 因此,下面的第一个结果是一个矩阵,然后使用
as.data.frame()
将其转换为数据帧。
This can be done fairly easily with sapply()
and %in%
. 使用
sapply()
和%in%
可以很容易地做到这一点。 sapply()
goes through list one element at a time and applies the function %in%
on each element, looking for the elements of possStr
and returning a logical result. sapply()
遍历一个元素,并在每个元素上应用%in%
函数,查找possStr
的元素并返回逻辑结果。
> input <- list(c("foo", "bar", "norf"), "norf", NA, c("hello", "norf"))
> possStr <- c("foo", "bar", "norf", "hello")
> d <- t(sapply(input, function(x) possStr %in% x ))
> colnames(d) <- possStr
> d ## in matrix form
# foo bar norf hello
# [1,] TRUE TRUE TRUE FALSE
# [2,] FALSE FALSE TRUE FALSE
# [3,] FALSE FALSE FALSE FALSE
# [4,] FALSE FALSE TRUE TRUE
> as.data.frame(d) ## convert to data frame
# foo bar norf hello
# 1 TRUE TRUE TRUE FALSE
# 2 FALSE FALSE TRUE FALSE
# 3 FALSE FALSE FALSE FALSE
# 4 FALSE FALSE TRUE TRUE
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.