在 R 中提取向量中的字符元素

Question

In character vectors like a,b,..., e below, I wonder how to extract the two character elements ie, "bmi" and "ch" ?在下面a,b,..., e等字符向量中，我想知道如何提取两个字符元素，即"bmi"和"ch" ？ (ie, desired_output_in_this_case = c("bmi","ch") ) （即， desired_output_in_this_case = c("bmi","ch") ）

The example below is just a toy example, the character elements can be anything else other than ch and bmi .下面的示例只是一个玩具示例，字符元素可以是ch和bmi以外的任何其他元素。 I'm looking for a general solution.我正在寻找一个通用的解决方案。

I have tried the following solution ( unlist(stringr::str_extract_all(a, "bmi|ch")) ).我尝试了以下解决方案（ unlist(stringr::str_extract_all(a, "bmi|ch")) ）。 BUT we should manually define "bmi|ch" in it to give the desired output.但是我们应该在其中手动定义"bmi|ch"以提供所需的 output。 Thus, it's not a general solution.因此，这不是一个通用的解决方案。

a <- "bmi + ch | study"
b <- "bmi * ch | study"
c <- "bmi * ch - 1 | study"
d <- "bmi * ch + 0 | study"
e <- "bmi:ch + 0 | study"

Answer 1

Assume the vector v defined in the Note at the end.假设最后注释中定义的向量v。 Then we can lapply over it using the indicated function.然后我们可以使用指示的 function 覆盖它。 If the number of variables is always the same you could alternately use sapply giving a matrix.如果变量的数量始终相同，您可以交替使用 sapply 给出一个矩阵。

lapply(sub("\\|.*", "", v), function(x) all.vars(parse(text = x)))

giving:给予：

[[1]]
[1] "bmi" "ch" 

[[2]]
[1] "bmi" "ch" 

[[3]]
[1] "bmi" "ch" 

[[4]]
[1] "bmi" "ch" 

[[5]]
[1] "bmi" "ch"

Note笔记

a <- "bmi + ch | study"
b <- "bmi * ch | study"
c <- "bmi * ch - 1 | study"
d <- "bmi * ch + 0 | study"
e <- "bmi:ch + 0 | study"
v <- c(a, b, c, d, e)

Answer 2

This is a bit more complicated.这有点复杂。 I will just leave it here in case someone may find it interesting.我会把它留在这里，以防有人觉得它有趣。

vecs<-list(a,b, c,d,e)
split_me<-Map(function(x) gsub("([a-z].*[a-z])(\\W.*)","\\1",x, 

perl=TRUE), vecs)
 lapply(split_me, function(x) 
  unlist(strsplit(gsub("\\s", "",x), "[+*:]")))

Result结果

[[1]]
[1] "bmi" "ch" 

[[2]]
[1] "bmi" "ch" 

[[3]]
[1] "bmi" "ch" 

[[4]]
[1] "bmi" "ch" 

[[5]]
[1] "bmi" "ch"

Data数据

a <- "bmi + ch | study"
b <- "bmi * ch | study"
c <- "bmi * ch - 1 | study"
d <- "bmi * ch + 0 | study"
e <- "bmi:ch + 0 | study"
vecs<-list(a,b, c,d,e)

在 R 中提取向量中的字符元素

问题描述

2 个解决方案

解决方案1
2 已采纳 2022-01-01 20:01:52

Note笔记

解决方案2
0 2022-01-01 20:05:32

在 R 中提取向量中的字符元素

问题描述

2 个解决方案

解决方案1 2 已采纳 2022-01-01 20:01:52

Note笔记

解决方案2 0 2022-01-01 20:05:32

解决方案1
2 已采纳 2022-01-01 20:01:52

解决方案2
0 2022-01-01 20:05:32