Subsetting a string based on multiple conditions

Question

I have a vector where each element is a string. I only want to keep the part of the string right before the '==' regardless of whether it is at the beginning of the string, after the & symbol, or after the |symbol. Here is my data:

data <- c("name=='John'", "name=='David'&age=='50'|job=='Doctor'&city=='Liverpool'", 
"job=='engineer'&name=='Andrew'", 
"city=='Manchester'", "age=='40'&city=='London'"
)

My ideal format would be something like this:

[1] "name"
[2] "name" "age" "job" "city"
[3] "job" "name"
[4] "city" 
[5] "age" "city"

The closest I have got is using genXtract from the qdap library, which puts the data in the format above, but I only know how to use it with one condition, ie

qdap::genXtract(data, "&", "==")

But I don't just want the part of the string between & and == but also between | and == or the beginning of the string and ==

Answer 1

What this regex does, is capture all a-zA-Z0-9 (=letters and numbers) before an occurence of == .

stringr::str_extract_all( data, "[0-9a-zA-Z]+(?=(==))")

[[1]]
[1] "name"
[[2]]
[1] "name" "age"  "job"  "city"
[[3]]
[1] "job"  "name"
[[4]]
[1] "city"
[[5]]
[1] "age"  "city"

if you want the output as a vector, use

L <- stringr::str_extract_all( data, "[0-9a-zA-Z]+(?=(==))" )
unlist( lapply( L, paste, collapse = " " ) )

results in

[1] "name"             
[2] "name age job city"
[3] "job name"         
[4] "city"             
[5] "age city"

Answer 2

In base R , this can be done with regmatches/gregexpr

lst1 <- regmatches(data, gregexpr("\\w+(?=\\={2})", data, perl = TRUE))
sapply(lst1, paste, collapse = " ")
#[1] "name"     
#[2] "name age job city" 
#[3] "job name"       
#[4]  "city"      
#[5]  "age city"

Subsetting a string based on multiple conditions

Question

2 answers

solution1
2 ACCPTED 2021-02-08 15:52:23

solution2
0 2021-02-08 16:22:28

Subsetting a string based on multiple conditions

Question

2 answers

solution1 2 ACCPTED 2021-02-08 15:52:23

solution2 0 2021-02-08 16:22:28

solution1
2 ACCPTED 2021-02-08 15:52:23

solution2
0 2021-02-08 16:22:28