I am processing strings in R which are supposed to contain zero or one pair of parentheses. If there are nested parentheses I need to delete the inner pair. Here is an example where I need to delete the parentheses around big bent nachos but not the other/outer parentheses.
test <- c(
"Record ID",
"What is the best food? (choice=Nachos)",
"What is the best food? (choice=Tacos (big bent nachos))",
"What is the best food? (choice=Chips with stuff)",
"Complete?"
)
I know I can kill all the parentheses with the stringr
package using str_remove_all()
:
test |>
stringr::str_remove_all(stringr::fixed(")")) |>
stringr::str_remove_all(stringr::fixed("("))
but I don't have the RegEx skills to pick the inner parentheses. I found a SO post that is close but it removes the outer parentheses and I cant untangle it to remove the inner.
Here you go.
test |>
stringr::str_replace_all("(\\().*\\(", "\\1") |> # remove inner open brackets
stringr::str_remove_all("\\)(?=.*\\))") # remove inner closed brackets
[1] "Record ID"
[2] "What is the best food? (choice=Nachos)"
[3] "What is the best food? (big bent nachos)"
[4] "What is the best food? (choice=Chips with stuff)"
[5] "Complete?"
Fixed my solution, so as to not lose text:
test |>
stringr::str_replace("\\((.*)\\(", "(\\1") |> # remove inner open brackets
stringr::str_remove_all("\\)(?=.*\\))") # remove inner outer brackets
[1] "Record ID"
[2] "What is the best food? (choice=Nachos)"
[3] "What is the best food? (choice=Tacos big bent nachos)"
[4] "What is the best food? (choice=Chips with stuff)"
[5] "Complete?"
Interested in how this would be solved with multiple (
... )
inside the outer parentheses, I came up with the following lookahead based idea. It only checks for an outer closing parentheses though.
test <- gsub("\\(([^)(]*)\\)(?=[^)(]*(?:\\([^)(]*\\)[^)(]*)*\\))", "\\1", test, perl=T)
See this R demo at tio.run or a pattern demo at regex101 (replace with \1
, capture of first group )
The lookahead verifies at each (
... )
if only followed by (
.... )
or non -parentheses up to )
.
If there is even arbitrary nesting, flattening the first level could be solved by a recursive regex .
test <- gsub("(?:\\G(?!^)|\\()[^)(]*+\\K(\\(((?>[^)(]+|(?1))*)\\))", "\\2", test, perl=T)
One more R demo at tio.run or a regex101 demo (replace with \2
, the second group's capture)
regex-part | explained |
---|---|
(?:\G(?!^)|\() |
Matches an opening bracket for chaining matches to by use of \G |
[^)(]*+\K |
Consumes any amount of non -parentheses and \K resets the beginning |
(\(((?>[^)(]+|(?1))*)\)) |
Matching the nested parentheses ( explanation at php.net ↗ ). It contains two capture groups : • the first recurses at (?1) • the second captures ( inside ) . |
Here the matches are chained to the opening parentheses. There is no check for an outer closing )
. This \G
based idea can be used without recursion too for just one level but is slightly less efficient.
Assuming there be at most one nested parentheses, we could use a gsub()
approach:
output <- gsub("\\(\\s*(.*?)\\s*\\(.*?\\)(.*?)\\s*\\)", "(\\1\\2)", test)
output
[1] "Record ID"
[2] "What is the best food? (choice=Nachos)"
[3] "What is the best food? (choice=Tacos)"
[4] "What is the best food? (choice=Chips with stuff)"
[5] "Complete?"
Data:
test <- c(
"Record ID",
"What is the best food? (choice=Nachos)",
"What is the best food? (choice=Tacos (big bent nachos))",
"What is the best food? (choice=Chips with stuff)",
"Complete?"
)
Here is a solution using gsub from base R. It is broken down into 2 steps for readability and debugging.
test <- c(
"Record ID",
"What is the best food? (choice=Nachos)",
"What is the best food? (choice=Tacos (big bent nachos))",
"What is the best food? (choice=Chips with stuff)",
"Complete?"
)
test <- gsub("(\\(.*)\\(", "\\1", test)
# ( \\(.* ) - first group starts with '(' then zero or more characters following that first '('
# \\( - middle part look of a another '('
# "\\1" replace the found group with the part from the first group
test <-gsub("\\)(.*\\))", "\\1", test)
#similer to first part
test
[1] "Record ID"
[2] "What is the best food? (choice=Nachos)"
[3] "What is the best food? (choice=Tacos big bent nachos)"
[4] "What is the best food? (choice=Chips with stuff)"
[5] "Complete?"
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.