简体   繁体   中英

How can I use gsub to remove specific characters before and after an arbitrary character in string

I am attempting to use gsub to remove characters from the following string:

string <- "function(data, x = !!rlang::sym(\"Time1\"), y = !!rlang::sym(\"YVAR\")), values = c(\"a\", \"b\"))"

The new string should return:

cat(string)
function(data, x = Time1, y = YVAR, values = c("a", "b"))

that is to say, I'd like to remove ::rlang::sym(\" , keep Time1 , and remove the closing quote and parenthesis after Time1 \") (and I'd also like to remove this for YVAR )

Time1 and YVAR (x and y variable names) are arbitrary and can be named anything in the resulting string, however, the characters ::rlang::sym(\" and the closing quote and parenthesis \") after the arbitrary string that needs to be kept are constant and will not change.

I understand I can simply use

result <- gsub("!!rlang::sym(\"", "", string, fixed = TRUE)

then

 result <- gsub("\")", "", result, fixed = TRUE)

to get part of the way there, however, I'd like to find a more elegant regex solution that can combine both of these gsub and also will of course not remove the closing "\")" in values = c(\"a\", \"b\"))"

If it's always the literal ::rlang::sym(" , then this

cat( gsub('!!rlang::sym\\("(\\S+)"\\)', "\\1", string), "\n" )
# function(data, x = Time1, y = YVAR), values = c("a", "b")) 

If it's a function-call/paren/quote, then it can be generalized a little. I'd think you'd want some specificity, since otherwise you'll be parsing out a lot more than you want. I'll assuming that rlang is required:

gsub('\\S+rlang\\S+\\("(\\S+)"\\)', "\\1", string)

Note that there are two right-parens in your sample string , ::rlang::sym(\"YVAR\")) , which are thwarting the pattern just a little. If that's real, then... either look for repeats with "\\)+ or... something else.

You could use a single pattern with a capture group which will match any character except " , and use group 1 in the replacement.

!!rlang::sym\("([^"]+)"\)

Regex demo

string <- "function(data, x = !!rlang::sym(\"Time1\"), y = !!rlang::sym(\"YVAR\")), values = c(\"a\", \"b\"))"
cat(gsub('!!rlang::sym\\("([^"]+)"\\)', "\\1", string))

Output

function(data, x = Time1, y = YVAR), values = c("a", "b"))

R demo

Use

result <- gsub("!!rlang::sym\\(\"([\\w\\W]*?)\"\\)", "\\1", string, perl=TRUE)

See proof

Expanation

--------------------------------------------------------------------------------
  !!rlang::sym             '!!rlang::sym'
--------------------------------------------------------------------------------
  \(                       '('
--------------------------------------------------------------------------------
  \"                       '"'
--------------------------------------------------------------------------------
  (                        group and capture to \1:
--------------------------------------------------------------------------------
    [\w\W]*?                 any character of: word characters (a-z,
                             A-Z, 0-9, _), non-word characters (all
                             but a-z, A-Z, 0-9, _) (0 or more times
                             (matching the least amount possible))
--------------------------------------------------------------------------------
  )                        end of \1
--------------------------------------------------------------------------------
  "                        '"'
--------------------------------------------------------------------------------
  \)                       ')'

See R proof :

string <- "function(data, x = !!rlang::sym(\"Time1\"), y = !!rlang::sym(\"YVAR\")), values = c(\"a\", \"b\"))"
result <- gsub("!!rlang::sym\\(\"([\\w\\W]*?)\"\\)", "\\1", string, perl=TRUE)
cat(result)

Results : function(data, x = Time1, y = YVAR), values = c("a", "b"))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM