Apologies if a similar question is answered already. But here is what I am struggling with, to get a regular expression replace for.
My input text has this sample
" var1 || literal "
I need a regular expression sub to something like this
" concat var1, literal "
Essentially, || should be replaced with a comma and the first element should be prefixed with a "concat". I may have multiple such occurances in a given input and so I should replace everywhere.
Here is where I am stuck. I can build the regex pattern, but I am not sure how to substitute it and with that.
re.sub(r'\s{1}[a-zA-Z0-9_]+\s*\|\|\s*[a-zA-Z0-9_]+\s*, '??????', input_string)
I am not sure if this can be done in a single Python statement.
I have an alternative to run through the string in loop and get each occurance and replace it individually without using regular expression.
Thanks in advance. Radha
You may handle this requirement using re.sub
with a callback function:
sql = "select var1 || literal || var2 from yourTable"
def to_concat(matchobj):
return "concat(" + re.sub(r'\s*\|\|', ',', matchobj) + ")"
sql_out = re.sub(r'\S+(?:\s+\|\|\s+\S+)+', lambda x: to_concat(x.group()), sql)
print(sql + "\n" + sql_out)
This prints:
select var1 || literal || var2 from yourTable
select concat(var1, literal, var2) from yourTable
The idea here is to first match the entire expression involving the ANSI ||
concatenation operator. Then, we pass this to a callback function which then selectively replaces all ||
by comma, as well as forms the function call to concat
.
With the python re
module, you can replace by regex by putting terms in parentheses in the pattern, then replacing then using \1
, \2
etc. in order of the terms.
re.sub(r'\s{1}([a-zA-Z0-9_]+)\s*\|\|\s*([a-zA-Z0-9_]+)\s*', r'concat \1 , \2', input_string)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.