简体   繁体   English

Python 中要替换的正则表达式模式

[英]Regular expression pattern to replace in Python

Apologies if a similar question is answered already.如果已经回答了类似的问题,我们深表歉意。 But here is what I am struggling with, to get a regular expression replace for.但这是我正在努力解决的问题,以获得正则表达式替换。

My input text has this sample我的输入文本有这个样本

" var1 || literal "

I need a regular expression sub to something like this我需要一个正则表达式来代替这样的东西

" concat var1, literal "

Essentially, ||本质上,|| should be replaced with a comma and the first element should be prefixed with a "concat".应替换为逗号,并且第一个元素应以“concat”为前缀。 I may have multiple such occurances in a given input and so I should replace everywhere.我可能在给定的输入中出现多次这样的情况,所以我应该到处替换。

Here is where I am stuck.这是我卡住的地方。 I can build the regex pattern, but I am not sure how to substitute it and with that.我可以构建正则表达式模式,但我不知道如何替换它。

re.sub(r'\s{1}[a-zA-Z0-9_]+\s*\|\|\s*[a-zA-Z0-9_]+\s*, '??????', input_string)

I am not sure if this can be done in a single Python statement.我不确定这是否可以在单个 Python 语句中完成。

I have an alternative to run through the string in loop and get each occurance and replace it individually without using regular expression.我有一个替代方法来遍历循环中的字符串并获取每个出现并单独替换它而不使用正则表达式。

Thanks in advance.提前致谢。 Radha拉达

You may handle this requirement using re.sub with a callback function:您可以使用带有回调 function 的re.sub来处理此要求:

sql = "select var1 || literal || var2 from yourTable"
def to_concat(matchobj):
    return "concat(" + re.sub(r'\s*\|\|', ',', matchobj) + ")"

sql_out = re.sub(r'\S+(?:\s+\|\|\s+\S+)+', lambda x: to_concat(x.group()), sql)
print(sql + "\n" + sql_out)

This prints:这打印:

select var1 || literal || var2 from yourTable
select concat(var1, literal, var2) from yourTable

The idea here is to first match the entire expression involving the ANSI ||这里的想法是首先匹配涉及 ANSI ||的整个表达式。 concatenation operator.连接运算符。 Then, we pass this to a callback function which then selectively replaces all ||然后,我们将其传递给回调 function,然后选择性地替换所有|| by comma, as well as forms the function call to concat .通过逗号,以及 forms function 调用concat

With the python re module, you can replace by regex by putting terms in parentheses in the pattern, then replacing then using \1 , \2 etc. in order of the terms.使用 python re模块,您可以通过将术语放在模式中的括号中来替换正则表达式,然后按照术语的顺序使用\1\2等替换。

re.sub(r'\s{1}([a-zA-Z0-9_]+)\s*\|\|\s*([a-zA-Z0-9_]+)\s*', r'concat \1 , \2', input_string)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM