简体   繁体   English

Python 条件正则表达式

[英]Python Conditional Regex

My program is given an object with parameters, and I need to get the parameters' values.我的程序被赋予了一个带参数的对象,我需要获取参数的值。

The object my program is given will look like this:我的程序给出的对象将如下所示:

Object = """{{objectName|
parameter1=random text|
parameter2=that may or may not|
parameter3=contain any letter (well, almost)|
parameter4=this is some [[problem|problematic text]], Houston, we have a problem!|
otherParameters=(order of parameters is random, but their name is fixed)}}"""

(all parameters might or might not exist) (所有参数可能存在也可能不存在)

I am trying to get the properties values.我正在尝试获取属性值。

In the first 3 lines, its pretty easy.在前 3 行中,它非常简单。 a simple regex will find it:一个简单的正则表达式会找到它:

if "parameter1" in Object:
    parameter1 = re.split(r"parameter1=(.*?)[\|\}]", Object)[1]

if "parameter2" in Object:
    parameter2 = re.split(r"parameter2=(.*?)[\|\}]", Object)[1]

and so on.等等。

The problem is with parameter4, the above regex ( property4=(.*?)[\\|\\}] ) will only return this is some [[problem , since the regex stops at the vertical bar.问题出在 parameter4 上,上面的正则表达式( property4=(.*?)[\\|\\}] )只会返回this is some [[problem ,因为正则表达式停在垂直条上。

Now here is the thing: vertical bar will only appear as part of the text inside "[[]]".现在事情是这样的:垂直条只会作为“[[]]”中文本的一部分出现。

For example, parameter1=a[[b|c]]d might appear, but parameter1=a|bc|例如,可能会出现parameter1=a[[b|c]]d ,但parameter1=a|bc| will never appear.永远不会出现。

I need a regex which will stop at vertical bar, unless it is inside double square brackets.我需要一个正则表达式,它将停止在垂直条上,除非它在双方括号内。 So for example, for parameter4, I will get this is some [[problem|problematic text]], Houston, we have a problem!所以例如,对于参数4,我会得到this is some [[problem|problematic text]], Houston, we have a problem!

Worked here when I removed the "?":当我删除“?”时在这里工作:

parameter4 = re.split(r"parameter4=(.*)[\|\}]", object_)[1]

I also changed the name of the variable to "object_" because "object" is a built-in object in Python我还将变量的名称更改为“object_”,因为“object”是 Python 中的内置对象

Best.最好的事物。

Apparently, there is no perfect solution.显然,没有完美的解决方案。

For other readers possibly reading this question in the future, the closest solution is, as pointed by Wiktor Stribiżew in the comments, parameter4=([^[}|]*(?:\\[\\[.*?]][^[}|]*)*) .对于未来可能会阅读此问题的其他读者,最接近的解决方案是,正如 Wiktor Stribiżew 在评论中指出的, parameter4=([^[}|]*(?:\\[\\[.*?]][^[}|]*)*)

This regex will only work if the param text does not contain any single [ , } and |此正则表达式仅在参数文本不包含任何单个[}| but may contain [[...]] sub-strings.但可能包含[[...]]子字符串。

If you want to understand this regex better, you might want to have a look here: https://regex101.com/r/bWVvKg/2如果你想更好地理解这个正则表达式,你可能想看看这里: https : //regex101.com/r/bWVvKg/2

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM