[英]get all occurences from a string based on find pattern in python
假设我有一个这样的字符串:
exp = 'CASE WHEN "Expressions"."PRODUCT_CATEGORIES"."CATEGORY_NAME"=\'CPU\' THEN \'YES\' WHEN "Expressions"."PRODUCT_CATEGORIES"."CATEGORY_NAME"=\'RAM\' THEN \'YES\' ELSE \'NO\' END'
exp2 = 'CASE WHEN ("Expressions"."ORDER_ITEMS"."QUANTITY"*"Expressions"."ORDER_ITEMS"."UNIT_PRICE")>0 THEN ("Expressions"."ORDER_ITEMS"."QUANTITY"* "Expressions"."ORDER_ITEMS"."UNIT_PRICE") ELSE ("Expressions"."ORDER_ITEMS"."QUANTITY"+ "Expressions"."ORDER_ITEMS"."UNIT_PRICE") END '
我想返回所有出现的 WHEN 和 THEN 以及它的文本。
这是exp1的预期output
['WHEN "Expressions"."PRODUCT_CATEGORIES"."CATEGORY_NAME"=\'CPU\' THEN \'YES\'','WHEN "Expressions"."PRODUCT_CATEGORIES"."CATEGORY_NAME"=\'RAM\' THEN \'YES\'']
这是exp2的预期output
['WHEN ("Expressions"."ORDER_ITEMS"."QUANTITY"*"Expressions"."ORDER_ITEMS"."UNIT_PRICE")>0 THEN ("Expressions"."ORDER_ITEMS"."QUANTITY"* "Expressions"."ORDER_ITEMS"."UNIT_PRICE")']
我试过的是这样的:
res = re.findall(r'\s*(WHEN|When|when)+\s*(.*)\s*(THEN|Then|then)+\s*')
但在我的情况下,结果列表显示了这个 output
['(WHEN "Expressions"."PRODUCT_CATEGORIES"."CATEGORY_NAME"=\'CPU\' THEN \'YES\' WHEN "Expressions"."PRODUCT_CATEGORIES"."CATEGORY_NAME"=\'RAM\' THEN)']
尝试:
WHEN (?:(?! +(?:WHEN|ELSE)).)* # with flags=re.I
WHEN
- 匹配 'WHEN'(?:(?? +(:.WHEN|ELSE)).)
- 使用负前瞻并声明只要当前 position 不匹配后跟“WHEN”或“ELSE”的一个或多个空格字符,则再匹配一个字符。import re
cases = [
'CASE WHEN "Expressions"."PRODUCT_CATEGORIES"."CATEGORY_NAME"=\'CPU\' THEN \'YES\' WHEN "Expressions"."PRODUCT_CATEGORIES"."CATEGORY_NAME"=\'RAM\' THEN \'YES\' ELSE \'NO\' END',
'CASE WHEN ("Expressions"."ORDER_ITEMS"."QUANTITY"*"Expressions"."ORDER_ITEMS"."UNIT_PRICE")>0 THEN ("Expressions"."ORDER_ITEMS"."QUANTITY"* "Expressions"."ORDER_ITEMS"."UNIT_PRICE") ELSE ("Expressions"."ORDER_ITEMS"."QUANTITY"+ "Expressions"."ORDER_ITEMS"."UNIT_PRICE") END '
]
for case in cases:
res = re.findall(r'WHEN (?:(?! +(?:WHEN|ELSE)).)*', case, flags=re.I)
print(res)
印刷:
['WHEN "Expressions"."PRODUCT_CATEGORIES"."CATEGORY_NAME"=\'CPU\' THEN \'YES\'', 'WHEN "Expressions"."PRODUCT_CATEGORIES"."CATEGORY_NAME"=\'RAM\' THEN \'YES\'']
['WHEN ("Expressions"."ORDER_ITEMS"."QUANTITY"*"Expressions"."ORDER_ITEMS"."UNIT_PRICE")>0 THEN ("Expressions"."ORDER_ITEMS"."QUANTITY"* "Expressions"."ORDER_ITEMS"."UNIT_PRICE")']
更新
如果要对 WHEN 和 ELSE 部分进行分组(去掉前导和尾随空格),请使用以下正则表达式:
WHEN +(.*?) +THEN +((?:(?! +(?:WHEN|ELSE)).)*)
import re
cases = [
'CASE WHEN "Expressions"."PRODUCT_CATEGORIES"."CATEGORY_NAME"=\'CPU\' THEN \'YES\' WHEN "Expressions"."PRODUCT_CATEGORIES"."CATEGORY_NAME"=\'RAM\' THEN \'YES\' ELSE \'NO\' END',
'CASE WHEN ("Expressions"."ORDER_ITEMS"."QUANTITY"*"Expressions"."ORDER_ITEMS"."UNIT_PRICE")>0 THEN ("Expressions"."ORDER_ITEMS"."QUANTITY"* "Expressions"."ORDER_ITEMS"."UNIT_PRICE") ELSE ("Expressions"."ORDER_ITEMS"."QUANTITY"+ "Expressions"."ORDER_ITEMS"."UNIT_PRICE") END '
]
for case in cases:
results = re.findall(r'WHEN +(.*?) +THEN +((?:(?! +(?:WHEN|ELSE)).)*)', case, flags=re.I)
for result in results:
print(result[0], result[1])
印刷:
"Expressions"."PRODUCT_CATEGORIES"."CATEGORY_NAME"='CPU' 'YES'
"Expressions"."PRODUCT_CATEGORIES"."CATEGORY_NAME"='RAM' 'YES'
("Expressions"."ORDER_ITEMS"."QUANTITY"*"Expressions"."ORDER_ITEMS"."UNIT_PRICE")>0 ("Expressions"."ORDER_ITEMS"."QUANTITY"* "Expressions"."ORDER_ITEMS"."UNIT_PRICE")
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.