[英]get all occurences from a string based on find pattern in python
假設我有一個這樣的字符串:
exp = 'CASE WHEN "Expressions"."PRODUCT_CATEGORIES"."CATEGORY_NAME"=\'CPU\' THEN \'YES\' WHEN "Expressions"."PRODUCT_CATEGORIES"."CATEGORY_NAME"=\'RAM\' THEN \'YES\' ELSE \'NO\' END'
exp2 = 'CASE WHEN ("Expressions"."ORDER_ITEMS"."QUANTITY"*"Expressions"."ORDER_ITEMS"."UNIT_PRICE")>0 THEN ("Expressions"."ORDER_ITEMS"."QUANTITY"* "Expressions"."ORDER_ITEMS"."UNIT_PRICE") ELSE ("Expressions"."ORDER_ITEMS"."QUANTITY"+ "Expressions"."ORDER_ITEMS"."UNIT_PRICE") END '
我想返回所有出現的 WHEN 和 THEN 以及它的文本。
這是exp1的預期output
['WHEN "Expressions"."PRODUCT_CATEGORIES"."CATEGORY_NAME"=\'CPU\' THEN \'YES\'','WHEN "Expressions"."PRODUCT_CATEGORIES"."CATEGORY_NAME"=\'RAM\' THEN \'YES\'']
這是exp2的預期output
['WHEN ("Expressions"."ORDER_ITEMS"."QUANTITY"*"Expressions"."ORDER_ITEMS"."UNIT_PRICE")>0 THEN ("Expressions"."ORDER_ITEMS"."QUANTITY"* "Expressions"."ORDER_ITEMS"."UNIT_PRICE")']
我試過的是這樣的:
res = re.findall(r'\s*(WHEN|When|when)+\s*(.*)\s*(THEN|Then|then)+\s*')
但在我的情況下,結果列表顯示了這個 output
['(WHEN "Expressions"."PRODUCT_CATEGORIES"."CATEGORY_NAME"=\'CPU\' THEN \'YES\' WHEN "Expressions"."PRODUCT_CATEGORIES"."CATEGORY_NAME"=\'RAM\' THEN)']
嘗試:
WHEN (?:(?! +(?:WHEN|ELSE)).)* # with flags=re.I
WHEN
- 匹配 'WHEN'(?:(?? +(:.WHEN|ELSE)).)
- 使用負前瞻並聲明只要當前 position 不匹配后跟“WHEN”或“ELSE”的一個或多個空格字符,則再匹配一個字符。import re
cases = [
'CASE WHEN "Expressions"."PRODUCT_CATEGORIES"."CATEGORY_NAME"=\'CPU\' THEN \'YES\' WHEN "Expressions"."PRODUCT_CATEGORIES"."CATEGORY_NAME"=\'RAM\' THEN \'YES\' ELSE \'NO\' END',
'CASE WHEN ("Expressions"."ORDER_ITEMS"."QUANTITY"*"Expressions"."ORDER_ITEMS"."UNIT_PRICE")>0 THEN ("Expressions"."ORDER_ITEMS"."QUANTITY"* "Expressions"."ORDER_ITEMS"."UNIT_PRICE") ELSE ("Expressions"."ORDER_ITEMS"."QUANTITY"+ "Expressions"."ORDER_ITEMS"."UNIT_PRICE") END '
]
for case in cases:
res = re.findall(r'WHEN (?:(?! +(?:WHEN|ELSE)).)*', case, flags=re.I)
print(res)
印刷:
['WHEN "Expressions"."PRODUCT_CATEGORIES"."CATEGORY_NAME"=\'CPU\' THEN \'YES\'', 'WHEN "Expressions"."PRODUCT_CATEGORIES"."CATEGORY_NAME"=\'RAM\' THEN \'YES\'']
['WHEN ("Expressions"."ORDER_ITEMS"."QUANTITY"*"Expressions"."ORDER_ITEMS"."UNIT_PRICE")>0 THEN ("Expressions"."ORDER_ITEMS"."QUANTITY"* "Expressions"."ORDER_ITEMS"."UNIT_PRICE")']
更新
如果要對 WHEN 和 ELSE 部分進行分組(去掉前導和尾隨空格),請使用以下正則表達式:
WHEN +(.*?) +THEN +((?:(?! +(?:WHEN|ELSE)).)*)
import re
cases = [
'CASE WHEN "Expressions"."PRODUCT_CATEGORIES"."CATEGORY_NAME"=\'CPU\' THEN \'YES\' WHEN "Expressions"."PRODUCT_CATEGORIES"."CATEGORY_NAME"=\'RAM\' THEN \'YES\' ELSE \'NO\' END',
'CASE WHEN ("Expressions"."ORDER_ITEMS"."QUANTITY"*"Expressions"."ORDER_ITEMS"."UNIT_PRICE")>0 THEN ("Expressions"."ORDER_ITEMS"."QUANTITY"* "Expressions"."ORDER_ITEMS"."UNIT_PRICE") ELSE ("Expressions"."ORDER_ITEMS"."QUANTITY"+ "Expressions"."ORDER_ITEMS"."UNIT_PRICE") END '
]
for case in cases:
results = re.findall(r'WHEN +(.*?) +THEN +((?:(?! +(?:WHEN|ELSE)).)*)', case, flags=re.I)
for result in results:
print(result[0], result[1])
印刷:
"Expressions"."PRODUCT_CATEGORIES"."CATEGORY_NAME"='CPU' 'YES'
"Expressions"."PRODUCT_CATEGORIES"."CATEGORY_NAME"='RAM' 'YES'
("Expressions"."ORDER_ITEMS"."QUANTITY"*"Expressions"."ORDER_ITEMS"."UNIT_PRICE")>0 ("Expressions"."ORDER_ITEMS"."QUANTITY"* "Expressions"."ORDER_ITEMS"."UNIT_PRICE")
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.