繁体   English   中英

根据 python 中的查找模式从字符串中获取所有出现

[英]get all occurences from a string based on find pattern in python

假设我有一个这样的字符串:

exp = 'CASE WHEN  "Expressions"."PRODUCT_CATEGORIES"."CATEGORY_NAME"=\'CPU\'  THEN   \'YES\'  WHEN  "Expressions"."PRODUCT_CATEGORIES"."CATEGORY_NAME"=\'RAM\'  THEN   \'YES\' ELSE  \'NO\' END' 
exp2 = 'CASE WHEN  ("Expressions"."ORDER_ITEMS"."QUANTITY"*"Expressions"."ORDER_ITEMS"."UNIT_PRICE")>0  THEN  ("Expressions"."ORDER_ITEMS"."QUANTITY"* "Expressions"."ORDER_ITEMS"."UNIT_PRICE") ELSE ("Expressions"."ORDER_ITEMS"."QUANTITY"+ "Expressions"."ORDER_ITEMS"."UNIT_PRICE")   END '

我想返回所有出现的 WHEN 和 THEN 以及它的文本。

这是exp1的预期output

['WHEN  "Expressions"."PRODUCT_CATEGORIES"."CATEGORY_NAME"=\'CPU\'  THEN   \'YES\'','WHEN  "Expressions"."PRODUCT_CATEGORIES"."CATEGORY_NAME"=\'RAM\'  THEN   \'YES\'']

这是exp2的预期output

['WHEN  ("Expressions"."ORDER_ITEMS"."QUANTITY"*"Expressions"."ORDER_ITEMS"."UNIT_PRICE")>0  THEN  ("Expressions"."ORDER_ITEMS"."QUANTITY"* "Expressions"."ORDER_ITEMS"."UNIT_PRICE")']

我试过的是这样的:

res = re.findall(r'\s*(WHEN|When|when)+\s*(.*)\s*(THEN|Then|then)+\s*')

但在我的情况下,结果列表显示了这个 output

['(WHEN  "Expressions"."PRODUCT_CATEGORIES"."CATEGORY_NAME"=\'CPU\'  THEN   \'YES\'  WHEN  "Expressions"."PRODUCT_CATEGORIES"."CATEGORY_NAME"=\'RAM\'  THEN)']

尝试:

WHEN (?:(?! +(?:WHEN|ELSE)).)* # with flags=re.I
  1. WHEN - 匹配 'WHEN'
  2. (?:(?? +(:.WHEN|ELSE)).) - 使用负前瞻并声明只要当前 position 不匹配后跟“WHEN”或“ELSE”的一个或多个空格字符,则再匹配一个字符。

见正则表达式演示

import re

cases = [
    'CASE WHEN  "Expressions"."PRODUCT_CATEGORIES"."CATEGORY_NAME"=\'CPU\'  THEN   \'YES\'  WHEN  "Expressions"."PRODUCT_CATEGORIES"."CATEGORY_NAME"=\'RAM\'  THEN   \'YES\' ELSE  \'NO\' END',
    'CASE WHEN  ("Expressions"."ORDER_ITEMS"."QUANTITY"*"Expressions"."ORDER_ITEMS"."UNIT_PRICE")>0  THEN  ("Expressions"."ORDER_ITEMS"."QUANTITY"* "Expressions"."ORDER_ITEMS"."UNIT_PRICE") ELSE ("Expressions"."ORDER_ITEMS"."QUANTITY"+ "Expressions"."ORDER_ITEMS"."UNIT_PRICE")   END '
]

for case in cases:
    res = re.findall(r'WHEN (?:(?! +(?:WHEN|ELSE)).)*', case, flags=re.I)
    print(res)

印刷:

['WHEN  "Expressions"."PRODUCT_CATEGORIES"."CATEGORY_NAME"=\'CPU\'  THEN   \'YES\'', 'WHEN  "Expressions"."PRODUCT_CATEGORIES"."CATEGORY_NAME"=\'RAM\'  THEN   \'YES\'']
['WHEN  ("Expressions"."ORDER_ITEMS"."QUANTITY"*"Expressions"."ORDER_ITEMS"."UNIT_PRICE")>0  THEN  ("Expressions"."ORDER_ITEMS"."QUANTITY"* "Expressions"."ORDER_ITEMS"."UNIT_PRICE")']

更新

如果要对 WHEN 和 ELSE 部分进行分组(去掉前导和尾随空格),请使用以下正则表达式:

WHEN +(.*?) +THEN +((?:(?! +(?:WHEN|ELSE)).)*)
import re

cases = [
    'CASE WHEN  "Expressions"."PRODUCT_CATEGORIES"."CATEGORY_NAME"=\'CPU\'  THEN   \'YES\'  WHEN  "Expressions"."PRODUCT_CATEGORIES"."CATEGORY_NAME"=\'RAM\'  THEN   \'YES\' ELSE  \'NO\' END',
    'CASE WHEN  ("Expressions"."ORDER_ITEMS"."QUANTITY"*"Expressions"."ORDER_ITEMS"."UNIT_PRICE")>0  THEN  ("Expressions"."ORDER_ITEMS"."QUANTITY"* "Expressions"."ORDER_ITEMS"."UNIT_PRICE") ELSE ("Expressions"."ORDER_ITEMS"."QUANTITY"+ "Expressions"."ORDER_ITEMS"."UNIT_PRICE")   END '
]

for case in cases:
    results = re.findall(r'WHEN +(.*?) +THEN +((?:(?! +(?:WHEN|ELSE)).)*)', case, flags=re.I)
    for result in results:
        print(result[0], result[1])

印刷:

"Expressions"."PRODUCT_CATEGORIES"."CATEGORY_NAME"='CPU' 'YES'
"Expressions"."PRODUCT_CATEGORIES"."CATEGORY_NAME"='RAM' 'YES'
("Expressions"."ORDER_ITEMS"."QUANTITY"*"Expressions"."ORDER_ITEMS"."UNIT_PRICE")>0 ("Expressions"."ORDER_ITEMS"."QUANTITY"* "Expressions"."ORDER_ITEMS"."UNIT_PRICE")

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM