简体   繁体   English

根据 python 中的查找模式从字符串中获取所有出现

[英]get all occurences from a string based on find pattern in python

Suppose I have a string like this:假设我有一个这样的字符串:

exp = 'CASE WHEN  "Expressions"."PRODUCT_CATEGORIES"."CATEGORY_NAME"=\'CPU\'  THEN   \'YES\'  WHEN  "Expressions"."PRODUCT_CATEGORIES"."CATEGORY_NAME"=\'RAM\'  THEN   \'YES\' ELSE  \'NO\' END' 
exp2 = 'CASE WHEN  ("Expressions"."ORDER_ITEMS"."QUANTITY"*"Expressions"."ORDER_ITEMS"."UNIT_PRICE")>0  THEN  ("Expressions"."ORDER_ITEMS"."QUANTITY"* "Expressions"."ORDER_ITEMS"."UNIT_PRICE") ELSE ("Expressions"."ORDER_ITEMS"."QUANTITY"+ "Expressions"."ORDER_ITEMS"."UNIT_PRICE")   END '

I want to return the all occurrences of WHEN and THEN along with the texts of that.我想返回所有出现的 WHEN 和 THEN 以及它的文本。

This is the expected output of exp1这是exp1的预期output

['WHEN  "Expressions"."PRODUCT_CATEGORIES"."CATEGORY_NAME"=\'CPU\'  THEN   \'YES\'','WHEN  "Expressions"."PRODUCT_CATEGORIES"."CATEGORY_NAME"=\'RAM\'  THEN   \'YES\'']

This is the expected output of exp2这是exp2的预期output

['WHEN  ("Expressions"."ORDER_ITEMS"."QUANTITY"*"Expressions"."ORDER_ITEMS"."UNIT_PRICE")>0  THEN  ("Expressions"."ORDER_ITEMS"."QUANTITY"* "Expressions"."ORDER_ITEMS"."UNIT_PRICE")']

What I have tried is this:我试过的是这样的:

res = re.findall(r'\s*(WHEN|When|when)+\s*(.*)\s*(THEN|Then|then)+\s*')

But the resulting list shows this output in my case但在我的情况下,结果列表显示了这个 output

['(WHEN  "Expressions"."PRODUCT_CATEGORIES"."CATEGORY_NAME"=\'CPU\'  THEN   \'YES\'  WHEN  "Expressions"."PRODUCT_CATEGORIES"."CATEGORY_NAME"=\'RAM\'  THEN)']

Try:尝试:

WHEN (?:(?! +(?:WHEN|ELSE)).)* # with flags=re.I
  1. WHEN - Matches 'WHEN ' WHEN - 匹配 'WHEN'
  2. (?:(?? +(:.WHEN|ELSE)).) - Uses negative lookahead and states that as long as the current position does not match one or more space characters followed by either 'WHEN' or 'ELSE', then match one more character. (?:(?? +(:.WHEN|ELSE)).) - 使用负前瞻并声明只要当前 position 不匹配后跟“WHEN”或“ELSE”的一个或多个空格字符,则再匹配一个字符。

See Regex Demo见正则表达式演示

import re

cases = [
    'CASE WHEN  "Expressions"."PRODUCT_CATEGORIES"."CATEGORY_NAME"=\'CPU\'  THEN   \'YES\'  WHEN  "Expressions"."PRODUCT_CATEGORIES"."CATEGORY_NAME"=\'RAM\'  THEN   \'YES\' ELSE  \'NO\' END',
    'CASE WHEN  ("Expressions"."ORDER_ITEMS"."QUANTITY"*"Expressions"."ORDER_ITEMS"."UNIT_PRICE")>0  THEN  ("Expressions"."ORDER_ITEMS"."QUANTITY"* "Expressions"."ORDER_ITEMS"."UNIT_PRICE") ELSE ("Expressions"."ORDER_ITEMS"."QUANTITY"+ "Expressions"."ORDER_ITEMS"."UNIT_PRICE")   END '
]

for case in cases:
    res = re.findall(r'WHEN (?:(?! +(?:WHEN|ELSE)).)*', case, flags=re.I)
    print(res)

Prints:印刷:

['WHEN  "Expressions"."PRODUCT_CATEGORIES"."CATEGORY_NAME"=\'CPU\'  THEN   \'YES\'', 'WHEN  "Expressions"."PRODUCT_CATEGORIES"."CATEGORY_NAME"=\'RAM\'  THEN   \'YES\'']
['WHEN  ("Expressions"."ORDER_ITEMS"."QUANTITY"*"Expressions"."ORDER_ITEMS"."UNIT_PRICE")>0  THEN  ("Expressions"."ORDER_ITEMS"."QUANTITY"* "Expressions"."ORDER_ITEMS"."UNIT_PRICE")']

Update更新

If you want to group the WHEN and ELSE parts (stripped of leading and trailing spaces), then use the following regex:如果要对 WHEN 和 ELSE 部分进行分组(去掉前导和尾随空格),请使用以下正则表达式:

WHEN +(.*?) +THEN +((?:(?! +(?:WHEN|ELSE)).)*)
import re

cases = [
    'CASE WHEN  "Expressions"."PRODUCT_CATEGORIES"."CATEGORY_NAME"=\'CPU\'  THEN   \'YES\'  WHEN  "Expressions"."PRODUCT_CATEGORIES"."CATEGORY_NAME"=\'RAM\'  THEN   \'YES\' ELSE  \'NO\' END',
    'CASE WHEN  ("Expressions"."ORDER_ITEMS"."QUANTITY"*"Expressions"."ORDER_ITEMS"."UNIT_PRICE")>0  THEN  ("Expressions"."ORDER_ITEMS"."QUANTITY"* "Expressions"."ORDER_ITEMS"."UNIT_PRICE") ELSE ("Expressions"."ORDER_ITEMS"."QUANTITY"+ "Expressions"."ORDER_ITEMS"."UNIT_PRICE")   END '
]

for case in cases:
    results = re.findall(r'WHEN +(.*?) +THEN +((?:(?! +(?:WHEN|ELSE)).)*)', case, flags=re.I)
    for result in results:
        print(result[0], result[1])

Prints:印刷:

"Expressions"."PRODUCT_CATEGORIES"."CATEGORY_NAME"='CPU' 'YES'
"Expressions"."PRODUCT_CATEGORIES"."CATEGORY_NAME"='RAM' 'YES'
("Expressions"."ORDER_ITEMS"."QUANTITY"*"Expressions"."ORDER_ITEMS"."UNIT_PRICE")>0 ("Expressions"."ORDER_ITEMS"."QUANTITY"* "Expressions"."ORDER_ITEMS"."UNIT_PRICE")

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM