根據 python 中的正則表達式匹配提取字符串之前和之后的字符串

Question

我想使用 python 在正則表達式中提取關系運算符（>、<、>=、<=、、=、=）之前和之后的字符串

輸入：

Find me products where sales >= 200000 and profit > 20% by country

output

[[sales,>=,200000],[profit,<,20%]]

我能夠在運算符和運算符使用之前獲取字符串

\w+(?=\s+([<>]=?|[!=]=))

我如何在同一個列表中獲取字符串？ 任何幫助深表感謝

Answer 1

雖然 pyOliv 的回答已經給出了想要的 output，但您對積極前瞻的使用讓我想知道積極的后視是否也值得研究。 這可能會使在關系運算符之后識別模式更加靈活，例如，如果您不知道您必須期望出現多少關系運算符。 匹配模式將是：

(?<=\s[<>!]=\s)[0-9,%]+|(?<=\s[<>=]\s)[0-9,%]+

lookbehind 的缺點是它需要事先知道它匹配的模式的長度，所以使用“+”、“*”或“|” 在它里面是行不通的。 這導致了稍微繁瑣的版本，其中一個lookbehind用於匹配length = 2的運算符，一個用於匹配length = 1的運算符。

Answer 2

您需要提供有關您正在查看的字符串的更多詳細信息。 根據您的示例：

import re
txt = 'sales >= 200,000 and profit > 20%'
match = re.match(r"(.*) ([<>=!]{1,2}) (.*) .* (.*) ([<>=!]{1,2}) (.*)", txt)
for i in range(1,6):
    print(match.group(i))

output：

sales
>=
200,000
profit
>

編輯：考慮更一般的情況，你有這個 function，它給出了你需要的確切 output：

import re

def split_txt(txt):
    lst = re.findall(r"\w+ [<>=!]{1,2} \w+", txt)
    out = []
    for sub_list in lst:
        match = re.match(r"(\w+) ([<>=!]{1,2}) (\w+)", sub_list)
        out.append([match.group(1), match.group(2), match.group(3)])
    return out


txt = 'bbl sales >= 200,000 and profit > 20% another text id != 25'
a = split_txt(txt)
print(a)

out: [['sales', '>=', '200'], ['profit', '>', '20'], ['id', ',=', '25']]

根據 python 中的正則表達式匹配提取字符串之前和之后的字符串

問題描述

2 個解決方案

解決方案1
1 2020-05-20 08:59:55

解決方案2
0 已采納 2020-05-20 08:25:49

根據 python 中的正則表達式匹配提取字符串之前和之后的字符串

問題描述

2 個解決方案

解決方案1 1 2020-05-20 08:59:55

解決方案2 0 已采納 2020-05-20 08:25:49

解決方案1
1 2020-05-20 08:59:55

解決方案2
0 已采納 2020-05-20 08:25:49