I want to extract strings before and string after a relational operator(>,<,>=,<=,,=,=) in regex using python
input:
Find me products where sales >= 200000 and profit > 20% by country
output
[[sales,>=,200000],[profit,<,20%]]
I am able to get the string before the operator and the operator using
\w+(?=\s+([<>]=?|[!=]=))
How do i get the string after as well in the same list? Any help is much appreciated
While pyOliv's answer already gives the wanted output, your use of the positive lookahead made me wonder whether the positive lookbehind might also be worthwhile to look into. That might make identifying the pattern after the relational operator more flexible, eg if you do not know how many occurrence of relational operators you have to expect. The matching pattern would be:
(?<=\s[<>!]=\s)[0-9,%]+|(?<=\s[<>=]\s)[0-9,%]+
The lookbehind has the disadvantage that it needs to know the length of the pattern it matches beforehand, so using "+", "*" or "|" within it will not work. This leads to the slightly more cumbersome version, where one lookbehind is used to match the length = 2 operators, and one is used to match the length = 1 operators.
you need to give more details about the strings your are looking through. Base on your example:
import re
txt = 'sales >= 200,000 and profit > 20%'
match = re.match(r"(.*) ([<>=!]{1,2}) (.*) .* (.*) ([<>=!]{1,2}) (.*)", txt)
for i in range(1,6):
print(match.group(i))
output:
sales
>=
200,000
profit
>
EDIT: Considering a more general case, you have this function, that give the exact output you need:
import re
def split_txt(txt):
lst = re.findall(r"\w+ [<>=!]{1,2} \w+", txt)
out = []
for sub_list in lst:
match = re.match(r"(\w+) ([<>=!]{1,2}) (\w+)", sub_list)
out.append([match.group(1), match.group(2), match.group(3)])
return out
txt = 'bbl sales >= 200,000 and profit > 20% another text id != 25'
a = split_txt(txt)
print(a)
out: [['sales', '>=', '200'], ['profit', '>', '20'], ['id', ',=', '25']]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.