简体   繁体   English

Python / RegEx / findall:如何从比赛后面提取模式?

[英]Python/RegEx/findall: How do I extract a pattern from behind the match?

I did look into lookbehind pattern (?<=...) but this doesn't seem to save the match. 我确实调查了后向模式(?<= ...),但这似乎无法保存匹配项。

input: 输入:

aaaaaaGET(abc)aaaaaa
aaaaaaaaaaaaa
aaaaaMATCH(00)aaaaaaa
aaaaaaaaaaaaa
aaaaGEX(xyz)aaaaaa
aaaaaaGET(notneed)aaaaaa
aaaaaaGEX(no)aaaaaa
aaaaaaGET(nope)aaaaaa
aaaaaaGET(AbC)aaaaaa
aaaaaaaaaaaaa
aaaaaaaaaaaaa
aaaaaMATCH(01)aaaaaaa
aaaaaaaaaaaaa
aaaaGEX(XYz)aaaaaa

output: 输出:

[(abc, 00, xyz), (AbC, 01, XYz]

I want use re.findall to find all the MATCH parts, and then what follows both GET (above the match) and GEX (below the match), but I can't figure out how to get anything like that from behind. 我想使用re.findall查找所有MATCH部分,然后查找GET(在匹配项之上)和GEX(在匹配项之下)之后的内容,但是我不知道如何从后面获取类似内容。

If all my related matches were ahead of MATCH, I'd have something like 如果我所有相关比赛都比MATCH提前,我会得到类似

re.findall('MATCH\((\d*)\).*?GEX\(([A-Za-z]*)\)', text, re.DOTALL)

But not sure how to get back and get the GET value 但不确定如何获取并获得GET值

I think you want something like this, 我想你想要这样的东西

>>> import re
>>> s = """aaaaaaGET(abc)aaaaaa
... aaaaaaaaaaaaa
... aaaaaMATCH(00)aaaaaaa
... aaaaaaaaaaaaa
... aaaaGEX(xyz)aaaaaa
... aaaaaaGET(notneed)aaaaaa
... aaaaaaGEX(no)aaaaaa
... aaaaaaGET(nope)aaaaaa
... aaaaaaGET(AbC)aaaaaa
... aaaaaaaaaaaaa
... aaaaaaaaaaaaa
... aaaaaMATCH(01)aaaaaaa
... aaaaaaaaaaaaa
... aaaaGEX(XYz)aaaaaa"""
>>> m = re.findall(r'GET.*?\(([^)]*)\)(?:(?!GET|GEX).)*?\(([^)]*)\)(?:(?!GET|GEX).)*?GEX\(([^)]*)\)', s, re.DOTALL)
>>> m
[('abc', '00', 'xyz'), ('AbC', '01', 'XYz')]

(?:(?!GET|GEX).)* negative lookahead checks for following three characters not to be GET or GEX , if it is not present, then only it matches the next character. (?:(?!GET|GEX).)*否定超前检查将检查以下三个字符是否不是GETGEX ,如果不存在,则仅匹配下一个字符。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Python Regex模式findall - Python Regex pattern findall Python - 如何将此模式(数字/数字)与正则表达式匹配? - Python - How do I match this pattern (number/number) with regex? 如何在Python中从正则表达式匹配和提取组? - How do match and extract groups from regex in Python? 如何在findex + Python的正则表达式模式中使用{} - How to use {} in regex pattern with findall + Python 如何从正则表达式搜索中提取跨度和匹配? - How do I extract the span and match from a regex search? 如何使用 python findall 提取公共部分? - How do I use python findall to extract common part? 如何修复此 RegEx 模式,以便提取与此 regex 模式匹配的字符串中所有可能出现的 substring? - How do I fix this RegEx pattern, in order to extract all possible occurrences of a substring within a string that match this regex pattern? 如何匹配此正则表达式以提取以下模式? - How to match this regex to extract the following pattern? 如何编写相关的REGEX模式以在python中提取较大文本字符串的子字符串 - How do I write a relevant REGEX pattern to extract sub-string of a larger text string in python 如何在 Python 中将这部分字符串与正则表达式匹配而无需后视需要固定宽度模式? - How to match this part of the string with regex in Python without getting look-behind requires fixed-width pattern?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM