[英]REGEX (python) match or return a string after '?', but in a new line, til the end of that line
[英]python regex match line containing numbers after string with digit at end
我使用正则表达式捕获文件中的文本,但是字符串包含错误的数字。 我没有捕获它,但是在尝试捕获下一行时,它仅返回字符串,而不返回下一行。 当没有错误的尾随数字时,我能够捕获它。
我已经尝试过许多正则表达式的组合,但尚未成功。
文本:
sentences
company_name: company, ltd6
numbers 99 and letters 99 (I want to match anything here and nothing after)
numbers 99 and letters 99 (I don't want to match anything here or after)
成功捕获正则表达式但带有数字的代码:
company_name = re.findall(r"company_name:\s(.*)\D.+", text)
成功捕获不带数字的正则表达式的代码:
company_name = re.findall(r"company_name:\s(.*)(?=.\D.+)", text)
尝试捕获以下行:
next_line = re.findall(r"company_name:\s(.*)(?=.\D.+).*", text)
我希望捕获下一行,但不要。
这将仅获得下一行,而忽略后续行:
next_line = re.sub(r".*company_name:[^\n]+\n*([^\n]+).*", r'\1', text, flags=re.S)
即: numbers 99 and letters 99 (I want to match anything here and nothing after)
根据您的原始表达方式,我猜测可能是这种表达方式,
.*company_name:\s*(.*\D)\s*(\w.*)
可能有用。 我们有两组(.*\\D)
和(\\w.*)
,其中捕获了我们想要的输出。
也许这个:
.*company_name:\s*(.*)\s*(\w.*)
import re
regex = r".*company_name:\s*(.*\D)\s*(\w.*)"
test_str = ("sentences\n"
"company_name: company, ltd6\n\n"
"numbers 99 and letters 99 (I want to match anything here)")
matches = re.finditer(regex, test_str, re.MULTILINE)
for matchNum, match in enumerate(matches, start=1):
print ("Match {matchNum} was found at {start}-{end}: {match}".format(matchNum = matchNum, start = match.start(), end = match.end(), match = match.group()))
for groupNum in range(0, len(match.groups())):
groupNum = groupNum + 1
print ("Group {groupNum} found at {start}-{end}: {group}".format(groupNum = groupNum, start = match.start(groupNum), end = match.end(groupNum), group = match.group(groupNum)))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.