简体   繁体   English

如何使用正则表达式将以单词(不包括)开头的子字符串匹配到字符串的结尾,并同时保持非贪婪?

[英]How can I use regex to match sub-string start by words (not included) to the end of string, and keep non-greedy at same time?

I want to find a sub-string that starts with words (\d月|\d日) (not included in result) and to the end of the string, at the same time, keep the sub-string shortest (non-greedy).我想找到一个以单词(\d月|\d日)开头(不包括在结果中)到字符串结尾的子串,同时保持子串最短(非贪婪). for example,例如,

str1 = "秋天9月9日长江工程完成"
res1 = re.search(r'(\d月|\d日).*', str1).group() #return 9月9日长江工程完成

I want to return the result like长江工程完成, for another example,我想返回长江工程完成这样的结果,再比如,

str2 ="秋天9月9日9日长江工程完成"

it should get same results like previous one它应该得到与前一个相同的结果

thus I tried these several methods, but all return un-expected results, please give me some suggestion...因此我尝试了这几种方法,但都返回了意想不到的结果,请给我一些建议......

res1 = re.search(r'(?:(?!\d月|\d日))(?:\d月|\d日)', str1).group() #return 9月
res1 = re.search(r'(?:\d月|\d日)((?:(?!\d月|\d日).)*?)', content).group()  #return 9月

If you want to capture the rest of the string, surround .* with a group.如果要捕获字符串的 rest,请将.*包围起来。

To capture one or more of the same pattern, you can use the + operator.要捕获一个或多个相同的模式,您可以使用+运算符。

import re

content = "9月9日9月长江工程完成"
match = re.match(r'(?:\d月|\d日)+(.*)', content)
print(match[1])

Output: Output:

长江工程完成

(?:(??\d月|\d日))(::\d月|\d日)

This pattern only captures the initial words, because you don't capture the rest as a group.此模式仅捕获初始单词,因为您没有将 rest 捕获为一个组。 (Also, it only allows for exactly two occurences). (此外,它只允许恰好出现两次)。


(?:\d月|\d日)((?:(?.\d月|\d日)?)*?)

This pattern requires only matches strings that look like this: 9月4日a6日b0月x - probably not what you need此模式只需要匹配如下所示的字符串: 9月4日a6日b0月x - 可能不是您需要的


PS Make sure you pick right function from the re : match , search or fullmatch (see What is the difference between re.search and re.match? ). PS 确保您从re : matchsearchfullmatch中选择正确的 function(请参阅re.search 和 re.match 之间的区别是什么? )。 You said that you need the whole string needs to start with the given words, so match or fullmatch .你说你需要整个字符串需要以给定的单词开头,所以matchfullmatch

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM