[英]Regex - extract substring with specific pattern
I have a large string as shown below: 我有一个很大的字符串,如下所示:
99/34 12/34 This text is 22.67 22/23 33/34 Second text is like is 22.67 55/66 45/54 Third text is like is 32.27
99/34 12/34此文本为22.67 22/23 33/34第二文本为22.67 55/66 45/54第三文本为32.27
and so on. 等等。 I am trying to form a regex expression to extract all the substrings that start with "two digits, slash, two digits, one whitespace, two digits, slash, two digits, any character any number of repetitions,one . literal and two digits" from the large string.
我试图形成一个正则表达式来提取所有以“两位数,斜杠,两位数,一个空格,两位数,斜杠,两位数,任意字符,任意数量的重复,一位。文字和两位数”开头的所有子字符串。从大串。
The regex I tried is \\d{2}/\\d{2}\\s{1}.*\\.\\d{2}
. 我尝试过的正则表达式是
\\d{2}/\\d{2}\\s{1}.*\\.\\d{2}
。 But, this returns the a single string "99/34 12/34 This text is 22.67 22/23 33/34 Second text is like is 22.67 55/66 45/54 Third text is like is 32.27". 但是,此返回单个字符串“ 99/34 12/34该文本为22.67 22/23 33/34第二个文本为22.67 55/66 45/54第三个文本为32.27”。 I would like to get this extracted as
我想将其提取为
99/34 12/34 This text is 22.67
99/34 12/34这段文字是22.67
22/23 33/34 Second text is like is 22.67
22/23 33/34第二个文本就像是22.67
55/66 45/54 Third text is like is 32.27
55/66 45/54第三段文字是32.27
How would I do this? 我该怎么做? I am using C# (.NET 4.5)
我正在使用C#(.NET 4.5)
The problem lies in the greedy .*
it will try to match as many characters as possible while still giving a match. 问题在于贪婪
.*
它会在匹配时尝试匹配尽可能多的字符。
You can simply modify your regex thus 您可以简单地修改您的正则表达式
\d{2}/\d{2}\s.*?\d{2}\.\d{2}
The ?
?
after the *
makes it not greedy and only consume (eat) as few characters as possible in order to find a match. *
表示不贪婪,并且仅消耗(吃掉)尽可能少的字符以找到匹配项。
Note that I also changed \\s{1}
to \\s
as it was a single character to start with an qualifying it as exactly one does nothing but obfuscate the pattern. 请注意,我也将
\\s{1}
更改为\\s
因为从限定字符开始它是一个字符,因为除了模糊模式之外,它什么也没有做。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.