简体   繁体   English

使用正则表达式仅匹配第 n 次出现

[英]Match only the nth occurrence using a regular expression

I have a string with 3 dates in it like this:我有一个包含 3 个日期的字符串,如下所示:

XXXXX_20160207_20180208_XXXXXXX_20190408T160742_xxxxx

I want to select the 2nd date in the string, the 20180208 one.我想选择字符串中的第二个日期,即20180208一个。

Is there away to do this purely in the regex , with have to resort to pulling out the 2 match in code.是否可以纯粹在regex执行此操作,而必须求助于在代码中提取 2 个匹配项。 I'm using C# if that matters.如果这很重要,我正在使用C#

Thanks for any help.感谢您的帮助。

You could use你可以用

^(?:[^_]+_){2}(\d+)

And take the first group, see a demo on regex101.com .以第一组为例在 regex101.com 上查看演示


Broken down, this says 崩溃了,这说

var pattern = @"^(?:[^_]+_){2}(\d+)"; 
var text = "XXXXX_20160207_20180208_XXXXXXX_20190408T160742_xxxxx";
var result = Regex.Match(text, pattern)?.Groups[1].Value;
Console.WriteLine(result); // => 20180208

C# demo : C# 演示

 var pattern = @"^(?:[^_]+_){2}(\\d+)"; var text = "XXXXX_20160207_20180208_XXXXXXX_20190408T160742_xxxxx"; var result = Regex.Match(text, pattern)?.Groups[1].Value; Console.WriteLine(result); // => 20180208

Try this one试试这个

MatchCollection matches = Regex.Matches(sInputLine, @"\\d{8}"); MatchCollection 匹配 = Regex.Matches(sInputLine, @"\\d{8}");

string sSecond = matches[1].ToString();字符串 sSecond = 匹配 [1].ToString();

You could use the regular expression你可以使用正则表达式

^(?:.*?\d{8}_){1}.*?(\d{8})

to save the 2 nd date to capture group 1.保存第二个日期以捕获组 1。

Demo演示

Naturally, for n > 2 , replace {1} with {n-1} to obtain the n th date.自然,对于n > 2 ,将{1}替换为{n-1}以获得第 n日期。 To obtain the 1 st date use为了获得第1日起使用

^(?:.*?\d{8}_){0}.*?(\d{8})

Demo演示

The C#'s regex engine performs the following operations. C# 的正则表达式引擎执行以下操作。

^        # match the beginning of a line
(?:      # begin a non-capture group
  .*?    # match 0+ chars lazily
  \d{8}  # match 8 digits
  _      # match '_'
)        # end non-capture group
{n}      # execute non-capture group n (n >= 0) times
.*?      # match 0+ chars lazily     
(\d{8})  # match 8 digits in capture group 1

The important thing to note is that the first instance of .*?需要注意的重要一点是.*? , followed by \\d{8} , because it is lazy, will gobble up as many characters as it can until the next 8 characters are digits (and are not preceded or followed by a digit. For example, in the string ,后跟\\d{8} ,因为它是懒惰的,将尽可能多地吞噬尽可能多的字符,直到接下来的 8 个字符是数字(并且前面或后面都没有数字。例如,在字符串中

_1234abcd_efghi_123456789_12345678_ABC

capture group 1 in (.*?)_\\d{8}_ will contain "_1234abcd_efghi_123456789" . (.*?)_\\d{8}_捕获组 1 将包含"_1234abcd_efghi_123456789"

You can use System.Text.RegularExpressions.Regex您可以使用System.Text.RegularExpressions.Regex

See the following example看下面的例子

Regex regex = new Regex(@"^(?:[^_]+_){2}(\d+)"); //Expression from Jan's answer just showing how to use C# to achieve your goal
GroupCollection groups = regex.Match("XXXXX_20160207_20180208_XXXXXXX_20190408T160742_xxxxx").Groups;
if (groups.Count > 1)
{
    Console.WriteLine(groups[1].Value);
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM