简体   繁体   English

正则表达式匹配句子的第二个单词并修剪前导空格

[英]Regex to match the second word of a sentence and trim leading white space

I'm trying to write a regex that matches the second word of a sentence.我正在尝试编写一个与句子的第二个单词匹配的正则表达式。

What I have so far is到目前为止我所拥有的是

\s+[^\s]+

Which matches哪个匹配

The quick brown fox jumps over the lazy dog敏捷的棕色狐狸跳过懒惰的狗

Unfortunately I can't come up with a solution that removes the leading white space.不幸的是,我无法想出消除前导空白的解决方案。

For an example see http://regex101.com/r/nB9yD9有关示例,请参阅http://regex101.com/r/nB9yD9

So, is there an easy way to match just "quick" without the white space?那么,有没有一种简单的方法可以在没有空白的情况下“快速”匹配? The weapon of choice here is C#, if it makes any difference.这里选择的武器是 C#,如果它有什么不同的话。

And it HAS to be regex, I know String.Split would be much nicer in this specific situation.它必须是正则表达式,我知道String.Split在这种特定情况下会更好。

On a side note, is it possible to match the n-th word of sentence in regex?附带说明一下,是否可以匹配正则表达式中句子的第 n 个单词? For what I know regex can't group into a unknown number of groups, is that correct?据我所知,正则表达式不能分组为未知数量的组,对吗?

EDIT : I had a typo in the example.编辑:我在这个例子中有一个错字。 The underscore I put there was meant to highlight the white space.我放在那里的下划线是为了突出空白。

The regular expression you are using is correct.您使用的正则表达式是正确的。 To go around your problem, you could use capture groups , something like so:要解决您的问题,您可以使用capture groups ,如下所示:

        string str = "The quick brown fox jumps over the lazy dog";
        Regex r = new Regex(@"\s+([^\s]+)");
        Match m = r.Match(str);
        System.Console.WriteLine(m.Groups[1]);

This will yield quick , without the trailing space.这将产生quick ,没有尾随空格。

Alternatively, you could use the trim() method on your result as well.或者,您也可以对结果使用trim()方法。

Also, as per your side note, you can match the nth word of a given sentence by combining C# and regex, something like so should do what you need:此外,根据您的附注,您可以通过组合 C# 和正则表达式来匹配给定句子的nth单词,类似这样的事情应该可以满足您的需求:

        string str = "The quick brown fox jumps over the lazy dog";
        Regex r = new Regex(@"(^|\s)+([^\s]+)");
        MatchCollection mc = r.Matches(str);                        
        for (int i = 0; i < mc.Count; i++)
        {
            System.Console.WriteLine(mc[i].Groups[2]);
        } 

Yields:产量:

The
quick
brown
fox
jumps
over
the
lazy
dog

I had to make amendments to the regex to take in consideration the first word as well.我不得不修改正则表达式以考虑第一个词。 This allows the regex to pick words which are either preceeded by a white space or else, the beginning of the string.这允许正则表达式选择前面有空格或字符串开头的单词。

As per your comment, please take a look at this link.根据您的评论,请查看链接。

string str = "The quick brown fox jumps over the lazy dog";
Regex r = new Regex(@"\w+");  //Find words

MessageBox.Show(r.Matches(str)[1].Value); // Get all words and show value at 1st position

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM