简体   繁体   English

使用Regex.Matches从字符串中提取数据

[英]Extract data from a string using Regex.Matches

I have a string that always takes a general form. 我有一个始终采用一般形式的字符串。 I wish to extract information from it and place it in an array. 我希望从中提取信息并将其放置在数组中。

Given the following input: 给出以下输入:
John Doe +22\\r\\nPong John Doe +22 \\ r \\ nPong

I want the following output 我想要以下输出
John Doe 约翰·杜
+22 +22
Pong

I'm using the following bit of code to extract the details I want. 我正在使用以下代码提取所需的详细信息。

public static string[] DetailExtractor(string input)
        {
            return Regex.Matches(input, @"(.*(?=\s\+))|(\+\d{1,2}(?=\\r\\n))|((?<=\\r\\n).*)")
                 .OfType<Match>()
                 .Select(m => m.Value)
                 .ToArray();
        }

But it gives me the following output: 但这给了我以下输出:
Player Name 选手姓名
""

However, using the same regex expression in this online regex tester matches all the elements I want. 但是,在此在线正则表达式测试器中使用相同的正则表达式表达式会匹配我想要的所有元素。

Why does it work for one and not the other? 为什么它只对一个起作用而不对另一个起作用? Does Regex.Matches not work the way I think it does? Regex.Matches是否无法按我认为的方式工作?

You can try with one of these: 您可以尝试以下方法之一:

[a-z]+ [a-z]+ \+[0-9]{1,}\\r\\n[a-z]+

or: 要么:

[a-z\s\\]+\+[0-9]{1,}[a-z\s\\]+

or: 要么:

[\w\s]+\+\d{1,}\\r\\n[\w]+

Just taking a guess here, but I'm betting that you are using the following: 只是在这里猜测,但我敢打赌您正在使用以下内容:

var details = DetailExtractor("John Doe +22\\r\\nPong");

The above would convert \\r\\n to the a carriage return and a new line character. 上面的代码会将\\r\\n转换为回车符和换行符。 This would prevent the regex you wrote from working. 这将阻止您编写的正则表达式起作用。 Instead you can specify a raw string in C# or escape the \\r\\n : 相反,您可以在C#中指定原始字符串,也可以转义\\r\\n

var details = DetailExtractor(@"John Doe +22\\r\\nPong");

or 要么

var details = DetailExtractor("John Doe +22\\\\r\\\\nPong");

As everyone else has pointed out there's simpler regexes available to do the same type of matching depending on your needs. 正如其他所有人所指出的那样,有更简单的正则表达式可用于根据您的需要进行相同类型的匹配。

The regex below is slightly simpler, but the string array return is slightly more complex. 下面的正则表达式稍微简单一些,但返回的字符串数组则稍微复杂一些。

public static string[] DetailExtractor1(string input)
{
    var match = Regex.Match(input, @"^(?<name>\w+\s+\w+)\s+(?<num>\+\d+)\r\n(?<type>\w+)");

    if (match.Success)
    {
        return new string[] {
            match.Groups["name"].Value,
            match.Groups["num"].Value,
            match.Groups["type"].Value
        };
    }

    return null;
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM