简体   繁体   English

正则表达式可在regex测试器中使用,但不适用于c#

[英]Regular Expression working in regex tester, but not in c#

I've searched around a bit, but it's hard to tell if this exact question has been answered before. 我已经搜索了一下,但是很难确定这个确切的问题之前是否已经回答过。 I know you'll let me know if this is a duplicate. 我知道您会让我知道是否重复。

I have a regular expression that matches a series of one or more positive integers preceded by a backslash. 我有一个正则表达式,可以匹配一系列以反斜杠开头的一个或多个正整数。 For example: \\12345 would match, but \\1234f or 12345 would not match. 例如:\\ 12345将匹配,但是\\ 1234f或12345将不匹配。

The regex I'm using is ^\\\\(\\d+)$ 我正在使用的正则表达式是^ \\\\(\\ d +)$

When I test the expression using various testers it works. 当我使用各种测试器测试表达式时,它可以工作。 For example, see: http://regex101.com/r/cY2bI1/1 例如,请参阅: http : //regex101.com/r/cY2bI1/1

However, when I implement it in the following c# code, I fail to get a match. 但是,当我在以下c#代码中实现它时,我找不到匹配项。

The implementation: 实现:

public string ParseRawUrlAsAssetNumber(string rawUrl) {
    var result = string.Empty;
    const string expression = @"^\\([0-9]+)$";
    var rx = new Regex(expression);
    var matches = rx.Matches(rawUrl);
    if (matches.Count > 0)
    {
        result = matches[0].Value;
    }
    return result;
}

The failing test (NUnit): 测试失败(NUnit):

[Test]
public void ParseRawUrlAsAssetNumber_Normally_ParsesTheUrl() {
    var f = new Forwarder();
    var validRawUrl = @"\12345";
    var actualResult = f.ParseRawUrlAsAssetNumber(validRawUrl);
    var expectedResult = "12345";
    Assert.AreEqual(expectedResult, actualResult);
}

The test's output: 测试的输出:

Expected string length 5 but was 6. Strings differ at index 0.
Expected: "12345"
But was:  "\\12345"
-----------^

Any ideas? 有任何想法吗?

Resolution: 解析度:

Thanks everyone for the input. 感谢大家的投入。 In the end I took the following route based on your recommendations, and it is passing tests now. 最后,我根据您的建议采取了以下路线,并且现在通过了测试。

public string ParseRawUrlAsAssetNumber(string rawUrl)
{
    var result = string.Empty;
    const string expression = @"^\\([0-9]+)$";
    var rx = new Regex(expression);
    var matches = rx.Matches(rawUrl);
    if (matches.Count > 0)
    {
        result = matches[0].Groups[1].Value;
    }
    return result;
}

The problem is this line: 问题是这一行:

var rx = new Regex(Regex.Escape(expression));

By escaping your expression you're turning all your special regex characters into literals. 通过转义表达式,可以将所有特殊的正则表达式字符转换为文字。 Calling Regex.Escape(@"^\\\\(\\d+)$") will return "\\^\\\\\\\\\\(\\\\d\\+\\)\\$" , which will only match the literal string "^\\\\(\\d+)$" 调用Regex.Escape(@"^\\\\(\\d+)$")将返回"\\^\\\\\\\\\\(\\\\d\\+\\)\\$" ,该字符串与文字字符串"^\\\\(\\d+)$"

Try just this: 试试这个:

var rx = new Regex(expression);

See MSDN: Regex.Escape for a full explanation and examples of how this method is intended to be used. 有关如何使用此方法的完整说明和示例,请参见MSDN: Regex.Escape


Given your updated question, it seems like you also have an issue here: 考虑到您的更新问题,似乎您在这里也遇到了一个问题:

result = matches[0].Value;

This will return the entire matched substring, not just the first capture group. 这将返回整个匹配的子字符串,而不仅仅是第一个捕获组。 For that you'll have to use: 为此,您必须使用:

result = matches[0].Groups[1].Value;

Don't escape pattern. 不要逃脱模式。 Also simply use Regex.Match thus you are going to have single match here. 也可以简单地使用Regex.Match因此您将在此处进行单个匹配。 Use Match.Success to check if input matched your pattern. 使用Match.Success检查输入是否与您的模式匹配。 And return group value - digits are in group of your matched expression: 并返回值-数字位于匹配表达式的组中:

public string ParseRawUrlAsAssetNumber(string rawUrl)
{            
    const string pattern = @"^\\(\d+)$";

    var match = Regex.Match(rawUrl, pattern);
    if (!match.Success)
        return String.Empty;

    return match.Groups[1].Value;
}

What if you try get the group result instead? 如果您尝试获取分组结果怎么办?

match.Groups[1].Value

When I get to a real computer I'll test, but seems like it should work 当我进入一台真正的计算机时,我会进行测试,但似乎应该可以工作

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM