简体   繁体   English

简单的正则表达式问题 C#

[英]Simple regex question C#

I need to match the string that is shown in the window displayed below:我需要匹配下面显示的 window 中显示的字符串:

8% of setup_av_free.exe from software-files-l.cnet.com Completed来自 software-files-l.cnet.com 的 setup_av_free.exe 的 8% 已完成

98% of test.zip from 65.55.72.119 Completed 98% 的测试。zip 从 65.55.72.119 完成

[numeric]%of[filename]from[hostname | [数字]%of[文件名]来自[主机名 | IP address]Completed [IP地址]已完成

I have written the regex pattern halfway我已经写了一半的正则表达式模式

if (Regex.IsMatch(text, @"[\d]+%[\s]of[\s](.+?)(\.[^.]*)[\s]from[\s]"))
    MessageBox.Show(text);

and I now need to integrate the following regex into my code above我现在需要将以下正则表达式集成到我上面的代码中

ValidIpAddressRegex = "^(([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.){3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])$";  

ValidHostnameRegex = "^(([a-zA-Z]|[a-zA-Z][a-zA-Z0-9\-]*[a-zA-Z0-9])\.)*([A-Za-z]|[A-Za-z][A-Za-z0-9\-]*[A-Za-z0-9])$"; 

The 2 regex were taken from this link . 2 正则表达式取自此链接 These 2 regex works well when i use the Regex.ismatch to match "123.123.123.123" and "software-files-l.cnet.com".当我使用 Regex.ismatch 匹配“123.123.123.123”和“software-files-l.cnet.com”时,这两个正则表达式运行良好。 However i cannot get it to work when i intergrate both of them to my existin regex code.但是,当我将它们都集成到我现有的正则表达式代码时,我无法让它工作。 I tried several variant but not able to get it to work.我尝试了几种变体,但无法使其正常工作。 Can someone guide me to integrate the 2 regex to my existing code.有人可以指导我将 2 正则表达式集成到我现有的代码中。 Thanks in advance.提前致谢。

You can certainly combine all these regular expressions into one, but I'd recommend against it.您当然可以将所有这些正则表达式组合成一个,但我建议您不要这样做。 Consider this method, first it checks wether your input text has the correct form overall, then it checks if the "from" part is an IP address or a hostname.考虑这种方法,首先它检查您的输入文本整体是否具有正确的形式,然后检查“发件人”部分是 IP 地址还是主机名。

bool CheckString(string text) {
    const string ValidIpAddressRegex = @"^(([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.){3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])$";  

    const string ValidHostnameRegex = @"^(([a-zA-Z]|[a-zA-Z][a-zA-Z0-9\-]*[a-zA-Z0-9])\.)*([A-Za-z]|[A-Za-z][A-Za-z0-9\-]*[A-Za-z0-9])$"; 

    var match = Regex.Match(text, @"[\d]+%[\s]of[\s](.+?)(\.[^.]*)[\s]from[\s](\S+)");
    if(!match.Success)
        return false;        

    string address = match.Groups[3].Value;

    return Regex.IsMatch(address, ValidIpAddressRegex) ||
           Regex.IsMatch(address, ValidHostnameRegex); 
}

It does what you want and is much more readable and than single monster-sized regular expression.它可以满足您的需求,并且比单个怪物大小的正则表达式更具可读性。 If you aren't going to call this method millions of time in a loop there is no reason to be concerned about it being less performant that single regex.如果您不打算在循环中数百万次调用此方法,则没有理由担心它会降低单个正则表达式的性能。

Also, in case you aren't aware of that the brackets around \d or \s aren't necessary.此外,如果您不知道\d\s周围的括号不是必需的。

The "Problem" that those two regexes do not match your string is that they start with ^ and end with $这两个正则表达式与您的字符串不匹配的“问题”是它们以^开头并以$结尾

^ means match the start of the string (or row if the m modifier is activated) ^表示匹配字符串的开头(如果激活了 m 修饰符,则匹配行)
$ means match the end of the string (or row if the m modifier is activated) $表示匹配字符串的结尾(如果激活了 m 修饰符,则匹配行)

When you try it this is true but in your real text they are in the middle of the string, so it is not matched.当您尝试时,这是真的,但在您的真实文本中,它们位于字符串的中间,因此不匹配。

Try just remove the ^ at the very beginning and the $ at the very end.尝试只删除最开始的^和最后的$

Here you go.这里是 go。

^[\d]+%[\s+]of[\s+](.+?)(\.[^.]*)[\s+]from[\s+]((([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.){3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])|((([a-zA-Z]|[a-zA-Z][a-zA-Z0-9\-]*[a-zA-Z0-9])\.)*([A-Za-z]|[A-Za-z][A-Za-z0-9\-]*[A-Za-z0-9])))[\s+]Completed

Remove the ^ and $ characters from the ValidIpAddressRegex and ValidHostnameRegex samples above, and add them separated by the or character (|) enclosed by parentheses.从上面的 ValidIpAddressRegex 和 ValidHostnameRegex 示例中删除 ^ 和 $ 字符,并添加它们并用括号括起来的或字符 (|) 分隔。

You could use this, its should work for all cases.你可以使用它,它应该适用于所有情况。 I mightve accidentally deleted a character while formatting so let me know if it doesnt work.我可能在格式化时不小心删除了一个字符,所以如果它不起作用,请告诉我。

string captureString = "8% of setup_av_free.exe from software-files-l.cnet.com Completed";
Regex reg = new Regex(@"(?<perc>\d+)% of (?<file>\w+\.\w+) from (?<host>" +
    @"(\d+\.\d+.\d+.\d+)|(((https?|ftp|gopher|telnet|file|notes|ms-help):" +
    @"((//)|(\\\\))+)?[\w\d:#@%/;$()~_?\+-=\\\.&]*)) Completed");
Match m = reg.Match(captureString);
string perc = m.Groups["perc"].Value;
string file = m.Groups["file"].Value;
string host = m.Groups["host"].Value;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM