简体   繁体   English

C#正则表达式返回多行文本

[英]C# Regex returning multiple lines of text

I have the following function: 我有以下功能:

public static string ReturnEmailAddresses(string input)
    {

        string regex1 = @"\[url=";
        string regex2 = @"mailto:([^\?]*)";
        string regex3 = @".*?";
        string regex4 = @"\[\/url\]";

        Regex r = new Regex(regex1 + regex2 + regex3 + regex4, RegexOptions.IgnoreCase | RegexOptions.Multiline);
        MatchCollection m = r.Matches(input);
        if (m.Count > 0)
        {
            StringBuilder sb = new StringBuilder();
            int i = 0;
            foreach (var match in m)
            {
                if (i > 0)
                    sb.Append(Environment.NewLine);
                string shtml = match.ToString();
                var innerString = shtml.Substring(shtml.IndexOf("]") + 1, shtml.IndexOf("[/url]") - shtml.IndexOf("]") - 1);
                sb.Append(innerString); //just titles                    
                i++;
            }

            return sb.ToString();
        }

        return string.Empty;
    }

As you can see I define a url in the "markdown" format: 如您所见,我以“ markdown”格式定义了一个网址:

[url = http://sample.com]sample.com[/url]

In the same way, emails are written in that format too: 同样,电子邮件也以这种格式编写:

[url=mailto:service@paypal.com.au]service@paypal.com.au[/url]

However when i pass in a multiline string, with multiple email addresses, it only returns the first email only. 但是,当我传入具有多个电子邮件地址的多行字符串时,它仅返回第一封电子邮件。 I would like it to have multple matches, but I cannot seem to get that working? 我希望它能进行多次比赛,但似乎无法正常工作?

For example 例如

[url=mailto:service@paypal.com.au]service@paypal.com.au[/url] /r/n a whole bunch of text here /r/n more stuff here [url=mailto:anotheremail@paypal.com.au]anotheremail@paypal.com.au[/url]

This will only return the first email above? 这只会返回上面的第一封电子邮件吗?

The mailto:([^\\?]*) part of your pattern is matching everything in your input string. 模式的mailto:([^\\?]*)部分与输入字符串中的所有内容匹配。 You need to add the closing bracket ] to the inside of your excluded characters to restrict that portion from overflowing outside of the "mailto" section and into the text within the "url" tags: 您需要在要排除的字符的内部添加右括号[ ] ,以限制该部分在“ mailto”部分之外以及“ url”标记内的文本中溢出:

\[url=mailto:([^\?\]]*).*?\[\/url\]

See this link for an example: https://regex101.com/r/zcgeW8/1 有关示例,请参见此链接: https : //regex101.com/r/zcgeW8/1

You can extract desired result with help of positive lookahead and positive lookbehind. 您可以借助正向前向和正向后向提取期望的结果。 See http://www.rexegg.com/regex-lookarounds.html 参见http://www.rexegg.com/regex-lookarounds.html

Try regex: (?<=\\[url=mailto:).*?(?=\\]) 尝试正则表达式: (?<=\\[url=mailto:).*?(?=\\])

Above regex will capture two email addresses from sample string 上面的正则表达式将从示例字符串中捕获两个电子邮件地址

[url=mailto:service@paypal.com.au]service@paypal.com.au[/url] /r/na whole bunch of text here /r/n more stuff here [url=mailto:anotheremail@paypal.com.au]anotheremail@paypal.com.au[/url]

Result: 结果:

service@paypal.com.au
anotheremail@paypal.com.au

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM