简体   繁体   中英

RegEx replace query to pick out wiki syntax

I've got a string of HTML that I need to grab the "[Title| http://www.test.com] " pattern out of eg

"dafasdfasdf, adfasd. [Test| http://www.test.com/] adf ddasfasdf [SDAF| http://www.madee.com/] assg ad"

I need to replace "[Title| http://www.test.com] " this with "http://www.test.com/'>Title".

What is the best away to approach this?

I was getting close with:

string test = "dafasdfasdf adfasd [Test|http://www.test.com/] adf ddasfasdf [SDAF|http://www.madee.com/] assg ad ";
        string p18 = @"(\[.*?|.*?\])";
        MatchCollection mc18 = Regex.Matches(test, p18, RegexOptions.Singleline | RegexOptions.IgnoreCase);
        foreach (Match m in mc18)
        {
            string value = m.Groups[1].Value;
            string fulltag = value.Substring(value.IndexOf("["), value.Length - value.IndexOf("["));
            Console.WriteLine("text=" + fulltag);
        }

There must be a cleaner way of getting the two values out eg the "Title" bit and the url itself.

Any suggestions?

Replace the pattern:

\[([^|]+)\|[^]]*]

with:

$1

A short explanation:

\[         # match the character '['
(          # start capture group 1
  [^|]+    #   match any character except '|' and repeat it one or more times
)          # end capture group 1
\|         # match the character '|'
[^]]*      # match any character except ']' and repeat it zero or more times
]          # match the character ']'

AC# demo would look like:

string test = "dafasdfasdf adfasd [Test|http://www.test.com/] adf ddasfasdf [SDAF|http://www.madee.com/] assg ad ";
string adjusted = Regex.Replace(test, @"\[([^|]+)\|[^]]*]", "$1");

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM