简体   繁体   中英

Regex within a regex?

Truth is, I'm having a hard time writing a regex string to parse something in the form of

[[[tab name=dog content=cat|tab name=dog2 content=cat2]]]

This regex would be parsed so that I can dynamically build tabs as demonstrated here. Initially I tried a regex pattern like \\[\\[\\[tab name=(?'name'.*?) content=(?'content'.*?)\\]\\]\\]

But I realized I couldn't get the tab as a whole and build upon a query without doing a regex.replace. Is it possible to take the entire tab leading up to the pipe symbol as a group and then parse that group down from the sub key/value pairs?

This is the current regex string I'm working with \\[\\[\\[(?'tab'tab name=(?'name'.*?) content=(?'content'.*?))\\]\\]\\]

And here is my code for performing the regex. Any guidance would be appreciated.

public override string BeforeParse(string markupText)
    {
        if (CompiledRegex.IsMatch(markupText))
        {
            // Replaces the [[[code lang=sql|xxx]]]
            // with the HTML tags (surrounded with {{{roadkillinternal}}.
            // As the code is HTML encoded, it doesn't get butchered by the HTML cleaner.
            MatchCollection matches = CompiledRegex.Matches(markupText);
            foreach (Match match in matches)
            {
                string tabname = match.Groups["name"].Value;
                string tabcontent = HttpUtility.HtmlEncode(match.Groups["content"].Value);
                markupText = markupText.Replace(match.Groups["content"].Value, tabcontent);

                markupText = Regex.Replace(markupText, RegexString, ReplacementPattern, CompiledRegex.Options);
            }
        }

        return markupText;
    }

Is this what you want?

string input = "[[[tab name=dog content=cat|tab name=dog2 content=cat2]]]";
Regex r = new Regex(@"tab name=([a-z0-9]+) content=([a-z0-9]+)(\||])");

foreach (Match m in r.Matches(input))
{
    Console.WriteLine("{0} : {1}", m.Groups[1].Value, m.Groups[2].Value);
}

http://regexr.com/3boot

Maybe string.split will be better in that case? For example something like that :

strgin str = "[[[tab name=dog content=cat|tab name=dog2 content=cat2]]]";
foreach(var entry in str.Split('|')){
var eqBlocks = entry.Split('=');
var tabName = eqBlocks[1].TrimEnd(" content");
var content = eqBlocks[2];
}

Ugly code, but should work.

Try this:

Starts with a word boundary and followed only by allowed characters.

/\b[\w =]*/g

https://regex101.com/r/cI7jS7/1

Just distill the regex pattern down to the individual tab patterns such as name=??? content=??? name=??? content=??? and match that only. That pattern which will make each Match (two in you example) where the data can be extracted.

string text = @"[[[tab name=dog content=cat|tab name=dog2 content=cat2]]]";
string pattern = @"name=(?<Name>[^\s]+)\scontent=(?<Content>[^\s|\]]+)";

var result = Regex.Matches(text, pattern)
                  .OfType<Match>()
                  .Select(mt => new
                  {
                       Name    = mt.Groups["Name"].Value,
                       Content = mt.Groups["Content"].Value,
                  });

The result is an enumerable list with the created dynamic entities with the tabs needed which can be directly bound to the control:

在此处输入图片说明


Note in the set notation [^\\s|\\]] the pipe | is treated as a literal in the set and not used as an or . The bracket ] does have to be escaped though to be treated as a literal. Finally the logic the parse will look for: "To not ( ^ ) be a space or a pipe or a brace for that set".

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM