简体   繁体   中英

Split hierarchical string between two special characters in C#

I was working with a string like

$upper('anothervalue')$

I wrote a parser that parses this statement well with the code below.

But, now another system is sending a complex string like (I can get any number of chained string):

$upper($trim('somevalue')$)$ - $upper('anothervalue')$

How can I loop through the hierarchical data in a particular order to evaluate the functions (starting with the innermost match):

$trim('somevalue')$ --I will evaluate this, store it say in variable x

$upper(x)$ --now evaluate upper with the result from the previous statement

$upper('anothervalue')$

private static object EvaluateFunctionsInJson(string jsonValue)
{
    object origJsonValue = jsonValue;
    var fnMatches = Regex.Matches(jsonValue.ToString(), @"\$(.+?)\$");
    var fnCount = fnMatches.Count;
    foreach (var fnMatch in fnMatches)
    {
        // call another method to evaluate the function
        object replaceValue = EvaluateFunction(fnMatch.ToString());

        if (fnCount > 1)
        {
            origJsonValue = origJsonValue.ToString().Replace(fnMatch.ToString(), replaceValue.ToString());
        }
        else
        {
            origJsonValue = replaceValue;
        }
    }
    return origJsonValue;
}



private static object EvaluateFunction(string jsonValue)
{
    var functionWithoutDollarSign = Regex.Replace(jsonValue.ToString(), @"[$$]+", "");
    string functionName = Regex.Match(functionWithoutDollarSign, @"\b[A-Za-z]+\b", RegexOptions.Singleline).Value; //get the first word
    var functionParam = Regex.Match(functionWithoutDollarSign, @"\(([^)]*)\)").Value; //get the text between paranthesis
    var functionParamWithoutParanthesis = Regex.Replace(functionParam.ToString(), @"[\(\)]+", "");

    var funcParams = functionParamWithoutParanthesis.Split(',');
    var value = funcParams[0];
    switch (functionName.ToLower().Trim())
    {
        case "upper":
            return value.ToUpper();
        case "lower":
            return value.ToLower();
        case "number":
            return Convert.ToInt64(value);
        case "boolean":
            return Convert.ToBoolean(value);
        default:
            return value;
    }
}

You can use

using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Text.RegularExpressions;

public class Test
{
    public static void Main()
    {
        var text = "$upper($trim('somevalue   ')$)$ - $upper('anothervalue')$";
        var pattern = @"(?s)\$(?<functionName>\w+)\('?(?<value>[^'\\]*(?:\\.[^'\\]*)*)'?\)\$";
        var prev = string.Empty;
        do {
            prev = text;
            text = Regex.Replace(text, pattern, x => {
                switch (x.Groups["functionName"].Value.ToLower().Trim())
                {
                    case "trim":
                        return x.Groups["value"].Value.Trim();
                    case "upper":
                        return x.Groups["value"].Value.ToUpper();
                    case "lower":
                        return x.Groups["value"].Value.ToLower();
                    case "number":
                        return Convert.ToInt64(x.Groups["value"].Value).ToString();
                    case "boolean":
                        return Convert.ToBoolean(x.Groups["value"].Value).ToString();
                    default:
                        return x.Groups["value"].Value;
                }
            });
        } while (prev != text);
        Console.WriteLine(text);
    }
}

See the online C# demo .

The regex is

(?s)\$(?<functionName>\w+)\('?(?<value>[^'\\]*(?:\\.[^'\\]*)*)'?\)\$

See the online regex demo . Details ::

  • (?s) - a RegexOptions.Singleline inline regex option that makes . match any chars inlcluding newline
  • \\$ - a dollar symbol
  • (?<functionName>\\w+) - Group "functionName": one or more word chars
  • \\( - a ( char
  • '? - an optional ' char
  • (?<value>[^'\\\\]*(?:\\\\.[^'\\\\]*)*) - Group "value": zero or more chars other than ' and \\ and then zero or more occurrences of any escaped char and then zero or more chars other than ' and \\
  • '?\\) - ') or ) string
  • \\$

The do {...} while (prev != text) searches and replaces matches in the text variable until no match is found.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM