简体   繁体   中英

Regex.Split White Space

string pattern = @"(if)|(\()|(\))|(\,)";
string str = "IF(SUM(IRS5555.IRs001)==IRS5555.IRS001,10,20)";
string[] substrings = Regex.Split(str,pattern,RegexOptions.IgnorePatternWhitespace | RegexOptions.IgnoreCase) ;
foreach (string match in substrings)
{
    Console.WriteLine("Token is:{0}", match);
}

And out put is

Token is:
Token is:IF
Token is:
Token is:(
Token is:SUM
Token is:(
Token is:IRS5555.IRs001
Token is:)
Token is:==IRS5555.IRS001
Token is:,
Token is:10
Token is:,
Token is:20
Token is:)
Token is:

As you can see Empty string in 1,3 and last token,i am not able to understand why this kind of result,there is not empty string in my given string.

i don't want this is result

try that:

        string pattern = @"(if)|(\()|(\))|(\,)";
        string str = "IF(SUM(IRS5555.IRs001)==IRS5555.IRS001,10,20)";
        var substrings = Regex.Split(str, pattern, RegexOptions.IgnoreCase).Where(n => !string.IsNullOrEmpty(n));
        foreach (string match in substrings)
        {
            Console.WriteLine("Token is:{0}", match);
        }

在此处输入图像描述

This happens because "IF" and "(" are separators and since there is nothing to the left of "IF" and nothing between "IF" and "(" you get these two empty entries. Remove "IF" from the pattern.

string pattern = @"(\()|(\))|(\,)"; 

UPDATE

You could search for the tokens instead of splitting the string

var matches = Regex.Matches(str, @"\w+|[().,]|==");

This returns exacly the tokens of your text.

string[] array = matches.Cast<Match>().Select(m => m.Value).ToArray();
    [0]: "IF"
    [1]: "("
    [2]: "SUM"
    [3]: "("
    [4]: "IRS5555"
    [5]: "."
    [6]: "IRs001"
    [7]: ")"
    [8]: "=="
    [9]: "IRS5555"
    [10]: "."
    [11]: "IRS001"
    [12]: ","
    [13]: "10"
    [14]: ","
    [15]: "20"
    [16]: ")"

UPDATE

Another Regex pattern you can try together with Regex.Split is

@"\b"

It will split the text at word boundries

    [0]: ""
    [1]: "IF"
    [2]: "("
    [3]: "SUM"
    [4]: "("
    [5]: "IRS5555"
    [6]: "."
    [7]: "IRs001"
    [8]: ")=="
    [9]: "IRS5555"
    [10]: "."
    [11]: "IRS001"
    [12]: ","
    [13]: "10"
    [14]: ","
    [15]: "20"
    [16]: ")"

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM