简体   繁体   English

在特定字符后或最大长度后分割字符串

[英]Split string after specific character or after max length

i want to split a string the following way: 我想通过以下方式拆分字符串:

string s = "012345678x0123x01234567890123456789";
s.SplitString("x",10);

should be split into 应该分成

012345678
x0123
x012345678
9012345678
9

eg the inputstring should be split after the character "x" or length 10 - what comes first. 例如,输入字符串应在字符“ x”或长度10之后分割(以先到为准)。

here is what i've tried so far: 到目前为止,这是我尝试过的:

public static IEnumerable<string> SplitString(this string sInput, string search, int maxlength)
{
    int index = Math.Min(sInput.IndexOf(search), maxlength);
    int start = 0;
    while (index != -1)
    {
        yield return sInput.Substring(start, index-start);
        start = index;
        index = Math.Min(sInput.IndexOf(search,start), maxlength);
    }
}

I would go with this regular expression: 我会使用以下正则表达式:

([^x]{1,10})|(x[^x]{1,9})

which means: 意思是:

Match at most 10 characters that are not x OR match x followed by at most 9 characters thar are not x 最多匹配10个非x字符,或者匹配x后跟最多9个字符,而不是x

Here is working example: 这是工作示例:

string regex = "([^x]{1,10})|(x[^x]{1,9})";
string input = "012345678x0123x01234567890123456789";
var results = Regex.Matches(input, regex)
                    .Cast<Match>()
                    .Select(m => m.Value);

which produces values by you. 由您产生价值。

Personally I don't like RegEx. 我个人不喜欢RegEx。 It creates code that is hard to de-bug and is very hard to work out what it is meant to be doing when you first look at it. 它会创建难以调试的代码,并且很难弄清您初次查看代码时的意图。 So for a more lengthy solution I would go with something like this. 因此,对于更长的解决方案,我将采用类似的方法。

    public static IEnumerable<string> SplitString(this string sInput, char search, int maxlength)
    {
        var result = new List<string>();
        var count = 0;
        var lastSplit = 0;

        foreach (char c in sInput)
        {
            if (c == search || count - lastSplit == maxlength)
            {
                result.Add(sInput.Substring(lastSplit, count - lastSplit));
                lastSplit = count;
            }

            count ++;
        }

        result.Add(sInput.Substring(lastSplit, count - lastSplit));

        return result;
    }

Note I changed the first parameter to a char (from a string). 注意我将第一个参数更改为char(从字符串)。 This code can probably be optimised some more, but it is nice and readable, which for me is more important. 该代码可能可以进行更多优化,但是它很好且可读,对我而言,这更重要。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM