简体   繁体   English

如何在第n次出现时拆分字符串?

[英]How to split a string on the nth occurrence?

What I want to do is to split on the nth occurrence of a string (in this case it's "\\t"). 我想要做的是分裂第n次出现的字符串(在这种情况下它是“\\ t”)。 This is the code I'm currently using and it splits on every occurrence of "\\t". 这是我正在使用的代码,它会在每次出现“\\ t”时分割。

string[] items = input.Split(new char[] {'\t'}, StringSplitOptions.RemoveEmptyEntries);

If input = "one\\ttwo\\tthree\\tfour", my code returns the array of: 如果input =“one \\ ttwo \\ tthree \\ tfour”,我的代码返回以下数组:

  • one
  • two
  • three
  • four

But let's say I want to split it on every "\\t" after the second "\\t". 但是,假设我想在第二个“\\ t”之后将它分成每个“\\ t”。 So, it should return: 所以,它应该返回:

  • one two 一二
  • three
  • four

There is nothing built in. 内置任何东西。

You can use the existing Split , use Take and Skip with string.Join to rebuild the parts that you originally had. 您可以使用现有的Split ,使用TakeSkip with string.Join来重建您最初拥有的部分。

string[] items = input.Split(new char[] {'\t'}, 
                             StringSplitOptions.RemoveEmptyEntries);
string firstPart = string.Join("\t", items.Take(nthOccurrence));
string secondPart = string.Join("\t", items.Skip(nthOccurrence))

string[] everythingSplitAfterNthOccurence = items.Skip(nthOccurrence).ToArray();

An alternative is to iterate over all the characters in the string, find the index of the nth occurrence and substring before and after it (or find the next index after the nth, substring on that etc... etc... etc...). 另一种方法是遍历字符串中的所有字符,找到第n个匹配项的索引和它之前和之后的子字符串(或者在第n个之后找到下一个索引,在该字符串上找到子字符串等等...等等。 )。

[EDIT] After re-reading the edited OP, I realise this doesn't do what is now asked. [编辑]重新阅读编辑后的OP后,我意识到这不会做现在的问题。 This will split on every nth target; 这会在每个第n个目标上分裂; the OP wants to split on every target AFTER the nth one. OP希望在第n个目标之后拆分每个目标。

I'll leave this here for posterity anyway. 无论如何,我会把这个留给后人。


If you were using the MoreLinq extensions you could take advantage of its Batch method. 如果您使用的是MoreLinq扩展 ,则可以利用其Batch方法。

Your code would then look like this: 您的代码将如下所示:

string text = "1\t2\t3\t4\t5\t6\t7\t8\t9\t10\t11\t12\t13\t14\t15\t16\t17";

var splits = text.Split('\t').Batch(5);

foreach (var split in splits)
    Console.WriteLine(string.Join("", split));

I'd probably just use Oded's implementation, but I thought I'd post this for an alternative approach. 我可能只是使用Oded的实现,但我想我会发布这个替代方法。

The implementation of Batch() looks like this: Batch()的实现如下所示:

public static class EnumerableExt
{
    public static IEnumerable<IEnumerable<TSource>> Batch<TSource>(this IEnumerable<TSource> source, int size)
    {
        TSource[] bucket = null;
        var count = 0;

        foreach (var item in source)
        {
            if (bucket == null)
                bucket = new TSource[size];

            bucket[count++] = item;

            if (count != size)
                continue;

            yield return bucket;

            bucket = null;
            count = 0;
        }

        if (bucket != null && count > 0)
            yield return bucket.Take(count);
    }
}

It is likely that you will have to split and re-combine. 您可能需要拆分并重新组合。 Something like 就像是

int tabIndexToRemove = 3;
string str = "My\tstring\twith\tloads\tof\ttabs";
string[] strArr = str.Split('\t');
int numOfTabs = strArr.Length - 1;
if (tabIndexToRemove > numOfTabs)
    throw new IndexOutOfRangeException();
str = String.Empty;
for (int i = 0; i < strArr.Length; i++)
    str += i == tabIndexToRemove - 1 ? 
        strArr[i] : String.Format("{0}\t", strArr[i]);

Result: 结果:

My string withloads of tabs 我的字符串有很多标签

I hope this helps. 我希望这有帮助。

// Return a substring of str upto but not including
// the nth occurence of substr
function getNth(str, substr, n) {
  var idx;
  var i = 0;
  var newstr = '';
  do {
    idx = s.indexOf(c);
    newstr += str.substring(0, idx);
    str = str.substring(idx+1);
  } while (++i < n && (newstr += substr))
  return newstr;
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM