简体   繁体   English

C#split string但保留分隔符

[英]C# split string but keep separators

There already exist similar questions, but all of them use regexen. 已经存在类似的问题,但所有问题都使用regexen。 The code I'm using (that strips the separators): 我正在使用的代码(剥离分隔符):

string[] sentences = s.Split(new string[] { ". ", "? ", "! ", "... " }, StringSplitOptions.None);

I would like to split a block of text on sentence breaks and keep the sentence terminators. 我想在句子分词上分割一段文字并保留句子终结符。 I'd like to avoid using regexen for performance. 我想避免使用regexen来提高性能。 Is it possible? 可能吗?

I don't believe there is an existing function that does this. 我不相信有一个现有的功能可以做到这一点。 However you can use the following extension method. 但是,您可以使用以下扩展方法。

public static IEnumerable<string> SplitAndKeepSeparators(this string source, string[] separators) {
  var builder = new Text.StringBuilder();
  foreach (var cur in source) {
    builder.Append(cur);
    if (separators.Contains(cur)) {
      yield return builder.ToString();
      builder.Length = 0;
    }
  }
  if (builder.Length > 0) {
    yield return builder.ToString();
  }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM