简体   繁体   English

C#将字符串拆分为多个字符

[英]C# split string by multiple characters

I want to split a string like this: 我想这样分割一个字符串:

"---hello--- hello ------- hello --- bye" “ ---你好---你好-------你好---再见”

into an array like this: 变成这样的数组:

"hello" ; “你好” ; "hello ------- hello" ; “你好你好” ; "bye" 再见

I tried it with this command: 我用以下命令尝试了它:

test.Split(new string[] {"---"}, StringSplitOptions.RemoveEmptyEntries);

But that doesn't work, it splits the "-------" into 3 this "---- hello". 但这是行不通的,它会将“ -------”分为3个“ ---- hello”。

Edit: 编辑:

I can't modify the text, it is an input and I don't know how it looks like before I have to modify it. 我无法修改文本,它是一种输入,在修改之前我不知道它的外观。

An other example would be: 另一个示例是:

--- example --- -例子-

--------- example text -------- ---------示例文字--------

--- example 2 --- -示例2-

and it should only split the ones with 3 hyphens not the one with more. 并且只应将带有3个连字符的连字符分开,而不将带有多个连字符的连字符分开。

You can use a Regex split. 您可以使用正则表达式拆分。 The regex uses a negative lookahead (?!-) to only match three - exactly. 正则表达式使用负号(?!-)仅匹配三个-恰好。 See also Get exact match of the word using Regex in C# . 另请参见在C#中使用Regex获取单词的完全匹配

string sentence = "---hello--- hello ------- hello --- bye";
var result = Regex.Split(sentence, @"(?<!-)---(?!-)");
foreach (string value in result) {
   Console.WriteLine(value.Trim());
}

.net Fiddle .net小提琴

Solution to find your Tokens with regex: 使用正则表达式查找令牌的解决方案:

(?<!-)---(?!-)

Console.WriteLine(String.Join(",", System.Text.RegularExpressions.Regex.Split("---hello--- hello ------- hello --- bye", "(?<!-)---(?!-)")))

I suggest trying Regex.Split instead of string.Split : 我建议尝试Regex.Split而不是string.Split

  string source = "---hello1--- hello2 ------- hello3 --- bye";

  var result = Regex
    .Split(source, @"(?<=[^-]|^)-{3}(?=[^-]|$)") // splitter is three "-" only
    .Where(item => !string.IsNullOrEmpty(item))  // Removing Empty Entries
    .ToArray();

  Console.Write(string.Join(";", result));

Outcome: 结果:

  hello1; hello2 ------- hello3 ; bye
  1. Replace ----- by something else that is never is your test, like @@@ test.replace("------", "@@@") 用其他永远不需要您测试的东西替换-----,例如@@@ test.replace("------", "@@@")
  2. Split your string 分割字符串
  3. Replace @@@ by ------ 将@@@替换为------

I would suggest using an neutral character like "/split" or something like that. 我建议使用诸如“ / split”之类的中性字符。 Than you can use test.Split(...) without woriing that it split something else that you want. 比您可以使用test.Split(...)而不用担心它会拆分您想要的其他内容。 Your code would now look something like that: 您的代码现在看起来像这样:

string test = "hello\split hello ------- hello \split bye";
test.Split("\split", StringSplitOptions.RemoveEmptyEntries);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM