简体   繁体   English

使用C#拆分字符串

[英]String split using C#

I have the following string: 我有以下字符串:

string text = "1. This is first sentence. 2. This is the second sentence. 3. This is the third sentence. 4. This is the fourth sentence."

I want to split it according to 1. 2. 3. and so on: 我想根据1. 2. 3.等分割它:

result[0] == "This is first sentence."
result[1] == "This is the second sentence."
result[2] == "This is the third sentence."
result[3] == "This is the fourth sentence."

Is there any way I can do it C#? 有什么方法可以做到吗C#?

假设你不能在你的句子中遇到这样的模式: X. (一个整数,后跟一个点,后跟一个空格),这应该有效:

String[] result = Regex.Split(text, @"[0-9]+\. ");

is it possible that there will be numbers in the sentence too? 是否有可能句子中也会有数字?

As I do not know you formatting, you already said you cannot do on EOL/New Line I would try something like... 因为我不知道你格式化,你已经说过你不能在EOL / New Line做我会尝试像...

List<string> lines = new List<string>();
string buffer = "";
int count = 1;

foreach(char c in input)
{
   if(c.ToString() == count.ToString())
   {
      if(!string.IsNullOrEmpty(buffer))
      {
         lines.Add(buffer);
         buffer = "";
      }
      count++;
   }
   buffer += c;
}

//lines will now contain your splitted data

You can then access each sentence like this... 然后你可以像这样访问每个句子......

string s1 = lines[0];
string s2 = lines[1];
string s3 = lines[2];

Important: Make sure you check the count of lines before getting sentence like... 重要提示:确保在获得句子之前检查行数...

string s1 = lines.Count > 0 ? lines[0] : "";

This makes a big assumption that you will not have the next lines number ID in a given sentance (ie sentence 2 will not contain the number 3) 这假设您在给定的发送中不会有下一行号ID(即句子2不包含数字3)

If this does not help the provide you input in original format (do not add lines breaks if there are none) 如果这没有帮助,你提供原始格式的输入(如果没有,请不要添加换行符)

EDIT : Fixed my code (wrong variable sorry) 编辑 :修正了我的代码(错误变量抱歉)

int index = 1; 
String[] result = Regex.Split(text, @"[0-9]+\. ").Where(i => !string.IsNullOrEmpty(i)).Select(i => (index++).ToString() + ". " + i).ToArray();

result will contain your sentences, including the "line number". 结果将包含您的句子,包括“行号”。

You could split on the '.' 你可以拆分'。' char and drop anything smaller than 2 char from the resulting array. char并从结果数组中删除小于2个char的任何内容。

Of course, this relies on the fact that you would have no datapoints of 1 character other than the numeric indicator, if that was the case you could also check for it as a numeric value. 当然,这依赖于这样一个事实:除了数字指示器之外你没有1个字符的数据点,如果是这种情况你也可以检查它作为数值。

This answer would also drop a period from your sentences, so you'd have to add that back in. There is a lot of manipulation but this saves you from having to read each char and decision it independently. 这个答案也会从你的句子中减去一段时间,所以你必须重新加入。这里有很多操作,但这样你就不必独立阅读每个字符并决定它。

This is the easiest way: 这是最简单的方法:

    var str = "1. This is first sentence." +
              "2. This is the second sentence." +
              "3. This is the third sentence." +
              "n. This is the nenth sentence";
    //set your max number e.g 10000
    var num = Enumerable.Range(1, 10000).Select(x=>x.ToString()+".").ToArray(); 
    var res=str.Split(num ,StringSplitOptions.RemoveEmptyEntries);

Hope this help ;) 希望这个帮助;)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM