简体   繁体   English

如何使用C#中的单词将文本值拆分为数组?

[英]How to split text value into array with words in C#?

Is it possible to save value of txtSearche in array splitted into seperate words? 是否有可能将txtSearche值保存在分割成单独单词的数组中?

txtSearche = "put returns between paragraphs";

something like this: 这样的事情:

 StringBuilder sb = new StringBuilder(txtSearche);

array1 = sb[1]   = put
array2 = sb[2]   = returns
array3 = sb[3]
array4 = sb[4]
array5 = sb[5]

how to do it correct? 怎么做对吗?

Yes try this: 是试试这个:

string[] words = txtSearche.Split(' ');

which will give you: 这会给你:

words[0]   = put
words[1]   = returns
words[2]   = between
words[3]   = paragraphs

EDIT: Also as Adkins mentions below, the words array will be created to whatever size is needed by the string that is provided. 编辑:同样如下面Adkins所提到的,单词数组将被创建为提供的字符串所需的任何大小。 If you want the list to have a dynamic size I would say drop the array into a list using List wordList = words.ToList(); 如果您希望列表具有动态大小,我会说使用List wordList = words.ToList();将数组放入列表中。

EDIT: Nakul to split by one space or more, just add them as parameters into the Split() method like below: 编辑:Nakul分割一个或多个空格,只需将它们作为参数添加到Split()方法中,如下所示:

txtSearche.Split(new string[] { " ", "  ", "   " }, StringSplitOptions.None);

or you can tell it simply to split by a single space and ignore entries that are blank, caused by consecutive spaces, by using the StringSplitOptions.RemoveEmptyEntries enum like so 或者您可以通过使用StringSplitOptions.RemoveEmptyEntries枚举来简单地将其拆分为单个空格并忽略由连续空格引起的空白条目

txtSearche.Split(new string[] { " " }, StringSplitOptions.RemoveEmptyEntries);

You could use String.Split . 你可以使用String.Split

Below example will split the string into an array with each word as an item... 下面的示例将字符串拆分为一个数组,每个单词作为项目...

string[] words = txtSearche.Split(' '); string [] words = txtSearche.Split('');

You can find more details here 你可以在这里找到更多细节

None of above work with multiple spaces or new line !!! 以上都不适用于多个空格或新行 !!!

Here is what works with them: 以下是它们的作用:

 string text = "hi!\r\nI am     a wonderful56 text... \r\nyeah...";
 string[] words =Regex.Split(text, @"\s+", RegexOptions.Singleline);

If you need to remove ellipsis then more processing is required and i can give you that as well. 如果你需要删除省略号,那么需要更多的处理,我也可以给你。

UPDATE UPDATE

In fact this is better: 事实上这更好:

 string text = "hi!\r\nI am     a wonderful56 text... \r\nyeah...";
 MatchCollection matches = Regex.Matches(text, @"[\w\d_]+", RegexOptions.Singleline);
 foreach (Match match in matches)
 {
   if(match.Success)
      Console.WriteLine(match.Value);
  }

Outputs : 产出

hi I am a wonderful56 text yeah 嗨,我是一个很棒的文字,是的

StringBuilder sb = new StringBuilder(txtSearche); 

var result  =  sb.Tostring().Split(' '); 

如果你想要一个更完整的解决方案而不是完全担心性能,你可以使用这个单线程序来处理标点符号等,并为你提供一系列单词。

string[] words = Regex.Replace(Regex.Replace(text, "[^a-zA-Z0-9 ]", " "), @"\s+", " ").Split(' ');
private void button1_Click(object sender, EventArgs e)
{
    string s = textBox1.Text;            
    string[] words = s.Split(' ');           
    textBox2.Text = words[0];
    textBox3.Text = words[1];
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM