简体   繁体   English

如何使用c#计算字符串数组中的单词出现次数?

[英]How to count word occurrences in an array of strings using c#?

I'm new to programming, and I am trying to write a program that take in an array of strings (each index of the array being a word) and then count the occurrences of each word in the string. 我是编程的新手,我正在尝试编写一个程序,该程序采用一个字符串数组(该数组的每个索引都是一个单词),然后计算字符串中每个单词的出现次数。 This is what I have so far: 这是我到目前为止:

        string[] words = 
        {
            "which", 
            "wristwatches", 
            "are", 
            "swiss", 
            "wristwatches"
        };

        Array.Sort (words);
        for (int i = 0; i < words.Length; i++) 
        {
            int count = 1;
            for(int j = 1; j < words.Length; j++)
            {
                if (words [i] == words [j])
                {
                    count++;
                }
            }
            Console.WriteLine ("{0}   {1}", words[i], count);
        } 

Ideally, I would like the output to be something like: 理想情况下,我希望输出类似于:

are 1 是1

swiss 1 瑞士1

which 1 其中1

wristwatches 2 手表2

The problems with your code are (1) double-counting and (2) skipping the initial element in the nested loop. 代码的问题是(1)重复计算和(2)跳过嵌套循环中的初始元素。

You double-count because you ignore situations when i == j ; 你重复计算,因为你忽略了i == j时的情况; you skip the initial element because you set int j = 1 . 你跳过了初始元素,因为你设置了int j = 1

The shortest solution is to use LINQ, like this: 最短的解决方案是使用LINQ,如下所示:

var counts = words
    .GroupBy(w => w)
    .Select(g => new {Word = g.Key, Count = g.Count()})
    .ToList();

Now you can print the results like this: 现在您可以打印如下结果:

foreach (var p in counts) {
    Console.WriteLine("Word '{0}' found {1} times", p.Word, p.Count);
}

There are certainly more efficient ways of handling this (take a look at dasblinkenlight's answer for an extremely good one) but asssuming you'd like to keep relatively the same code, you should change your second for loop to something along these lines: 当然,有更有效的方法来处理此问题(请查看dasblinkenlight给出的非常好的答案),但是假设您希望保持相对相同的代码,则应按照以下方式将第二个for循环更改为:

for(int j = i+1; j < words.Length; j++)
{
    if (words [i] == words [j])
    {
        count++;
    }
    else break;
}

Here are the two changes I made: 以下是我所做的两项更改:

1) You should initialize j to i+1; 1)你应该将j初始化为i + 1; You want to check if any of the rest of the Strings are equal to words[i], and the rest of the strings will start at i+1, not 1 (unless i=0). 您想要检查其余字符串是否等于单词[i],其余字符串将从i + 1开始,而不是1(除非i = 0)。

2) For the sake of efficiency, you'll want to break out of the second loop if the two string aren't equal; 2)为了提高效率,如果两个字符串不相等,你会想要跳出第二个循环; since you sorted the array alphabetically, if the word you're currently looking at isn't equal, none of the ones after it will be either. 因为你按字母顺序对数组进行排序,如果你当前正在查看的单词不相等,那么它之后的单词都不会。

For your understanding purpose use String.Compare() 为了您的理解目的,使用String.Compare()

  int Duplicate = words.Lenth + 1; //any value not in the range of the string array
  for (int i = 0; i < words.Length; i++) 
    {
        int count = 1;
        for(int j = 0; j < words.Length; j++)
        {
            if(i != j)  //to avoid same string comparison
            {
               if (string.Compare(words [i],words [j]) == 0)   //or else .Equals(0) 
               {
                  count++;
                  Duplicate = j;
               }
            }
        }
        if(i != Duplicate)
        {
           Console.WriteLine ("{0}   {1}", words[i], count);
        }
    } 

This will not print again the same value. 这不会再次打印相同的值。

var occrs = words.GroupBy(x => x.ToLower())
               .ToDictionary(g => g.Key, g => g.Count());
foreach(var pair in occrs)
    Console.WriteLine(pair.Key + " " +pair.Value);

Make use of dictionary data structure. 利用字典数据结构。 Here the dictionary will store key as word and value as word count. 在这里,字典将键存储为单词,将值存储为单词数。 Insert all the words in dictionary. 在词典中插入所有单词。 If the word inserted word is new, set the value of the word key to 1 , otherwise increment the word-key value by 1. 如果插入的单词是新的,则将单词键的值设置为1,否则将单词键值增加1。

        Dictionary<string, int> wordCount = new Dictionary<string, int>();

        // Insert a word in the dictionary if it exits, otherwise increment 
        //the count of the word

        for (int i = 0; i < words.Length; i++)
        {
            try
            {
                wordCount.Add(words[i], 1);
            }
            catch (Exception)
            {
                wordCount[words[i]] += 1;
            }
        }

        // display word and it's corresponding word count

        foreach (var item in wordCount)
        {
            Console.WriteLine ("{0}   {1}", item.Key, item.Value);
        }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM