简体   繁体   English

减少下面的anagram程序中的内存使用量 - C#

[英]Reduce memory usage in the below anagram program - C#

Trying the Anagram checker by inputting paragraph and output is clear, but the memory limits exceeds the specified 通过输入段落和输出来尝试Anagram检查器是明确的,但内存限制超出了指定的范围

This the code which i tried 这是我试过的代码

using System;

class Program
{
    static void Main(string[] args)
    {

        string[] arr = (Punct(Console.ReadLine()).ToLower()).Split(' ');
        string a = string.Empty;
        System.Collections.Generic.Dictionary<string, string> dn = new System.Collections.Generic.Dictionary<string, string>(); // *2
        foreach (string s in arr)
        {
            string st = sort(s);
            if (dn.ContainsKey(st))
            {
                if (dn[st] != s)
                {
                    if (a.Contains(dn[st]))
                        a = a.Replace(dn[st], dn[st] + " " + s); // *1
                    else
                        a = a + dn[st] + " " + s + "\n";
                    dn[st] = s;
                }
            }
            else
                dn.Add(st, s);
        }
        Console.Write(a);
    }

    public static string sort(string s)
    {
        char[] chars = s.ToCharArray();
        Array.Sort(chars);
        return new string(chars);
    }

    public static string Punct(string s)
    {
        System.Text.StringBuilder sb = new System.Text.StringBuilder();
        foreach (char c in s)
        {
            if (!char.IsPunctuation(c))
                sb.Append(c);
        }
        return sb.ToString();
    }
}

On checking with profiler, String function takes a lot of memory and other dictionary too, so, my question is how can i optimize the above code to a least memory, or any codes or declaration which i am using are unnecessary ? 在使用分析器检查时,String函数也占用了大量内存和其他字典,因此,我的问题是如何将上述代码优化到最少内存,或者我使用的任何代码或声明是不必要的?

Input: 输入:

Parts of the world have sunlight for close to 24 hours during summer. 夏季,世界各地都有近24小时的阳光照射。 Dan had a strap on his head to identify himself as the leader and He wondered what kind of traps lay ahead of him. Dan头上有一条带子,以表明自己是领导者,他想知道他前面有什么样的陷阱。

Output: 输出:

parts strap traps 零件表带陷阱
dan and 丹和

Few points I noticed: 我注意到几点:

  • Instead of loading whole file into memory, read it word-by-word. 而不是将整个文件加载到内存中,而是逐字读取。 It might make things more complex, but it will reduce memory for big files. 它可能会使事情变得更复杂,但它会减少大文件的内存。 Not that it matters for example text you provided. 并不重要,例如您提供的文字。
  • Instead of accumulating result output (in a), just save the words in dictionary itself, probably in the list, and output it after you run through whole file. 而不是累积结果输出(在a中),只需将单词保存在字典本身中,可能在列表中,并在运行整个文件后输出。
  • Try using Radix Tree instead of using dictionary. 尝试使用Radix Tree而不是使用字典。

I think my 2nd point here is most important in your case. 我认为我的第二点在你的情况下是最重要的。 1st and 3rd point would matter if you had much bigger file with many more different words, but minimal "hits" of equal anagrams. 如果你有更大的文件,有更多不同的单词,但是相同的字谜最小的“命中率”,第一点和第三点都很重要。

Try to minimize String operations, especially concatenation without StringBuilder. 尝试最小化String操作,尤其是没有StringBuilder的连接。 In addition, just use C# LINQ technology: 另外,只需使用C#LINQ技术:

string input = Console.ReadLine();
string[] words = (Punct(input).ToLower()).Split(' ');

var anagramStrings = words
    .Distinct()
    .GroupBy(sort)
    .Where(anagrams => anagrams.Count() > 1)
    .Select(anagrams => String.Join(" ", anagrams));

string output = String.Join("\n", anagramStrings);

Console.Write(output);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM