简体   繁体   English

将双精度列表转换为分组字符串

[英]Convert a double list to a grouped string

The program has an input a list of doubles and the output needs to be a string containing the list values grouped by their value. 该程序的输入为双精度列表,输出为包含按值分组的列表值的字符串。 The list values will be grouped if they are equal. 如果列表值相等,则将它们分组。 Something like: input 9,77,5,5,31 => output 9 77 2*5 31 像这样的东西:输入9,77,5,5,31 =>输出9 77 2 * 5 31

I created an algorithm in C# (in Java I think that is almost the same) for this but I am not sure if it can be improved regarding its speed or code quaility, or if it has some bugs that I could not see. 为此,我在C#中创建了一个算法(在Java中,我认为几乎是相同的),但是我不确定在速度或代码质量方面是否可以改进它,或者是否存在一些我看不到的错误。 The algorithm having also some more input, output examples is below. 下面还有另外一些输入,输出示例的算法。

          List<double> input = new List<double> { 11, 32, 32, 43}; // output 11 2*32 43
        //List<double> input = new List<double> { 11, 11, 43, 43 }; // output 2*11 2*43
        //List<double> input = new List<double> { 10, 11, 12, 13, 14, 15, 16 }; // output 10 11 12 13 14 15 16
        //List<double> input = new List<double> { 11, 11, 11, 11, 11 }; // output 5 * 11
        //List<double> input = new List<double> { 11, 11, 32, 22, 22, 22, 4, 10, 10 }; // output 2*11 32 3*22 4 2*10

        string listAsString = string.Empty;
        double nextElem = double.MinValue;
        for (int i = 0; i < input.Count; i++)
        {
            double currentElem = input[i];

            if (i + 1 < input.Count)
            {
                nextElem = input[i + 1];
            }

            int equalCount = 0;
            while (currentElem.Equals(nextElem) && i < input.Count)
            {
                equalCount++;
                i++;
                currentElem = nextElem;

                if (i < input.Count)
                {
                    nextElem = input[i];
                }
            }

            if (equalCount < 2)
            {
                listAsString += currentElem + " ";
            }
            else
            {
                listAsString += equalCount + "*" + currentElem + " ";
                i--;
            }
        }

        Console.WriteLine(listAsString);

Please let me know if you noticed some bugs or see some improvements that can be done. 如果您发现一些错误或看到可以完成的改进,请告诉我。

Also if you know another implementation of this requirement please add it so that a comparation regarding results, speed, code quality between the algorithms can be done... and find the best way to handle this. 另外,如果您知道此要求的另一种实现,请添加它,以便可以对算法之间的结果,速度,代码质量进行比较...并找到处理此问题的最佳方法。

Since the requirement is to group only consecutive equal values, the Dictionary and LINQ GroupBy approaches mentioned in another answer do not apply because they will produce incorrect result for input sequence like 1,2,1 . 由于要求仅对连续的相等值进行分组,因此另一个答案中提到的Dictionary和LINQ GroupBy方法不适用,因为它们会对诸如1,2,1类的输入序列产生不正确的结果。 Also there is no standard LINQ method for doing such grouping (except eventually Aggregate method, but it's no more than inefficient for / foreach loop equivalent). 同样,也没有标准的LINQ方法来进行这种分组(最终的Aggregate方法除外,但for / foreach循环等效项而言,效率不高)。

Shortly, your algorithm is the best for such task. 不久,您的算法最适合此类任务。 But implementation is not. 但是执行不是。

The main bottleneck is the string concatenation as mentioned by Peroxy , which (also mentioned in the other answer) is easily fixable by utilizing the StringBuilder class. 主要的瓶颈是Peroxy提到的字符串连接,该字符串连接(在其他答案中也提到了)可以通过使用StringBuilder类轻松修复。 Once you do that, the performance will be just fine. 一旦执行此操作,性能将很好。

The other issue I see in the implementation is usage of special values ( double.MinValue ), duplicate corner case checks, decrementing for loop variable inside the body etc. So although it probably works and I don't see directly a bug, it's kind of hard to follow the algorithm logic and spot a potential bug just reading the implementation. 另一个问题我在执行看到的是特殊值(使用double.MinValue ),重复的极端情况的检查,递减for身体等内部循环变量所以,虽然它可能工作,我不直接看到一个bug,它是一种很难理解算法逻辑并在阅读实现时发现潜在的错误。 The algorithm itself is quite simple, I would implement it this way: 该算法本身非常简单,我可以通过以下方式实现:

static string ListAsString(List<double> input)
{
    var sb = new StringBuilder();
    for (int i = 0; i < input.Count; )
    {
        var value = input[i];
        int count = 1;
        while (++i < input.Count && input[i] == value)
            count++;
        if (sb.Length > 0) sb.Append(' ');
        if (count > 1) sb.Append(count).Append('*');
        sb.Append(value);
    }
    return sb.ToString();
}

which IMO is quite easier to follow. IMO比较容易理解。 Note that there is no duplicate code, no special values and the loop variable i advancing is done only in one place inside the outer loop body. 请注意,没有重复的代码,没有特殊的值,并且循环变量i只能在外部循环体内的一个位置进行。 Again, this has nothing to do with performance (which is provided by the StringBuilder usage), but simply readability, redundancy elimination and less error prone. 同样,这与性能无关(这由StringBuilder用法提供),而仅仅是可读性,冗余消除和较少的错误倾向。

Personally, I see great potential with Dictionary usage here, here is a quick solution I made with a dictionary implementation: 就个人而言,我在这里发现使用Dictionary潜力很大,这是我使用字典实现的一个快速解决方案:

var input = new List<double> { 9, 77, 5, 5, 31 };
var dict = new Dictionary<double, int>();
var listAsString = new StringBuilder();

foreach (var item in input)
{
    if (dict.ContainsKey(item))
        dict[item]++;
    else
        dict[item] = 1;
}

foreach (var item in dict)
{
    listAsString.Append(item.Value > 1 ? $"{item.Value}*{item.Key} " : $"{item.Key} ");
}

Console.WriteLine(listAsString);

If you ever wanted a non efficient LINQ one liner solution: 如果您想要一种效率不高的LINQ单衬板解决方案:

string result = string.Join(" ", input.GroupBy(i => i)
                                       .Select(x => 
                                       x.Count() > 1 ? 
                                       $"{x.Count()}*{x.Key} " : 
                                       $"{x.Key} "));

However, I believe your method is written nicely, albeit a bit less readable than the dictionary one, but the main flaw with your solution is that you are using a string when building the final string, you should definitely be using a StringBuilder , I have introduced the StringBuilder in your method and made comparisons between these three methods: 但是,我相信您的方法写得很好,虽然比字典的可读性差,但是解决方案的主要缺点是,在构建最终字符串时您使用的是字符串,您肯定应该使用StringBuilder ,在您的方法中介绍了StringBuilder ,并对这三种方法进行了比较:

Dictionary    | Your method | GroupBy method
------------------------------------------------
 2 ms         |    0 ms     |    5 ms           n=3
 0 ms         |    0 ms     |    0 ms           n=6
 0 ms         |    0 ms     |    0 ms           n=12
 0 ms         |    0 ms     |    0 ms           n=24
 0 ms         |    0 ms     |    0 ms           n=48
 0 ms         |    0 ms     |    0 ms           n=96
 0 ms         |    0 ms     |    0 ms           n=192
 0 ms         |    0 ms     |    0 ms           n=384
 0 ms         |    0 ms     |    0 ms           n=768
 0 ms         |    0 ms     |    0 ms           n=1536
 1 ms         |    0 ms     |    1 ms           n=3072
 3 ms         |    2 ms     |    3 ms           n=6144
 5 ms         |    4 ms     |    6 ms           n=12288
 8 ms         |    7 ms     |    14 ms          n=24576
 14 ms        |    13 ms    |    25 ms          n=49152
 31 ms        |    32 ms    |    66 ms          n=98304
 80 ms        |    59 ms    |    146 ms         n=196608
 149 ms       |    123 ms   |    294 ms         n=393216
 246 ms       |    218 ms   |    504 ms         n=786432
 483 ms       |    428 ms   |    1040 ms        n=1572864
 999 ms       |    873 ms   |    2070 ms        n=3145728
 1995 ms      |    1784 ms  |    3950 ms        n=6291456

Your solution is always the fastest, if you want to go for speed, keep your solution, but change it to use StringBuilder , use listAsString.Append(currentElem + " ") instead of listAsString += currentElem + " " . 您的解决方案始终是最快的,如果您想追求速度,请保留您的解决方案,但是将其更改为使用StringBuilder ,请使用listAsString.Append(currentElem + " ")代替listAsString += currentElem + " "

GroupBy could be used if you will only operate with collections that have n < 1000 , use the Dictionary solution if you would rather settle with readability over speed. 如果仅对n < 1000集合进行操作,则可以使用GroupBy如果您希望在速度上提高可读性,请使用Dictionary解决方案。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM