简体   繁体   中英

Convert a double list to a grouped string

The program has an input a list of doubles and the output needs to be a string containing the list values grouped by their value. The list values will be grouped if they are equal. Something like: input 9,77,5,5,31 => output 9 77 2*5 31

I created an algorithm in C# (in Java I think that is almost the same) for this but I am not sure if it can be improved regarding its speed or code quaility, or if it has some bugs that I could not see. The algorithm having also some more input, output examples is below.

          List<double> input = new List<double> { 11, 32, 32, 43}; // output 11 2*32 43
        //List<double> input = new List<double> { 11, 11, 43, 43 }; // output 2*11 2*43
        //List<double> input = new List<double> { 10, 11, 12, 13, 14, 15, 16 }; // output 10 11 12 13 14 15 16
        //List<double> input = new List<double> { 11, 11, 11, 11, 11 }; // output 5 * 11
        //List<double> input = new List<double> { 11, 11, 32, 22, 22, 22, 4, 10, 10 }; // output 2*11 32 3*22 4 2*10

        string listAsString = string.Empty;
        double nextElem = double.MinValue;
        for (int i = 0; i < input.Count; i++)
        {
            double currentElem = input[i];

            if (i + 1 < input.Count)
            {
                nextElem = input[i + 1];
            }

            int equalCount = 0;
            while (currentElem.Equals(nextElem) && i < input.Count)
            {
                equalCount++;
                i++;
                currentElem = nextElem;

                if (i < input.Count)
                {
                    nextElem = input[i];
                }
            }

            if (equalCount < 2)
            {
                listAsString += currentElem + " ";
            }
            else
            {
                listAsString += equalCount + "*" + currentElem + " ";
                i--;
            }
        }

        Console.WriteLine(listAsString);

Please let me know if you noticed some bugs or see some improvements that can be done.

Also if you know another implementation of this requirement please add it so that a comparation regarding results, speed, code quality between the algorithms can be done... and find the best way to handle this.

Since the requirement is to group only consecutive equal values, the Dictionary and LINQ GroupBy approaches mentioned in another answer do not apply because they will produce incorrect result for input sequence like 1,2,1 . Also there is no standard LINQ method for doing such grouping (except eventually Aggregate method, but it's no more than inefficient for / foreach loop equivalent).

Shortly, your algorithm is the best for such task. But implementation is not.

The main bottleneck is the string concatenation as mentioned by Peroxy , which (also mentioned in the other answer) is easily fixable by utilizing the StringBuilder class. Once you do that, the performance will be just fine.

The other issue I see in the implementation is usage of special values ( double.MinValue ), duplicate corner case checks, decrementing for loop variable inside the body etc. So although it probably works and I don't see directly a bug, it's kind of hard to follow the algorithm logic and spot a potential bug just reading the implementation. The algorithm itself is quite simple, I would implement it this way:

static string ListAsString(List<double> input)
{
    var sb = new StringBuilder();
    for (int i = 0; i < input.Count; )
    {
        var value = input[i];
        int count = 1;
        while (++i < input.Count && input[i] == value)
            count++;
        if (sb.Length > 0) sb.Append(' ');
        if (count > 1) sb.Append(count).Append('*');
        sb.Append(value);
    }
    return sb.ToString();
}

which IMO is quite easier to follow. Note that there is no duplicate code, no special values and the loop variable i advancing is done only in one place inside the outer loop body. Again, this has nothing to do with performance (which is provided by the StringBuilder usage), but simply readability, redundancy elimination and less error prone.

Personally, I see great potential with Dictionary usage here, here is a quick solution I made with a dictionary implementation:

var input = new List<double> { 9, 77, 5, 5, 31 };
var dict = new Dictionary<double, int>();
var listAsString = new StringBuilder();

foreach (var item in input)
{
    if (dict.ContainsKey(item))
        dict[item]++;
    else
        dict[item] = 1;
}

foreach (var item in dict)
{
    listAsString.Append(item.Value > 1 ? $"{item.Value}*{item.Key} " : $"{item.Key} ");
}

Console.WriteLine(listAsString);

If you ever wanted a non efficient LINQ one liner solution:

string result = string.Join(" ", input.GroupBy(i => i)
                                       .Select(x => 
                                       x.Count() > 1 ? 
                                       $"{x.Count()}*{x.Key} " : 
                                       $"{x.Key} "));

However, I believe your method is written nicely, albeit a bit less readable than the dictionary one, but the main flaw with your solution is that you are using a string when building the final string, you should definitely be using a StringBuilder , I have introduced the StringBuilder in your method and made comparisons between these three methods:

Dictionary    | Your method | GroupBy method
------------------------------------------------
 2 ms         |    0 ms     |    5 ms           n=3
 0 ms         |    0 ms     |    0 ms           n=6
 0 ms         |    0 ms     |    0 ms           n=12
 0 ms         |    0 ms     |    0 ms           n=24
 0 ms         |    0 ms     |    0 ms           n=48
 0 ms         |    0 ms     |    0 ms           n=96
 0 ms         |    0 ms     |    0 ms           n=192
 0 ms         |    0 ms     |    0 ms           n=384
 0 ms         |    0 ms     |    0 ms           n=768
 0 ms         |    0 ms     |    0 ms           n=1536
 1 ms         |    0 ms     |    1 ms           n=3072
 3 ms         |    2 ms     |    3 ms           n=6144
 5 ms         |    4 ms     |    6 ms           n=12288
 8 ms         |    7 ms     |    14 ms          n=24576
 14 ms        |    13 ms    |    25 ms          n=49152
 31 ms        |    32 ms    |    66 ms          n=98304
 80 ms        |    59 ms    |    146 ms         n=196608
 149 ms       |    123 ms   |    294 ms         n=393216
 246 ms       |    218 ms   |    504 ms         n=786432
 483 ms       |    428 ms   |    1040 ms        n=1572864
 999 ms       |    873 ms   |    2070 ms        n=3145728
 1995 ms      |    1784 ms  |    3950 ms        n=6291456

Your solution is always the fastest, if you want to go for speed, keep your solution, but change it to use StringBuilder , use listAsString.Append(currentElem + " ") instead of listAsString += currentElem + " " .

GroupBy could be used if you will only operate with collections that have n < 1000 , use the Dictionary solution if you would rather settle with readability over speed.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM