简体   繁体   中英

Iterating a dictionary in C#

var dict = new Dictionary<int, string>();
for (int i = 0; i < 200000; i++)
    dict[i] = "test " + i;

I iterated this dictionary using the code below:

foreach (var pair in dict)
    Console.WriteLine(pair.Value);

Then, I iterated it using this:

foreach (var key in dict.Keys)
    Console.WriteLine(dict[key]);

And the second iteration took ~3 seconds less. I can get both keys and values via both methods. What I wonder is whether the second approach has a drawback. Since the most rated question that I can find about this doesn't include this way of iterating a dictionary, I wanted to know why no one uses it and how does it work faster.

Your time tests have some fundamental flaws:

  • Console.Writeline is an I/O operation, which takes orders of magnitude more time than memory accesses and CPU calculations. Any difference in iteration times is probably being dwarfed by the cost of this operation. It's like measuring the weights of pennies in a cast-iron stove.
  • You don't mention how long the overall operation took, so saying that one took 3 seconds less than another is meaningless. If it took 300 seconds to run the first, and 303 seconds to run the second, then you are micro-optimizing.
  • You don't mention how you measured running time. Did running time include the time getting the program assembly loaded and bootstrapped?
  • You don't mention repeatability: Did you run these operations several times? Several hundred times? In different orders?

Here are my tests. Note how I try my best to ensure that the method of iteration is the only thing that changes, and I include a control to see how much of the time is taken up purely because of a for loop and assignment:

void Main()
{
    // Insert code here to set up your test: anything that you don't want to include as
    // part of the timed tests.
    var dict = new Dictionary<int, string>();
    for (int i = 0; i < 2000; i++)
        dict[i] = "test " + i;
    string s = null;
    var actions = new[]
    {
        new TimedAction("control", () => 
        {
    for (int i = 0; i < 2000; i++)
            s = "hi";
        }),
        new TimedAction("first", () => 
        {
            foreach (var pair in dict)
            s = pair.Value;
        }),
        new TimedAction("second", () => 
        {
            foreach (var key in dict.Keys)
            s = dict[key];
        })
    };
    TimeActions(100, // change this number as desired.
        actions);
}


#region timer helper methods
// Define other methods and classes here
public void TimeActions(int iterations, params TimedAction[] actions)
{
    Stopwatch s = new Stopwatch();
    foreach(var action in actions)
    {
        var milliseconds = s.Time(action.Action, iterations);
        Console.WriteLine("{0}: {1}ms ", action.Message, milliseconds);
    }

}

public class TimedAction
{
    public TimedAction(string message, Action action)
    {
        Message = message;
        Action = action;
    }
    public string Message {get;private set;}
    public Action Action {get;private set;}
}

public static class StopwatchExtensions
{
    public static double Time(this Stopwatch sw, Action action, int iterations)
    {
        sw.Restart(); 
        for (int i = 0; i < iterations; i++)
        {
            action();
        }
        sw.Stop();

        return sw.Elapsed.TotalMilliseconds;
    }
}
#endregion

Result

control: 1.2173ms
first: 9.0233ms
second: 18.1301ms

So in these tests, using the indexer takes roughly twice as long as iterating key-value pairs, which is what I would expect*. This stays roughly proportionate if I increase the number of entries and the number of repetitions by an order of magnitude, and I get the same results if I run the two tests in reverse order.

* Why would I expect this result? The Dictionary class probably represents its entries as KeyValuePairs internally, so all it really has to do when you iterate it directly is walk through its data structure once, handing the caller each entry as it comes to it. If you iterate just the Keys , it still has to find each KeyValuePair, and give you the value of the Key property from it, so that step alone is going to cost roughly the same amount as iterating across it in the first place. Then you have to call the indexer, which has to calculate a hash for provided key, jump to the correct hashtable bucket, and do an equality check on the keys of any KeyValuePairs it finds there. These operations aren't terribly expensive, but once you do them N times, it's roughly as expensive as if you'd iterated over the internal hashtable structure again.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM