简体   繁体   中英

Fraction of Ping Tasks never complete

I am trying to ping all link-local IP addresses on a particular interface ("169.254.xxx.yyy"; around 65000 of them). I expect only one or two successes. I want to test the IP addresses that are active as early as possible (ie before waiting for all the other pings to time out); if an active address turns out to be the device I want then I can cancel all the other pings.

My application is C#, WinForms, async/await. I create a List<Task<T>> , each of which uses a separate Ping object to probe a particular address. I then use Task.WhenAny to retrieve the results and progressively remove the corresponding Task from the list. (I've also tried the other, more efficient methods of handling the results in order, with similar results).

What I have found is that, of the 65000 or so Tasks, most complete and deliver the appropriate result. However, a few thousand of them (the precise number varies between runs) remain in the WaitingForActivation state and never get run . Only if I reduce the number of Tasks to below about 1300 do I see correct completion of all of them; otherwise, I see a fraction of them (around 5-10%) remain WaitingForActivation. The tasks that don't get run appear to be randomly distributed throughout the list.

I have tried moving the code to a Console application, with the same result. If in each Task I replace the use of SendPingAsync by a call to Task.Delay() with the same timeout, all Tasks complete as expected.

My Console test application:

using System;
using System.Collections.Generic;
using System.Net.NetworkInformation;
using System.Net.WebSockets;
using System.Threading.Tasks;

namespace AsyncPing
{
    class Program
    {
        static ClientWebSocket _commandWebSocket;

        static async Task Main(string[] args)
        {
            var topLevelTasks = new List<Task<ClientWebSocket>>();
            // Scanning task
            topLevelTasks.Add(Task.Run(async () =>
                await TryToConnectLinkLocal()));
            // Monitoring task (just for debugging)
            topLevelTasks.Add(Task.Run(async () => await MonitorLLTasks()));

            await Task.WhenAll(topLevelTasks);
        }

        // Monitoring Task; periodically reports on the state of the tasks collection
        private static async Task<ClientWebSocket> MonitorLLTasks()
        {
            for (int i = 0; i < 1000; ++i)
            {
                await Task.Delay(1000);

                int waitingForActivation = 0, waitingToRun = 0, running = 0, completed = 0;
                int index = 0;
                while (index < tasks.Count)
                {
                    try
                    {
                        switch (tasks[index].Status)
                        {
                            case TaskStatus.WaitingForActivation:
                                waitingForActivation++;
                                break;
                            case TaskStatus.WaitingToRun:
                                waitingToRun++;
                                break;
                            case TaskStatus.Running:
                                running++;
                                break;
                            case TaskStatus.RanToCompletion:
                                completed++;
                                break;
                        }

                        ++index;
                    }
                    catch
                    {
                        // Very occasionally, LLtasks[index] has been removed by the time we access it
                    }
                }
                Console.WriteLine($"There are {index} tasks: {waitingForActivation} waitingForActivation; {waitingToRun} waitingToRun; {running} running; {completed} completed.  {handled} results have been handled.");

                if (tasks.Count == 0)
                    break;
            }
            return null;
        }

        const string LinkLocalIPPrefix = "169.254.";

        static List<Task<String>> tasks = new List<Task<String>>();
        static int handled = 0;

        private static async Task<ClientWebSocket> TryToConnectLinkLocal()
        {
            // Link-local addresses all start with this prefix
            string baseIP = LinkLocalIPPrefix;

            tasks.Clear();
            handled = 0;

            Console.WriteLine("Scanning Link-local addresses...");
            // Scan all Link-local addresses
            // We build a task for each ip address.  
            // Note that there are nearly 65536 addresses to ping and the tasks start running
            // as soon as we start creating them. 
            for (int i = 1; i < 255; i++)
            {
                string ip_i = baseIP + i.ToString() + ".";
                for (int j = 1; j < 255; j++)
                {
                    string ip = ip_i + j.ToString();

                    var task = Task.Run(() => TryToConnectLinkLocal(ip));
                    tasks.Add(task);
                }
            }

            while (tasks.Count > 0)
            {
                var t = await Task.WhenAny(tasks);
                tasks.Remove(t);
                String result = await t;
                handled++;
            }

            return null;
        }
    
        private const int _pingTimeout = 10; // 10ms ought to be enough!

        // Ping the specified address
        static async Task<String> TryToConnectLinkLocal(string ip)
        {
            using (Ping ping = new Ping())
            {
                // This dummy code uses a fixed IP address to avoid possibility of a successful ping
                var reply = await ping.SendPingAsync("169.254.123.234", _pingTimeout);

                if (reply.Status == IPStatus.Success)
                {
                    Console.WriteLine("Response at LL address " + ip);
                    return ip;
                }
            }

            // Alternative: just wait for the duration of the timeout
            //await Task.Delay(_pingTimeout);

            return null;
        }

    }
}

A typical output would be something like (similar lines edited for brevity):

Scanning Link-local addresses...
There are 14802 tasks: 12942 waitingForActivation; 0 waitingToRun; 0 running; 1860 completed.  0 results have been handled.
There are 24623 tasks: 20005 waitingForActivation; 0 waitingToRun; 0 running; 4618 completed.  0 results have been handled.
There are 27287 tasks: 21170 waitingForActivation; 0 waitingToRun; 0 running; 6117 completed.  0 results have been handled.
There are 41714 tasks: 32471 waitingForActivation; 0 waitingToRun; 0 running; 9243 completed.  0 results have been handled.
There are 51263 tasks: 38816 waitingForActivation; 0 waitingToRun; 0 running; 12447 completed.  0 results have been handled.
There are 63891 tasks: 48403 waitingForActivation; 0 waitingToRun; 0 running; 15488 completed.  0 results have been handled.
There are 64498 tasks: 46496 waitingForActivation; 0 waitingToRun; 0 running; 18002 completed.  18 results have been handled.

<All tasks have been created. Many have been run.  More and more results are handled and the corresponding tasks removed>

There are 6626 tasks: 5542 waitingForActivation; 0 waitingToRun; 0 running; 1084 completed.  57890 results have been handled.
There are 5542 tasks: 5542 waitingForActivation; 0 waitingToRun; 0 running; 0 completed.  58974 results have been handled.
There are 5542 tasks: 5542 waitingForActivation; 0 waitingToRun; 0 running; 0 completed.  58974 results have been handled.
There are 5542 tasks: 5542 waitingForActivation; 0 waitingToRun; 0 running; 0 completed.  58974 results have been handled.
There are 5542 tasks: 5542 waitingForActivation; 0 waitingToRun; 0 running; 0 completed.  58974 results have been handled.
There are 5542 tasks: 5542 waitingForActivation; 0 waitingToRun; 0 running; 0 completed.  58974 results have been handled.
There are 5542 tasks: 5542 waitingForActivation; 0 waitingToRun; 0 running; 0 completed.  58974 results have been handled.

<5542 results remain in the list because they are stuck WaitingForActivation.  Only 58974 results (of 64516) have been handled.  This state continues indefinitely>

I would be happy to receive an explanation of this behaviour, suggestions for how to fix it, and/or suggestions about how to probe the network in a more efficient manner.

I've renamed this question because I understand from Stephen Cleary's blog that these tasks may be Promise tasks, which start in the WaitingForActivation state. What is really important here, however, is that they never complete.

Since the original post, I've tried the following:

  • Used continuation tasks as per Stephen Toub's article to handle results in order of completion;
  • Used ConcurrentExclusiveSchedulerPair to try to throttle the execution of the Tasks;
  • Used ConfigureAwait(false) to try to vary the SynchronizationContext used;
  • Checked that no exceptions are being thrown, as far as I can tell.

None of these seemed to have any significant effect.

I've also tried chunking the scan into a number of sub-scans. The sub-scan generates a number of Ping tasks that are executed asynchronously (as per the code above). Each sub-scan is executed synchronously, one after the other. On my machine, provided I keep the number of Ping tasks below about 1100, they all execute correctly. Above that, a fraction of them never complete. It seems as if this approach is not all that much slower (presumably because the network interface gets flooded beyond a certain number of simultaneous pings), so it offers a practical approach to my problem. However, the question of why some of the tasks fail to complete for >1100 tasks remains. And still: if I replace the Ping by a call to await Task.Delay(...) , all tasks complete.

My suggestion is to use an ActionBlock<T> from the TPL Dataflow library. This component will take care of the timely cancellation of the procedure when the first successful IPAddress is found, while also enforcing a maximum concurrency policy.

You start by instantiating an ActionBlock<IPAddress> , providing the action that will run for each IPAddress , as well as the execution options. Then you feed the block with addresses, using the Post method. Then you signal that no more addresses will be posted by invoking the Complete method. Finally you await the Completion property of the component. Example:

const int pingTimeout = 10;
using var cts = new CancellationTokenSource();
IPAddress result = null;

var block = new ActionBlock<IPAddress>(async address =>
{
    try
    {
        var reply = await new Ping().SendPingAsync(address, pingTimeout);
        if (reply.Status == IPStatus.Success)
        {
            Interlocked.CompareExchange(ref result, address, null);
            cts.Cancel();
        }
    }
    catch (PingException) { } // Ignore
}, new ExecutionDataflowBlockOptions()
{
    MaxDegreeOfParallelism = 100, // Select a reasonable value
    CancellationToken = cts.Token
});

byte b1 = 169, b2 = 254;
var addresses = Enumerable.Range(0, 255)
    .SelectMany(_ => Enumerable.Range(0, 255),
        (b3, b4) => new IPAddress(
            new byte[] { b1, b2, (byte)b3, (byte)b4 }));

foreach (var address in addresses) block.Post(address);
block.Complete();

try { await block.Completion; }
catch (OperationCanceledException) { } // Ignore

Console.WriteLine($"Result: {result?.ToString() ?? "(not found)"}");

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM