简体   繁体   中英

Call async method in new task c#

Problem i'm trying to solve:

For each directory, it exists some files, and I want to upload this to Azure.

So I want to do this: Task1 - uploading files in directory 1 to azure Task 2 - uploading files in directory 2 to azure

I want to do this concurrently.

I have the following code:

private async Task ProcessMatFiles(string directory, List<FileInfo> matFiles)
{
    foreach (var file in matFiles)
    {
        if (!string.IsNullOrEmpty(file.Name) && !string.IsNullOrEmpty(directory) && !string.IsNullOrEmpty(file.FullName))
        {
            var cloudBlockBlob = this._cloudBlobContainer.GetBlockBlobReference("textures/" + directory + "/" + file.Name);

            if (!await cloudBlockBlob.ExistsAsync())
                await cloudBlockBlob.UploadFromFileAsync(file.FullName);
        }
    }
List<Task> tasks = new List<Task>();
foreach (var directory in matFileDirectories)
{
    // Get all the files in the directory
    var matFiles = new DirectoryInfo(directory).EnumerateFiles().ToList();

    // Get the directory name of the files
    var matDirectory = Path.GetFileName(Path.GetDirectoryName(matFiles.FirstOrDefault().FullName));

    if (matFiles.Count > 0 && !string.IsNullOrEmpty(matDirectory))
    {
        var task = new Task(() =>this.ProcessMatFiles(matDirectory, matFiles));
        tasks.Add(task);
        task.Start();
    }
}

Task.WaitAll(tasks.ToArray());

With this code, i get the following warning:

Because this call is not awaited, execution of the current method continues before the call is completed. Consider applying the 'await' operator to the result of the call.

What does that mean? How does this affect my code?

I can remove the warning by doing like this:

var task = new Task(async () => await this.ProcessMatFiles());

Is this the correct way to this?

The real problem seems to be how to process multiple files in parallel. ProcessMatFiles returns a Task already and I'd assume it doesn't run anything heavy on the caller's thread. That task can be stored in the tasks list. That list can be awaited without blocking with

await Task.WhenAll(tasks);

A better solution would be to convert the entire loop into a LINQ query that returns the Tasks and await it.

var tasks = from var directory in matFileDirectories
            let dir=new DirectoryInfo(directory)
            let files=dir.GetFiles()
            select ProcessMatFiles(dir.Name, files));

await Task.WhenAll(tasks);

The problem with this is that enumerating the files in a folder is expensive itself, and GetFiles() , or using EnumerateFiles().ToList() has to wait for the enumeration to finish. It would be better if ProcessMatFiles received the DirectoryInfo object and enumerated the files in a separate thread .

Another improvement would be to process the files one-by-one:

var tasks = from var directory in matFileDirectories
            let dir=new DirectoryInfo(directory)
            from file in dir.EnumerateFiles()
            select ProcessMatFile(dir.Name, file));

It's possible to improve this further if one knows what ProcessMatFiles does, eg use Dataflow blocks or Channels for throttling and using a specific number of tasks, breaking the process into multiple concurrent steps etc.

Update

Since this is a file upload operation, each file is a separate asynchronous operation. Most of the checks can be removed when working with DirectoryInfo and FileInfo objects.

The upload method should be just:

async Task Upload(FileInfo file)
{
    var folder=file.Directory.Name;
    var blob = _cloudBlobContainer.GetBlockBlobReference(${"textures/{folder}/{file.Name}";
    if (!await blob.ExistsAsync())
    {
        await blob.UploadFromFileAsync(file.FullName);
    }
}

The task-producing query can be simplified to:

var tasks = from var directory in matFileDirectories
            let dir=new DirectoryInfo(directory)
            from file in dir.EnumerateFiles()
            select UploadFile(file);

await Task.WhenAll(tasks);

This will try to fire off all upload operations as fast as the files can be iterated. This could flood the network. One solution is to use an ActionBlock that will only use eg 8 tasks at a time to upload files. A limit is placed on the input buffer too, to avoid filling it with eg 1000 FileInfo items:

var options=new ExecutionDataflowBlockOptions
      {
         MaxDegreeOfParallelism = 8,  //Only 8 concurrent operations
         BoundedCapacity=64           //Block posters if the input buffer has too many items
      } ;
var block=new ActionBlock<FileInfo>(async file=>UploadFile(file),options);

var files = from var directory in matFileDirectories
            let dir=new DirectoryInfo(directory)
            from file in dir.EnumerateFiles()
            select file;

foreach(var file in files)
{
    //Wait here if the input buffer is full
    await block.SendAsync(file);
}

block.Complete();

//Wait for all uploads to finish
await block.Completion;

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM