简体   繁体   中英

Use the yield to defer Azure blob storage call - c#

I currently have a method where I pull a list of blob file names from Azure. The method is as follows:

internal async Task<IEnumerable<BlobItem>> GetFiles(CloudBlobContainer container, string directoryName, bool recursive)
{
    var results = new List<BlobItem>();
    BlobContinuationToken continuationToken = null;

    do
    {
        var response = await container.GetDirectoryReference(directoryName).ListBlobsSegmentedAsync(false, BlobListingDetails.None, 100, continuationToken, null, null);

        continuationToken = response.ContinuationToken;
        foreach (var item in response.Results)
        {
            if (item.GetType() != typeof(CloudBlobDirectory))
                results.Add(new BlobItem(item));
            else if (recursive)
                results.AddRange(await GetFiles(container, ((CloudBlobDirectory)item).Prefix, recursive));
        }
    }
    while (continuationToken != null);

    return results;
}

What I don't like about my code above is that I run through all files and add to the results until the cancellation token is null. So basically, go get all, then stop and return.

I don't think this is overly efficient - I was thinking I could maybe yield the results so it only goes and gets the next "batch" of results when I'm ready for it (from the calling code).

I'm not that familiar with using yield and have come up with this but I think it might not be deferring the call to ListBlobSegment. Here's my code:

internal IEnumerable<BlobItem> GetFiles(CloudBlobContainer container, string directoryName, bool recursive)
{
    var results = new List<BlobItem>();
    BlobContinuationToken continuationToken = null;

    do
    {
        var response = container.GetDirectoryReference(directoryName).ListBlobsSegmentedAsync(false, BlobListingDetails.None, 100, continuationToken, null, null).GetAwaiter().GetResult();

        continuationToken = response.ContinuationToken;
        foreach (var item in response.Results)
        {


            if (item.GetType() != typeof(CloudBlobDirectory))
                yield return new BlobItem(item);
            else if (recursive)
            {
                var internalResponse =  GetFiles(container, ((CloudBlobDirectory)item).Prefix, recursive));
                foreach (var intItem in internalResponse)
                {
                    yield return intItem;
                }
            }
        }
    }
    while (continuationToken != null);
}

could someone advise me if I'm using the yield statement in the correct way? As mentioned, never used this in anger before and want to get it right :-) My aim to hopefully defer the service call and make the code more efficient for the call.

Thanks in advance for any pointers!

NOTE: using these API's for blob storage

using Microsoft.WindowsAzure.Storage;
using Microsoft.WindowsAzure.Storage.Blob;

Edit 2018-11-15 Starting with C#8 , you will be able to use IAsyncEnumerable :

async IAsyncEnumerable<int> GetBigResultsAsync()
{
    await foreach (var result in GetResultsAsync())
    {
        if (result > 20) yield return result; 
    }
}

IAsyncEnumerable<string> GetAsyncAnswers()
{
    return AsyncEnum.Enumerate<string>(async consumer =>
    {
        foreach (var question in GetQuestions())
        {
            string theAnswer = await answeringService.GetAnswer(question);
            await consumer.YieldAsync(theAnswer);
        }
    });
}

https://blogs.msdn.microsoft.com/dotnet/2018/11/12/building-c-8-0/

https://archive.codeplex.com/?p=asyncenum

Original Answer

You need to use IObservable<BlobItem> instead of IEnumerable<BlobItem> in this case. At least internally, depending on how you actually call GetFiles .

This question has some great explanation which you should look into:

How to yield return item when doing Task.WhenAny

Or the corresponding blog post for the accepted answer.

Side note: You might want to use the parameter useFlatBlobListing=true for ListBlobsSegmentedAsync instead of manually doing the recursive code.

Some quick code which describes how this might look (not tested or anything)

public IEnumerable<BlobItem> GetFilesAsEnumerable(CloudBlobContainer container, string directoryName, bool recursive)
{
    return GetFiles(container, directoryName, recursive).ToEnumerable();
}

public IObservable<BlobItem> GetFiles(CloudBlobContainer container, string directoryName, bool recursive)
{
    return Observable.Create<BlobItem>(async obs =>
        {
            BlobContinuationToken continuationToken = null;

            do
            {
                var response = await container.GetDirectoryReference(directoryName).ListBlobsSegmentedAsync(/*useFlatBlobListing*/ recursive, BlobListingDetails.None, 100, continuationToken, null, null);

                continuationToken = response.ContinuationToken;
                foreach (var item in response.Results)
                {
                    // Only required if recursive == false
                    if (item.GetType() != typeof(CloudBlobDirectory))
                        obs.OnNext(new BlobItem(item));
                }
            }
            while (continuationToken != null);
        });
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM