I'm trying to grab one page of up to 5000 blobs, with no prefix. The container in question has roughly 26,000 blobs in it. I consistently get no results on my first page, but I noticed the BlobContinuationToken
that's returned isn't null, so I can page again and get results on the second page. Why aren't there any results on the first page, but there are on the second?
I'd like to be able to do this, and grab only one page:
var response = await container.ListBlobsSegmentedAsync(null).ConfigureAwait(false);
But this returns no results, so instead, I have to call it again, passing in the continuationToken, at which point I do get results.
true
for useFlatBlobListing
and it didn't change anything, but I don't really understand the option (as far as I'm aware, my container's contents are flat) ListBlobsSegmentedAsync
before and never noticed this problem (but the containers were larger) new BlobContinuationToken()
. I'm not sure if one is preferable On a larger container, after awhile it started taking more than two page fetches in order to get results. Each page fetch (including the empty ones) took right around 5 seconds, until it finally returned results. I saw it take up to 12 page fetches at its peak, taking over 60 seconds total to return results on a container that had over 300,000 blobs. This is shortly after doing massive deletes on the container.
It's not at all unexpected that you can occasionally get empty pages or pages with less than the max results along with a continuation token. Why is this a problem if the continuation token returned takes you to your next page? If you don't want to deal with continuation tokens, ListBlobs (not the segmented version) will give an iterator that will lazily get more blobs and follow the continuation tokens for you.
As for the root cause, there's a lot of reasons this could happen. My guess is actually the frequent deletes in your case, but that's a guess. Returning less than the number of max results and a continuation happens for multiple reasons, but a couple I suspect here are: 1. We hit the server-side timeout, so we return what we have thus far 2. Hit edge of a partition which happens more frequently when the blob list is large and may span several machines. If you're frequently deleting blobs and have a lot it may take some time to actually garbage collect those so we'll spend all our time scanning through stuff we don't return.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.