简体   繁体   English

Azure表存储的ExecuteAsync()无法插入所有记录

[英]ExecuteAsync() of Azure Table Storage failing to insert all the records

I am trying to insert 10000 records into Azure table storage. 我试图将10000条记录插入Azure表存储。 I am using ExecuteAsync() to achieve it, but somehow approximately around 7500 records are inserted and rest of the records are lost. 我正在使用ExecuteAsync()实现它,但是不知何故,大约插入了7500条记录,其余的记录丢失了。 I am purposely not using await keyword because I don't want to wait for the result, just want to store them in the table. 我故意不使用await关键字,因为我不想等待结果,只想将它们存储在表中。 Below is my code snippet. 以下是我的代码段。

private static async void ConfigureAzureStorageTable()
    {
        CloudStorageAccount storageAccount =
            CloudStorageAccount.Parse(CloudConfigurationManager.GetSetting("StorageConnectionString"));
        CloudTableClient tableClient = storageAccount.CreateCloudTableClient();
        TableResult result = new TableResult();
        CloudTable table = tableClient.GetTableReference("test");
        table.CreateIfNotExists();

        for (int i = 0; i < 10000; i++)
        {
            var verifyVariableEntityObject = new VerifyVariableEntity()
            {
                ConsumerId = String.Format("{0}", i),
                Score = String.Format("{0}", i * 2 + 2),
                PartitionKey = String.Format("{0}", i),
                RowKey = String.Format("{0}", i * 2 + 2)
            };
            TableOperation insertOperation = TableOperation.Insert(verifyVariableEntityObject);
            try
            {
                table.ExecuteAsync(insertOperation);
            }
            catch (Exception e)
            {

                Console.WriteLine(e.Message);
            }
        }
    }

Is anything incorrect with the usage of the method? 该方法的使用是否有误?

You still want to await table.ExecuteAsync() . 仍然await table.ExecuteAsync() That will mean that ConfigureAzureStorageTable() returns control to the caller at that point, which can continue executing. 这将意味着ConfigureAzureStorageTable()会在那时将控制权返回给调用方,调用方可以继续执行。

The way you have it in the question, ConfigureAzureStorageTable() is going to continue past the call to table.ExecuteAsync() and exit, and things like table will go out of scope, while the table.ExecuteAsync() task is still not complete. 问题中的ConfigureAzureStorageTable()将继续调用table.ExecuteAsync()并退出,并且诸如table将超出范围,而table.ExecuteAsync()任务仍未完成。

There are plenty of caveats about using async void on SO and elsewhere that you will also need to consider. 关于在SO和其他地方使用async void许多警告,您也需要考虑。 You could just as easily have your method as async Task but not await it in the caller yet , but keep the returned Task around for clean termination, etc. 你可以很容易有你的方法, async Task ,但在调用者不等待 ,但保留返回Task周围干净终止等

Edit : one addition - you almost certainly want to use ConfigureAwait(false) on your await there, as you don't appear to need to preserve any context. 编辑 :一项附加功能-您几乎肯定要await那里的await使用ConfigureAwait(false) ,因为您似乎不需要保留任何上下文。 This blog post has some guidelines on that and async in general. 这篇博客文章对此有一些指导,并且总体上讲是异步的。

According to your requirement, I have tested your scenario on my side by using CloudTable.ExecuteAsync and CloudTable.ExecuteBatchAsync successfully. 根据您的要求,我已经成功地使用CloudTable.ExecuteAsyncCloudTable.ExecuteBatchAsync对您的方案进行了测试。 Here is my code snippet about using CloudTable.ExecuteBatchAsync to insert records to Azure Table Storage, you could refer to it. 这是我的有关使用CloudTable.ExecuteBatchAsync将记录插入到Azure表存储中的代码片段,您可以参考它。

Program.cs Main Program.cs主要

class Program
{
    static void Main(string[] args)
    {
        CloudStorageAccount storageAccount =
            CloudStorageAccount.Parse(CloudConfigurationManager.GetSetting("StorageConnectionString"));
        CloudTableClient tableClient = storageAccount.CreateCloudTableClient();
        TableResult result = new TableResult();
        CloudTable table = tableClient.GetTableReference("test");
        table.CreateIfNotExists();

        //Generate records to be inserted into Azure Table Storage
        var entities = Enumerable.Range(1, 10000).Select(i => new VerifyVariableEntity()
        {
            ConsumerId = String.Format("{0}", i),
            Score = String.Format("{0}", i * 2 + 2),
            PartitionKey = String.Format("{0}", i),
            RowKey = String.Format("{0}", i * 2 + 2)
        });

        //Group records by PartitionKey and prepare for executing batch operations
        var batches = TableBatchHelper<VerifyVariableEntity>.GetBatches(entities);

        //Execute batch operations in parallel
        Parallel.ForEach(batches, new ParallelOptions()
        {
            MaxDegreeOfParallelism = 5
        }, (batchOperation) =>
        {
            try
            {
                table.ExecuteBatch(batchOperation);
                Console.WriteLine("Writing {0} records", batchOperation.Count);
            }
            catch (Exception ex)
            {
                Console.WriteLine("ExecuteBatch throw a exception:" + ex.Message);
            }
        });
        Console.WriteLine("Done!");
        Console.WriteLine("Press any key to exit...");
        Console.ReadKey();
    }
}

TableBatchHelper.cs TableBatchHelper.cs

public class TableBatchHelper<T> where T : ITableEntity
{
    const int batchMaxSize = 100;

    public static IEnumerable<TableBatchOperation> GetBatches(IEnumerable<T> items)
    {
        var list = new List<TableBatchOperation>();
        var partitionGroups = items.GroupBy(arg => arg.PartitionKey).ToArray();
        foreach (var group in partitionGroups)
        {
            T[] groupList = group.ToArray();
            int offSet = batchMaxSize;
            T[] entities = groupList.Take(offSet).ToArray();
            while (entities.Any())
            {
                var tableBatchOperation = new TableBatchOperation();
                foreach (var entity in entities)
                {
                    tableBatchOperation.Add(TableOperation.InsertOrReplace(entity));
                }
                list.Add(tableBatchOperation);
                entities = groupList.Skip(offSet).Take(batchMaxSize).ToArray();
                offSet += batchMaxSize;
            }
        }
        return list;
    }
}

Note: As mentioned in the official document about inserting a batch of entities: 注意:如官方文档中所述,有关插入一批实体:

A single batch operation can include up to 100 entities. 一个批处理操作最多可以包含100个实体。

All entities in a single batch operation must have the same partition key . 单个批处理操作中的所有实体必须具有相同的分区键

In summary, please try to check whether it could work on your side. 总而言之,请尝试检查它是否可以在您身边使用。 Also, you could capture the detailed exception within your console application and capture the HTTP request via Fiddler to catch the HTTP error requests when you inserting records to Azure Table Storage. 另外,您可以在控制台应用程序中捕获详细的异常,并在将记录插入Azure Table Storage时通过Fiddler捕获HTTP请求以捕获HTTP错误请求。

How about using a TableBatchOperation to run batches of N inserts at once? 如何使用TableBatchOperation一次运行N个插入批处理?

private const int BatchSize = 100;

private static async void ConfigureAzureStorageTable()
{
    CloudStorageAccount storageAccount =
        CloudStorageAccount.Parse(CloudConfigurationManager.GetSetting("StorageConnectionString"));
    CloudTableClient tableClient = storageAccount.CreateCloudTableClient();
    TableResult result = new TableResult();
    CloudTable table = tableClient.GetTableReference("test");
    table.CreateIfNotExists();

    var batchOperation = new TableBatchOperation();

    for (int i = 0; i < 10000; i++)
    {
        var verifyVariableEntityObject = new VerifyVariableEntity()
        {
            ConsumerId = String.Format("{0}", i),
            Score = String.Format("{0}", i * 2 + 2),
            PartitionKey = String.Format("{0}", i),
            RowKey = String.Format("{0}", i * 2 + 2)
        };
        TableOperation insertOperation = TableOperation.Insert(verifyVariableEntityObject);
        batchOperation.Add(insertOperation);

        if (batchOperation.Count >= BatchSize)
        {
            try
            {
                await table.ExecuteBatchAsync(batchOperation);
                batchOperation = new TableBatchOperation();
            }
            catch (Exception e)
            {
                Console.WriteLine(e.Message);
            }
        }
    }

    if(batchOperation.Count > 0)
    {
        try
        {
            await table.ExecuteBatchAsync(batchOperation);
        }
        catch (Exception e)
        {
            Console.WriteLine(e.Message);
        }
    }
}

You can adjust BatchSize to what you need. 您可以根据需要调整BatchSize。 Small disclaimer: I didn't try to run this, though it should work. 小免责声明:尽管可以正常运行,但我并未尝试运行它。

But I can't help but wonder why is your function async void ? 但是我不禁想知道为什么您的函数async void That should be reserved for event handlers and similar ones where you cannot decide the interface. 应该保留给事件处理程序以及您无法确定接口的类似处理程序。 In most cases you want to return a task. 在大多数情况下,您想返回任务。 Because now the caller cannot catch exceptions that occur in this function. 因为现在,调用者无法捕获此函数中发生的异常。

async void is not a good practice unless it is an eventhandler. 除非它是一个事件处理程序,否则异步无效不是一个好习惯。

https://msdn.microsoft.com/en-us/magazine/jj991977.aspx https://msdn.microsoft.com/zh-CN/magazine/jj991977.aspx

If you plan to insert many records into azure table storage, batch insert is your best bet. 如果您打算将许多记录插入到Azure表存储中,则最好选择批量插入。

https://msdn.microsoft.com/en-us/library/azure/microsoft.windowsazure.storage.table.tablebatchoperation.aspx https://msdn.microsoft.com/zh-CN/library/azure/microsoft.windowsazure.storage.table.tablebatchoperation.aspx

Keep in mind that it has limit of 100 table operations per batch. 请记住,每个批次最多只能进行100次表操作。

我遇到了同样的问题,并通过强制ExecuteAsync等待结果存在之前解决了它。

table.ExecuteAsync(insertOperation).GetAwaiter().GetResult()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM