简体繁体 English

Azure Data Explorer 高摄取延迟和流式传输

[英]Azure Data Explorer High Ingestion Latency with Streaming

原文 2021-06-15 08:09:24 5 1 azure/ latency/ azure-data-explorer/ data-ingestion

We are using stream ingestion from Event Hubs to Azure Data Explorer.我们正在使用 stream 从事件中心摄取到 Azure 数据资源管理器。 The Documentation states the following: 该文档指出以下内容：

The streaming ingestion operation completes in under 10 seconds, and your data is immediately available for query after completion.流式摄取操作在 10 秒内完成，完成后您的数据立即可供查询。

I am also aware of the limitations such as我也知道一些限制，例如

Streaming ingestion performance and capacity scales with increased VM and cluster sizes.流式摄取性能和容量随着 VM 和集群大小的增加而扩展。 The number of concurrent ingestion requests is limited to six per core.每个核心的并发摄取请求数限制为六个。 For example, for 16 core SKUs, such as D14 and L16, the maximal supported load is 96 concurrent ingestion requests.例如，对于 16 个核心 SKU，例如 D14 和 L16，支持的最大负载为 96 个并发摄取请求。 For two core SKUs, such as D11, the maximal supported load is 12 concurrent ingestion requests.对于两个核心 SKU，例如 D11，支持的最大负载是 12 个并发摄取请求。

But we are currently experiencing ingestion latency of 5 minutes (as shown on the Azure Metrics) and see that data is actually available for querying 10 minutes after ingestion.但我们目前正在经历 5 分钟的摄取延迟（如 Azure 指标所示），并且看到数据实际上可在摄取 10 分钟后进行查询。

Our Dev Environment is the cheapest SKU Dev(No SLA)_Standard_D11_v2 but given that we only ingest ~5000 Events per day (per metric "Events Received") in this environment this latency is very high and not usable in the streaming scenario where we need to have the data available < 1 minute for queries.我们的开发环境是最便宜的 SKU 开发（无 SLA）_Standard_D11_v2，但鉴于我们在此环境中每天仅摄取约 5000 个事件（按“接收的事件”度量），此延迟非常高，无法在我们需要的流式传输场景中使用使数据可用 < 1 分钟进行查询。

Is this the latency we have to expect from the Dev Environment or are the any tweaks we can apply in order to achieve lower latency also in those environments?这是我们必须从开发环境中获得的延迟，还是我们可以应用的任何调整以在这些环境中实现更低的延迟？

How will latency behave with a production environment like Standard_D12_v2?延迟在 Standard_D12_v2 等生产环境中的表现如何？ Do we have to expect those high numbers there as well or is there a fundamental difference in behavior between Dev/test and Production Environments in this concern?我们是否也必须期待那些高数字，或者在这个问题上，开发/测试和生产环境之间的行为是否存在根本差异？

1 个解决方案

Did you follow the two steps needed to enable the streaming ingestion for the specific table, ie enabling streaming ingestion on the cluster and on the table?您是否遵循了为特定表启用流式摄取所需的两个步骤，即在集群和表上启用流式摄取？

In general, this is not expected, the Dev/Test cluster should exhibit the same behavior as the production cluster with the expected limitations around the size and scale of the operations, if you test it with a few events and see the same latency it means that something is wrong.通常，这是意料之中的，如果您使用一些事件对其进行测试并看到相同的延迟，则开发/测试集群应该表现出与生产集群相同的行为，并且在操作的大小和规模方面具有预期的限制出事了。

If you did follow these steps, and it still does not work please open a support ticket.如果您确实遵循了这些步骤，但仍然无法正常工作，请打开支持票。