简体   繁体   English

在KUSTO ADX中提取失败时如何获取源数据

[英]How to get the Source Data When Ingestion Failure in KUSTO ADX

  1. I have a base table in ADX Kusto DB. 我在ADX Kusto DB中有一个base表。

.create table base (info:dynamic)

  1. I have written a function which parses( dynamic column) the base table and greps a few columns and stores it in another table whenever the base table gets data(from EventHub). 我编写了一个函数,该函数解析( dynamic列) base表并抓取几列,并在base表从EventHub获取数据时将其存储在另一个表中。 Below function and its update policy 以下功能及其更新策略

.create function extractBase() { base | evaluate bag_unpack(info) | project tostring(column1), toreal(column2), toint(column3), todynamic(column4) } .alter table target_table policy update @'[{"IsEnabled": true, "Source": "base", "Query": "extractBase()", "IsTransactional": false, "PropagateIngestionProperties": true}]'

suppose if the base table does not contain the expected column, ingestion error happens. 假设base表不包含预期的列,则发生提取错误。 how do I get the source(row) for the failure? 如何获取失败的来源(行)? When using .show ingestion failures , it displays the failure message. 使用.show ingestion failures ,它会显示失败消息。 there is a column called IngestionSourcePath . 有一个名为IngestionSourcePath的列。 when I browse the URL, getting an exception as Resource Not Found . 当我浏览URL时,出现异常,如Resource Not Found

If ingestion failure happens, I need to store the particular row of base table into IngestionFailure Table. 如果发生提取失败,则需要将base表的特定行存储到IngestionFailure表中。 for further investigation 有待进一步调查

In this case, your source data cannot "not have" a column defined by its schema. 在这种情况下,您的源数据不能“不具有”由其架构定义的列。 If no value was ingested for some column in some row, a null value will be present there and the update policy will not fail. 如果某行中某列未提取任何值,则此处将存在空值,并且更新策略不会失败。

Here the update policy will break if the original table row does not contain enough columns. 如果原始表行没有足够的列,则更新策略将中断。 Currently the source data for such errors is not emitted as part of the failure message. 当前,此类错误的源数据未作为故障消息的一部分发出。

In general, the source URI is only useful when you are ingesting data from blobs. 通常,仅当您从Blob提取数据时,源URI才有用。 In other cases the URI shown in the failed ingestion info is a URI on an internal blob that was created on the fly and no one has access to. 在其他情况下,失败的提取信息中显示的URI是动态创建的内部Blob上的URI,没有人可以访问。

However, there is a command that is missing from documentation (we will make sure to update it) that allows you to duplicate (dump to storage container you provide) the source data for the next failed ingestion into a specific table. 但是,文档中缺少一个命令(我们将确保对其进行更新),该命令允许您将下一次错误提取的源数据复制(转储到提供的存储容器中)到特定表中。

The syntax is: .dup-next-failed-ingest into TableName to h@'Path to Azure blob container' 语法为:.dup-next-failed-ingest到TableName中以h @'Azure blob容器的路径'

Here the path to Azure Blob container must include a writeable SAS. 此处,Azure Blob容器的路径必须包含可写的SAS。 The required permission to run this command is DB admin. 运行此命令所需的权限是DB admin。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM