简体   繁体   English

监控Azure Data Lake Store

[英]Monitor Azure Data Lake Store

I store data in XML files in Data Lake Store within each folder, like one folder constitutes one source system. 我将数据存储在Data Lake Store的每个文件夹内的XML文件中,就像一个文件夹构成一个源系统一样。

On end of every day, i would like to run some kid of log analytics to find out how many New XML files are stored in Data Lake Store under every folder?. 每天结束时,我想运行一些日志分析工具,以了解在每个文件夹下的Data Lake Store中存储了多少个新XML文件? I have enabled Diagnostic Logs and also added OMS Log Analytics Suite. 我启用了诊断日志,还添加了OMS Log Analytics Suite。

I would like to know what is the best way to achieve this above report? 我想知道实现以上报告的最佳方法是什么?

It is possible to do some aggregate report (and even create an alert/notification). 可以做一些汇总报告(甚至创建警报/通知)。 Using Log Analytics, you can create a query that searches for any instances when a file is written to your Azure Data Lake Store based on either a common root path, or a file naming: 使用Log Analytics,可以创建一个查询,该查询基于公共根路径或文件命名将文件写入Azure Data Lake Store时搜索所有实例:

AzureDiagnostics
| where ( ResourceProvider == "MICROSOFT.DATALAKESTORE" )
| where ( OperationName == "create" )
| where ( Path_s contains "/webhdfs/v1/##YOUR PATH##")

Alternatively, the last line, could also be: 或者,最后一行也可以是:

| where ( Path_s contains ".xml")

...or a combination of both. ...或两者的结合。

You can then use this query to create an alert that will notify you during a given interval (eg every 24 hours) the number of files that were created. 然后,您可以使用此查询创建警报,该警报将在给定的间隔(例如,每24小时)内通知您已创建的文件数。

Depending on what you need, you can format the query these ways: 根据您的需要,可以通过以下方式设置查询的格式:

  • If you use a common file naming, you can find a match where the path contains said file naming. 如果使用通用文件命名,则可以在路径包含所述文件命名的地方找到匹配项。
  • If you use a common path, you can find a match where the patch matches the common path. 如果使用公用路径,则可以找到补丁与公用路径匹配的匹配项。
  • If you want to be notified of all the instances (not just specific ones), you can use an aggregating query, and an alert when a threshold is reached/exceeded (ie 1 or more events): 如果要通知所有实例(而不仅仅是特定实例),则可以使用聚合查询,并在达到/超过阈值(即1个或多个事件)时发出警报:

     AzureDiagnostics | where ( ResourceProvider == "MICROSOFT.DATALAKESTORE" ) | where ( OperationName == "create" ) | where ( Path_s contains ".xml") | summarize AggregatedValue = count(OperationName) by bin(TimeGenerated, 24h), OperationName 

With the query, you can create the alert by following the steps in this blog post: https://azure.microsoft.com/en-gb/blog/control-azure-data-lake-costs-using-log-analytics-to-create-service-alerts/ . 通过查询,您可以按照此博客文章中的步骤创建警报: https : //azure.microsoft.com/en-gb/blog/control-azure-data-lake-costs-using-log-analytics-创建服务警报/

Let us know if you have more questions or need additional details. 如果您还有其他问题或需要其他详细信息,请告诉我们。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM