简体   繁体   English

Azure数据湖中的元数据

[英]Meta data in Azure data lake

I have written a Azure finction in C# that recursivly goes through the data lake and generates a file with metadata (filename,path,size mofied date etc) of all files and folders in the datalake. 我用C#编写了一个Azure函数,该函数递归地遍历数据湖并生成一个包含元数据中所有文件和文件夹的元数据(文件名,路径,大小更改日期等)的文件。

This takes quite a while since we have a lot of files and foders. 由于我们有很多文件和查找程序,因此需要花费相当长的时间。 So I was just wondering if there was a meta data store that we could pull this data from directly? 所以我只是想知道是否有一个元数据存储区,我们可以直接从中获取这些数据? I thinking of something like sys tables in SQL Server. 我想到的是SQL Server中的sys表之类的东西。

Thanks in advance! 提前致谢!

There are some features around file information that will soon be released that give you some of the file system meta data properties. 文件信息中有一些功能即将发布,这些功能为您提供了一些文件系统元数据属性。 But you would still need to enumerate your folder hierarchies yourself. 但是您仍然需要自己枚举文件夹层次结构。

For example: 例如:

@data = 
  EXTRACT 
    vehicle_id int
  , entry_id long
  , event_date DateTime
  , latitude float
  , longitude float
  , speed int
  , direction string
  , trip_id int?
  , uri = FILE.URI()
  , modified_date = FILE.MODIFIED()
  , created_date = FILE.CREATED()
  , file_sz = FILE.LENGTH()
FROM "/Samples/Data/AmbulanceData/vehicle{*}"
USING Extractors.Csv();

OUTPUT @data
TO "/output/releasenotes/winter2018/fileprops.csv"
USING Outputters.Csv(outputHeader : true);

I suggest that you file a request for a file system meta-data catalog view (eg, usql.files and usql.filesystem ) at http://aka.ms/adlfeedback to augment our metadata catalog views. 我建议您通过http://aka.ms/adlfeedback提交对文件系统元数据目录视图(例如usql.filesusql.filesystem )的请求,以增强我们的元数据目录视图。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM