简体   繁体   中英

Read Meta Data of files inside Azure Data Lake Store

Need to READ META DATA of files stored in Azure Data Lake Store.

File may be of format JPEG, EXCEL or TIFF

Please advise, really looking for suggestions. I am using Microsoft Azure Data Lake Store and using USQL.

At the moment that is not supported. It seems to be on the backlog according to the feedback site

You might be able to write a custom extractor as suggested in the link:

In case it is available, like EXIF in JPEG - extract some of the properties from the content using a custom extractor.

According to this blogpost they have done it for image property extraction, see the repo . It can be a guide on how to implement this for your scenario's. Here is an example query

@image_features =
    EXTRACT copyright string, 
            equipment_make string,
            equipment_model string,
            description string,
            thumbnail byte[], 
            name string, format string
    FROM @"/Samples/Data/Images/{name}.{format}"

    USING new Images.ImageFeatureExtractor(scaleWidth: 500, scaleHeight: 300);

@image_features = SELECT * FROM @image_features
                  WHERE format IN("JPEG", "jpeg", "jpg", "JPG");

OUTPUT @image_features
TO @"/output/images/image_features.csv"
USING Outputters.Csv();

Or have another process extract those properties and put them in some metadatafile in Azure Data Lake so you can join that file.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM