简体   繁体   中英

How to process the telemetry json messages in Azure data lake Gen2?

I have simulated devices which is sending messages to IoT Hub blob storage and from there I am copying data(encoded in JSON format) to Azure Data Lake Gen2 by creating a pipeline using Azure Data Factory.

How to convert these json output file to CSV file to be processed by data lake engine? Can't I process all the incoming json telemetry directly in azure data lake?

There are 3 official built-in extractors that allows you to analyze data contained in CSV, TSV or Text files.

But MSFT also released some additional sample extractors on their Azure GitHub repo that deal with Xml, Json and Avro files. I have used the Json extractor in production as it is really stable and useful.

The JSON Extractor treats the entire input file as a single JSON document. If you have a JSON document per line, see the next section. The columns that you try to extract will be extracted from the document. In this case, I'm extracting out the _id and Revision properties. Note, it's possible that one of these is a further nested object, in which case you can use the JSON UDF's for subsequent processing.

REFERENCE ASSEMBLY [Newtonsoft.Json];
REFERENCE ASSEMBLY [Microsoft.Analytics.Samples.Formats]; 

//Define schema of file, must map all columns
 @myRecords =
    EXTRACT
        _id string,
    Revision string     
    FROM @"sampledata/json/{*}.json"
    USING new Microsoft.Analytics.Samples.Formats.Json.JsonExtractor();

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM