簡體   English   中英

使用Nifi構建攝取的json數據的可能性

[英]Possibilities for structuring ingested json data using Nifi

使用Nifi,是否可以將json文件加載到結構化表中?

我調用了以下天氣預報數據(來自6000個氣象站),目前將它們加載到HDFS中。 全部顯示在一行上:

{"SiteRep":{"Wx":{"Param":[{"name":"F","units":"C","$":"Feels Like Temperature"},{"name":"G","units":"mph","$":"Wind Gust"},{"name":"H","units":"%","$":"Screen Relative Humidity"},{"name":"T","units":"C","$":"Temperature"},{"name":"V","units":"","$":"Visibility"},{"name":"D","units":"compass","$":"Wind Direction"},{"name":"S","units":"mph","$":"Wind Speed"},{"name":"U","units":"","$":"Max UV Index"},{"name":"W","units":"","$":"Weather Type"},{"name":"Pp","units":"%","$":"Precipitation Probability"}]},"DV":{"dataDate":"2017-01-12T22:00:00Z","type":"Forecast","Location":[{"i":"14","lat":"54.9375","lon":"-2.8092","name":"CARLISLE AIRPORT","country":"ENGLAND","continent":"EUROPE","elevation":"50.0","Period":{"type":"Day","value":"2017-01-13Z","Rep":{"D":"WNW","F":"-3","G":"25","H":"67","Pp":"0","S":"13","T":"2","V":"EX","W":"1","U":"1","$":"720"}}},{"i":"22","lat":"53.5797","lon":"-0.3472","name":"HUMBERSIDE AIRPORT","country":"ENGLAND","continent":"EUROPE","elevation":"24.0","Period":{"type":"Day","value":"2017-01-13Z","Rep":{"D":"NW","F":"-2","G":"43","H":"63","Pp":"3","S":"25","T":"4","V":"EX","W":"3","U":"1","$":"720"}}}, .....

理想情況下,我希望將模式結構化為6000行表。

我曾嘗試編寫一個將上述內容傳遞給Pig的模式,但沒有成功,可能是因為我對json不夠熟悉,無法正確轉換它。

為了找到一種向數據添加一些結構的簡單方法,我發現Nifi中有一個PutHBaseJson處理器。

誰能建議該PutHBaseJson處理器是否可以使用上述數據結構? 如果是這樣,誰能指出我一個不錯的教程,為我提供有關配置的起點?

非常感謝任何指導。

您可能想使用SplitJson處理器將6000條記錄的JSON結構拆分為6000個單獨的流文件。 如果您需要從頂級響應中“注入”參數定義,則可以執行ReplaceTextJoltTransformJSON操作來處理單個JSON記錄。 這是Yolanda Davis的一篇好文章 ,描述了如何在NiFi中執行Jolt轉換(JSON-> JSON)。

一旦具有包含單個JSON記錄的單個流文件,將它們放入HBase就非常容易了。 Bryan Bende寫了一篇文章,描述PutHBaseJson處理器的必要配置

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM