[英]Nifi - How to put data into Hive database?
I am building a Nifi flow to get json elements from a kafka and write them into a Have table.我正在构建一个 Nifi 流以从 kafka 获取 json 元素并将它们写入 Have 表中。
However, there is very little to none documentation about the processors and how to use them.但是,关于处理器以及如何使用它们的文档很少甚至没有。
What I plan to do is the following:我打算做的是以下几点:
kafka consume --> ReplaceText --> PutHiveQL
Consuming kafka topic is doing great.使用 kafka 主题做得很好。 I receive a json string.我收到一个 json 字符串。
I would like to extract the json data (with replaceText) and put them into the hive table (PutHiveQL).我想提取json数据(使用replaceText)并将它们放入hive表(PutHiveQL)中。
However, I have absolutely no idea how to do this.但是,我完全不知道如何做到这一点。 Documentation is not helping and there is no precise example of processor usage (or I could not find one).文档没有帮助,也没有处理器使用的精确示例(或者我找不到)。
basicly you want to transform your record from kafka into HQL request then send the request to putHiveQl processor.基本上你想将你的记录从 kafka 转换为 HQL 请求,然后将请求发送到 putHiveQl 处理器。
I am not sur that the transformation kafka record -> putHQL can be done with replacing text ( seam little bit hard/ tricky) .我不认为转换 kafka record -> putHQL 可以通过替换文本来完成(接缝有点困难/棘手)。 In general i use custom groovy script processor to do this.通常,我使用自定义 groovy 脚本处理器来执行此操作。
Edit编辑
Global overview :全球概览:
EvaluateJsonPath评估JsonPath
This extract the properties timestamp
and uuid
of my Json flowfile and put them as attribute of the flowfile.这提取了我的 Json 流文件的属性timestamp
和uuid
,并将它们作为流文件的属性。
ReplaceText替换文本
This set flowfile string to empty string and replaces it by the replacement value
property, in which I build the query.这将流文件字符串设置为空字符串,并用replacement value
属性替换它,我在其中构建查询。
You can directly inject the streaming data using Puthivestreaming process.您可以使用 Puthivestreaming 过程直接注入流数据。 create an ORC table with the strcuture matching to the flow and pass the flow to PUTHIVE3STreaming processor it works.创建一个结构与流匹配的 ORC 表,并将流传递给它工作的 PUTHIVE3STreaming 处理器。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.