I am in a situation where I would like to store data as respective monthly CSVs using SQL query into SFTP server.
For instance, my query is :
select fooId, bar from FooBar
where query_date>=20180101 and query_date<20180201 --(for the month of January 2018)
I would like to store it as 20180101_FooBar.csv
on to my SFTP server. Similarly, other files for other months follow the same process with different query_date interval.
Important consideration to make : I have to store the *fooId* as MD5 Hash string.
How may I automate this flow in NIFI?
Roughly, the flow that I foresee is:
*ExecuteSQL*(but not sure how to paramterize the counter for query_date)
-> *ConvertAvroToJson*
-> *EvaluateJsonPath* (to extract the fooID )
-> *HashContent*
-> *MergeContent*
-> *PutSFTP*
Please advicee on how I may take this forward.
For this case I could think of three approaches.
Approach 1 : execute SQL query with MD5 function to get hash value of fooId:
Flow:
GenerateFlowFile //add startdate,enddate attributes
startdate -> ${now():format("yyyyMM"):minus(1):append("01")} enddate -> ${now():format("yyyyMM"):append("01")}
ExecuteSQL //select md5(fooId) foodId, bar from FooBar where
query_date>=${startdate} and query_date<${enddate}
Change the above query as per your source to get md5 hash value for column
ConvertRecord //convert Avro format to Json format
Approach 2 : Create MD5 hash value in NiFi
Flow:
GenerateFlowFile //add startdate,enddate attributes
startdate -> ${now():format("yyyyMM"):minus(1):append("01")} enddate -> ${now():format("yyyyMM"):append("01")}
ExecuteSQL //select fooId, bar from FooBar
where query_date>=${startdate} and query_date
change the above query as per your source to get md5 hash value for column
ConvertRecord //convert Avro format to Json format
Another way is to write a script that can parse the json array messages and create md5 hashvalue
for the fooId
key and write the json message with the new md5 hashvalue.
I uploaded both approaches Approach1 and Approach2 templates, Save and Upload to NiFi instance for more reference and use the approach that best fits for your case.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.