简体   繁体   English

mergecontent处理器未提供预期结果

[英]mergecontent processor is not giving expected result

I have json data in multiple small files (some times only one line in a file or a null file). 我在多个小文件中包含json数据(有时一个文件中只有一行或一个空文件)。

I want to merge all small files in to single large file. 我想将所有小文件合并为一个大文件。

I am getting a large file in an unexpected format. 我正在以意外的格式获取大文件。

ex: 例如:

file 1: 文件1:

{"code"="1", "color"="green"}
{"code"="2", "color"="blue"}
{"code"="3", "color"="orange"}

file 2: 文件2:

{"code"="4", "color"="yellow"}
{"code"="5", "color"="red"}

I am getting the below output after using MergeContent 使用MergeContent后,我得到以下输出

{"code"="1", "color"="green"}
{"code"="2", "color"="blue"}
{"code"="3", "color"="orange"}{"code"="4", "color"="yellow"}
{"code"="5", "color"="red"}

Expected output 预期产量

{"code"="1", "color"="green"}
{"code"="2", "color"="blue"}
{"code"="3", "color"="orange"}
{"code"="4", "color"="yellow"}
{"code"="5", "color"="red"}

Any help is appreciated Thank you 任何帮助表示赞赏,谢谢

This is likely because file-1 does not have a new-line character after the last line. 这可能是因为file-1在最后一行之后没有换行符。 The bin-packing merge is literally writing the raw bytes of each flow file one after another with no concept of what is in the bytes, so the bytes of the second file go right after the bytes of the first file. bin-packing合并实际上是逐个写入每个流文件的原始字节,而不涉及字节中的内容,因此第二个文件的字节紧随第一个文件的字节之后。

The are properties for Header, Demarcator, and Footer which will get inserted around the bytes accordingly. Header,Demarcator和Footer的are属性将相应地插入到字节周围。 So using "Delimiter Strategy" of "Text" and entering shift+enter into the "Demarcator" value will tell it to enter a new-line in between each batch of bytes. 因此,使用“文本”的“分隔符策略”并在“分界符”值中输入shift + enter会告诉它在每批字节之间输入换行符。

Keep in mind if some files do end in new-lines then you will sometimes get two new-lines in a row with this approach. 请记住,如果某些文件确实以换行结尾,那么使用这种方法有时会连续出现两个换行。 You could probably filter that out after the fact using RouteText, or try to clean it up before hand using ReplaceText. 您可能会在使用RouteText之后将其过滤掉,或者尝试在使用ReplaceText之前对其进行清理。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM