简体   繁体   English

最新文件和文件动态命名

[英]Latest files and dynamic naming of files

I'm working on a talend job that makes an HTTP GET request to get multiple PDF documents. 我正在做一个天才的工作,该工作发出HTTP GET请求以获取多个PDF文档。 The request returns a JSON file consisting of: documentDate and documentLink. 该请求返回一个JSON文件,该文件包含:documentDate和documentLink。 I want to get the latest documentLink(s) from the documentDate and upload the document(s) with a FileFetch component with the filename being "Document_1" if only one document exist of the latest date. 我想从documentDate获取最新的documentLink,并使用FileFetch组件上载该文档,文件名是“ Document_1”,如果只有一个最新日期的文档。 If 2 documents have the latest date then the FileFetch component should upload 2 documents with one filename being "Document_1" and the other "Document_2" I'm unsure on how to loop over the JSON file to get the latest date and naming the document(s) correctly. 如果2个文档具有最新日期,则FileFetch组件应上传2个文档,其中一个文件名为“ Document_1”,另一个为“ Document_2”。我不确定如何循环JSON文件以获取最新日期并命名该文档( s)正确。

What I have done so far: 到目前为止,我所做的是:

 tHTTPRequest_1 --> tExtractJSONFields_1 --> tXMLMap_1 --> tFileFetch_1

This works uploading one file, but there is no check made for the latest documentDate or naming of the filenames in the tFileFetch_1 component. 可以上传一个文件,但是不会检查最新的documentDate或tFileFetch_1组件中文件名的命名。

The returned JSON looks like this: 返回的JSON如下所示:

{
"documents": [
    {
        "documentDate ": 200119,
        "documentLink": "someLink1",
    },
    {
        "documentDate ": 200119,
        "documentLink": "someLink2",
    },
    {
        "documentDate ": 150119,
        "documentLink": "someLink3",
    }
   ]
}

Do you guys have any idea on how to solve this problem? 你们对如何解决这个问题有任何想法吗?

I believe you are looking for something similar to the following: 我相信您正在寻找类似于以下内容的东西:

在此处输入图片说明

The first part of the Job consists of : 工作的第一部分包括:

tFileInputJson (or in your case the tHttpRequest1) -> tSetGlobalVar -> tExtractJsonFields ->tJavaRow tFileInputJson(或您的tHttpRequest1)-> tSetGlobalVar-> tExtractJsonFields-> tJavaRow

tHttpRequest1 will grab the JSON response (can be sorted or not) tHttpRequest1将获取JSON响应(可以排序或不排序)

tsetGlobalVar will save the JSON in a global variable to be used in the second part. tsetGlobalVar将JSON保存在要在第二部分中使用的全局变量中。

tExtractJson will extract the DocumentDate from every JSON array entry. tExtractJson将从每个JSON数组条目中提取DocumentDate。

tJavaRow will contain the Java Logic of comparing the dates from different documents and setting the maximum one in a global variable: tJavaRow将包含Java逻辑,用于比较不同文档中的日期并在全局变量中设置最大日期:

String maxDate = (String) globalMap.get("MaxDate");

if (maxDate != null && !maxDate.trim().isEmpty() )
    //Some Logic here to take the max i.e. Convert it to proper date format and compare them.
    globalMap.put("MaxDate", *comparedMaxDate*);
else
    globalMap.put("MaxDate", row4.DOCDATE);

Once this is completed, in our Global Variable we will have the original JSON response and maximum/recent Date. 完成此操作后,在我们的全局变量中,我们将获得原始的JSON响应和最大/最近日期。

The second Part which will run once the first part(SubJob) is completed will consist of: 一旦第一部分(SubJob)完成,第二部分将运行:

tJava -> tExtractJsonFields -> tMap -> tFileFetch tJava-> tExtractJsonFields-> tMap-> tFileFetch

tJava will simply grab the JSON message and MaxDate from the global variables set in part one. tJava将仅从第一部分中设置的全局变量中获取JSON消息和MaxDate。

tExtractJsonFields will extract the documentLink and documentDate for each array entry and pass them along the max date to the tMap component. tExtractJsonFields将为每个数组条目提取documentLink和documentDate,并将它们沿最大日期传递给tMap组件。

tMap component will simply perform a comparison between the documentDate and max date and check if equal to pass the documentLink to tFileFetch, else it will ignore it. tMap组件将仅在documentDate和最大日期之间执行比较,并检查是否等于将documentLink传递给tFileFetch,否则它将忽略它。

This allows you to only send the documentLinks that have the latest date. 这使您只能发送具有最新日期的documentLink。

Hope this helps and it is clear enough. 希望这会有所帮助,并且足够清楚。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM