I have loaded local file into talend process and need to do below condition this file data
Below my csv file data showing like
NO,DATE,MARK
123,2015-03-01,200
123,2015-03-01,-200
123,2015-03-01,200
123,2015-03-01,200
125,2016-01-01,80
Here above "200"
and "-200"
two values availed. if I have -200
I need to remove corresponding +200
value after that If I have same NO,DATE,MARK
then I need to remove duplicates two
" 123,2015-03-01,200"," 123,2015-03-01,200" = " 123,2015-03-01,200"
Finally my result should come like below
NO,DATE,MARK
123,2015-03-01,200
125,2016-01-01,80
After that I need to some 200 + 80 = 125,2016-01-01,280
. How to do above process using talend job.
Step by step, we can start by removing this:
123,2015-03-01,200
123,2015-03-01,-200
we can do it by summing MARK
after grouping by NO
and DATE
by using the talend compoenet tAggregateRow
. After, we will get :
123,2015-03-01,0
Now we can use the component tFilterRow
to remove all rows having MARK == 0
, and the component tUniqRow
to remove duplicated rows.
The last step is to get the sum of MARK
using tAggregateRow
and store it in a context variable, then get the greatest NO
and the latest DATE
by using the component tSortRow
and then get only that row using tSampleRow
. We can affect the sum of MARK
.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.