简体   繁体   中英

Rank/Rownumber function in SSIS Dataflow

In my dataflow, after some lookups I would get duplicate customer records(They are not exact duplicates only the customer ID is the same), based on some attributes of the customer like city, location. I need to choose one record among them.

How I can achieve this in SSIS dataflow

Here is the sample data:

;with cust (CustomerID,Cutomer_Name,score)
as 
(Select 1 as CustomerID, 'abd' as Cutomer_Name, 100 as Score
union 
select 1,'abd',null
union select 1,'abd',20
)  

select * from cust   

From here I need to choose the record the with lowest score and send only that row to the final table.

It's easy to achieve with Rownum function in SQL, but this case occurs during the dataflow in SSIS

Do the source's data access mode on an SQL command.

在此处输入图片说明

Use a MultiCast to split it into two Outputs - say Output1 and Output2. One of the outputs connect to a Aggregate transformation and Group by CustomerId and do a Minimum of Score. Now connect back the output of the Aggregate transform to Output2 use a Merge Join in the mapping map Output2.CustomerId = Aggregate Transform.Score and Output2.CustomerId = Aggregate Transform.Score. This would do the trick, but if you have multiple customerIds with the same score then you might need a Sort after this step to remove duplicates. Hope this helps.

This is the solution which helped me to solve my issue

http://paultebraak.wordpress.com/2013/02/25/rank-partitioning-in-etl-using-ssis/

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM