简体   繁体   English

SSIS数据流中的排名/行数功能

[英]Rank/Rownumber function in SSIS Dataflow

In my dataflow, after some lookups I would get duplicate customer records(They are not exact duplicates only the customer ID is the same), based on some attributes of the customer like city, location. 在我的数据流中,经过一些查找后,我将根据客户的一些属性(例如城市,地理位置)获得重复的客户记录(它们不是完全重复的,只是客户ID相同)。 I need to choose one record among them. 我需要从中选择一项记录。

How I can achieve this in SSIS dataflow 我如何在SSIS数据流中实现这一目标

Here is the sample data: 这是示例数据:

;with cust (CustomerID,Cutomer_Name,score)
as 
(Select 1 as CustomerID, 'abd' as Cutomer_Name, 100 as Score
union 
select 1,'abd',null
union select 1,'abd',20
)  

select * from cust   

From here I need to choose the record the with lowest score and send only that row to the final table. 从这里,我需要选择得分最低的记录,并将该行仅发送到决赛桌。

It's easy to achieve with Rownum function in SQL, but this case occurs during the dataflow in SSIS 使用SQL中的Rownum函数很容易实现,但是这种情况发生在SSIS中的数据流期间

Do the source's data access mode on an SQL command. 在SQL命令上执行源的数据访问模式。

在此处输入图片说明

Use a MultiCast to split it into two Outputs - say Output1 and Output2. 使用MultiCast将其拆分为两个输出-例如Output1和Output2。 One of the outputs connect to a Aggregate transformation and Group by CustomerId and do a Minimum of Score. 输出之一连接到汇总转换和“按客户ID分组”,并执行“最低分数”。 Now connect back the output of the Aggregate transform to Output2 use a Merge Join in the mapping map Output2.CustomerId = Aggregate Transform.Score and Output2.CustomerId = Aggregate Transform.Score. 现在,使用映射映射Output2.CustomerId = Aggregate Transform.Score和Output2.CustomerId = Aggregate Transform.Score的合并联接将Aggregate变换的输出连接回Output2。 This would do the trick, but if you have multiple customerIds with the same score then you might need a Sort after this step to remove duplicates. 这可以解决问题,但是如果您有多个具有相同分数的customerId,则在此步骤之后可能需要进行排序以删除重复项。 Hope this helps. 希望这可以帮助。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM