简体   繁体   English

左连接与右表的前1个值

[英]Left join with right table top 1 values

I am joining an account master table with approximately 4MM rows with a transaction table. 我正在将一个约有4MM行的帐户主表与一个事务表连接起来。 My problem is that when I do a left join of the account number from the transaction table = account number from the account master table, I am uncovering an anomaly in our data. 我的问题是,当我对交易表中的帐号进行左连接=从帐户主表中的帐号进行连接时,我发现我们的数据中存在异常。 I can have 3 different entries in the account master for the same account number. 我可以在帐户管理员中为同一帐号输入3个不同的条目。 These relate to characteristics of the account. 这些与帐户的特征有关。 The anomaly is that while the address information may be the same, in some cases I am seeing the spelling of the city being different. 异常之处在于地址信息可能相同,但在某些情况下,我看到城市的拼写有所不同。 When I join the two tables I only want the first instance of the account number in the account master. 当我加入两个表时,我只想要帐户主帐户中帐号的第一个实例。 I have seen some posts on using the row_number() but I am lost on using it properly here. 我已经看到了一些关于使用row_number()的帖子,但是我在这里无法正确使用它。 This is what I am using but getting three records for each of the account numbers. 这就是我使用的,但是每个帐号都有3条记录。

     select am.[Customer_Name], am.[svc_city], sr.measure
from [dbo].[PP_SUMMARY_RESIDENTIAL] sr
left join [CIS].[dbo].[Account_Master] am on
(case when (left(sr.fred_account_number,2) = '00') then (right(sr.fred_account_number,len(sr.fred_account_number - 2)))
     when (left(sr.fred_account_number,1) = '0') then (right(sr.fred_account_number,len(sr.fred_account_number - 1)))
     else sr.fred_account_number
     end)
 = (select am.accountnumber, row_number() over (order by am.accountnumber) as row) where row = 1
 and sr.fred_account_number = '123456789' 

First of all, if there are several records for the same account then the DB schema and/or the applications that use it are in need of refurbishment. 首先,如果同一帐户有多个记录,则需要翻新数据库架构和/或使用该数据库的应用程序。

Anyway, to select only one record of several "analogous" you can do something along the lines of (simplified from your query) 无论如何,仅选择几个“类似”记录中的一个,就可以按照以下步骤做一些事情(从查询中简化)

with
acc_with_ord as ( 
    select
        col1, col2,..., 
        row_number() over (partition by <uniquely identifying columns> order by <some columns>) as ord
    from
        AccountMaster
),
unq_acc as (
    select * from acc_with_ord where ord = 1

)
select <something>
from
    pp_summary_residential
    left join unq_acc on
        <join conditions>

The first part assigns surrogate order ids to the records describing the same account (since we partition by some fields that uniquely identify the account), the second one selects only one record per account, and the third one is the final selects that uses the unique account records in the join. 第一部分为描述同一帐户的记录分配代理订单ID(因为我们按一些唯一标识该帐户的字段进行了分区),第二部分为每个帐户仅选择一条记录,第三部分是使用唯一帐户的最终选择加入中的帐户记录。

I would suggest using outer apply : 我建议使用outer apply

select am.[Customer_Name], am.[svc_city], sr.measure
from [dbo].[PP_SUMMARY_RESIDENTIAL] sr outer apply
     (select top 1 am.*
      from [CIS].[dbo].[Account_Master] am 
      where (case when (left(sr.fred_account_number, 2) = '00') then (right(sr.fred_account_number,len(sr.fred_account_number - 2)))
                  when (left(sr.fred_account_number,1) = '0') then (right(sr.fred_account_number, len(sr.fred_account_number - 1)))
                  else sr.fred_account_number
             end)
      order by am.account_number
     ) am;

This will select one row from am , which one depends on the order by . 这将从am选择一行,其中一行取决于order byorder by

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM