简体   繁体   English

内部连接多个表,但想要基于一列的不同数据

[英]Inner Joining multiple tables but want distinct data based off one column

Generic details due to job but here's a run down.由于工作的一般细节,但这里是一个失败。

We currently house customers, customer addresses, customer emails in all separate tables.我们目前在所有单独的表格中存放客户、客户地址、客户电子邮件。 I'm trying to run a report where I inner join those tables but I only want it to provide distinct results based on the customers table.我正在尝试运行一个报告,我在其中加入这些表,但我只希望它根据客户表提供不同的结果。 The issue I'm running into is that it's still returning multiples due to someone may have updated an email and it inserts a new record我遇到的问题是它仍然返回倍数,因为有人可能已经更新了 email 并插入了一条新记录

Tried moving DISTINCT around as well as doing a "Group By" clause but neither are returning the right results.尝试移动 DISTINCT 以及执行“Group By”子句,但都没有返回正确的结果。 There is a "last modified" column so maybe I can only have it results with the most recent modified?有一个“最后修改”列,所以也许我只能得到最近修改的结果?

Ie charles smith as 3 rows John Smith 4 row etc can I set modified statement so that it only returns the last modified from those?即 charles smith 作为 3 行 John Smith 4 行等我可以设置修改后的语句,以便它只返回最后修改的语句吗?

select
c.customer_id
c.first_name
c.last_name
ce.email_address
ca.addr_street
ca.addr_city
ca.addr_zip
FROM Customers C
INNER JOIN Cust_Address ca ON c.cust_id=ca.addr_cust_id
Inner JOIN Cust_Email ce ON c.cust_id=ce.email_cust_id

I only want it to return one record for each customer no matter how many address/emails they have in the system.我只希望它为每个客户返回一条记录,无论他们在系统中有多少地址/电子邮件。

Use row_number() and subqueries:使用row_number()和子查询:

SELECT c.customer_id, c.first_name, c.last_name,
       ce.email_address, ca.addr_street, ca.addr_city ca.addr_zip
FROM Customers C INNER JOIN
     (SELECT ca.*,
             ROW_NUMBER() OVER (PARTITION BY addr_cust_id ORDER BY lastmodified DESC) as seqnum
      FROM Cust_Address ca
     ) ca
     ON c.cust_id = ca.addr_cust_id INNER JOIN
     (SELECT ce.*,
             ROW_NUMBER() OVER (PARTITION BY email_cust_id ORDER BY lastmodified DESC) as seqnum
      FROM Cust_Email ce
     ) ce
     ON c.cust_id = ce.email_cust_id
WHERE ca.seqnum = 1 AND ce.seqnum = 1;

If, as stated in the question, the duplicates come from cust_email , and as you are showing a single column from that table in the resultset, a solution would be to remove it from the join and use an inline query in the select clause, as follows:如果如问题中所述,重复项来自cust_email ,并且当您在结果集中显示该表中的单个列时,解决方案是将其从连接中删除并在select子句中使用内联查询,如如下:

select
    c.customer_id
    c.first_name
    c.last_name
    (
        select email_address 
        from cust_email ce 
        where c.cust_id = ce.email_cust_id 
        order by ce.last_modified desc 
        limit 1
    ) email_address
    ca.addr_street
    ca.addr_city
    ca.addr_zip
from 
    customers c
    inner join cust_address ca on c.cust_id=ca.addr_cust_id

This solution uses a limit clause (supported notably in MySQL and Postgres);此解决方案使用limit子句(在 MySQL 和 Postgres 中特别受支持); the syntax may vary according to your RDBMS (typically, SQLServer and Oracle have different syntax).语法可能会根据您的 RDBMS 有所不同(通常,SQLServer 和 Oracle 具有不同的语法)。

With an index on cust_email(cust_id) , this should be an efficient solution, that avoids the need for window functions.使用cust_email(cust_id)上的索引,这应该是一个有效的解决方案,它避免了对 window 函数的需要。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM