简体   繁体   中英

Inner Joining multiple tables but want distinct data based off one column

Generic details due to job but here's a run down.

We currently house customers, customer addresses, customer emails in all separate tables. I'm trying to run a report where I inner join those tables but I only want it to provide distinct results based on the customers table. The issue I'm running into is that it's still returning multiples due to someone may have updated an email and it inserts a new record

Tried moving DISTINCT around as well as doing a "Group By" clause but neither are returning the right results. There is a "last modified" column so maybe I can only have it results with the most recent modified?

Ie charles smith as 3 rows John Smith 4 row etc can I set modified statement so that it only returns the last modified from those?

select
c.customer_id
c.first_name
c.last_name
ce.email_address
ca.addr_street
ca.addr_city
ca.addr_zip
FROM Customers C
INNER JOIN Cust_Address ca ON c.cust_id=ca.addr_cust_id
Inner JOIN Cust_Email ce ON c.cust_id=ce.email_cust_id

I only want it to return one record for each customer no matter how many address/emails they have in the system.

Use row_number() and subqueries:

SELECT c.customer_id, c.first_name, c.last_name,
       ce.email_address, ca.addr_street, ca.addr_city ca.addr_zip
FROM Customers C INNER JOIN
     (SELECT ca.*,
             ROW_NUMBER() OVER (PARTITION BY addr_cust_id ORDER BY lastmodified DESC) as seqnum
      FROM Cust_Address ca
     ) ca
     ON c.cust_id = ca.addr_cust_id INNER JOIN
     (SELECT ce.*,
             ROW_NUMBER() OVER (PARTITION BY email_cust_id ORDER BY lastmodified DESC) as seqnum
      FROM Cust_Email ce
     ) ce
     ON c.cust_id = ce.email_cust_id
WHERE ca.seqnum = 1 AND ce.seqnum = 1;

If, as stated in the question, the duplicates come from cust_email , and as you are showing a single column from that table in the resultset, a solution would be to remove it from the join and use an inline query in the select clause, as follows:

select
    c.customer_id
    c.first_name
    c.last_name
    (
        select email_address 
        from cust_email ce 
        where c.cust_id = ce.email_cust_id 
        order by ce.last_modified desc 
        limit 1
    ) email_address
    ca.addr_street
    ca.addr_city
    ca.addr_zip
from 
    customers c
    inner join cust_address ca on c.cust_id=ca.addr_cust_id

This solution uses a limit clause (supported notably in MySQL and Postgres); the syntax may vary according to your RDBMS (typically, SQLServer and Oracle have different syntax).

With an index on cust_email(cust_id) , this should be an efficient solution, that avoids the need for window functions.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM