Generic details due to job but here's a run down.
We currently house customers, customer addresses, customer emails in all separate tables. I'm trying to run a report where I inner join those tables but I only want it to provide distinct results based on the customers table. The issue I'm running into is that it's still returning multiples due to someone may have updated an email and it inserts a new record
Tried moving DISTINCT around as well as doing a "Group By" clause but neither are returning the right results. There is a "last modified" column so maybe I can only have it results with the most recent modified?
Ie charles smith as 3 rows John Smith 4 row etc can I set modified statement so that it only returns the last modified from those?
select
c.customer_id
c.first_name
c.last_name
ce.email_address
ca.addr_street
ca.addr_city
ca.addr_zip
FROM Customers C
INNER JOIN Cust_Address ca ON c.cust_id=ca.addr_cust_id
Inner JOIN Cust_Email ce ON c.cust_id=ce.email_cust_id
I only want it to return one record for each customer no matter how many address/emails they have in the system.
Use row_number()
and subqueries:
SELECT c.customer_id, c.first_name, c.last_name,
ce.email_address, ca.addr_street, ca.addr_city ca.addr_zip
FROM Customers C INNER JOIN
(SELECT ca.*,
ROW_NUMBER() OVER (PARTITION BY addr_cust_id ORDER BY lastmodified DESC) as seqnum
FROM Cust_Address ca
) ca
ON c.cust_id = ca.addr_cust_id INNER JOIN
(SELECT ce.*,
ROW_NUMBER() OVER (PARTITION BY email_cust_id ORDER BY lastmodified DESC) as seqnum
FROM Cust_Email ce
) ce
ON c.cust_id = ce.email_cust_id
WHERE ca.seqnum = 1 AND ce.seqnum = 1;
If, as stated in the question, the duplicates come from cust_email
, and as you are showing a single column from that table in the resultset, a solution would be to remove it from the join and use an inline query in the select
clause, as follows:
select
c.customer_id
c.first_name
c.last_name
(
select email_address
from cust_email ce
where c.cust_id = ce.email_cust_id
order by ce.last_modified desc
limit 1
) email_address
ca.addr_street
ca.addr_city
ca.addr_zip
from
customers c
inner join cust_address ca on c.cust_id=ca.addr_cust_id
This solution uses a limit
clause (supported notably in MySQL and Postgres); the syntax may vary according to your RDBMS (typically, SQLServer and Oracle have different syntax).
With an index on cust_email(cust_id)
, this should be an efficient solution, that avoids the need for window functions.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.