I have a table (TestFI) with the following data for instance
FIID Email
---------
null a@a.com
1 a@a.com
null b@b.com
2 b@b.com
3 c@c.com
4 c@c.com
5 c@c.com
null d@d.com
null d@d.com
and I need records that appear exactly twice AND have 1 row with FIID is null and one is not. Such for the data above, only "a@a.com and b@b.com" fit the bill.
I was able to construct a multilevel query like so
Select
FIID,
Email
from
TestFI
where
Email in
(
Select
Email
from
(
Select
Email
from
TestFI
where
Email in
(
select
Email
from
TestFI
where
FIID is null or FIID is not null
group by Email
having
count(Email) = 2
)
and
FIID is null
)as Temp1
group by Email
having count(Email) = 1
)
However, it took nearly 10 minutes to go through 10 million records. Is there a better way to do this? I know I must be doing some dumb things here.
Thanks
I would try this query:
SELECT EMail, MAX(FFID)
FROM TestFI
GROUP BY EMail
HAVING COUNT(*)=2 AND COUNT(FIID)=1
It will return the EMail column, and the non-null value of FFID. The other value of FFID is null.
With an index on (email, fid)
, I would be tempted to try:
select tnull.*, tnotnull.*
from testfi tnull join
testfi tnotnull
on tnull.email = tnotnull.email left outer join
testfi tnothing
on tnull.email = tnothing.email
where tnothing.email is null and
tnull.fid is null and
tnotnull.fid is not null;
Performance definitely depends on the database. This will keep all the accesses within the index. In some databases, an aggregation might be faster. Performance also depends on the selectivity of the queries. For instance, if there is one NULL record and you have the index (fid, email)
, this should be much faster than an aggregation.
Maybe something like ...
select
a.FIID,
a.Email
from
TestFI a
inner join TestFI b on (a.Email=b.Email)
where
a.FIID is not null
and b.FIID is null
;
And make sure Email and FIID are indexed.
I need records that appear exactly twice AND have 1 row with FIID is null and one is not
On the innermost select, group by email having count = 2:
select email, coalesce(fiid,-1) as AdjusteFIID from T
group by email having count(email) =2
select email, AdjustedFIID
from
(
select email, coalesce(fiid,-1) as AdjusteFIID from T
group by email having count(email) =2
) as X
group by email
having min(adjustedFIID) = -1 and max(adjustedFIID) > -1
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.