So I have a table, my_table
with a primary key, id
( INT
), and further columns foo
( VARCHAR
) and bar
( DOUBLE
). Each foo
should appear once in my table, with an associated bar
value, but I know that I have several rows with identical foo
s associated different bar
s. How do I get a list of those rows containing the same foo
value, but which have different bar
s (say, different by more than 10.)? I tried:
SELECT t1.id, t1.bar, t2.id, t2.bar, t1.foo FROM my_table t1, my_table t2 WHERE t1.foo=t2.foo AND t1.bar - t2.bar > 10.;
But I get lots and lots of results (more than the total number of rows in my_table
). I feel I must be doing something very obviously stupid, but can't see my mistake.
Ah - thanks SWeko: I think I understand why I'm getting so many results, then. Is there a way in SQL of counting, for each foo
, the number of rows with that foo
but bar
s differing by more than 10.?
To answer your latest question:
Is there a way in SQL of counting, for each foo, the number of rows with that foo but bars differing by more than 10.?
A query like this should work:
select t1.id, t1.foo, t1.bar, count(t2.id) as dupes
from my_table t1
left outer join my_table t2 on t1.foo=t2.foo and (t1.bar - t2.bar) > 10
group by t1.id, t1.foo, t1.bar;
If, for example, you have 5 rows with foo='A'
and 10 rows with foo='B'
the self-join will join each A-row with each other A-row (including itself) and each B-row with each other B-row, so a simple
SELECT t1.id, t1.bar, t2.id, t2.bar, t1.foo
FROM my_table t1, my_table t2
WHERE t1.foo=t2.foo
will return 5*5+10*10=125
rows. Filtering the values will cut that number down, but you might still have (significantly) more rows than you started with. Eg if we presume that the B-rows have values of bar
of 5 through 50 respectively, that would mean that they will be matched with:
bar = 5 - 0 rows that have bar less than -5
bar = 10 - 0 rows that have bar less than 0
bar = 15 - 0 rows that have bar less than 5
bar = 20 - 1 rows that have bar less than 10
bar = 25 - 2 rows that have bar less than 15
bar = 30 - 3 rows that have bar less than 20
bar = 35 - 4 rows that have bar less than 25
bar = 40 - 5 rows that have bar less than 30
bar = 45 - 6 rows that have bar less than 35
bar = 50 - 7 rows that have bar less than 40
so you will have 28 results for the B-rows alone, and that number rises with the square of the rows that have the same value of foo
.
Have you tried the same thing with the "new" JOIN
syntax?
SELECT t1.*,
t2.*
FROM my_table t1
JOIN my_table t2 ON t1.foo = t2.foo
WHERE (t1.bar - t2.bar) > 10
I don't suspect that that will fix your problem, but for me that's at least where I would start.
I might also try this:
SELECT t1.*,
t2.*
FROM my_table t1
JOIN my_table t2 ON t1.foo = t2.foo AND t1.id != t2.id
WHERE (t1.bar - t2.bar) > 10
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.