MySQL correlated subquery in JOIN syntax

Question

I would like to provide a WHERE condition on an inner query by specifying innertable.id = outertable.id. However, MySQL (5.0.45) reports "Unknown column 'outertable.id' in 'where clause'". Is this type of query possible?

The inner query is pivoting rows to columns using a GROUP BY. This could be entirely be performed in the outer query, but would possibly incur additional overhead due to the extra joins.

Alternatively, I can leave off the WHERE condition in the inner query and instead specify an ON outertable.id = innerquery.id, but it would then fetch the entire inner query rowset to join again the outer, which is inefficient.

The actual SQL appears below:

select t.ticketid, u.userid, t.fullname, u.loginapi_userid, t.email, tp.subject, tp.contents, a.PhoneNumber, a.Location, a.Extension, a.BusinessUnit, a.Department
from swtickets t
inner join swticketposts tp on t.ticketid = tp.ticketid
inner join swusers u on t.userid = u.userid
left join
  (
  select
  cfv.typeid,
  min(case cfv.customfieldid when 1 then cfv.fieldvalue end) as 'PhoneNumber',
  min(case cfv.customfieldid when 3 then cfv.fieldvalue end) as 'Location',
  min(case cfv.customfieldid when 5 then cfv.fieldvalue end) as 'Extension',
  min(case cfv.customfieldid when 8 then cfv.fieldvalue end) as 'BusinessUnit',
  min(case cfv.customfieldid when 9 then cfv.fieldvalue end) as 'Department'
  from swcustomfieldvalues cfv
  where cfv.typeid = t.ticketid
  group by cfv.typeid
  ) as a on 1 = 1
where t.ticketid = 2458;

Answer 1

The answer to your question is no, it is not possible to reference correlation names as you are doing. The derived table is produced by your inner query before the outer query starts evaluating joins. So the correlation names like t , tp , and u are not available to the inner query.

To solve this, I'd recommend using the same constant integer value in the inner query, and then join the derived table in the outer query using a real condition instead of 1=1 .

SELECT t.ticketid, u.userid, t.fullname, u.loginapi_userid, t.email,
  tp.subject, tp.contents, a.PhoneNumber, a.Location, a.Extension,
  a.BusinessUnit, a.Department
FROM swtickets t
 INNER JOIN swticketposts tp ON (t.ticketid = tp.ticketid)
 INNER JOIN swusers u ON (t.userid = u.userid)
 LEFT OUTER JOIN (
  SELECT cfv.typeid,
    MIN(CASE cfv.customfieldid WHEN 1 THEN cfv.fieldvalue END) AS 'PhoneNumber',
    MIN(CASE cfv.customfieldid WHEN 3 THEN cfv.fieldvalue END) AS 'Location',
    MIN(CASE cfv.customfieldid WHEN 5 THEN cfv.fieldvalue END) AS 'Extension',
    MIN(CASE cfv.customfieldid WHEN 8 THEN cfv.fieldvalue END) AS 'BusinessUnit',
    MIN(CASE cfv.customfieldid WHEN 9 THEN cfv.fieldvalue END) AS 'Department'
  FROM swcustomfieldvalues cfv
  WHERE cfv.typeid = 2458
  GROUP BY cfv.typeid
  ) AS a ON (a.typeid = t.ticketid)
WHERE t.ticketid = 2458;

Answer 2

You're using the Entity-Attribute-Value design, and there's ultimately no way to make this scalable if you try to generate conventional result sets. Don't try to do this in one query.

Instead, query your normalized tables first:

SELECT t.ticketid, u.userid, t.fullname, u.loginapi_userid, t.email, 
  tp.subject, tp.contents
FROM swtickets t
 INNER JOIN swticketposts tp ON (t.ticketid = tp.ticketid)
 INNER JOIN swusers u ON (t.userid = u.userid)
WHERE t.ticketid = 2458;

Then query your custom fields, with the result on multiple rows of the result set:

SELECT cfv.customfieldid, cfv.fieldvalue
FROM swcustomfieldvalues cfv
WHERE cfv.typeid = 2458;

You'll get multiple rows in the result set, one row for each custom field:

+---------------+--------------+
| customfieldid | fieldvalue   |
+---------------+--------------+
|             1 | 415-555-1234 |
|             3 | Third office |
|             5 | 123          |
|             8 | Support      |
|             9 | Engineering  |
+---------------+--------------+

You then need to write application code to map the result-set fields to the application object fields, in a loop.

Using an Entity-Attribute-Value table in this way is more scalable both in terms of performance and code maintenance.

Answer 3

I would write it with multiple joins. When you say that it "would possibly incur additional overhead" that tells me that you haven't tested it to be sure. If you have decent indexes the joins should be pretty trivial.

This also shows just one of the pitfalls of the generic "hold everything" table design pattern.

Answer 4

My suggestion was going to be what you ruled out on the grounds of efficiency. Eg leaving out the where clause and using a join (as per t.ticketid = a.ticketid)

Have you been able to prove your thoughts on inefficiency by some concrete examples? I know what you are saying but whatever method you use every row in the outer query is being joined to every row in the inner query so depending on the execution plan it may not be as inefficient as you suspect?

Answer 5

I imagine the problem is 'cfv.typeid = t.ticketid' then? My thinking about that would be that, while MySQL supports correlated subqueries, what you're trying to do seems like it could fail in a join because the 'inner' query isn't really 'inside' the rest of the query like it is in a WHERE clause. But it looks like you could just take the where clause out of the subquery and make your join condition on a.typeid = t.ticketid.

MySQL correlated subquery in JOIN syntax

Question

5 answers

solution1
30 ACCPTED 2009-02-12 18:57:24

solution2
2 2009-02-10 03:04:00

solution3
1 2009-02-10 01:37:27

solution4
0 2009-02-09 22:55:50

solution5
0 2009-02-09 23:09:36

MySQL correlated subquery in JOIN syntax

Question

5 answers

solution1 30 ACCPTED 2009-02-12 18:57:24

solution2 2 2009-02-10 03:04:00

solution3 1 2009-02-10 01:37:27

solution4 0 2009-02-09 22:55:50

solution5 0 2009-02-09 23:09:36

solution1
30 ACCPTED 2009-02-12 18:57:24

solution2
2 2009-02-10 03:04:00

solution3
1 2009-02-10 01:37:27

solution4
0 2009-02-09 22:55:50

solution5
0 2009-02-09 23:09:36