I have a hierarchical table in MySQL: parent
field of each item points to the id
field of its parent item. For each item I can get the list of all its parents [regardless the depth] using the query described here . With GROUP_CONCAT
I get the full path as a single string:
SELECT GROUP_CONCAT(_id SEPARATOR ' > ') FROM (
SELECT @r AS _id,
(
SELECT @r := parent
FROM t_hierarchy
WHERE id = _id
) AS parent,
@l := @l + 1 AS lvl
FROM (
SELECT @r := 200,
@l := 0
) vars,
t_hierarchy h
WHERE @r <> 0
ORDER BY lvl DESC
) x
I can make this work only if the id
of the item is fixed [it's 200
in this case].
I want to do the same for all rows: retrieve the whole table with one additional field ( path
) which will display the full path. The only solution that comes to my mind is to wrap this query in another select, set a temporary variable @id
and use it inside the subquery. But it doesn't work. I get NULL
s in the path
field.
SELECT @id := id, parent, (
SELECT GROUP_CONCAT(_id SEPARATOR ' > ') FROM (
SELECT @r AS _id,
(
SELECT @r := parent
FROM t_hierarchy
WHERE id = _id
) AS parent,
@l := @l + 1 AS lvl
FROM (
SELECT @r := @id,
@l := 0
) vars,
t_hierarchy h
WHERE @r <> 0
ORDER BY lvl DESC
) x
) as path
FROM t_hierarchy
PS I know I can store the paths in a separate field and update them when inserting/updating, but I need a solution based on the linked list technique .
UPDATE: I would like to see a solution that will not use recursion or constructs like for
and while
. The above method for finding paths doesn't use any loops or functions. I want to find a solution in the same logic. Or, if it's impossible, please try to explain why!
Consider the difference between the following two queries:
SELECT @id := id as id, parent, (
SELECT concat(id, ': ', @id)
) as path
FROM t_hierarchy;
SELECT @id := id as id, parent, (
SELECT concat(id, ': ', _id)
FROM (SELECT @id as _id) as x
) as path
FROM t_hierarchy;
They look nearly identical, but give dramatically different results. On my version of MySQL, _id
in the second query is the same for each row in its result set, and equal to the id
of the last row. However, that last bit is only true because I executed the two queries in the order given; after SET @id := 1
, for example, I can see that _id
is always equal to the value in the SET
statement.
So what's going on here? An EXPLAIN
yields a clue:
mysql> explain SELECT @id := id as id, parent, (
-> SELECT concat(id, ': ', _id)
-> FROM (SELECT @id as _id) as x
-> ) as path
-> FROM t_hierarchy;
+----+--------------------+-------------+--------+---------------+------------------+---------+------+------+----------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+-------------+--------+---------------+------------------+---------+------+------+----------------+
| 1 | PRIMARY | t_hierarchy | index | NULL | hierarchy_parent | 9 | NULL | 1398 | Using index |
| 2 | DEPENDENT SUBQUERY | <derived3> | system | NULL | NULL | NULL | NULL | 1 | |
| 3 | DERIVED | NULL | NULL | NULL | NULL | NULL | NULL | NULL | No tables used |
+----+--------------------+-------------+--------+---------------+------------------+---------+------+------+----------------+
3 rows in set (0.00 sec)
That third row, the DERIVED
table with no tables used, indicates to MySQL that it can be calculated exactly once, at any time. The server doesn't notice that the derived table uses a variable defined elsewhere in the query, and has no clue that you want it to be run once per row. You're being bitten by a behavior mentioned in the MySQL documentation on user-defined variables :
As a general rule, you should never assign a value to a user variable and read the value within the same statement. You might get the results you expect, but this is not guaranteed. The order of evaluation for expressions involving user variables is undefined and may change based on the elements contained within a given statement; in addition, this order is not guaranteed to be the same between releases of the MySQL Server.
In my case, it chooses to do calculate that table first, before @id
is (re)defined by the outer SELECT
. In fact, that's exactly why the original hierarchical data query works; the @r
definition is computed by MySQL before anything else in the query, precisely because it's that kind of derived table. However, we need here a way to reset @r
once per table row, not just once for the whole query. To do that, we need a query that looks like the original one, resetting @r
by hand.
SELECT @r := if(
@c = th1.id,
if(
@r is null,
null,
(
SELECT parent
FROM t_hierarchy
WHERE id = @r
)
),
th1.id
) AS parent,
@l := if(@c = th1.id, @l + 1, 0) AS lvl,
@c := th1.id as _id
FROM (
SELECT @c := 0,
@r := 0,
@l := 0
) vars
left join t_hierarchy as th1 on 1
left join t_hierarchy as th2 on 1
HAVING parent is not null
This query uses the second t_hierarchy
the same way the original query does, to ensure there are enough rows in the result for the parent subquery to loop over. It also adds a row for each _id that includes itself as a parent; without that, any root objects (with NULL
in the parent field) would fail to appear in the results at all.
Oddly, running the result through GROUP_CONCAT
seems to disrupt ordering. Fortunately, that function has its own ORDER BY
clause:
SELECT _id,
GROUP_CONCAT(parent ORDER BY lvl desc SEPARATOR ' > ') as path,
max(lvl) as depth
FROM (
SELECT @r := if(
@c = th1.id,
if(
@r is null,
null,
(
SELECT parent
FROM t_hierarchy
WHERE id = @r
)
),
th1.id
) AS parent,
@l := if(@c = th1.id, @l + 1, 0) AS lvl,
@c := th1.id as _id
FROM (
SELECT @c := 0,
@r := 0,
@l := 0
) vars
left join t_hierarchy as th1 on 1
left join t_hierarchy as th2 on 1
HAVING parent is not null
ORDER BY th1.id
) as x
GROUP BY _id;
Fair warning: These queries implicitly rely on the @r
and @l
updates happening before the @c
update. That order is not guaranteed by MySQL, and may change with any version of the server.
Define the getPath function and run the following query:
select id, parent, dbo.getPath(id) as path from t_hierarchy
Defining the getPath function:
create function dbo.getPath( @id int)
returns varchar(400)
as
begin
declare @path varchar(400)
declare @term int
declare @parent varchar(100)
set @path = ''
set @term = 0
while ( @term <> 1 )
begin
select @parent = parent from t_hierarchy where id = @id
if ( @parent is null or @parent = '' or @parent = @id )
set @term = 1
else
set @path = @path + @parent
set @id = @parent
end
return @path
end
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.