简体   繁体   中英

categorize records of table in SQL?

I have a table with increasing records.it contains some columns including id1, id2. I want insert a column to categorize these records in this way:

For example if id1=1 in relate with id2=2 be in one category

If id1=3 in relate with id2=2 all three ids 1, 2, 3 group in same category

Pk      | id1     | id2     | category
--------+---------+---------+-----------
1       | 1111    | 2222    | 1
2       | 2222    | 3333    | 1
3       | 3333    | 1111    | 1
4       | 4444    | 5555    | 1
5       | 2222    | 1111    | 1
6       | 5555    | 1111    | 1
7       | 6666    | 8888    | 2
8       | 7777    | 9999    | 3

And if any new record adds to table it get a group and updates old groups. For example if new record was like below, change the category of 7th row to 1

Pk      | id1     | id2     | category
--------+---------+---------+-----------
7       | 6666    | 8888    | 1
8       | 7777    | 9999    | 3
9       | 8888    | 1111    | 1

or instead of inserting a column in this table, create another table with id and category for realizing each id's category.

By this way, I want understand networks between different ID's.

General graph walking is a bit painful using CTEs -- but possible. And there really aren't alternatives.

In SQL Server, you can maintain a list of visited nodes. This prevents infinite recursion. Unfortunately, this list is stored using a string.

So, this calculates the categories:

with t as (
      select v.*
      from (values (1, 1111, 2222),
                   (2, 2222, 3333),
                   (3, 3333, 1111),
                   (4, 4444, 5555),
                   (5, 2222, 1111),
                   (6, 5555, 1111),
                   (7, 6666, 8888),
                   (8, 7777, 9999)
           ) v(pk, id1, id2)
    ),
    cte as (
     select pk, id1, id1 as id2, convert(varchar(max), concat(',', id1, ',')) as visited
     from t
     union all
     select cte.pk, cte.id1, t.id2, convert(varchar(max), concat(visited, t.id2, ','))
     from cte join
          t
          on cte.id2 = t.id1
     where cte.visited not like concat('%,', t.id2, ',%')  
     union all
     select cte.pk, cte.id1, t.id1, convert(varchar(max), concat(visited, t.id1, ','))
     from cte join
          t
          on cte.id2 = t.id2
     where cte.visited not like concat('%,', t.id1, ',%')  
    )   
select pk, id1, min(id2), dense_rank() over (order by min(id2))
from cte
group by pk, id1;

You can adapt this code to do an update (via a join on the primary key).

You can also incorporate this into a trigger or application to adjust the categories when new edges are added.

However, you should revise your data structure. You have a graph data structure, so you should have a table of ids and a table of edges. The categories represent disconnected subgraphs, and should be applied on the nodes not the edges .

Here is a db<>fiddle with the above code.

This pattern will help you I think please customize it for yourself:

declare @t table(id int,parentId int,name varchar(20))
insert @t select 1,  0, 'Category1'
insert @t select 2,  0, 'Category2'
insert @t select 3,  1, 'Category3'
insert @t select 4 , 2, 'Category4'
insert @t select 5 , 1, 'Category5'
insert @t select 6 , 2, 'Category6'
insert @t select 7 , 3, 'Category7'
;

WITH tree (id, parentid, level, name, rn) as 
(
   SELECT id, parentid, 0 as level, name,
       convert(varchar(max),right(row_number() over (order by id),10)) rn
   FROM @t
   WHERE parentid = 0

   UNION ALL

   SELECT c2.id, c2.parentid, tree.level + 1, c2.name,
       rn
   FROM @t c2 
     INNER JOIN tree ON tree.id = c2.parentid
)
SELECT *
FROM tree
order by RN

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM