简体   繁体   中英

How to aggregate values in two columns in multiple records into one

I am on MS SQL Server.

I have a few records like this:

id   |   firstname     | startpoint   | endpoint   |    lastname     
---------------------------------------------------------------------
1    |   james         |  A           |   B        |    bond         
2    |   james         |  B           |   C        |    bond         
3    |   james         |  X           |   Z        |    bond         

the only difference between these records are startpoint and endpoint

I would like to see something like this:

firstname      | startAndEndPoint    | lastname
-------------------------------------------------
james          | A, B, C, X, Z       | bond

I've tried select concat(startpoint, ', ', endpoint) as startAndEndPoint from table where lastname = 'bond' but it still gives me the result like this:

firstname     | startAndEndPoint    | lastname
james         | A, B                | bond
james         | B, C                | bond
james         | X, Z                | bond

I believe array_agg() function can help me according to this post but it appears that function is available in postgres, not sql server.

You can unpivot and aggregate:

select firstname, lastname, string_agg(pt, ', ') as points
from (select t.*, v.pt,
             row_number() over (partition by firstname, lastname, pt order by pt) as seqnum
      from t cross apply
           (values (t.startpoint), (t.endpoint)) as v(pt)
     ) t
where seqnum = 1
group by firstname, lastname;

Unfortunately, string_agg() doesn't support distinct . However, this is easily remedied by using row_number() .

Edit:

If you wanted to identify each separate connected component, then you can use a recursive CTE:

with cte as (
      select id, firstname, lastname,
             convert(varchar(max), concat(startpoint, ', ', endpoint)) as points,
             endpoint
      from t
      where not exists (select 1 from t t2 where t2.endpoint = t.startpoint)
      union all
      select cte.id, cte.firstname, cte.lastname,
             concat(cte.point, ', ', cte.endpoint), t.endpoint
      from cte join
           t
           on t.startpoint = cte.endpoint and t.id = cte.id
    )
select *
from cte;

Here is a db<>fiddle.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM