简体   繁体   中英

How to extract multiple rows from a table based on values from multiple columns from another table and then concatenate in SQL?

I have two tables, Table 1 and Table 2. Table 1 have columns "start" and "end". Table 2 has column "position" and "Sequence". I would like to extract the sequences from Table 2 from position = start to position = end and the create a new column with the concatenated string.

Table 1

Start End
100 104
105 109

Table 2

Position Seq
100 A
101 T
102 C
103 T
104 G
105 T
106 T
107 G
108 T
109 G

My final result needs to be

Start End Sequence
100 104 ATCTG
105 109 TTGTG

I tried concatenating the values in the Table 2 using the below statement

 SELECT Sequence = (Select '' + Seq 
 from Table2
 where Position >= 100 and Position <= 104
 order by Position FOR XML PATH('')

)

You don't state what DBMS you are using so here is a SQL Server solution using a CTE and FOR XML to perform the transpose:

; WITH SequenceCTE AS
(
    SELECT  [Start],
            [End],
            Seq
    FROM    Table1 a
            JOIN Table2 b
                ON b.Position >= a.[Start] AND
                  b.Position <= a.[End]
)
SELECT  DISTINCT
        a.[Start],
        a.[End],
        (
            SELECT  STUFF(',' + Seq,1,1,'')
            FROM    SequenceCTE b
            WHERE   a.[Start] = b.[Start] AND
                    a.[End] = b.[end]
            FOR XML PATH ('') 
        )
FROM    SequenceCTE a

In standard SQL, you can do something like this:

select t1.start, t1.end,
       listagg(t2.position, '') within group (order by t2.seq) as sequence
from table1 t1 join
     table2 t2
     on t2.position between t1.start and t2.end
group by t1.start, t1.end;

Most databases support aggregate string concatenation, but the function may have a different name and slightly different syntax.

Note that start and end are poor names for columns because they are SQL keywords -- as is sequence in most databases.

You can generate row numbers for your first table which can later be used to group the ranges after joining on those numbers:

with to_id as (select row_number(*) over (order by t1.start) id, t1.* from table1 t1), 
     ranges as (select t3.id, t2.* from table2 t2 join to_id t3 on t3.start <= t2.position and t2.position <= t3.end)
select t3.start, t3.end, group_concat(r1.seq, '') from ranges r1 join to_id t3 on r1.id = t3.id group by r1.id;

Look into how crosstab queries are done.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM