[英]How to extract multiple rows from a table based on values from multiple columns from another table and then concatenate in SQL?
I have two tables, Table 1 and Table 2. Table 1 have columns "start" and "end".我有两个表,表 1 和表 2。表 1 有“开始”和“结束”列。 Table 2 has column "position" and "Sequence".表 2 具有列“位置”和“序列”。 I would like to extract the sequences from Table 2 from position = start to position = end and the create a new column with the concatenated string.我想从表 2 中提取序列,从 position = start 到 position = end 并使用连接字符串创建一个新列。
Table 1表格1
Start开始 | End结尾 |
---|---|
100 100 | 104 104 |
105 105 | 109 109 |
Table 2表 2
Position Position | Seq序列 |
---|---|
100 100 | A一个 |
101 101 | T吨 |
102 102 | C C |
103 103 | T吨 |
104 104 | G G |
105 105 | T吨 |
106 106 | T吨 |
107 107 | G G |
108 108 | T吨 |
109 109 | G G |
My final result needs to be我的最终结果需要是
Start开始 | End结尾 | Sequence序列 |
---|---|---|
100 100 | 104 104 | ATCTG ATCTG |
105 105 | 109 109 | TTGTG TTGTG |
I tried concatenating the values in the Table 2 using the below statement我尝试使用以下语句连接表 2 中的值
SELECT Sequence = (Select '' + Seq
from Table2
where Position >= 100 and Position <= 104
order by Position FOR XML PATH('')
) )
You don't state what DBMS you are using so here is a SQL Server solution using a CTE and FOR XML to perform the transpose:您没有 state 您正在使用什么 DBMS,所以这里是一个 SQL 服务器解决方案,使用 CTE 和 FOR XML 来执行转置:
; WITH SequenceCTE AS
(
SELECT [Start],
[End],
Seq
FROM Table1 a
JOIN Table2 b
ON b.Position >= a.[Start] AND
b.Position <= a.[End]
)
SELECT DISTINCT
a.[Start],
a.[End],
(
SELECT STUFF(',' + Seq,1,1,'')
FROM SequenceCTE b
WHERE a.[Start] = b.[Start] AND
a.[End] = b.[end]
FOR XML PATH ('')
)
FROM SequenceCTE a
In standard SQL, you can do something like this:在标准 SQL 中,您可以执行以下操作:
select t1.start, t1.end,
listagg(t2.position, '') within group (order by t2.seq) as sequence
from table1 t1 join
table2 t2
on t2.position between t1.start and t2.end
group by t1.start, t1.end;
Most databases support aggregate string concatenation, but the function may have a different name and slightly different syntax.大多数数据库支持聚合字符串连接,但 function 可能有不同的名称和略有不同的语法。
Note that start
and end
are poor names for columns because they are SQL keywords -- as is sequence
in most databases.请注意, start
和end
是列的糟糕名称,因为它们是 SQL 关键字 - 大多数数据库中的sequence
也是如此。
You can generate row numbers for your first table which can later be used to group the ranges after joining on those numbers:您可以为您的第一个表生成行号,稍后可用于在加入这些数字后对范围进行分组:
with to_id as (select row_number(*) over (order by t1.start) id, t1.* from table1 t1),
ranges as (select t3.id, t2.* from table2 t2 join to_id t3 on t3.start <= t2.position and t2.position <= t3.end)
select t3.start, t3.end, group_concat(r1.seq, '') from ranges r1 join to_id t3 on r1.id = t3.id group by r1.id;
Look into how crosstab queries are done.查看交叉表查询是如何完成的。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.