简体   繁体   English

SQL 服务器查询 SSIS 转换由于 174 个 UNION ALL 语句而超时

[英]SQL Server query for SSIS transformation timing out due to 174 UNION ALL statements

I have a table in Hive and SQL Server with data stored as below.我在 Hive 和 SQL 服务器中有一个表,其数据存储如下。 I am using SSIS to move this data in to SQL Server.我正在使用 SSIS 将此数据移动到 SQL 服务器。 The query is taking too long.查询时间过长。 There are about 175 separate values in the Description column, which results in 174 UNION ALL statements due to which the query times out after about 2 hours. Description 列中大约有 175 个单独的值,这会导致 174 个 UNION ALL 语句,因此查询在大约 2 小时后超时。

SQL Error [08S01]: org.apache.thrift.transport.TTransportException: java.net.SocketTimeoutException: Read timed out* SQL 错误 [08S01]:org.apache.thrift.transport.TTransportException:java.net.SocketTimeoutException:读取超时*

Is there a better way to write this query?有没有更好的方法来编写这个查询?

Thanks!谢谢!

Hive: Hive:

ID  | Description
----+------------------------------
 1  | Desc1;Desc2;Desc3;Desc4
 2  | Desc1;Desc3;Desc4;Desc5;Desc6
 ...
230 | Desc8;Desc163;Desc9;Desc2;Desc172

SQL Server: SQL 服务器:

CaseID | GroupID | Description
-------+---------+--------------
   1   |    63   | Desc1
   1   |    44   | Desc2
   1   |    57   | Desc3
   1   |    78   | Desc4
   ...
   2   |    78   | Desc1
   2   |    57   | Desc3

Query:询问:

select 
       case 
             when cas.description like '%Desc1%' then 63 
       end as groupid, -- maps to groupid
       cas.id as caseid, -- maps to caseid 
       current_timestamp as INSERT_DT
from 
       svc_case cas
inner join account acc on acc.id = cas.id
where cas.description <> 'NULL' and LENGTH(cas.description) > 0
and acc.recordid = '03443FGT'
union all 
select 
       case 
             when cas.description like '%Desc2%' then 44
       end as groupid, -- maps to groupid
       cas.id as caseid, -- maps to caseid 
       current_timestamp as INSERT_DT
from 
       svc_case cas
inner join account acc on acc.id = cas.id
where cas.description <> 'NULL' and LENGTH(cas.description) > 0
and acc.recordid = '03443FGT'
union all
select 
       case 
             when cas.description like '%Desc3%' then 57 
       end as groupid, -- maps to groupid
       cas.id as caseid, -- maps to caseid 
       current_timestamp as INSERT_DT
from 
       svc_case cas
inner join account acc on acc.id = cas.id
where cas.description <> 'NULL' and LENGTH(cas.description) > 0
and acc.recordid = '03443FGT'
union all
select 
       case 
             when cas.description like '%Desc4%' then 78 
       end as groupid, -- maps to groupid
       cas.id as caseid, -- maps to caseid 
       current_timestamp as INSERT_DT
from 
       svc_case cas
inner join account acc on acc.id = cas.id
where cas.description <> 'NULL' and LENGTH(cas.description) > 0
and acc.recordid = '03443FGT'
...
select 
       case 
             when cas.description like '%Desc175%' then 12 
       end as groupid, -- maps to groupid
       cas.id as caseid, -- maps to caseid 
       current_timestamp as INSERT_DT
from 
       svc_case cas
inner join account acc on acc.id = cas.id
where cas.description <> 'NULL' and LENGTH(cas.description) > 0
and acc.recordid = '03443FGT'

This is a stab in the dark, but there are 2 things you can do to improve this query.这是在黑暗中的一次尝试,但是您可以做两件事来改进这个查询。 Firstly, let's address all those UNION ALL s.首先,让我们解决所有这些UNION ALL If I understand your query correctly, you can unpivot your data to achieve the same thing:如果我正确理解了您的查询,您可以取消数据透视以实现相同的目的:

SELECT V.groupid,
       cas.id AS caseid,
       current_timestamp as INSERT_DT
FROM dbo.svc_case cas
     JOIN dbo.account acc on acc.id = cas.id
     CROSS APPLY (VALUES(CASE WHEN cas.description LIKE '%Desc1%' THEN 63 END),
                        (CASE WHEN cas.description LIKE '%Desc2%' THEN 44 END),
                        (CASE WHEN cas.description LIKE '%Desc3%' THEN 57 END),
                        (CASE WHEN cas.description LIKE '%Desc4%' THEN 78 END),
                        --I assume there are 174 more of these
                        (CASE WHEN cas.description LIKE '%Desc178%' THEN 1 END))V(groupid) --The last one isn't correct, but to show how the `APPLY` ends

Then you have your WHERE , which isn't SARGable due to the LENGTH .然后你有你的WHERE ,由于LENGTH而不是 SARGable 。 LENGTH isn't actually a T-SQL operator, so I hope you are actually using SQL Server (if you're not, this is a waste of an answer, as the above is T-SQL specific). LENGTH实际上不是 T-SQL 运算符,所以我希望您实际上使用的是 SQL 服务器(如果不是,这是浪费答案,因为上面是特定于 T-SQL 的)。 Considering that LEN(NULL) returns NULL , then use <> '' .考虑到LEN(NULL)返回NULL ,然后使用<> '' Considering you already have <> 'NULL' though you can use NOT IN :考虑到你已经有<> 'NULL'虽然你可以使用NOT IN

WHERE cas.description NOT IN('NULL','')
  AND acc.recordid = '03443FGT'

I do, however, suggest against storing the literal string value 'NULL' in your column, you should fix that and actually store NULL , not 'NULL' ;但是,我建议不要将文字字符串值'NULL'存储在您的列中,您应该修复它并实际存储NULL ,而不是'NULL' the 2 are different values and behave very differently. 2 是不同的值并且表现得非常不同。

Only run the query one time.只运行一次查询。 So no union all, and leave out the CASE.所以没有联合,并省略了CASE。 Use a multicast and split it in SSIS.使用多播并将其拆分为 SSIS。

You can expand the codes and use case to convert to numbers:您可以扩展代码和用case以转换为数字:

select (case when code = 'Desc1' then 63
             when code = 'Desc2' then 44
             . . .
        end) as groupid, -- maps to groupid
       cas.id as caseid, -- maps to caseid 
       current_timestamp as INSERT_DT
from svc_case cas join
     account acc
     on acc.id = cas.id lateral view
     explode(split(cas.description, ';')) codes as code
where acc.recordid = '03443FGT';

I don't know why you have description <> 'NULL' .我不知道你为什么有description <> 'NULL' I am guessing that you really want is not null -- and that is unnecessary with the lateral join.我猜你真正想要is not null这对于横向连接是不必要的。

Also, if you have a reference table, with one row per code and groupid , then the code can be further simplified by joining to that.此外,如果您有一个参考表,每个代码和groupid一行,则可以通过加入该表来进一步简化代码。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM