[英]update with join on CTE doesn't use index, but with temp table it does
[英]Update table using with temp CTE
%sql
with temp1 as
(
select req_id from table1 order by timestamp desc limit 8000000
)
update table1 set label = '1' where req_id in temp1 and req_query like '%\<\/script\>%'
update table1 set label = '1' where req_id in temp1 and req_query like '%aaaaa%'
update table1 set label = '1' where req_id in temp1 and req_query like '%bbbb%'
收到錯誤:
com.databricks.backend.common.rpc.DatabricksExceptions$SQLExecutionException: org.apache.spark.sql.catalyst.parser.ParseException: mismatched input 'in' expecting {, ';'}(line 6, pos 93)
有人可以建議嗎? 問數據庫同樣的問題會更便宜嗎?
select req_id from table1 order by timestamp desc limit 8000000
這是不允許的: where req_id in temp1
。 temp1
不是一個列表,而是一個查詢的結果,應該像另一個表一樣使用。
你可能寧願寫這樣的東西:
update table1
set label = '1'
where req_id in (select req_id from table1 order by timestamp desc limit 8000000)
and req_query like '%\<\/script\>%'
您可以將 update 與 join 一起使用,而不是 IN 子句。 Delta 不支持使用內部聯接更新表,但我認為您可以使用MERGE 。 像這樣的東西:
WITH temp1 AS (
SELECT req_id FROM table1 ORDER BY timestamp DESC LIMIT 8000000
)
MERGE INTO table1 a
USING temp1 b
ON (a.req_id = b.req_id)
WHEN MATCHED THEN
UPDATE SET a.label = CASE WHEN a.req_query LIKE '%\<\/script\>%' THEN '1'
WHEN a.req_query LIKE '%aaaaa%' THEN '2'
WHEN a.req_query LIKE '%bbbb%' THEN '3'
ELSE a.label
END
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.