%sql
with temp1 as
(
select req_id from table1 order by timestamp desc limit 8000000
)
update table1 set label = '1' where req_id in temp1 and req_query like '%\<\/script\>%'
update table1 set label = '1' where req_id in temp1 and req_query like '%aaaaa%'
update table1 set label = '1' where req_id in temp1 and req_query like '%bbbb%'
getting error:
com.databricks.backend.common.rpc.DatabricksExceptions$SQLExecutionException: org.apache.spark.sql.catalyst.parser.ParseException: mismatched input 'in' expecting {, ';'}(line 6, pos 93)
can someone advise? what will be less costly to ask the database the same question?
select req_id from table1 order by timestamp desc limit 8000000
This is not allowed: where req_id in temp1
. temp1
is not a list, but the result of a query and should be used like another table.
You might rather write something like this:
update table1
set label = '1'
where req_id in (select req_id from table1 order by timestamp desc limit 8000000)
and req_query like '%\<\/script\>%'
You can use update with join instead of IN clause. Delta does not support updating tables using inner join but you can use MERGE I think. Something like this:
WITH temp1 AS (
SELECT req_id FROM table1 ORDER BY timestamp DESC LIMIT 8000000
)
MERGE INTO table1 a
USING temp1 b
ON (a.req_id = b.req_id)
WHEN MATCHED THEN
UPDATE SET a.label = CASE WHEN a.req_query LIKE '%\<\/script\>%' THEN '1'
WHEN a.req_query LIKE '%aaaaa%' THEN '2'
WHEN a.req_query LIKE '%bbbb%' THEN '3'
ELSE a.label
END
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.