从计数子查询的结果更新列

Question

I have the following query: 我有以下查询：

SELECT count(distinct document_key), etl_telco_cycle.customer_number FROM telco_document_header inner join etl_telco_cycle on  (telco_document_header.customer_number like '%' || etl_telco_cycle.customer_number) where telco_document_header.document_cycle = substring(cast(now() - interval '1 month' as varchar) from 1 for 4) || substring(cast(now() - interval '1 month' as varchar) from 6 for 2) and telco_document_header.customer_number like '%' || etl_telco_cycle.customer_number) group by etl_telco_cycle.customer_number

which returns this: 返回以下内容：

Now I want to use that result to update count in a table where customer_number match. 现在，我想使用该结果来更新customer_number匹配的表中的计数。 I tried this: 我尝试了这个：

update etl_telco_cycle set amount_mobilephone_numbers = (SELECT count(distinct document_key), etl_telco_cycle.customer_number FROM telco_document_header inner join etl_telco_cycle on  (telco_document_header.customer_number like '%' || etl_telco_cycle.customer_number) where telco_document_header.document_cycle = substring(cast(now() - interval '1 month' as varchar) from 1 for 4) || substring(cast(now() - interval '1 month' as varchar) from 6 for 2) group by etl_telco_cycle.customer_number)

which results in this: 结果是：

Answer 1

Use the FROM clause to the UPDATE command : 使用FROM子句的UPDATE命令：

UPDATE etl_telco_cycle e
SET    amount_mobilephone_numbers = c.ct
FROM  (
   SELECT e.customer_number, count(distinct document_key) AS ct
   FROM   telco_document_header t
   JOIN   etl_telco_cycle       e ON  t.customer_number like '%' || e.customer_number
   WHERE  t.document_cycle = substring(cast(now() - interval '1 month' as varchar) from 1 for 4)
                          || substring(cast(now() - interval '1 month' as varchar) from 6 for 2)
   GROUP  BY 1
   ) c
WHERE e.customer_number = c.customer_number
AND   e.amount_mobilephone_numbers IS DISTINCT FROM c.ct;  --optional optimization

While you can also use a correlated subquery, this would typically be much slower, running one aggregation query per target row, while this query runs a single aggregation query. 虽然您也可以使用相关子查询，但通常会慢得多，每目标行运行一个聚合查询，而此查询运行单个聚合查询。 And there is a minor difference: if no related rows are found in a correlated subquery like Gordon demonstrates , the column is still updated to NULL (which would fail for columns defined NOT NULL ), while my query does nothing instead (keeping the old value). 两者之间有微小的区别：如果在相关子查询（如Gordon演示）中未找到相关行，则该列仍会更新为NULL（这对于定义为NOT NULL列将失败），而我的查询NOT NULL执行任何操作 （保留旧值））。 You'll have to define the desired behavior. 您必须定义所需的行为。

The added AND e.amount_mobilephone_numbers IS DISTINCT FROM c.ct prevents empty updates. 添加的AND e.amount_mobilephone_numbers IS DISTINCT FROM c.ct防止空更新。 Related: 有关：

How do I (or can I) SELECT DISTINCT on multiple columns? 如何（或可以）在多列上选择DISTINCT？

You could optimize the performance of the counting subquery some more. 您可以进一步优化计数子查询的性能。 You may not need DISTINCT nor the JOIN in the subquery - would need to see exact table definitions and constraits. 你可能不需要DISTINCT也不JOIN子查询-需要看到确切的表定义和constraits。 Looks like you can replace this either way: 看起来您可以用以下任何一种方式替换：

   substring(cast(now() - interval '1 month' as varchar) from 1 for 4)
|| substring(cast(now() - interval '1 month' as varchar) from 6 for 2)

with: 有：

to_char(now() - interval '1 month', 'YYYYMM')

Either depends on the current timezone setting, which may be undesirable in corner cases. 无论哪种情况都取决于当前的timezone设置，这在极端情况下可能是不可取的。

And document_cycle should be a date or integer , not a string type ... 并且document_cycle应该是date或integer ，而不是字符串类型。

Answer 2

You can just use a correlated subquery: 您可以只使用相关的子查询：

update etl_telco_cycle
    set amount_mobilephone_numbers = (SELECT count(distinct document_key)
                                      FROM telco_document_header tdh
                                      WHERE tdh.customer_number = etl_telco_cycle.customer_number AND 
                                            tdh.document_cycle = substring(cast(now() - interval '1 month' as varchar) from 1 for 4) || substring(cast(now() - interval '1 month' as varchar) from 6 for 2) 
                                    );

I'm not sure why your version uses LIKE for the match on customer numbers. 我不确定您的版本为什么使用LIKE来匹配客户编号。 That seems awkward, so I removed it. 这看起来很尴尬，所以我删除了它。

I also think the date logic can be written more concisely using TO_CHAR() : 我还认为可以使用TO_CHAR()更加简洁地编写日期逻辑：

update etl_telco_cycle
    set amount_mobilephone_numbers = (SELECT count(distinct document_key)
                                      FROM telco_document_header tdh
                                      WHERE tdh.customer_number = etl_telco_cycle.customer_number AND 
                                            tdh.document_cycle = TO_CHAR(now() - interval '1 month', 'YYYYDD')
                                     );

从计数子查询的结果更新列

问题描述

2 个解决方案

解决方案1
1 2018-11-07 16:04:00

解决方案2
1 已采纳 2018-11-07 16:38:30

从计数子查询的结果更新列

问题描述

2 个解决方案

解决方案1 1 2018-11-07 16:04:00

解决方案2 1 已采纳 2018-11-07 16:38:30

解决方案1
1 2018-11-07 16:04:00

解决方案2
1 已采纳 2018-11-07 16:38:30