简体   繁体   English

德鲁伊 sql 查询 - 对跨记录的多值字段进行明确计数

[英]druid sql query - count distinctly for a multi value field across records

Is there a way to do a distinct count across different rows for a multi-value field in druid SQL for a particular value in which value is only counted once across an array?有没有办法对德鲁伊 SQL 中的多值字段的不同行进行不同的计数,以获得特定值,其中值只在数组中计算一次? eg suppose I have below records:例如,假设我有以下记录:

shippingSpeed 
[standard, standard, standard, ground]
[standard,ground]
[ground,ground]

Expected Result:预期结果:

standard 2
ground 3

I tried below query but it is aggregating the field count inside an array and then giving the total count across all records:我尝试了下面的查询,但它聚合了一个数组中的字段计数,然后给出了所有记录的总计数:

SELECT
"shippingSpeed", count(*)
FROM orders
WHERE __time >= CURRENT_TIMESTAMP - INTERVAL '30' DAY
GROUP BY 1
ORDER BY 2 ASC

Result:结果:

standard 4
ground 4

This is because the Group By on multi-valued columns will UNNEST the array into multiple rows.这是因为多值列上的分组依据会将数组 UNNEST 成多行。 It is counting each item as an instance correctly.它正确地将每个项目计为一个实例。

If you want to remove duplicates, define "shippingSpeed" at ingestion time with the property: "multiValueHandling": "SORTED_SET"如果要删除重复项,请在摄取时使用属性定义“shippingSpeed”:“multiValueHandling”:“SORTED_SET”

You can find more details here: https://druid.apache.org/docs/latest/querying/multi-value-dimensions.html#overview您可以在此处找到更多详细信息: https://druid.apache.org/docs/latest/querying/multi-value-dimensions.html#overview

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM