简体   繁体   English

在结构数组中找到最大数量

[英]Finding Largest Number in Array of Structs

I currently have a table that contains an organization name, org_name , and an array of structs called types that contains structs with properties name and count . 我目前有一个表,其中包含组织名称org_name和称为types的结构数组,其中包含具有属性namecount I am attempting to use bigquery to figure out the "dominant" type by finding the type with the highest count and appending that types name to the organizations row. 我正在尝试使用bigquery通过找到count最高的类型并将该类型name附加到组织行来找出“显性” type My code to get the organization name and the array of structs is as follows: 我获取组织名称和结构数组的代码如下:

CREATE TEMP FUNCTION GetNamesAndCounts(elements ARRAY<STRING>) AS (
  ARRAY(
    SELECT AS STRUCT elem AS name, COUNT(*) AS count
    FROM UNNEST(elements) AS elem
    GROUP BY elem
    ORDER BY count
  )
);

select org_name, GetNamesAndCounts(types_of_professionals) as types from table

This is a picture of the results from that query. 这是该查询结果的图片 For context, I would like there to be another column dominant type that displays the name of the type with the highest count . 对于上下文,我希望有另一个列dominant type ,该dominant type显示具有最高counttype的名称。

As I can see, you order your structs ascending by column count , so the last element is what you want (if there is no other name with same count ). 如我所见,您按列count对结构进行升序排列,因此最后一个元素就是您想要的(如果没有其他具有相同count name )。 So you can just GetNamesAndCounts(types_of_professionals)[ordinal(array_length(GetNamesAndCounts(types_of_professionals)))] 因此,您可以只获取GetNamesAndCounts(types_of_professionals)[ordinal(array_length(GetNamesAndCounts(types_of_professionals)))]

or here is full script 或者这是完整的脚本

select
    org_name,

    array(
        select as struct
            elem as name,
            count(*) as count
        from
            unnest(elements) as elem
        group by
            elem
        order by
            count
    ) as types,

    (   select
            elem
        from
            unnest(elements) as elem
        group by
            elem
        order by
            count(*) desc
        limit
            1
    ) as dominant_type,

    array(
        select
            elem
        from
            (   select
                    elem,
                    count(*) as count,
                    rank() over(partition by
                                    elem
                                order by
                                    count(*) desc) as rank
                from
                    unnest(elements) as elem
                group by
                    elem
            )
        where
            rank = 1
    ) as all_dominant_types
from
    table

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM