简体   繁体   English

GROUP BY 中的列太多

[英]Too many columns in GROUP BY

I'm trying to aggregate some data, but I've a problem.我正在尝试汇总一些数据,但遇到了问题。 There's my query (using 3 tables):这是我的查询(使用 3 个表):

SELECT
            ufc.counter_id,
            gcrvf.goal_id,
            gcrvf.date_of_visit,
            ufc.utm_campaign,
            ufc.utm_source,
            ufc.utm_medium,
            ufc.utm_content,
            ufc.utm_term,
            ufc.original_join_id,
            max(gcrvf.last_update_time) AS last_update_time,
            sum(gcrvf.conversions) AS conversions, 
            c.name AS counter_name,
            c.owner_login AS owner_login,
            c.status AS counter_status,
            concat(g.goal_source,CAST('Goal','text')) AS metric_type,
            multiIf(g.is_retargeting = 0,'non-retargeting',g.is_retargeting = 1,'retargeting',NULL) AS metric_key,
            concat(g.name,' (',CAST(gcrvf.goal_id,'String'),')') AS metric_name
        FROM connectors_yandex_metrika.goal_conversions_report_v_final AS gcrvf
        INNER JOIN connectors_yandex_metrika.utm_for_collect AS ufc ON gcrvf.counter_id = ufc.counter_id
        LEFT JOIN connectors_yandex_metrika.counter AS c ON gcrvf.counter_id = c.id
        LEFT JOIN connectors_yandex_metrika.goal AS g ON gcrvf.goal_id = g.id
        WHERE 
            ((gcrvf.utm_campaign = ufc.utm_campaign) OR (ufc.utm_campaign IS NULL)) 
            AND ((gcrvf.utm_source = ufc.utm_source) OR (ufc.utm_source IS NULL)) 
            AND ((gcrvf.utm_medium = ufc.utm_medium) OR (ufc.utm_medium IS NULL)) 
            AND ((gcrvf.utm_content = ufc.utm_content) OR (ufc.utm_content IS NULL))
            AND ((gcrvf.utm_term = ufc.utm_term ) OR (ufc.utm_term IS NULL))
        GROUP BY
            ufc.counter_id,
            gcrvf.date_of_visit,
            gcrvf.goal_id,
            ufc.utm_campaign,
            ufc.utm_source,
            ufc.utm_medium,
            ufc.utm_content,
            ufc.utm_term,
            ufc.original_join_id,
            c.name,
            c.owner_login,
            c.status,
            metric_type,
            metric_key,
            metric_name 

I have to GROUP BY by almost all columns.我必须按几乎所有列进行GROUP BY Is it a real problem?这是一个真正的问题吗?

Columns ufc.original_join_id , c.name,c.owner_login , c.status , metric_type , metric_key,metric_name are not necessary here.ufc.original_join_id , c.name,c.owner_login , c.status , metric_type , metric_key,metric_name在这里不是必需的。 I added them to group by just because I need these columns.我将它们添加到分组依据只是因为我需要这些列。 And I want to ask: any way to make it more abbreviated?我想问:有什么办法可以让它更缩写吗? Any ways to avoid unnecessary columns from group by?有什么方法可以避免 group by 中不必要的列? Or it's okay?还是没事?

And my second question: does ClickHouse cache right table when we use JOINs?我的第二个问题:当我们使用 JOIN 时,ClickHouse 会缓存right table吗? So I always should put huge table as left table?所以我总是应该把大桌子放在左桌子上?

All columns are required in the group by. group by 中的所有列都是必需的。 It is not possible to leaf some columns out which where mentioned as select columns.不可能将某些列称为 select 列。

Depending on your indexed columns you can improve the speed of the query.根据您的索引列,您可以提高查询速度。 You should try to make an index on the key columns.您应该尝试在键列上建立索引。

The Database will handle the cache logic for you.数据库将为您处理缓存逻辑。 Depending on how often you execute the query.取决于您执行查询的频率。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM