繁体   English   中英

在 BigQuery 中,计算 group by 中两行之间的差异

[英]In BigQuery, compute difference between two rows in group by

with
  my_stats as (
    select 24996 as competitionId, 17 as playerId, 'on' as onOff, 8 as fga, 4 as fgm, 0.50 as fgPct union all
    select 24996 as competitionId, 17 as playerId, 'off' as onOff, 5 as fga, 3 as fgm, 0.60 as fgPct union all
    select 24996 as competitionId, 24 as playerId, 'on' as onOff, 9 as fga, 6 as fgm, 0.67 as fgPct union all
    select 24996 as competitionId, 24 as playerId, 'off' as onOff, 3 as fga, 1 as fgm, 0.33 as fgPct union all
    select 24996 as competitionId, 27 as playerId, 'on' as onOff, 5 as fga, 4 as fgm, 0.8 as fgPct
  ),
  
  my_output as (
    select 24996 as competitionId, 17 as playerId, 'diff' as onOff, 3 as fga, 1 as fgm, -0.1 as fgPct union all
    select 24996 as competitionId, 24 as playerId, 'diff' as onOff, 6 as fga, 5 as fgm, 0.34 as fgPct
  )
  

select * from my_stats
select * from my_output

这是一个简单的例子来演示我们正在努力解决的问题。 我们有表my_stats ,其中主键是competitionId, playerId, onOffcompetitionId, playerId, onOff的组合,并且onOff列只能是“on”或“off”。 对于单个competitionId, playerId然后(其中有两行,一行表示“on”,一行表示“off”),我们想从所有其他列中减去值(on - off)。

希望my_output表明确我们需要什么输出。 playerId = 27的情况下,由于该玩家没有“关闭”行,因此可以简单地将其从输出中删除,因为无需进行计算。

您可以进行条件聚合:

select
    competitionId,
    playerId,
    'diff' as onOff,
    sum(case when onOff = 'on' then fga   else - fga   end) fga,
    sum(case when onOff = 'on' then fgm   else - fgm   end) fga,
    sum(case when onOff = 'on' then fgpct else - fgpct end) fgpct
from my_stats
where onOff in ('on', 'off')
group by competitionId, playerId
having count(*) = 2

这按比赛和球员对数据进行分组,然后条件sum()计算每一列的“开”和“关”值之间的差异。 having子句过滤掉没有两个记录可用的组。

另一种基于自联接的解决方案:

select
    t1.competitionId,
    t1.playerId,
    'diff' as onOff,
    t1.fga - t2.fga as fga,
    t1.fgm - t2.fgm as fgm,
    t1.fgpct - t2.fgpct as fgpct
from my_stats as t1
join my_stats as t2
  on t1.competitionId = t2.competitionId
 and t1.playerId = t2.playerId
where t1.onOff = 'on'
  and t2.onOff = 'off'

您应该检查哪种方法更有效

下面是 BigQuery 标准 SQL

#standardSQL
SELECT competitionId, playerId, 'diff' AS onOff,
  SUM(onOffSign * fga) AS fga,
  SUM(onOffSign * fgm) AS fgm,
  SUM(onOffSign * fgPct) AS fgPct  
FROM my_stats, 
  UNNEST([IF(onOff = 'on', 1, -1)]) onOffSign
GROUP BY competitionId, playerId
HAVING COUNT(1) = 2  

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM