简体   繁体   中英

Group Data Hourly and Insert Into a Summary Table in Postgres

I have a database with approx 50 tables (for now), due to the amount of data that gets poured into a few tables, I have been tasked with creating an hourly summarization of the data and dump it into another table. So running reports on the original data is taking quite long as the new database (2 weeks old) already is up to 200k records in one of the two tables I'm pulling data for.

The query gets three possible results for customers - "cust1", cust2" and "cust3", each with a chosen card (mail response for product quality questionnaire and potential winning of a prize) that is one of 13 choices. The letter representation ("A" Ace , "K" King and so on for the associated values)

Here is one of the sub-queries and the corresponding result:

select sp_cust_card_sequence(cards.cust1) as seq, cards.cust1 as card, count(cards.cust1) as card_count, rtt.game_id as game_id, gt.promo_id as promo_id, gt.choice_id as choice_id, extract(hour from header.start_timestamp) as hour, header.start_timestamp::timestamp::date as date
    from game_bac_cards cards
    inner join card_cust_resp header ON (header.id = cards.game_id)
    inner join game_table gt ON (header.promo_id = gt.promo_id)
    inner join ref_table_type rtt ON (gt.table_type_id = rtt.id)
    where result <> 'undef' 
    group by date, hour, card, rtt.game_id, gt.promo_id, gt.choice_id, cards.cust1

结果:

Essentially, I would like to group, for example, all values of "K" from the card column by hour. With the query below, I seem to be able to almost achieve that, but the snippet below shows that the value of " K " for hour " 19 " on " 12/4/2014 " has two entries, as opposed to just one. I'm sure there is a more elegant way of doing this.

Final Query:

select date, hour, card, seq, sum(card_count) as card_count, game_id, choice_id, promo_id 
from (
select date, hour, card, seq, sum(card_count) as card_count, game_id, choice_id, promo_id from (
    select sp_cust_card_sequence(cards.cust1) as seq, cards.cust1 as card, count(cards.cust1) as card_count, rtt.game_id as game_id, gt.promo_id as promo_id, gt.choice_id as choice_id, extract(hour from header.start_timestamp) as hour, header.start_timestamp::timestamp::date as date
    from game_bac_cards cards
    inner join card_cust_resp header ON (header.id = cards.game_id)
    inner join game_table gt ON (header.promo_id = gt.promo_id)
    inner join ref_table_type rtt ON (gt.table_type_id = rtt.id)
    where result <> 'undef' 
    group by date, hour, card, rtt.game_id, gt.promo_id, gt.choice_id, cards.cust1
) as cust1_table
where cust1_table.card is not null and cust1_table.card <> ''
group by date, hour, card_count, card, seq, game_id, choice_id, promo_id

union all

select date, hour, card, seq, sum(card_count) as card_count, game_id, choice_id, promo_id from (
    select sp_cust_card_sequence(cards.cust2) as seq, cards.cust2 as card, count(cards.cust2) as card_count, rtt.game_id as game_id, gt.promo_id as promo_id, gt.choice_id as choice_id, extract(hour from header.start_timestamp) as hour, header.start_timestamp::timestamp::date as date
    from game_bac_cards cards
    inner join card_cust_resp header ON (header.id = cards.game_id)
    inner join game_table gt ON (header.promo_id = gt.promo_id)
    inner join ref_table_type rtt ON (gt.table_type_id = rtt.id)
    where result <> 'undef' 
    group by date, hour, card, rtt.game_id, gt.promo_id, gt.choice_id, cards.cust2
) as cust2_table
where cust2_table.card is not null and cust2_table.card <> ''
group by date, hour, card_count, card, seq, game_id, choice_id, promo_id

union all

select date, hour, card, seq, sum(card_count) as card_count, game_id, choice_id, promo_id from (
    select sp_cust_card_sequence(cards.cust3) as seq, cards.cust3 as card, count(cards.cust3) as card_count, rtt.game_id as game_id, gt.promo_id as promo_id, gt.choice_id as choice_id, extract(hour from header.start_timestamp) as hour, header.start_timestamp::timestamp::date as date
    from game_bac_cards cards
    inner join card_cust_resp header ON (header.id = cards.game_id)
    inner join game_table gt ON (header.promo_id = gt.promo_id)
    inner join ref_table_type rtt ON (gt.table_type_id = rtt.id)
    where result <> 'undef' 
    group by date, hour, card, rtt.game_id, gt.promo_id, gt.choice_id, cards.cust3
) as cust3_table
where cust3_table.card is not null and cust3_table.card <> ''
group by date, hour, card_count, card, seq, game_id, choice_id, promo_id
) as card_details
and card_details.card is not null and card_details.card <> ''
group by date, hour, card, card_count, seq, game_id, choice_id, promo_id
order by date, hour, seq

Results showing only card K | hour 19 | date 12/4/2014 . This should be only ONE row as opposed to TWO. (?)

在此处输入图片说明

Any help is greatly appreciated!

Try deleting the group by on the card_count in the outer query

group by date, hour, card_count , card, seq, game_id, choice_id, promo_id

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM