简体   繁体   中英

Insert table statistics in another table

I have the following table in a database (teradata/hive):

CREATE MULTISET TABLE DP_S.temp_a ,NO FALLBACK ,
     NO BEFORE JOURNAL,
     NO AFTER JOURNAL,
     CHECKSUM = DEFAULT,
     DEFAULT MERGEBLOCKRATIO
     (
      id INTEGER,
      val VARCHAR(200) CHARACTER SET LATIN CASESPECIFIC
      )
NO PRIMARY INDEX;

I want to generate the following stats and insert them row by row in a stats table:

SEL 'temp_a' AS testing_table,
                CAST(COUNT(*) AS NUMBER(30)) AS noofrows
                , MIN(id) AS min_id
                , MAX(id) AS max_id
                , SUM(CASE WHEN id IS NULL THEN 1 ELSE 0 END) AS null_check_id
                ,SUM(CAST(id AS DECIMAL(30,8))) AS sumnum_id
                , MIN(val) AS min_val
                , MAX(val) AS max_val
                ,SUM(CAST(LENGTH(TRIM(val)) AS DECIMAL(30,8))) AS sumlen_val
                FROM DP_S.temp_a;

DDL for stats table:

CREATE MULTISET TABLE DP_S.SANITY_STATS ,NO FALLBACK ,
     NO BEFORE JOURNAL,
     NO AFTER JOURNAL,
     CHECKSUM = DEFAULT,
     DEFAULT MERGEBLOCKRATIO
     (
      stats_date DATE DEFAULT CURRENT_DATE FORMAT 'YYYY-MM-DD',
      TBL_name VARCHAR(200) CHARACTER SET LATIN CASESPECIFIC,
      COL_name VARCHAR(200) CHARACTER SET LATIN CASESPECIFIC,
      stats_type VARCHAR(200) CHARACTER SET LATIN CASESPECIFIC,
      stats_value BIGINT
     )
NO PRIMARY INDEX;

eg if temp_a is the following table

id | val
1  | 'abc'
2  | 'def'

the stats table should contain the following information:

Date       | Table  | Col | StatType | Value
2014-12-24 | temp_a | all | noofrows | 2
2014-12-24 | temp_a | id  | min      | 1
2014-12-24 | temp_a | id  | max      | 3
2014-12-24 | temp_a | id  | sumnum   | 3
2014-12-24 | temp_a | id  | nullchk  | 0
2014-12-24 | temp_a | val | min      | 'abc'
2014-12-24 | temp_a | val | max      | 'def'
2014-12-24 | temp_a | val | sumlen   | 6

Is this possible to achieve in a single query? (I know i can run multiple inserts while selecting one stat from the table, but i guess that would be slow and cumbersome)

I suppose that should work. Sorry if misspelled something, don't have DB to check.

select
    b.testing_table
,   case b.fld
        when 1 then a.noofrows
        when 2 then a.min_id
        when 3 then a.max_id
        when 4 then a.null_check_id
        when 5 then a.sumnum_id
        when 6 then a.min_val
        when 7 then a.max_val
        when 8 then a.sumlen_val
from
(
    select
        CAST(COUNT(*) AS NUMBER(30)) AS noofrows
    ,   MIN(id) AS min_id
    ,   MAX(id) AS max_id
    ,   SUM(CASE WHEN id IS NULL THEN 1 ELSE 0 END) AS null_check_id
    ,   SUM(CAST(id AS DECIMAL(30,8))) AS sumnum_id
    ,   MIN(val) AS min_val
    ,   MAX(val) AS max_val
    ,   SUM(CAST(LENGTH(TRIM(val)) AS DECIMAL(30,8))) AS sumlen_val
    FROM DP_S.temp_a
) as a
full outer join
(
    select 'temp_a' AS testing_table, 1 as fld
    union all select 'temp_a' AS testing_table, 2 as fld
    union all select 'temp_a' AS testing_table, 3 as fld
    union all select 'temp_a' AS testing_table, 4 as fld
    union all select 'temp_a' AS testing_table, 5 as fld
    union all select 'temp_a' AS testing_table, 6 as fld
    union all select 'temp_a' AS testing_table, 7 as fld
    union all select 'temp_a' AS testing_table, 8 as fld
) as b

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM