简体   繁体   中英

Getting table names and row counts for all tables in an athena database

I have an AWS database with multiple tables that I am trying to get the row counts for in a single query.

The ideal query output would be:

table_name row_count
table2_name row_count
etc...

So far I've been able to either get all the table names from the database or all the rowcounts of the tables (in random order), but not both in the same query.

This query returns a column of all the table names that exist in the database:

SELECT table_name FROM information_schema.tables WHERE table_schema = '<database_name>';

This query returns all the row counts for the tables:

SELECT COUNT(*) FROM table_name
UNION ALL
SELECT COUNT(*) FROM table2_name
UNION ALL
etc..for the rest of the tables

The issue with this query is that is displays the row counts in a random order that doesn't correspond with the order of the tables in the query, and so I don't know which row count goes with which table - hence why I need both the table names and row counts.

Simply add the names of the tables as literals in your queries:

SELECT 'table_name' AS table_name, COUNT(*) AS row_count FROM table_name
UNION ALL
SELECT 'table_name2' AS table_name, COUNT(*) AS row_count FROM table_name2
UNION ALL
…

The following query generates the UNION query to produce counts of all records. The problem to solve is that (as of December 2022) INFORMATION_SCHEMA.TABLES incorrectly defines every table and view as a BASE TABLE so you will need some logic to eliminate the views.

In Data Warehousing it is common practise to record snapshots of the record counts of landing tables at frequent intervals. Any unexpected deviations from expected counts can be used for reporting/alerting

WITH Table_List AS (
    SELECT table_schema,table_name, CONCAT('SELECT CURRENT_DATE AS run_date, ''',table_name, ''' AS table_name, COUNT(*) AS Records FROM "',table_schema,'"."', table_name, '"') AS BaseSQL
    FROM INFORMATION_SCHEMA.TABLES 
    WHERE 
            table_schema = 'YOUR_DB_NAME' -- Change this 
        AND table_name LIKE 'YOUR TABLE PATTERN%' -- Change or remove this line
)
, Total_Records AS (
    SELECT COUNT(*) AS Table_Count
    FROM Table_List
)
SELECT 
    CASE WHEN ROW_NUMBER() OVER (ORDER BY table_name) = Table_Count 
        THEN BaseSQL
        ELSE CONCAT(BaseSql, ' UNION ALL') END  AS All_Table_Record_count_SQL
FROM Table_List CROSS JOIN Total_Records
ORDER BY table_name;

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM