How to find the most accessed table in Redshift?

Question

We are streaming realtime data to Redshift. The bottleneck is no of table loads that can run concurrently. We at present are running more than 1000+ loads every 15mins.

But we want to reduce this number based on how frequently these tables are used by the users. Please suggest how can we get this information in Redshift.

Answer 1

This view open sourced by awslabs can be used to query the most frequently queried tables.

Create view

CREATE OR REPLACE VIEW admin.v_get_table_scan_frequency
AS
SELECT 
    database, 
    schema AS schemaname, 
    table_id, 
    "table" AS tablename, 
    size, 
    sortkey1, 
    NVL(s.num_qs,0) num_qs
FROM svv_table_info t
LEFT JOIN (SELECT
   tbl, perm_table_name,
   COUNT(DISTINCT query) num_qs
FROM
   stl_scan s
WHERE 
   s.userid > 1
   AND s.perm_table_name NOT IN ('Internal Worktable','S3')
GROUP BY 
   tbl, perm_table_name) s ON s.tbl = t.table_id
AND t."schema" NOT IN ('pg_internal')
ORDER BY 7 desc;

Table

\d admin.v_get_table_scan_frequency
   Column   |  Type  | Modifiers
------------+--------+-----------
 database   | text   |
 schemaname | text   |
 table_id   | oid    |
 tablename  | text   |
 size       | bigint |
 sortkey1   | text   |
 num_qs     | bigint |

Query

select * from admin.v_get_table_scan_frequency order by num_qs;

Result

database | schemaname | table_id | tablename | size  | sortkey1      | num_qs
-----------------+------------+----------+------------------------------------------+-------+---------------+--------
 db      | product    | 1        | table1    |    92 | AUTO(SORTKEY) |  13448
 db      | product    | 2        | table2    |   180 | AUTO(SORTKEY) |  13389

Keeping a time series data of this query in Prometheus can help find rate and frequency trend over time for each table. Based on that we can decided how frequently to refresh data in Redshift.

How to find the most accessed table in Redshift?

Question

1 answers

solution1
0 2021-05-15 08:48:10

Create view

Table

Query

Result

How to find the most accessed table in Redshift?

Question

1 answers

solution1 0 2021-05-15 08:48:10

Create view

Table

Query

Result

solution1
0 2021-05-15 08:48:10