[英]how to find size of database, schema, table in redshift
Team,团队,
my redshift version is:我的红移版本是:
PostgreSQL 8.0.2 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.4.2 20041017 (Red Hat 3.4.2-6.fc3), Redshift 1.0.735
how to find out database size, tablespace, schema size & table size ?如何找出数据库大小、表空间、模式大小和表大小?
but below are not working in redshift ( for above version )但以下不适用于红移(对于以上版本)
SELECT pg_database_size('db_name');
SELECT pg_size_pretty( pg_relation_size('table_name') );
Is there any alternate to find out like oracle ( from DBA_SEGMENTS )是否有任何替代方法可以找到像 oracle (来自 DBA_SEGMENTS )
for tble size, i have below query, but not sure about exact menaing of MBYTES.对于 tble 大小,我有以下查询,但不确定 MBYTES 的确切含义。 FOR 3rd row, MBYTES = 372. it means 372 MB ?
对于第三行,MBYTES = 372。这意味着 372 MB ?
select trim(pgdb.datname) as Database, trim(pgn.nspname) as Schema,
trim(a.name) as Table, b.mbytes, a.rows
from ( select db_id, id, name, sum(rows) as rows from stv_tbl_perm a group by db_id, id, name ) as a
join pg_class as pgc on pgc.oid = a.id
join pg_namespace as pgn on pgn.oid = pgc.relnamespace
join pg_database as pgdb on pgdb.oid = a.db_id
join (select tbl, count(*) as mbytes
from stv_blocklist group by tbl) b on a.id=b.tbl
order by a.db_id, a.name;
database | schema | table | mbytes | rows
---------------+--------------+------------------+--------+----------
postgres | public | company | 8 | 1
postgres | public | table_data1_1 | 7 | 1
postgres | proj_schema1 | table_data1 | 372 | 33867540
postgres | public | table_data1_2 | 40 | 2000001
(4 rows)
The above answers don't always give correct answers for table space used.上述答案并不总是为使用的表空间提供正确答案。 AWS support have given this query to use:
AWS 支持已提供此查询使用:
SELECT TRIM(pgdb.datname) AS Database,
TRIM(a.name) AS Table,
((b.mbytes/part.total::decimal)*100)::decimal(5,2) AS pct_of_total,
b.mbytes,
b.unsorted_mbytes
FROM stv_tbl_perm a
JOIN pg_database AS pgdb
ON pgdb.oid = a.db_id
JOIN ( SELECT tbl,
SUM( DECODE(unsorted, 1, 1, 0)) AS unsorted_mbytes,
COUNT(*) AS mbytes
FROM stv_blocklist
GROUP BY tbl ) AS b
ON a.id = b.tbl
JOIN ( SELECT SUM(capacity) AS total
FROM stv_partitions
WHERE part_begin = 0 ) AS part
ON 1 = 1
WHERE a.slice = 0
ORDER BY 4 desc, db_id, name;
Yes, mbytes in your example is 372Mb.是的,您示例中的 MB 为 372Mb。 Here's what I've been using:
这是我一直在使用的:
select
cast(use2.usename as varchar(50)) as owner,
pgc.oid,
trim(pgdb.datname) as Database,
trim(pgn.nspname) as Schema,
trim(a.name) as Table,
b.mbytes,
a.rows
from
(select db_id, id, name, sum(rows) as rows
from stv_tbl_perm a
group by db_id, id, name
) as a
join pg_class as pgc on pgc.oid = a.id
left join pg_user use2 on (pgc.relowner = use2.usesysid)
join pg_namespace as pgn on pgn.oid = pgc.relnamespace
and pgn.nspowner > 1
join pg_database as pgdb on pgdb.oid = a.db_id
join
(select tbl, count(*) as mbytes
from stv_blocklist
group by tbl
) b on a.id = b.tbl
order by mbytes desc, a.db_id, a.name;
I'm not sure about grouping by database and scheme, but here's a short way to get usage by table,我不确定是否按数据库和方案分组,但这里有一种按表获取使用情况的简短方法,
SELECT tbl, name, size_mb FROM
(
SELECT tbl, count(*) AS size_mb
FROM stv_blocklist
GROUP BY tbl
)
LEFT JOIN
(select distinct id, name FROM stv_tbl_perm)
ON id = tbl
ORDER BY size_mb DESC
LIMIT 10;
you can checkout this repository, i'm sure you'll find useful stuff there.你可以查看这个存储库,我相信你会在那里找到有用的东西。
https://github.com/awslabs/amazon-redshift-utils https://github.com/awslabs/amazon-redshift-utils
to answer your question you can use this view: https://github.com/awslabs/amazon-redshift-utils/blob/master/src/AdminViews/v_space_used_per_tbl.sql要回答您的问题,您可以使用此视图: https : //github.com/awslabs/amazon-redshift-utils/blob/master/src/AdminViews/v_space_used_per_tbl.sql
and then query as you like.然后随意查询。 eg:
select * from admin.v_space_used_per_tbl;
例如:
select * from admin.v_space_used_per_tbl;
Modified versions of one of the other answers.其他答案之一的修改版本。 This includes database name, schema name, table name, total row count, size on disk and unsorted size:
这包括数据库名称、模式名称、表名称、总行数、磁盘大小和未排序大小:
-- sort by row count
select trim(pgdb.datname) as Database, trim(pgns.nspname) as Schema, trim(a.name) as Table,
c.rows, ((b.mbytes/part.total::decimal)*100)::decimal(5,3) as pct_of_total, b.mbytes, b.unsorted_mbytes
from stv_tbl_perm a
join pg_class as pgtbl on pgtbl.oid = a.id
join pg_namespace as pgns on pgns.oid = pgtbl.relnamespace
join pg_database as pgdb on pgdb.oid = a.db_id
join (select tbl, sum(decode(unsorted, 1, 1, 0)) as unsorted_mbytes, count(*) as mbytes from stv_blocklist group by tbl) b on a.id=b.tbl
join (select id, sum(rows) as rows from stv_tbl_perm group by id) c on a.id=c.id
join (select sum(capacity) as total from stv_partitions where part_begin=0) as part on 1=1
where a.slice=0
order by 4 desc, db_id, name;
-- sort by space used
select trim(pgdb.datname) as Database, trim(pgns.nspname) as Schema, trim(a.name) as Table,
c.rows, ((b.mbytes/part.total::decimal)*100)::decimal(5,3) as pct_of_total, b.mbytes, b.unsorted_mbytes
from stv_tbl_perm a
join pg_class as pgtbl on pgtbl.oid = a.id
join pg_namespace as pgns on pgns.oid = pgtbl.relnamespace
join pg_database as pgdb on pgdb.oid = a.db_id
join (select tbl, sum(decode(unsorted, 1, 1, 0)) as unsorted_mbytes, count(*) as mbytes from stv_blocklist group by tbl) b on a.id=b.tbl
join (select id, sum(rows) as rows from stv_tbl_perm group by id) c on a.id=c.id
join (select sum(capacity) as total from stv_partitions where part_begin=0) as part on 1=1
where a.slice=0
order by 6 desc, db_id, name;
This query is much easier:这个查询要容易得多:
-- List the Top 30 largest tables on your cluster -- 列出集群中最大的 30 个表
SELECT
"schema"
,"table" AS table_name
,ROUND((size/1024.0),2) AS "Size in Gigabytes"
,pct_used AS "Physical Disk Used by This Table"
FROM svv_table_info
ORDER BY pct_used DESC
LIMIT 30;
SVV_TABLE_INFO
is a Redshift systems table that shows information about user-defined tables (not other system tables) in a Redshift database. SVV_TABLE_INFO
是一个 Redshift 系统表,它显示有关 Redshift 数据库中用户定义表(而不是其他系统表)的信息。 The table is only visible to superusers.该表仅对超级用户可见。
To get the size of each table, run the following command on your Redshift cluster:要获取每个表的大小,请在您的 Redshift 集群上运行以下命令:
SELECT "table", size, tbl_rows
FROM SVV_TABLE_INFO
table
column is the table name. table
列是表名。size
column is the size of the table in MB. size
列是表的大小(以 MB 为单位)。tbl_rows
column is the total number of rows in the table, including rows that have been marked for deletion but not yet vacuumed. tbl_rows
列是表中的总行数,包括已标记为删除但尚未tbl_rows
的行。Look at SVV_TABLE_INFO
Redshift documentation for other interesting columns to retrieve from this system table.查看
SVV_TABLE_INFO
Redshift 文档以了解要从此系统表中检索的其他有趣列。
This is what I am using(please change the databasename from 'mydb' to your database name) :这就是我正在使用的(请将数据库名称从“mydb”更改为您的数据库名称):
SELECT CAST(use2.usename AS VARCHAR(50)) AS OWNER
,TRIM(pgdb.datname) AS DATABASE
,TRIM(pgn.nspname) AS SCHEMA
,TRIM(a.NAME) AS TABLE
,(b.mbytes) / 1024 AS Gigabytes
,a.ROWS
FROM (
SELECT db_id
,id
,NAME
,SUM(ROWS) AS ROWS
FROM stv_tbl_perm a
GROUP BY db_id
,id
,NAME
) AS a
JOIN pg_class AS pgc ON pgc.oid = a.id
LEFT JOIN pg_user use2 ON (pgc.relowner = use2.usesysid)
JOIN pg_namespace AS pgn ON pgn.oid = pgc.relnamespace
AND pgn.nspowner > 1
JOIN pg_database AS pgdb ON pgdb.oid = a.db_id
JOIN (
SELECT tbl
,COUNT(*) AS mbytes
FROM stv_blocklist
GROUP BY tbl
) b ON a.id = b.tbl
WHERE pgdb.datname = 'mydb'
ORDER BY mbytes DESC
,a.db_id
,a.NAME;
src: https://aboutdatabases.wordpress.com/2015/01/24/amazon-redshift-how-to-get-the-sizes-of-all-tables/源代码: https : //aboutdatabases.wordpress.com/2015/01/24/amazon-redshift-how-to-get-the-sizes-of-all-tables/
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.