A BigQuery best practice is to split timeseries in daily tables (as "NAME_yyyyMMdd") and then use Table Wildcards to query one or more of these tables.
Sometimes it is useful to get the last update time on a certain set of data (ie to check correctness of the ingestion procedure). How do I get the last update time over a set of tables organized like that?
A good way to achieve that is to use the __TABLES__
meta-table. Here is a generic query I use in several projects:
SELECT
MAX(last_modified_time) LAST_MODIFIED_TIME,
IF(REGEXP_MATCH(RIGHT(table_id,8),"[0-9]{8}"),LEFT(table_id,LENGTH(table_id) - 8),table_id) AS TABLE_ID
FROM
[my_dataset.__TABLES__]
GROUP BY
TABLE_ID
It will return the last update time of every table in my_dataset
. For tables organized with a daily-split structure, it will return a single value (the update time of the latest table), with the initial part of their name as TABLE_ID
.
SELECT * FROM project_name.data_set_name
.INFORMATION_SCHEMA.PARTITIONS where table_name='my_table';
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.