简体   繁体   English

如何在大查询中的现有表上创建分区和集群?

[英]how to create partition and cluster on an existing table in big query?

In SQL Server, we can create index like this.在 SQL Server 中,我们可以这样创建索引。 How do we create the index after the table already exists?我们如何在表已经存在后创建索引? What is the syntax of create clusted index in bigquery? bigquery中create clusted index的语法是什么?

CREATE INDEX abcd ON `abcd.xxx.xxx`(columnname )

In big query, we can create table like below.在大查询中,我们可以创建如下表。 But how to create partition and cluster on an existing table?但是如何在现有表上创建分区和集群呢?

CREATE TABLE rep_sales.orders_tmp PARTITION BY DATE(created_at) CLUSTER BY created_at AS SELECT * FROM rep_sales.orders

As @Sergey Geron mentioned in the comments, BigQuery doesn't support indexes.正如@Sergey Geron 在评论中提到的,BigQuery 不支持索引。 For more information, please refer to this doc .有关更多信息,请参阅此文档

An existing table cannot be partitioned but you can create a new partitioned table and then load the data into it from the unpartitioned table.现有的表不能分区,但您可以创建一个新的分区表,然后从未分区的表中加载数据。

As for clustering of tables, BigQuery supports changing an existing non-clustered table to a clustered table and vice versa.对于表的集群,BigQuery 支持将现有的非集群表更改为集群表,反之亦然。 You can also update the set of clustered columns of a clustered table.您还可以更新聚簇表的聚簇列集。 This method of updating the clustering column set is useful for tables that use continuous streaming inserts because those tables cannot be easily swapped by other methods.这种更新集群列集的方法对于使用连续流式插入的表很有用,因为这些表无法通过其他方法轻松交换。

You can change the clustering specification in the following ways:您可以通过以下方式更改集群规范

  • Call the tables.update or tables.patch API method.调用tables.updatetables.patch API 方法。

  • Call the bq command-line tool's bq update command with the --clustering_fields flag.使用--clustering_fields标志调用bq命令行工具的bq update命令。

Note: When a table is converted from non-clustered to clustered or the clustered column set is changed, automatic re-clustering only works from that time onward.注意:当表从非聚簇转换为聚簇或改变聚簇列集时,自动重新聚簇仅从那时起起作用。 For example, a non-clustered 1 PB table that is converted to a clustered table using tables.update still has 1 PB of non-clustered data.例如,使用 tables.update 转换为聚簇表的非聚簇 1 PB 表仍然具有 1 PB 的非聚簇数据。 Automatic re-clustering only applies to any new data committed to the table after the update.自动重新聚类仅适用于更新后提交给表的任何新数据。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM