简体   繁体   中英

How to add columns in BigQuery to a table with no schema without deleting it's current labels in SQL?

I run BQ jobs from a python code that first creates an empty table in BQ for the results with specific labels & description. later in the BQ SQL I insert the results into that empty table. The only problem is that I can't use ALTER to add columns to a table with no schema. I can't add the schema before because the SQL query is dynamically created by the Python code. The only way I found to solve this was to create the table with a column called 'x' and then remove it at the end of the SQL query.

Here is an idea of what the code looks like:

CREATE TEMP FUNCTION 
    ... very_complicated_function ...;

CREATE TEMP TABLE features AS    
    ... very_clever_code ...;


ALTER TABLE `table.created.by_python`
ADD COLUMN IF NOT EXISTS key INT64,
ADD COLUMN IF NOT EXISTS feature1 INT64;

ALTER TABLE `table.created.by_python` DROP COLUMN x;



INSERT INTO `table.created.by_python`
SELECT * except(nearest_centroids_distance)
  from 
    ML.PREDICT(MODEL `brilliant.genius.amazing`, 
                  (SELECT * FROM features)) M
    

The best possibility is just to insert the data into the empty table and let it create the schema itself if it doesn't exist.

You can add an empty column to an existing table by:

  • Using the Cloud Console

  • Using the bq command-line tool's bq update command

  • Calling the tables.patch API method

  • Using the ALTER TABLE ADD COLUMN data definition language (DDL) statement.

  • Using the client libraries.

Here's some python code that you can try to use and see if it helps in your case.

from google.cloud import bigquery
 
# Construct a BigQuery client object.
client = bigquery.Client()
 
# TODO(developer): Set table_id to the ID of the table
#                  to add an empty column.
# table_id = "your-project.your_dataset.your_table_name"
 
table = client.get_table(table_id)  # Make an API request.
 
original_schema = table.schema
new_schema = original_schema[:]  # Creates a copy of the schema.
new_schema.append(bigquery.SchemaField("phone", "STRING"))
 
table.schema = new_schema
table = client.update_table(table, ["schema"])  # Make an API request.
 
if len(table.schema) == len(original_schema) + 1 == len(new_schema):
    print("A new column has been added.")
else:
    print("The column has not been added.")

Also, here's some documentation that can help you to implement the new column into a table.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM