简体   繁体   中英

Convert Big Query DDL to Snowflake DDL

There is a Big Query table with nested columns, the DDL for the table is:

CREATE TABLE stg.raw_data.user_events
(
pipeline_metadata STRUCT<uuid STRING, timestamp TIMESTAMP, restream_count INT64, pubsub_subscription_name STRING>,
event_name STRING,
insertId STRING,
eventID STRING,
timestamp TIMESTAMP,
metadata STRUCT<userID INT64, uuid STRING, sessionID STRING, platform STRING, prime_status STRING, created_at TIMESTAMP, city STRING, country STRING, embed_url STRING, embed_domain STRING, deviceID STRING, device_type STRING, state STRING, metadata_schema_version STRING, schema_version STRING, ipAddress STRING>,
properties STRING
)
PARTITION BY DATE(timestamp);

Sample data from this table looks like this:

[{
  "pipeline_metadata": {
    "uuid": "d2aae738-ddbe-43fc-b1a0-a0a3d7f4a66c",
    "timestamp": "2022-10-06 00:44:53.804000 UTC",
    "restream_count": "0",
    "pubsub_subscription_name": "raw-user-events-data-ingestor-subscription"
  },
  "event_name": "search",
  "insertId": "5751196922008325",
  "eventID": "rCz_ZG70T-CYlJ7uAgWhvxOeRKqoevLb",
  "timestamp": "2022-10-06 00:44:52.792000 UTC",
  "metadata": {
    "userID": "2235338",
    "uuid": "3cf13e3f499339f45bc1a344bd5c83866ebae85b6c89255ff203467552157aaa",
    "sessionID": "140b56af96c6331e8e5cc88721143d788a30b3c35fe28dac2c2e3ceeb1ee4948",
    "platform": "web-app",
    "prime_status": "subscription",
    "created_at": "2022-10-06 00:44:52.767000 UTC",
    "city": "Woodside",
    "country": "United States",
    "embed_url": null,
    "embed_domain": null,
    "deviceID": "db0f3955624d5541b587ac4661a25dfd",
    "device_type": "desktop",
    "state": "New York",
    "metadata_schema_version": "latest",
    "schema_version": "latest",
    "ipAddress": null
  },
  "properties": "{\"action\":\"focus-on-input\",\"source\":\"dashboard\",\"page_path\":\"/home/dashboard\"}"
}]

I am trying to create a similar table in Snowflake, the nested column structure should be retained in Snowflake, because this is a migration from Big Query to Snowflake, hence the structure should be the same between both the warehouses.

I was able to prepare the table DDL for Snowflake but columns are not nested similar to the Big Query table.

CREATE or replace TABLE test.dbt."USER_EVENTS"
(
pipeline_metadata variant,
event_name STRING,
insertId STRING,
eventID STRING,
time_stamp TIMESTAMP,
metadata variant,
properties STRING
)
CLUSTER BY (DATE(time_stamp));

How can I build a DDL for Snowflake table, where the structure of the table is similar to that of the Big Query table.

Depending on how you define "similar" (between the DBMSes where you don't have a direct correspondence between a given feature), you could simply flatten out the STRUCTs

CREATE TABLE stg.raw_data.user_events
(
  uuid STRING,
  meta_timestamp TIMESTAMP,
  restream_count INT,
  pubsub_subscription_name STRING,
  event_name STRING,
  insertId STRING,
  eventID STRING,
  time_stamp TIMESTAMP,
  userID INT, 
  useer_uuid STRING,
  sessionID STRING,
  platform STRING,
  prime_status STRING,
  created_at TIMESTAMP,
  city STRING,
  country STRING,
  embed_url STRING,
  embed_domain STRING,
  deviceID STRING,
  device_type STRING,
  state STRING,
  metadata_schema_version STRING,
  schema_version STRING,
  ipAddress STRING,
  properties VARIANT
)
 CLUSTER BY (time_stamp)

that is certainly somewhat similar .. if it is similar enough is not for us to decide.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM