简体   繁体   English

如何在大查询表中存储动态生成的 JSON 对象?

[英]How to store dynamically generated JSON object in Big Query Table?

I have a use case to store dynamic JSON objects in a column in Big Query.我有一个用例将动态 JSON 对象存储在 Big Query 的列中。 The schema of the object is dynamically generated by the source and not known beforehand.对象的模式是由源动态生成的,事先不知道。 The number of key value pairs in the object can differ as well, as shown below.对象中键值对的数量也可以不同,如下所示。

Example JSON objects:示例 JSON 对象:
{"Fruit":"Apple","Price":"10","Sale":"No"} {“水果”:“苹果”,“价格”:“10”,“销售”:“否”}
{"Movie":"Avatar","Genre":"Fiction"} {“电影”:“阿凡达”,“流派”:“小说”}

I could achieve the same in Hive by defining the column as map<string, string> object and I could query the data in the column like col_name["Fruit"] or col_name["Movie"] for that corresponding row.通过将列定义为 map<string, string> 对象,我可以在 Hive 中实现相同的目的,并且我可以查询列中的数据,例如 col_name["Fruit"] 或 col_name["Movie"] 以获取相应的行。

Is there an equivalent way of above usage in Big Query? Big Query 中是否有上述用法的等效方法? I came across 'RECORD' data type but the schema needs to be same for all the objects in the column.我遇到了“记录”数据类型,但列中所有对象的架构需要相同。

Note: Storing the column as string datatype is not an option as the users need to query the data on the keys directly without parsing after retrieving the data.注意:将列存储为字符串数据类型不是一种选择,因为用户需要在检索数据后直接查询键上的数据而无需解析。

Storing the data as a JSON string seems to be the only way to implement your requirement, at the moment.目前,将数据存储为 JSON 字符串似乎是实现您的要求的唯一方法。 As a workaround, you can create a JavaScript UDF that parses the JSON string and extracts the necessary information.作为一种解决方法,您可以创建解析 JSON 字符串并提取必要信息的 JavaScript UDF。 Below is a sample UDF.下面是一个示例 UDF。

CREATE TEMP FUNCTION extract_from_json(json STRING, key STRING)
RETURNS STRING
LANGUAGE js AS """
  const obj = JSON.parse(json);
  return obj[key];
"""; 
WITH json_table AS (
  SELECT '{"Fruit":"Apple","Price":"10","Sale":"No"}' json_data UNION ALL
  SELECT '{"Movie":"Avatar","Genre":"Fiction"}' json_data
)
SELECT extract_from_json(json_data, 'Movie') AS photos
FROM json_table

You can also check out the newly introduced JSON data type in BigQuery.您还可以查看 BigQuery 中新引入的JSON 数据类型 The data type offers more flexibility when handling JSON data but please note that the data type is still in preview and is not recommended for production .该数据类型在处理 JSON 数据时提供了更大的灵活性,但请注意,该数据类型仍处于预览阶段,不建议用于生产 You will have to enroll in this preview.您必须注册此预览版。 For more information on working with JSON data, refer to this documentation .有关使用 JSON 数据的更多信息,请参阅此文档

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM