简体繁体 English

使用 Spark 写入 BigQuery 中的 JSON 列类型

[英]Writing to a JSON column type in BigQuery using Spark

原文 2022-12-02 12:34:09 1 1 java/ apache-spark/ google-bigquery/ apache-spark-sql

I have a column of type JSON in my BigQuery schema definition.我的 BigQuery 架构定义中有一个类型为 JSON 的列。 I want to write to this from a Java Spark Pipeline but I cannot seem to find a way that this is possible.我想从 Java Spark Pipeline 写入此文件，但我似乎找不到可行的方法。

If create a Struct of the JSON it results in a RECORD type.如果创建一个 JSON 的结构，它会产生一个RECORD类型。 And if I use to_json like below it turns converts into a STRING type.如果我像下面那样使用to_json ，它会转换为STRING类型。

dataframe = dataframe.withColumn("JSON_COLUMN, functions.to_json(functions.col("JSON_COLUMN)))

I know BigQuery has support for JSON columns but is there any way to write to them with Java Spark currently?我知道 BigQuery 支持 JSON 列，但目前有什么方法可以用 Java Spark 写入它们吗？

1 个解决方案

As @DavidRabinowitz mentioned in the comment, feature to insert JSON type data into BigQuery using spark-bigquery-connector will be released soon.正如@DavidRabinowitz 在评论中提到的，使用spark-bigquery-connector将 JSON 类型数据插入 BigQuery 的功能将很快发布。

All the updates regarding the BigQuery features will be updated in this document .有关 BigQuery 功能的所有更新都将在本文档中更新。

Posting the answer as community wiki for the benefit of the community that might encounter this use case in the future.将答案发布为社区 wiki，以造福于将来可能会遇到此用例的社区。

Feel free to edit this answer for additional information.请随意编辑此答案以获取更多信息。

无法将具有 JSON/RECORD 列类型的 bigquery 表读入 spark dataframe。（java.lang.IllegalStateException：意外类型：JSON） - Unable to read bigquery table with JSON/RECORD column type into spark dataframe. ( java.lang.IllegalStateException: Unexpected type: JSON)

BigQuery 中 JSON 类型列的 SELECT 个值 - SELECT values from JSON type column in BigQuery

将 dict 序列化为 json 并写入 bigquery 时出错 - error serializing dict to json and writing to bigquery

如何使用 Google BigQuery 中数组类型列的不同元素进行分组？ - How to Group By using the distinct elements of an Array type column in Google BigQuery?

使用 Java 创建具有 BIGNUMERIC 数据类型的 BigQuery 表列 - Creating BigQuery table column with BIGNUMERIC data type using Java

写入JSON记录时如何使用python BigQuery客户端进行UPSERT操作 - How to perform the UPSERT operation using the python BigQuery client when writing JSON record

使用 Spark BigQuery 连接器查询 BigQuery 视图时未启用缓存 - Cache not enabled when querying BigQuery view using Spark BigQuery connector

使用 Databricks 将数据写入 Bigquery 时出错 Pyspark - Error writing data to Bigquery using Databricks Pyspark

在 azure 突触笔记本中使用 spark.sql 提取 json 列 - extract json column using spark.sql in azure synapse notebook

BigQuery 的查询过滤器按 json 类型 - BigQuery's query filter by json type

暂无

暂无

声明:本站的技术帖子网页，遵循CC BY-SA 4.0协议，如果您需要转载，请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 无法将具有 JSON/RECORD 列类型的 bigquery 表读入 spark dataframe。（java.lang.IllegalStateException：意外类型：JSON） - Unable to read bigquery table with JSON/RECORD column type into spark dataframe. ( java.lang.IllegalStateException: Unexpected type: JSON) BigQuery 中 JSON 类型列的 SELECT 个值 - SELECT values from JSON type column in BigQuery 将 dict 序列化为 json 并写入 bigquery 时出错 - error serializing dict to json and writing to bigquery 如何使用 Google BigQuery 中数组类型列的不同元素进行分组？ - How to Group By using the distinct elements of an Array type column in Google BigQuery? 使用 Java 创建具有 BIGNUMERIC 数据类型的 BigQuery 表列 - Creating BigQuery table column with BIGNUMERIC data type using Java 写入JSON记录时如何使用python BigQuery客户端进行UPSERT操作 - How to perform the UPSERT operation using the python BigQuery client when writing JSON record 使用 Spark BigQuery 连接器查询 BigQuery 视图时未启用缓存 - Cache not enabled when querying BigQuery view using Spark BigQuery connector 使用 Databricks 将数据写入 Bigquery 时出错 Pyspark - Error writing data to Bigquery using Databricks Pyspark 在 azure 突触笔记本中使用 spark.sql 提取 json 列 - extract json column using spark.sql in azure synapse notebook BigQuery 的查询过滤器按 json 类型 - BigQuery's query filter by json type

相关标签

粤ICP备18138465号 © 2020-2024 STACKOOM.COM