[英]Writing to a JSON column type in BigQuery using Spark
I have a column of type JSON in my BigQuery schema definition.我的 BigQuery 架构定义中有一个类型为 JSON 的列。 I want to write to this from a Java Spark Pipeline but I cannot seem to find a way that this is possible.我想从 Java Spark Pipeline 写入此文件,但我似乎找不到可行的方法。
If create a Struct of the JSON it results in a RECORD
type.如果创建一个 JSON 的结构,它会产生一个RECORD
类型。 And if I use to_json
like below it turns converts into a STRING
type.如果我像下面那样使用to_json
,它会转换为STRING
类型。
dataframe = dataframe.withColumn("JSON_COLUMN, functions.to_json(functions.col("JSON_COLUMN)))
I know BigQuery has support for JSON columns but is there any way to write to them with Java Spark currently?我知道 BigQuery 支持 JSON 列,但目前有什么方法可以用 Java Spark 写入它们吗?
As @DavidRabinowitz mentioned in the comment, feature to insert JSON type data into BigQuery using spark-bigquery-connector
will be released soon.正如@DavidRabinowitz 在评论中提到的,使用spark-bigquery-connector
将 JSON 类型数据插入 BigQuery 的功能将很快发布。
All the updates regarding the BigQuery features will be updated in this document .有关 BigQuery 功能的所有更新都将在本文档中更新。
Posting the answer as community wiki for the benefit of the community that might encounter this use case in the future.将答案发布为社区 wiki,以造福于将来可能会遇到此用例的社区。
Feel free to edit this answer for additional information.请随意编辑此答案以获取更多信息。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.