BigQuery 将 Float 值 0.0 和 Boolean false 保存为 null

Question

我正在使用云 function（python 3.10 运行时）在 protobuf 模式中接收和编码以下 JSON 有效负载，并发布到允许将数据处理到 BigQuery 的 PubSub 主题。

有效载荷

{
  "data": [
    {
      "user_id": "XY25999A",
      "firstname": "John",
      "lastname": "Doe",
      "fee": 20.00,
      "is_active": false
    },
    {
      "user_id": "XY26999B",
      "firstname": "Sam",
      "lastname": "Foo",
      "fee": 0.00,
      "is_active": true
    },
    {
      "user_id": "XY27999C",
      "firstname": "Kay",
      "lastname": "Bent",
      "fee": 20.00,
      "is_active": true
    }
  ]
}

json 模式

{
    "type":"object",
    "properties":{
       "user_id":{
          "type":"string"
       },
       "firstname":{
          "type":"string"
       },
       "lastname":{
          "type":"string"
       },
       "fee":{
          "type":"number"
       },
       "is_active":{
          "type":"boolean"
       }
    }
 }

protobuf 模式

message ProtoSchema {
    string user_id = 1;
    string firstname = 2;
    string lastname = 3;
    double fee = 4;
    bool is_active = 5;
  }

当数据被处理到 BigQuery 时， John的is_active和Sam的fee都显示null而不是分别为false和0.0 。

用户身份	名	姓	费用	活跃
XY25999A	约翰	母鹿	20.00	null
XY26999B	山姆	福	null	真的
XY27999C	凯	弯曲	20.00	真的

这种行为有原因或解释吗？

Answer 1

除非显式设置required ，否则 proto3 消息中的所有字段都是隐式optional的。 它有助于节省编码消息的大小。 我猜云 function 文档缺少字段的默认 bool 和 double 值分别为false和0 ，如果字段具有默认值，则不会设置这些字段。 因此，数据处理器应该为缺失的字段使用默认值。

BigQuery 将 Float 值 0.0 和 Boolean false 保存为 null

问题描述

有效载荷

json 模式

protobuf 模式

1 个解决方案

解决方案1
0 2022-12-08 15:06:35

BigQuery 将 Float 值 0.0 和 Boolean false 保存为 null

问题描述

有效载荷

json 模式

protobuf 模式

1 个解决方案

解决方案1 0 2022-12-08 15:06:35

解决方案1
0 2022-12-08 15:06:35