简体   繁体   English

从 Big-query 中的 Complex JSON 中提取数据

[英]Extract data from Complex JSON in Big-query

I have a data which looks like this -我有一个看起来像这样的数据 -

{aa_validation: null 
 propensity_overlap: {auc pscore overlap: 0.5993614555297898 
                      auc pscore treated: 1.000000000000001 
                      auc pscore control: 1.0000000000000004
                      auc pscore ROC: 0.7524618788923345} 
 feature_balance: {% features with post matching SMD < 0.1: 100.0 
                   % features with post matching SMD < 0.25: 100.0 
                   % features with SMD improved after matching: 84.21052631578947 
                   % features with SMD not significantly worsened: 100.0}}

I want to use Big Query to make a column for each of these keys such as I get a result like this -我想使用 Big Query 为这些键中的每一个创建一个列,例如我得到这样的结果 -

auc pscore overlap   auc pscore overlap...   % features with post matching SMD < 0.1   % features with post matching SMD < 0.25 ....

      0.32                    1                        50.0                      50.0

I have been going crazy using Regex_extract but cant seem to make it work.我一直在疯狂使用 Regex_extract,但似乎无法让它发挥作用。 Can anyone help me extract this using Bigquery?谁能帮我用 Bigquery 提取这个?

This JSON schema is not suitable for BigQuery.此 JSON 架构不适合 BigQuery。 You need to change the keys in order to be able to extract them properly.您需要更改密钥才能正确提取它们。

This key, fe, "% features with post matching SMD < 0.1" is not going to work with the JSON_EXTRACT function, as you can see here:这个键,fe,“% features with post matching SMD < 0.1”不适用于 JSON_EXTRACT function,正如您在这里看到的:

invalid key on sample query示例查询中的无效键

Use different keys and then you will be able to launch queries like this one:使用不同的键,然后您将能够启动这样的查询:

    SELECT JSON_EXTRACT(PARSE_JSON(json_field), "$.aa_validation") AS aa_validation,
JSON_EXTRACT(PARSE_JSON(json_field), "$.feature_balance") AS feature_balance,
JSON_EXTRACT(PARSE_JSON(json_field), "$.feature_balance.features_with_smd_improved_after_matching") AS smd_improved_after_matching,
FROM `qwiklabs-gcp-03-5570739e32d7.data.test2`

Taking advantage of PARSE_JSON and JSON_EXTRACT combined with JsonPath queries.利用 PARSE_JSON 和 JSON_EXTRACT 结合 JsonPath 查询。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM