简体   繁体   English

从列中提取数据

[英]Extracting data from a column

I have data that looks like below-我的数据如下所示 -

id            a_json
111       {key:A,values:[123,2345,2345,456,78,9]}
222       {key:A,values:[1112, 323, 11, 11]}

I want to extract the UNIQUE numbers in the square bracket (values).我想提取方括号(值)中的唯一数字。 Following is what I tried-以下是我尝试过的 -

SELECT
  id,
  REGEXP_EXTRACT_ALL(a_json, r'([0-9]+)*(,[0-9]+)*'),
  a_json
FROM 
`project.dataset.table`
WHERE
  a_json like  "%values%"
GROUP BY
  id,
  a_json

But this gives me the following error-但这给了我以下错误-

Regular expression passed to REGEXP_EXTRACT_ALL must not have more than 1 capturing group

I want the result to look like-我希望结果看起来像 -

id            a_json                                  numbers
111       {key:A,values:[123,2345,2345,456,78,9]}   123,2345,456,78,9
222       {key:A,values:[1112,323,11,11]}           1112,323,11

Is this doable?这是可行的吗?

You should use below regexp你应该使用下面的正则表达式

REGEXP_EXTRACT_ALL(a_json, r'\d+') as numbers  

in this case output will be在这种情况下 output 将是

Row id  a_json                              numbers  
1   111 {key:A,values:[123,2345,456,78,9]}  123  
                                            2345     
                                            456  
                                            78   
                                            9    
2   222 {key:A,values:[1112, 323, 11]}      1112     
                                            323  
                                            11     

As alternative - you can use below version - in this case you can omit WHERE a_json like "%values%"作为替代方案 - 您可以使用以下版本 - 在这种情况下,您可以省略WHERE a_json like "%values%"

SPLIT(REGEXP_EXTRACT(a_json, r'values:\[(.*)]')) numbers    

with exact same output与完全相同的 output

I want to see the result in the comma separated format.我想以逗号分隔格式查看结果。 Also, I forgot to mention in the question that I need to check for unique values while doing this.另外,我忘了在问题中提到我需要在执行此操作时检查唯一值。

Below simple adjustments will do the trick下面简单的调整就可以了

#standardSQL
WITH `project.dataset.table` AS (
  SELECT 111 id, '{key:A,values:[123,2345,2345,456,78,9]}' a_json UNION ALL
  SELECT 222, '{key:A,values:[1112, 323, 11, 11]}' 
)
SELECT id, a_json,
  (SELECT STRING_AGG(DISTINCT number) FROM UNNEST(SPLIT(REGEXP_EXTRACT(a_json, r'values:\[(.*)]'))) number) numbers
FROM `project.dataset.table`   

with output与 output

Row id      a_json                                      numbers  
1   111     {key:A,values:[123,2345,2345,456,78,9]}     123,2345,456,78,9    
2   222     {key:A,values:[1112, 323, 11, 11]}          1112, 323, 11   

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM