简体   繁体   中英

How to extract all numerical values from a string column in a big query table and insert them in new numerical columns?

Let's say I have a table like temp_table :

CREATE TABLE `YOUR_DATASET.temp_table` (
  `F1` STRING,
  `F2` INT64,
  `F3` STRING,
);

And this table includes some data:

INSERT `YOUR_DATASET.temp_table` (F1, F2, F3)
VALUES('45FG67', 10, 'This stri98ng includes 10/15 numbers .9'),
      ('45FG67', 10, 'This string includes 100 and 0'),
      ('95pp7', 30, 'This string includes .8 and 1_number'),
      ('45FG67', 12, '45'),
      ('45FG67', 12,NULL),
      ('95pp7', 30, NULL),
      ('95pp7', 5, '10 & 54.2')

This would create the temp_table as:

SELECT * FROM `YOUR_DATASET.shc_core_2021.temp_table`

I would like to write a big query script to extract all numerical values in F3 and append them as new numerical columns to temp_table . The number of new numerical columns should be equal to the maximum number of numerical values in F3 . In this example table, temp_table , there should be 4 new numerical columns added to the table, because F3 for row number 5 is This stri98ng includes 10/15 numbers.9 and int includes 4 numerical values: 98, 10, 15, 0.9. As another example, the values for these 4 numerical columns for row number 6 would be 45, null, null, null.

Note, here I asked a similar question. That solution works for that general question I asked there but doesn't work for the problem I described above.

Use below

select * from (
  select F1, F2, F3, offset + 1 as offset, num
  from your_table 
  left join unnest(regexp_extract_all(F3, r'([\d\.]+)')) num with offset
)
pivot (min(num) as numerical_val for offset in (1,2,3,4))     

If applied to sample data in your question - output is

在此处输入图像描述

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM