I have a function str_to_map() in hive that I need to convert to Big Query. As we don't have map in Bigquery, I want to find another way to have a map format and then after that to extract the key-values by using the key name.
Example :
Select str_to_map('cars:0,kids:143,cats:1,lost:0,win:1,chances:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0,missed:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0',',',':')
If I call the key 'cars' I get the value '0'. If I call the key 'chances' I should get '0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0'
It's necessary for me to have a type like the 'map' type (key-value).
Thank you
Google provides some useful UDFs for BigQuery here in bigquery-utils .
Don't reinvent the wheel
So, I brought two udfs to answer this question.
Given a key and a list of key-value maps in the form [{'key': 'a', 'value': 'aaa'}], returns the SCALAR type value.
String to map convert.
With these, you can write a query like below:
SELECT get_value('kids', cw_map_parse(str, ',', ':')) kids,
get_value('chances', cw_map_parse(str, ',', ':')) chances,
FROM UNNEST(['cars:0,kids:143,cats:1,lost:0,win:1,chances:0,missed:0']) str;
+------+---------+
| kids | chances |
+------+---------+
| 143 | 0 |
+------+---------+
But due to below requirements, cw_map_parse implementation needs to be customized a little bit.
If I call the key 'cars' I get the value '0'. If I call the key 'chances' I should get '0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0'
Below is a query with cutomized UDFs functions. str_to_map()
is a customized version of cw_map_parse()
.
CREATE TEMP FUNCTION str_to_map(m string, pd string, kvd string)
RETURNS ARRAY<STRUCT<key STRING, value STRING>> AS (
ARRAY(
SELECT AS STRUCT kv[SAFE_OFFSET(0)] AS key, kv[SAFE_OFFSET(1)] AS value
FROM (
SELECT SPLIT(REGEXP_REPLACE(kv, r'^(.*?)' || kvd, r'\1|'), '|') AS kv
FROM UNNEST(SPLIT(m, pd)) AS kv
)
));
CREATE TEMP FUNCTION get_value(get_key STRING, arr ANY TYPE) AS (
(SELECT value FROM UNNEST(arr) WHERE key = get_key)
);
SELECT get_value('cars', map) cars,
get_value('kids', map) kids,
get_value('chances', map) chances,
get_value('missed', map) missed,
FROM UNNEST(['cars:0,kids:143,cats:1,lost:0,win:1,chances:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0,missed:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0']) str,
UNNEST([STRUCT(str_to_map(str, ',', ':') AS map)]);
+------+------+-------------------------------------+-------------------------------------+
| cars | kids | chances | missed |
+------+------+-------------------------------------+-------------------------------------+
| 0 | 143 | 0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0 | 0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0 |
+------+------+-------------------------------------+-------------------------------------+
Another super simple option for that particular case
select
json_value(json, '$.cars') cars,
json_value(json, '$.kids') kids,
json_value(json, '$.cats') cats,
json_value(json, '$.lost') lost,
json_value(json, '$.win') win,
json_value(json, '$.chances') chances,
json_value(json, '$.missed') missed
from your_table,
unnest([format('{%s}', regexp_replace(str, r'([^:,]+):([\d:]*\d)', r'"\1":"\2"'))]) json
with output
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.