[英]How to extract a JSON value in Hive
我有一個 JSON 字符串,它存儲在數據庫中與父 ID 對應的單個單元格中
{"profileState":"ACTIVE","isDefault":"true","joinedOn":"2019-03-24T15:19:52.639Z","profileType":"ADULT","id":"abc","signupDeviceId":"1"}||{"profileState":"ACTIVE","isDefault":"true","joinedOn":"2021-09-05T07:47:00.245Z","imageId":"19","profileType":"KIDS","name":"Kids","id":"efg","signupDeviceId":"1"}
現在我想使用上面的 JSON 從中提取 id。 假設我們有這樣的數據
Parent ID | Profile JSON
1 | {profile_json} (see above string)
我希望 output 看起來像這樣
Parent ID | ID
1 | abc
1 | efg
現在,我嘗試了幾次迭代來解決這個問題
第一種方法:
select
get_json_object(p.profile, '$$.id') as id,
test.parent_id
from (
select split(
regexp_replace(
regexp_extract(profiles, '^\\[(.+)\\]$$',1),
'\\}\\,\\{', '\\}\\|\\|\\{'),
'\\|\\|') as profile_list,
parent_id ,
from source_table) test
lateral view explode(test.profile_list) p as profile
)
但這返回的id
列具有 NULL 個值。 我在這里缺少什么嗎?
第二種方法:
with profiles as(
select regexp_replace(
regexp_extract(profiles, '^\\[(.+)\\]$$',1),
'\\}\\,\\{', '\\}\\|\\|\\{') as profile_list,
parent_id
from source_table
)
SELECT
get_json_object (t1.profile_list,'$.id')
FROM profiles t1
第二種方法是根據上面的 JSON 字符串只返回第一個 id ( abc
)。
我試圖在 apache hive v4 中復制它。
數據
+----------------------------------------------------+------------------+
| data | parent_id |
+----------------------------------------------------+------------------+
| {"profileState":"ACTIVE","isDefault":"true","joinedOn":"2019-03-24T15:19:52.639Z","profileType":"ADULT","id":"abc","signupDeviceId":"1"}||{"profileState":"ACTIVE","isDefault":"true","joinedOn":"2021-09-05T07:47:00.245Z","imageId":"19","profileType":"KIDS","name":"Kids","id":"efg","signupDeviceId":"1"} | 1.0 |
+----------------------------------------------------+------------------+
sql
select pid,get_json_object(expl_jid,'$.id') json_id from
(select parent_id pid,split(data,'\\|\\|') jid from tabl1)a
lateral view explode(jid) exp_tab as expl_jid;
+------+----------+
| pid | json_id |
+------+----------+
| 1.0 | abc |
| 1.0 | efg |
+------+----------+
解決這個問題。 在第一種方法中使用提取 $
select
get_json_object(p.profile, '$.id') as id,
test.parent_id
from (
select split(
regexp_replace(
regexp_extract(profiles, '^\\[(.+)\\]$$',1),
'\\}\\,\\{', '\\}\\|\\|\\{'),
'\\|\\|') as profile_list,
parent_id ,
from source_table) test
lateral view explode(test.profile_list) p as profile
)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.