I have a column which contains a large JSON object. For example, let's call the column Column1, and this is a typical element:
{"key1":value,"key2":[{"subK11":val,"subK12":val},{"subK21":val,"subK22":val}]}
So, I can extract a normal element fine using:
select get_json_object(Column1,'$.key1') as key1
But I have been unable to figure out how to extract the ARRAY in a usable form, as this:
select get_json_object(Column1,'$.key2') as key2
Returns a STRING type. So I can't select elements from the array like normal. That is, this query will fail:
select key2[1] as first_element
from
(select get_json_object(Column1,'$.key2') as key2)
OR
select explode(key2)
from
(select get_json_object(Column1,'$.key2') as key2 )
Both give errors, the later says "explode() requires array type". So the issue, I think, is that get_json_object returns a string. I need it to recognize that key2 contains an ARRAY, but I have no idea how to do that.
I'm new to Hive SQL, mainly an SQL user, so please let me know if there's anything crazy obvious I'm missing. I have not found a solution to this type of problem on any of the other questions.
you can use hive-third-functions , It provide json_array_extract function, you can extract json array info like this:
json_array_extract("[{\"a\":{\"b\":\"13\"}}, {\"a\":{\"b\":\"18\"}}, {\"a\":{\"b\":\"12\"}}]", "$.a.b"); => ["\"13\"","\"18\"","\"12\""]
json_array_extract_scalar("[{\"a\":{\"b\":\"13\"}}, {\"a\":{\"b\":\"18\"}}, {\"a\":{\"b\":\"12\"}}]", "$.a.b") => ["13","18","12"]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.