简体   繁体   中英

How to apply more than one time regexp_extract funtion in hive sql

I have a query like this:

select 
name,
split(name, ' ')[0] as name_1
from table

Which I wish to apply twice a regular expression, is it possible?

What I have done and it has not worked is:

select 
name,
split(name, ' ')[0] as name_1,
split(regexp_extract(name, "^(.*?)\\s(.*)",2), ' ')[0] as name_1,
from table

If I had: Name: Mark Bill Gates Potter The expected result will be: Gates Potter

Address: Gate 294 st.1.4 Arizona Expected: st. 1.4 Arizona

Name column is the full name (eg. Mark Bill Gates Potter), my idea is to keep the 3rd word until the finish sentence (eg. Gates Potter). How would this be possible?

Im working in Hive.

my idea is to keep the 3rd word until the finish sentence (eg. Gates Potter)

That could be:

split(name, ' ')[2] || ' ' || split(name, ' ')[3] as name_3_4

If you really want everything from the third word, a regex is better:

regexp_replace(name, '^\\S+\\s*\\S+\\s*', '') as name_3_and_more

The idea here replace the first two words with the empty string.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM