简体   繁体   中英

Extracting text after certain characters in string in hive

I have multiple IDs for each row and I want to extract the number that comes after certain text in each row. Below could be my values in id column.

test123; tghy876; 8906; TT-1234
best123; tghy8656; 88706; TT-5678

I want output to return only numbers that comes after tt- so my output should be as below.

1234
5678

what hive code should be used to achieve this?

Using regexp - return first group of any number of digits after '; TT-'

select regexp_extract(src_string,'\; TT-(\\d+)',1);

Demo:

hive> select regexp_extract('best123\; tghy8656\; 88706\; TT-5678','\; TT-(\\d+)',1);
OK
5678
Time taken: 0.056 seconds, Fetched: 1 row(s)
hive>

was able to do it using multiple split.

split(split(id,"tt-")[1],";")[0]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM