I have data in a column that looks like this:
Countryside_Video_-_A18-49_Pub_- Q3 -_Flight_7_18_49_BOTH
Countryside Video - M18-25 Validated -Q4 - Flight 1
PremiumBrand_2019_Upfront_Video_-_W18-49_Validated_-_Q4_Flight_1_18_49_FEMALE
Travel Around the World - W25-54 Validated - Q3 25-54_FEMALE
I need to extract the age and gender value from each string:
- A18-49
- M18-25
- W18-49
- W25-54
It's tricky, because there could be any number of combinations between the letters A,M,F and a number range. The letters signify Age, Male, or Female. The number range is the age range.
From some googling, it looks like I might be able to use a regexp_extract function, but I'm a novice to Hive. Any help on this would be greatly appreciated!
我手头没有 Hive 可以测试,但这可能有效:
select regexp_extract(col, '([AMW][0-9]{2}[-][0-9]{2})', 1)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.