简体   繁体   中英

Hive SQL regexp_extract (number)_(number)

I'm new to hiveSQL and I'm trying to extract a value from the column col_a from the data df which is in this format: \\\"id\\\":\\\"101_12345\\\" I only need to extract 101_12345 , but underscore makes it hard to satisfy my need. I tried using regexp_extract(col_a, '(\\d+)[_](\\d+)') but only outputs 101 . Could I get some help with regexp? Thank you

Simple solution: You don't need the two brackets.

Here's a working solution: '\\d+[_]\\d+'

When you put tokens into parentheses, the regex engine will group its match together, separate from the complete match . So the final result will comprise the complete match, and two extra matches representing the one before and after the underscore. To avoid this, just remove the brackets as you don't really need them.

In the future, if you want to group a regex together but don't want the result to contain it separately, use a non-capturing group given by (?:) .

Here's a demo of what your code resulted in, hosted at regex101.com

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM