[英]Extract substring with a specific pattern in Hive SQL
I have a column with this sample data.我有一列包含此示例数据。 I need to extract all substring that starts with "M6".
我需要提取所有以“M6”开头的 substring。 Is there a way to do it with regexp_extract?
有没有办法用 regexp_extract 做到这一点?
Data Column![]() |
---|
HEY01230328_M6K21SG_UNO_NYC_241 ![]() |
M6EW2BJ_UNO_NYC_251 ![]() |
M6HW2WL_UNO_NYC_251 ![]() |
HEY08460329_NA_M6LAB3D_UNO_NYC_241 ![]() |
Desired Output![]() |
---|
M6K21SG ![]() |
M6EW2BJ ![]() |
M6HW2WL ![]() |
M6LAB3D ![]() |
Try using:尝试使用:
SELECT colname FROM tableName WHERE REGEXP_EXTRACT(colname, ".*(M6[^_]*).*",1)
Regex used:使用的正则表达式:
.*(M6[^_]*).*
Explanation:解释:
.*
- matches 0+ occurrences of any character that is not a newline character .*
- 匹配 0+ 次出现的非换行符的任何字符(M6[^_]*)
- matches M6
followed by 0+ occurrences of any character that is not a _
. (M6[^_]*)
- 匹配M6
后跟 0+ 次出现的任何非_
字符。 So, after M6, it keeps on matching everything until it finds the next _
._
。 The parenthesis is used to store this sub-match in Group 1.*
- matches 0+ occurrences of any character that is not a newline character .*
- 匹配 0+ 次出现的非换行符的任何字符
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.