需要 Hive 相当于 oracle 的 regexp_extract 来提取模式之间的文本

Question

My data looks like this:我的数据如下所示：

bizunit
nam-bu1-us-credit
nam-bu2-us-debit
latam-bu3-mx-debit

Now I want to extract nam & latam into a separate column called region and extract bu1 , bu2 , bu3 into a separate column called business unit and us , mx into a separate column called country .现在我想将nam & latam提取到一个名为region的单独列中， bu2 bu1 、 bu2 、 bu3提取到一个名为business unit的单独列中，并将us , mx提取到一个名为country的单独列中。

What hive function and SQL would I use?我将使用什么 hive 函数和 SQL？ Please share a sample hive SQL to split the above data into 3 columns to be extracted as above.请分享一个示例 hive SQL 将上述数据拆分为 3 列以按上述方式提取。

Answer 1

CREATE TABLE myTably(
  myText string
  )
INSERT INTO TABLE concat_test VALUES 
  ('nam-bu1-us-credit'), 
  ('nam-bu2-us-debit'), 
  ('latam-bu3-mx-debit');

Here's the query to extract region, business unit and country这是提取地区、业务单位和国家的查询

select 
regexp_extract(myText, '([^-]+)-([^-]+)-([^-]+)-', 1),
regexp_extract(myText, '([^-]+)-([^-]+)-([^-]+)-', 2),
regexp_extract(myText, '([^-]+)-([^-]+)-([^-]+)-', 3) from MyTable

需要 Hive 相当于 oracle 的 regexp_extract 来提取模式之间的文本

问题描述

1 个解决方案

解决方案1
0 2021-10-14 18:44:34

需要 Hive 相当于 oracle 的 regexp_extract 来提取模式之间的文本

问题描述

1 个解决方案

解决方案1 0 2021-10-14 18:44:34

解决方案1
0 2021-10-14 18:44:34