简体   繁体   English

teradata sql - regexp_substr 在“-”上拆分

[英]teradata sql - regexp_substr to split on ' - '

I am somewhat new to Teradata.我对 Teradata 有点陌生。 I am more familiar with Presto SQL, where split_part is available.我更熟悉 Presto SQL,其中 split_part 可用。

I'm looking to split a string on a space, hyphen, space (' - ').我希望在空格、连字符、空格 (' - ') 上拆分字符串。

Example: 'Wal-Mart - Target - Best Buy - K-Mart - Staples'示例:“沃尔玛 - Target - Best Buy - K-Mart - Staples”

I'm used to using split_part(split_part(COLUMN, ' - ',2), ' - '), 1) to get Target, which ignores the hyphens in Wal-Mart and K-Mart because the hyphen is not preceeded and followed by a space.我习惯使用 split_part(split_part(COLUMN, ' - ',2), ' - '), 1) 来获取 Target,它忽略了 Wal-Mart 和 K-Mart 中的连字符,因为连字符没有前后由一个空间。

But, I can't figure out how to get 'Target' with Teradata.但是,我不知道如何使用 Teradata 获得“目标”。
strtok() only seems to work with a single character, which isn't sufficient since I want to split on 3 (' - '). strtok() 似乎只适用于单个字符,这还不够,因为我想拆分 3 (' - ')。

Any help would be appreciated!任何帮助,将不胜感激!

根据您的版本(14.0 或最新版本),您可以使用strtok来解析它

select strtok(oreplace('Wal-Mart - Target - Costco - K-Mart - Staples',' - ','|'),'|',2)

While not a direct answer, maybe it will help with the logic so at the risk of getting flamed I'm throwing it out here anyway.虽然不是直接的答案,但也许它会有助于逻辑,所以冒着被激怒的风险我还是把它扔在这里。 With a regex you should be able to first describe your pattern in plain language to help analyze and define what you really need to get.使用正则表达式,您应该能够首先用通俗的语言描述您的模式,以帮助分析和定义您真正需要获得的内容。 ie You want the 2nd occurrence of a string that is surrounded by the pattern space-dash-space.即,您想要第二次出现的字符串被模式 space-dash-space 包围。 What if the pattern is at the start or end of the line?如果模式在行的开头或结尾怎么办? Let's revise.让我们修改一下。 You want a specified occurrence of a string that is preceded by the start of the line optionally OR by the pattern space-dash-space, and is followed by space-dash-space OR the end of the line.您需要指定出现的字符串,其前面是行首(可选)或模式 space-dash-space,后跟 space-dash-space OR 行尾。

In Oracle it would look like this where the first '2' in the argument list means get the 2nd occurrence of the pattern and the 2nd '2' means return the 2nd remembered group (in parenthesis).在 Oracle 中,它看起来像这样,参数列表中的第一个 '2' 表示获取模式的第二次出现,第二个 '2' 表示返回第二个记住的组(在括号中)。 The WITH statement just sets up the data. WITH 语句只是设置数据。 You would have to translate this regex to Teradata.您必须将此正则表达式转换为 Teradata。

WITH tbl(str) AS (
  SELECT 'Wal-Mart - Target - Best Buy - K-Mart - Staples' FROM dual
)
SELECT REGEXP_SUBSTR(str, '(^?| - )(.*?)( - |$)', 1, 2, NULL, 2) retailer
FROM tbl;

RETAILER
--------
Target  
1 row selected.

Play with query here在这里玩查询

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM