雪花中的 SUBSTRING_INDEX()

Question

What is the exact duplicate function of MySQL SUBSTRING_INDEX() in Snowflake??雪花中 MySQL SUBSTRING_INDEX()的确切重复函数是什么？

I found SPLIT_PART() in Snowflake but this is not the exact same of SUBSTRING_INDEX() .我在 Snowflake 中找到了SPLIT_PART()但这与SUBSTRING_INDEX()不完全相同。

Eg SUBSTRING_INDEX("www.abc.com", ".", 2);例如SUBSTRING_INDEX("www.abc.com", ".", 2); returns www.abc返回www.abc

all the left side substring after 2nd delimiter '.'第二个分隔符'.'之后的所有左侧子字符串

but但

SPLIT_PART("www.abc.com", ".", 2); return abc返回abc

it splits 1st then only returns the split part of a string.它首先拆分，然后只返回字符串的拆分部分。

How can I use SUBSTRING_INDEX() in the same way as MySQL in Snowflake如何在 Snowflake 中以与 MySQL 相同的方式使用SUBSTRING_INDEX()

Answer 1

Similar effect could be achieved using ARRAY operations:使用 ARRAY 操作可以达到类似的效果：

SELECT s.c, ARRAY_TO_STRING(ARRAY_SLICE(STRTOK_TO_ARRAY(s.c, '.'), 0, 2), '.')
FROM (VALUES ('www.abc.com')) AS s(c);

How does it works?它是如何工作的？

STRTOK_TO_ARRAY - make an array from string STRTOK_TO_ARRAY - 从字符串创建一个数组
ARRAY_SLICE - take the parts from 0 to n ARRAY_SLICE - 取从 0 到 n 的部分
ARRAY_TO_STRING - convert array back to string using '.' ARRAY_TO_STRING - 使用 '.' 将数组转换回字符串as delimeter作为分隔符

In steps:分步骤：

SELECT 
  s.c,
  STRTOK_TO_ARRAY(s.c, '.')   AS arr,
  ARRAY_SLICE(arr, 0, 2)      AS slice,
  ARRAY_TO_STRING(slice, '.') AS result
FROM (VALUES ('www.abc.com')) AS s(c);

Answer 2

You may use REGEXP_SUBSTR here:您可以在此处使用REGEXP_SUBSTR ：

SELECT REGEXP_SUBSTR('www.abc.com', '^[^.]+\.[^.]+');

Here is a demo showing that the regex pattern works as expected.这是一个演示，显示正则表达式模式按预期工作。

Answer 3

The substring_index function in MySQL returns the entire string if the substring isn't found or if the supplied occurrence is greater than the maximum occurrence.如果未找到子字符串或提供的出现次数大于最大出现次数，MySQL 中的substring_index函数将返回整个字符串。 Assuming you want to preserve that behavior and that you'd also find it helpful to be able to extract non-contiguous parts of string, consider this approach.假设您想保留该行为，并且您还发现能够提取字符串的非连续部分很有帮助，请考虑使用这种方法。

with cte as (select 'www.abc.com' as txt)


select a.txt, listagg(b.value,'.') within group (order by b.index)
from cte a, lateral split_to_table(a.txt, '.') b
where b.index <=2 --you can also do for e.g. b.index in (1,3) to get 'www.com'
group by a.txt;

雪花中的 SUBSTRING_INDEX()

问题描述

3 个解决方案

解决方案1
1 已采纳 2021-10-18 14:44:49

解决方案2
0 2021-10-18 05:43:10

解决方案3
0 2021-10-18 21:59:30

雪花中的 SUBSTRING_INDEX()

问题描述

3 个解决方案

解决方案1 1 已采纳 2021-10-18 14:44:49

解决方案2 0 2021-10-18 05:43:10

解决方案3 0 2021-10-18 21:59:30

解决方案1
1 已采纳 2021-10-18 14:44:49

解决方案2
0 2021-10-18 05:43:10

解决方案3
0 2021-10-18 21:59:30