简体   繁体   English

如何将一列中的字符串解析为 SQL 中的分隔值

[英]How to parse string from one column into delimited values in SQL

This is my column in Redshift这是我在 Redshift 中的专栏

SHIPMENT_ID
-----------------------------------------
FBA15KS66741, FBA15KS6673D
FBA15NHV7PXX (Oct 20th)
FBA15XNW0SWY 27 balance 2 of 2
FBA15M575MDL &  FBA15M59W1Y5
FBA15NHV7PXX (Oct 20th)
FBA15D7WPZVR /FBA15D7WWTPK/FBA15D7WW1GL

I would like to make it我想做

SHIPMENT_ID
-----------------------------------------
FBA15KS66741, FBA15KS6673D
FBA15NHV7PXX
FBA15XNW0SWY
FBA15M575MDL, FBA15M59W1Y5
FBA15NHV7PXX
FBA15D7WPZVR, FBA15D7WWTPK, FBA15D7WW1GL

In SQL only, what is the best way to handle this?仅在 SQL 中,处理此问题的最佳方法是什么?

This works in PostgreSQL, so may work in Redshift depending on feature availability in PG8.这适用于 PostgreSQL,因此可能适用于 Redshift,具体取决于 PG8 中的功能可用性。

WITH items AS
(
  SELECT shipment_id,
         ARRAY_TO_STRING(REGEXP_MATCHES(shipment_id,'FBA15[0-9a-zA-z]{7}','g'),'') AS unique_shipment_ids
  FROM dat
)
SELECT shipment_id,
       STRING_AGG(unique_shipment_ids,',') AS shipment_id_csv
FROM items
GROUP BY shipment_id;

在此处输入图像描述

I've assumed:我假设:

  • Each item begins with the characters 'FBA15'每件商品均以字符“FBA15”开头
  • There are exactly 7 characters after the first 5前 5 个字符后正好有 7 个字符

You can edit the regexp pattern if my assumptions are incorrect.如果我的假设不正确,您可以编辑正则表达式模式。

The approach is:方法是:

  1. Use REGEXP_MATCHES to capture each item within each row.使用REGEXP_MATCHES捕获每行中的每个项目。 This creates multiple rows per unique value in shipment_id这会在 shipping_id 中为每个唯一值创建多行
  2. Use ARRAY_TO_STRING to convert those values to text , rather than text[]使用ARRAY_TO_STRING将这些值转换为text ,而不是text[]
  3. Use STRING_AGG to join them back together with a comma separator使用STRING_AGG用逗号分隔符将它们重新连接在一起

I found that I could not use STRING_AGG directly around REGEXP_MATCHES as I get the error aggregate function calls cannot contain set-returning function calls , so opted for a CTE.我发现我不能直接在REGEXP_MATCHES周围使用STRING_AGG ,因为我得到错误aggregate function calls cannot contain set-returning function calls ,所以选择了 CTE。 I assume a subquery would work as well.我假设子查询也可以。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM