简体   繁体   中英

Extract Substring from a string using SQL

Body
Beta M yy-Tz
Beta M zz-Ox
Beta W yy-Tz
Tiger W Dash
Tiger M Dash
Tiger White Stone W Pearl
Tiger White Stone M Pearl

I have a column 'Body' in a snowflake table xyz which contains the products being ordered. The 'Body' field is composed of three parts.
1st part: Product Name (Ex: Beta, Tiger, Tiger White Stone)

2nd part: Gender M or W and

3rd part is internal product naming (Ex: yy | Tz, zz | Ox, Dash and Pearl).

I want to extract the M and W in the separate column named Gender to know how many men and women customers I have.

Expected output:

Gender
M
M
W
W
M
W
M

I tried to use the split function but I feel it is not an efficient way. Is there a better way to get this implemented?

Because the "parts" of the string can contain spaces then using spaces to find the 2nd part is probably not reliable. I'm going to assume that the patterns you are looking for are ' M ' and ' W ' but obviously if these patterns can exist in the first or third part this won't work.

Try this:

CASE WHEN POSITION(' M ', BODY) > 0 THEN 'M'
     WHEN POSITION(' W ', BODY) > 0 THEN 'W'
     ELSE 'X' END

Yet another option is using the Snowflake REGEXP_SUBSTR function, with a regex expression that matches the single character right before your 3rd defined group:

SELECT REGEXP_SUBSTR(Body, ' ([A-Z]) [\\w\\-]+$', 1, 1, 'e') AS Gender
FROM tab 

The regex looks for:

  • ([AZ]) : space + uppercase character (Group 1) + space
  • [\\w\\-]+ : any combination of alphanumerical characters and dashes
  • $ : end of string

The three parameters 1, 1, 'e' stand for:

  • 1, : start from the first char
  • , 1, : get the first occurrence
  • 'e' : extract the (first) group

Does it solve your problem?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM