简体   繁体   中英

Regular expression to remove a portion of text from each entry in commas separated list

I have a string of comma separated values, that I want to trim down for display purpose.

The string is a comma separated list of values of varying lengths and number of list entries.

Each entry in the list is formatted as a five character pattern in the format "##-NX" followed by some text.

eg, "01-NX sometext, 02-NX morertext, 09-NX othertext, 12-NX etc..."

Is there an regular expression function I can use to remove the text after the 5 character prefix portion of each entry in the list, returning "01-NX, 02-NX, 09-NX, 12-NX,..."?

I am a novice with regular expressions and I haven't been able figure out how to code the pattern.

I think what you need is

regexp_replace(regexp_replace(mystring, '(\d{2}-NX)(.*?)(,)', '\1\3'), '(\d{2}.*NX).*', '\1')

The inner REGEXP_REPLACE looks for a pattern like nn-NX (two numeric characters followed by "-NX") and any number of characters up to the next comma, then replaces it with the first and third term, dropping the "any number of characters" part.

The outer REGEXP_REPLACE looks for a pattern like two numeric characters followed by any number of characters up to the last NX, and keeps that part of the string.

Here is the Oracle code I used for testing:

with a as (
  select '01-NX sometext, 02-NX morertext, 09-NX othertext, 12-NX etc.' as myString
  from dual
)
select mystring
, regexp_replace(regexp_replace(mystring, '(\d{2}-NX)(.*?)(,)', '\1\3'), '(\d{2}.*NX).*', '\1') as output
from a

This alternative calls REGEXP_REPLACE() once.

Match 2 digits, a dash and 'NX' followed by any number of zero or more characters (non-greedy) where followed by a comma or the end of the string. Replace with the first group and the 3rd group which will be either the comma or the end of the string.

EDIT: Took dougp's advice and eliminated the RTRIM by adding the 3rd capture group. Thanks for that!

WITH tbl(str) AS (
  SELECT '01-NX sometext, 02-NX morertext, 09-NX othertext, 12-NX etc.' FROM dual
)
SELECT 
REGEXP_REPLACE(str, '(\d{2}-NX)(.*?)(,|$)', '\1\3') str
from tbl;

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM