简体   繁体   中英

Using Regex to Match Pattern

I am trying to use regex to retrieve Title:Code pair.

(.*?\(CPT-.*?\)|.*?\(ICD-.*?\))

Data:

SENSORINEURAL HEARING LOSS BILATERAL (MILD) (ICD-389.18) RIGHT WRIST GANGLION CYST (ICD-727.41) S/P INJECTION OF DEPO MEDROL INTO LEFT SHOULDER JOINT (CPT-20600)

I would like to capture:

  • SENSORINEURAL HEARING LOSS BILATERAL (MILD) (ICD-389.18)
  • RIGHT WRIST GANGLION CYST (ICD-727.41)
  • S/P INJECTION OF DEPO MEDROL INTO LEFT SHOULDER JOINT (CPT-20600)

What is the proper regex to use?

What about a pattern like this:

.*?\((CPT|ICD)-[A-Z0-9.]+\)

This will match zero or more of any character, non-greedily, followed by a ( followed by either CPT or ICD , followed by a hyphen, followed by one or more Uppercase Latin letters, decimal digits or periods, followed by a ) .

Note that I picked [A-Z0-9.]+ because, to my understanding, all current ICD-9 codes , ICD-10 codes , and CPT codes conform to that pattern.

The C# code might look a bit like this:

var result = Regex.Matches(input, @".*?\((CPT|ICD)-[A-Z0-9.]+\)")
                  .Cast<Match>()
                  .Select(m => m.Value);

If you want to avoid having any surrounding whitespace, you simply trim the result strings ( m => m.Value.Trim() ), or ensure that the matched prefix starts with a non-whitespace character by putting a \\S in front, like this:

var result = Regex.Matches(input, @"\S.*?\((CPT|ICD)-[A-Z0-9.]+\)")
                  .Cast<Match>()
                  .Select(m => m.Value);

Or using a negative lookahead if you need to handle inputs like (ICD-100)(ICD-200) :

var result = Regex.Matches(input, @"(?!\s).*?\((CPT|ICD)-[A-Z0-9.]+\)")
                  .Cast<Match>()
                  .Select(m => m.Value);

You can see a working demonstration here .

You can use the split() method:

string input = "SENSORINEURAL HEARING LOSS BILATERAL (MILD) (ICD-389.18) RIGHT WRIST GANGLION CYST (ICD-727.41) S/P INJECTION OF DEPO MEDROL INTO LEFT SHOULDER JOINT (CPT-20600)";
string pattern = @"(?<=\))\s*(?=[^\s(])";
string[] result = Regex.Split(input, pattern);

Consider the following Regex...

.*?\d\)

Good Luck!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM