简体   繁体   中英

How do you find a delimited/isolated substring with string.contains?

I am trying to parse out and identify some values from strings that I have in a list.

I am using string.Contains to identify the value im looking for, but I am getting hits even if the value is surrounded by other text. How can I make sure I only get a hit if the value is isolated?

Example parse:

Looking for value = "302"

string sale = 
  "199708. (30), italiano, delim fabricata modella, serialNumber302. tnr F18529302E.";

var result = sale.ToLower().Contains(”302”));

In this example I will get a hit for "serialNumber302" and "F18529302E" , which in the context is incorrect since I only want a hit if it finds “302” isolated, like “dontfind302 shouldfind 302”.

Any ideas on how to do this?

If you try Regex, you can define a word boundary using \b :

string sale = 
  "199708. (30), italiano, delim fabricata modella, serialNumber302. tnr F18529302E.";

bool result = Regex.IsMatch(sale, @"\b302\b"); // false

sale = "A string with 302 isolated";

result = Regex.IsMatch(sale, @"\b302\b"); // true

So 302 will only be found if it is at the start of the string, at the end of the string, or if it is surrounded by non-word characters ie not az AZ 0-9 or _

EDIT: From the comments I realiſed that it waſn't clear whether or not "serialNum302" ſhould get a hit. I aſſumed ſo in this anſwer.

I ſee a few eaſy ways you could do this:

1) If the input is always a number as in the example, one option would be to only ſearch for ſubſtrings not ſurrounded by more numbers, by examining all the reſults of an initial ſearch and comparing their neighboring characters againſt the ſtring "0123456789". I really don't think this is the beſt option though, becauſe ſooner or later it's goïng to break when it miſinterprets one of the other bits of data.

2) If the ſtring sale always has the ſeriäl number in the format "serialNumber[Num]", inſtead of juſt looking for Num, look for "serialNumber" + Num, as this is leſs likely to be meſſed up with the other data.

3) From your ſtring, it looks like you have a ſtandardized format that's beïng introduced to the ſyſtem. In this caſe, parſe it in a ſtandardized way, eg by ſplitting it into ſubſtrings at the commas, then parſing each ſubſtring differently as it requires.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM