简体   繁体   中英

Use regex in c# to remove specific combination of characters

I would like to keep in my string the following characters:

  • numeric characters : 1-9
  • alpha characters : aA-zA
  • only apostroph character surrounded by alphanumeric characters, ie "x'x" where x belongs to alphanumeric characters group.

At this point, I am able to keep all the alphanumeric characters. The problem is with the apostroph character, I am keeping all the apostroph whereas I would like to keep only the ones surrounded by alphanumeric characters. This is my code :

Regex rgx = new Regex("[^a-zA-Z0-9' -]");
string newString = rgx.Replace(oldString, "");

Example : For this string "abc'd1*%'" , I would like to get "abc'd1" .

You could use the below regex and then replace the matched characters with an empty string.

@"(?<![A-Za-z])'|'(?![A-Za-z])|[^A-Za-z0-9']"

DEMO

Explanation:

  • (?<![A-Za-z])' Matches all the single quotes which is not preceded by an alphabet.
  • | OR
  • '(?![A-Za-z]) Matches all the single quotes which is not followed by an alphabet. So theses two patterns fails to match the single quotes which is preceded and followed by a alphabet.
  • | OR
  • [^A-Za-z0-9'] From the remaining string, this pattern would match any character but not of alphanumeric or single quotes.

Code:

string str = "abc'd1*%'";
string result = Regex.Replace(str, @"(?<![A-Za-z])'|'(?![A-Za-z])|[^A-Za-z0-9']", "");
Console.WriteLine(result);
Console.ReadLine();

IDEONE

[a-zA-Z0-9 -]+|(?<=[a-zA-Z])'(?=[a-zA-Z])

Try this.See demo.

https://regex101.com/r/dU7oN5/13

If you are matching whitespace, try this:

[\w\s-]+|(?<=[\w\s])'(?=[\w\s])

If no whitespace, try this:

[\w-]+|(?<=[\w])'(?=[\w])

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM