简体   繁体   中英

C# Regex - Match certain char followed by number/identifier

I'm in trouble with a Regex which seems to have never been asked here. I have to replace the char a followed by a whitespace (or not followed), but necessarly followed by a number (the number must not be replaced).

I have this Regex: [aA]\s.(?<=\d)* and this is the result:

1]

using (?<=\d)* I wanted to try to match but not capture the number immediately after the character following (or not) from the space, but obviously it doesn't work, also because "\d" does not include the identifiers. Identifiers can be a series of numeric or alphanumeric characters without a defined length, nor a sorting of the letters in case it was alphanumeric. They can be A54N3 , Z4G78 or 8454 or 4AZ7 or 7 or A1 , 1A . Combinations always change.

I'd want to match ONLY the a before the number 8 (or any other number, or an identifier like N574A ) and replace that char with art , but leaving the number /identifier as it is, so result should be: agricoltura n 6 sensi dell'art8 or agricoltura n 6 sensi dell'artN574A , and if the target string was agricoltura n 6 sensi dell'a8 or agricoltura n 6 sensi dell'aN574A , (so without whitespace) result should be: agricoltura n 6 sensi dell'art8 or agricoltura n 6 sensi dell'artN574A

So the generic rule should be: Match [aA] followed by an optional space then must be followed by a number or an identifier that must not be captured

Is it possible to do such a thing? What could be the solution? Thank you so much!

UPDATE

Using the \\b([aA])\\s*([A-Za-z]*\\d[\\dA-Za-z]*)\\b pattern seems to replace correct values, here is the demo

You may use

\b([aA])\s*([A-Za-z]*\d[\dA-Za-z]*)\b

Replace with $1rt$2 . See the regex demo

Details

  • \b - a word boundary
  • ([aA]) - Group 1 (referred to with $1 from the replacement pattern): a or A
  • \s* - 0 or more whitespaces
  • ([A-Za-z]*\d[\dA-Za-z]*) - Group 2 (referred to with $2 from the replacement pattern): an alphanumeric whole word that contains at least one digit:
    • [A-Za-z]* - zero or more ASCII letters
    • \d - a digit
    • [\dA-Za-z]* - 0+ digits or ASCII letters (replace \d with 0-9 to match ASCII digits only, or pass RegexOptions.ECMAScript flag to Regex constructor)
  • \b - word boundary.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM