Example input:
'Please find the ref AB45676785567XYZ. which is used to identify reference number'
Example output:
'AB45676785567XYZ'
I need a RegExp
to return the match exactly matching my requirements; ie the substring where the first 2 and last 3 characters are letters.
The first 2 and last 3 letters are unknown.
I've tried this RegExp
:
[a-zA-Z]{2}[^\s]*?[a-zA-Z]{3}
But it is not matching as intended.
Your current RegExp
matches the following words marked with code blocks:
Please
find the refAB45676785567XYZ
. which is used toidentify
reference
number
This is because your RegExp
, [a-zA-Z]{2}[^\\s]*?[a-zA-Z]{3}
, is asking for:
[a-zA-Z]{2}
Begins with 2 letters (either case) [^\\s]*?
Contains anything that isn't a whitespace[a-zA-Z]{3}
Ends with 3 letters (either case) In your current example, restricting the letters to uppercase only would match only the match you seek:
[A-Z]{2}[^\s]+[A-Z]{3}
Alternatively, requiring numbers between the 2 beginning and 3 ending letters would also produce the match you want:
[a-zA-Z]{2}\d+[a-zA-Z]{3}
Start with 2 letters :
[a-zA-Z]{2}
Digits in the middle :
\d+
Finish with 3 letters :
[a-zA-Z]{3}
Full Regex :
[a-zA-Z]{2}\d+[a-zA-Z]{3}
If the middle text is Alpha-Numeric, you can use this :
[A-Z]{2}[^\s]+[A-Z]{3}
What is really important here, is word boundaries \\b
, try: \\b[a-zA-Z]{2}\\w+[a-zA-Z]{3}\\b
Explanation:
\\b
- word boundary
[a-zA-Z]{2}
- match any letter, 2 times
\\w+
- match one or more word characters
[a-zA-Z]{3}
- match any letter, 3 times
\\b
- word boundary
CAUTION your requirements are amibgious, as any word consisting of 5 or more letters would match the pattern
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.