简体   繁体   中英

Need regex to ignore a specific string of only numbers

so I'm using 3rd party application that uses regex to get matches. It is automatically set to match only the first match since it only looking for one piece of information per page. I cannot change this setting unless I want it to find all matches to be display as an array which I rarely want it to do. That last condition doesn't apply to the match I want.

What I want it to find are ID codes. It just so happens that all the IDs start with 10 and are followed by 4 more numbers

Example:

104230

So I wrote this regex

10[0-9]{4}

The only problem with this is that there is a .js file in the header that is named 10022008.js and since it automatically chooses the first match, all the IDs get set to this.

How do you get regex to ignore that string of numbers and that string only? All the searches I have done only similar ignore type codes have not worked

Add the "word boundary" regex \\b to each end of your regex:

\b10[0-9]{4}\b

The word boundary matches between any "word" character (ie \\w , which is [0-9a-zA-Z_] ) and any non-word character, or visa versa, and is zero-width, so it won't add any characters to your capture.

Lookahead is one solution. May not be the most efficient, but I think it is the most readable.

10\d{4}(?!08\.js)

This will match 10 followed by any four digits, provided that those digits are not followed by 08.js .

我不确定输入数据是什么样的,但是你可以将它限制在行的开头和结尾吗?

^10[0-9]{4}$

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM