I am attempting to write a regular expression to match certain patterns except for those with a preceding pattern. In other words given the following sentence:
Don't want to match paragraph 1.2.3.4 but this instead 5.6.7.8
I would like to match all XXXX
that does not have the word paragraph
in front of it, ie it should only match 5.6.7.8 . My current regex as such seems to match both 1.2.3.4 and 5.6.7.8. I have switched around the lookaheads but doesn't seem to match my use case.
(?<!paragraph)(?:[\(\)0-9a-zA-Z]+\.)+[\(\)0-9a-zA-Z]+
I code in javascript.
EDIT: Note that XXXX
are not fixed at 4 X
s. They range from XX
to XXXXX
Your pattern matches because "paragraph" is not the same as "paragraph[space]". Your pattern doesn't have a space. Your text does.
You may want to add the space (perhaps conditionally?) to your lookbehind. Because you want to match a varying number of XXXX
(you've said XX
through XXXXX
), we need to include X.
in the lookbehind as well:
const rex = /(?<!paragraph *(?:[\(\)0-9a-zA-Z]+\.)*)(?:[\(\)0-9a-zA-Z]+\.){1,4}[\(\)0-9a-zA-Z]/i;
Live Example:
function test(str) { const rex = /(?<!paragraph *(?:[\\(\\)0-9a-zA-Z]+\\.)*)(?:[\\(\\)0-9a-zA-Z]+\\.){1,4}[\\(\\)0-9a-zA-Z]/i; const match = rex.exec(str); console.log(match ? match[0] : "No match"); } console.log("Testing four 'digits':"); test("Don't want to match paragraph 1.2.3.4 but this instead 5.6.7.8 blah"); console.log("Testing two 'digits':"); test("Don't want to match paragraph 1.2.3.4 but this instead 5.6 blah"); console.log("Testing two 'digits' again:"); test("Don't want to match paragraph 1.2 but this instead 5.6 blah"); console.log("Testing five 'digits' again:"); test("Don't want to match paragraph 1.2 but this instead 5.6.7.8.9 blah");
That expression requires:
paragraph
followed by zero or more spaces possibly followed by X.
zer or more times is not immediately prior to the match; andX.
is repeated one to four times ( {1,4}
); andX
immediately follows those three X
in my example is A-Z0-9
and I've made the expression case-insensitive, but you can tweak as needed.
Note that lookbehind was only added to JavaScript recently, in ES2018, so support requires up-to-date JavaScript environments. If you need lookbehind on older environments, you might check out Steven Levithan's excellent XRegex library .
Also note that variable-length lookbehind like the above is not supported in all languages (but is supported in JavaScript...in engines that are up-to-date).
如果你总是想匹配一个 4-item 的组,你可以这样做:
(?<!paragraph )([0-9]+.?){4}
You can build the Regex iteratively -
Test regex here .
const inputData = 'Don\\'t want to match paragraph 1.2.3.4 but this instead 5.6.7.8 and 12.2.333.2'; const re = /(?<!paragraph\\s+)(\\d{1,}\\.\\d{1,}\\.\\d{1,}\\.\\d{1,})/ig; const matchedGroups = inputData.matchAll(re); for (const matchedGroup of matchedGroups) { console.log(matchedGroup); }
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.