简体   繁体   中英

regex to match strings not ending with a pattern?

I am trying to form a regular expression that will match strings that do NOT end a with a DOT FOLLOWED BY NUMBER.

eg.

abcd1
abcdf12
abcdf124
abcd1.0
abcd1.134
abcdf12.13
abcdf124.2
abcdf124.21

I want to match first three.
I tried modifying this post but it didn't work for me as the number may have variable length.

Can someone help?

You can use something like this:

^((?!\.[\d]+)[\w.])+$

It anchors at the start and end of a line. It basically says:

Anchor at the start of the line
DO NOT match the pattern .NUMBERS
Take every letter, digit, etc, unless we hit the pattern above
Anchor at the end of the line

So, this pattern matches this (no dot then number):

This.Is.Your.Pattern or This.Is.Your.Pattern2012

However it won't match this (dot before the number):

This.Is.Your.Pattern.2012

EDIT: In response to Wiseguy's comment, you can use this:

^((?!\\.[\\d]+$)[\\w.])+$ - which provides an anchor after the number. Therefore, it must be a dot, then only a number at the end... not that you specified that in your question..

If you can relax your restrictions a bit, you may try using this (extended) regular expression: ^[^.]*.?[^0-9]*$

You may omit anchoring metasymbols ^ and $ if you're using function/tool that matches against whole string.

Explanation : This regex allows any symbols except dot until (optional) dot is found, after which all non-numerical symbols are allowed. It won't work for numbers in improper format, like in string: abcd1...3 or abcd1.fdfd2 . It also won't work correctly for some string with multiple dots, like abcd.ab123cd.a (the problem description is a bit ambigous).

Philosophical explanation : When using regular expressions, often you don't need to do exactly what your task seems to be, etc. So even simple regex will do the job. An abstract example: you have a file with lines are either numbers, or some complicated names(without digits), and say, you want to filter out all numbers, then simple filtering by [^0-9] - grep '^[0-9]' will do the job.

But if your task is more complex and requires validation of format and doing other fancy stuff on data, why not use a simple script(say, in awk , python , perl or other language)? Or a short "hand-written" function, if you're implementing stand-alone application. Regexes are cool, but they are often not the right tool to use.

我会在最后使用一个简单的负面背后隐藏:

.*(?<!\\.\\d+)$

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM