简体   繁体   中英

Regex - How do you match everything except four digits in a row?

Using Regex, how do you match everything except four digits in a row? Here is a sample text that I might be using:

foo1234bar
baz      1111bat
asdf 0000 fdsa
a123b

Matches might look something like the following:

"foo", "bar", "baz      ", "bat", "asdf ", " fdsa", "a123b"

Here are some regular expressions I've come up with on my own that have failed to capture everything I need:

[^\d]+            (this one includes a123b)
^.*(?=[\d]{4})    (this one does not include the line after the 4 digits)
^.*(?=[\d]{4}).*  (this one includes the numbers)

Any ideas on how to get matches before and after a four digit sequence?

You haven't specified your app language, but practically every app language has a split function, and you'll get what you want if you split on \\d{4} .

eg in java:

String[] stuffToKeep = input.split("\\d{4}");

You can use a negative lookahead:

(?!\b\d{4}\b)(\b\w+\b)

Demo

In Python the following is very close to what you want:

In [1]: import re

In [2]: sample = '''foo1234bar
   ...: baz      1111bat
   ...: asdf 0000 fdsa
   ...: a123b'''

In [3]: re.findall(r"([^\d\n]+\d{0,3}[^\d\n]+)", sample)
Out[3]: ['foo', 'bar', 'baz      ', 'bat', 'asdf ', ' fdsa', 'a123b']

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM