简体   繁体   中英

Python: How can I apply two exceptions to my regular expression?

I have this regular expression code for finding the anomaly pattern.

if not re.match(r'(-?[01]\.[0-9]{1,8})\s(-?[01]\.[0-9]{1,8})', text):
    print("anomaly!!")

I want to find something that anomaly pattern through if not .

My code usually works well, but I found a case where doesn't work:

0.00000000e+00 // It should be error (included non-numeric strings)
0.000000  // It should be error (complete zero cannot exist)
0.00  0000  // It should be error (included non-numeric strings)

I complied with the following rules:

1. The value after the decimal point can be from at least one letter to at maximum eight.
2. Never include non-numeric strings.
3. A complete zero (0.0 ~0.00000000) cannot exist.
4. A value must exist after the decimal point.

I think, my regular expression can't detect complete zero (0.0 to 0.00000000) and non-numeric values.

How can I apply two exceptions to my regular expression?

Please give me some advice.

This is my test cases:

[-0.19666128 -0.0000]  # It should be error (complete zero cannot exist)
[-1.09666128 -0.16812956]  # It should be correct.
[-0.180045 -0.22017317]  # It should be correct.
[1.00000786 -0.24855652]  # It should be correct.
[0.1766060 -1.]  # It should be error (A value must exist after the decimal point)
[1.16797414 0.00000000e+00]  # It should be error (included non-numeric strings)
[-0. 0.]  # It should be error (A value must exist after the decimal point)
[1.1223297 -0.2840327]  # It should be correct.
[1. -0.       ]  # It should be error (A value must exist after the decimal point and included non-numeric strings)
[-0.11070672 -0.20553467]  # It should be correct.
[1.04924586 -0.16772696]  # It should be correct.
[0.06169098 -0.15855075]  # It should be correct.
[-0.11988816 1.20512903]  # It should be correct.
[-0.180045   -1.22017317]  # It should be correct.
[-0.18486786 -0.24855652]  # It should be correct.

Add a $ character to the end of the pattern. Otherwise the re.match() function will accept any string that begins with the pattern.

example:

print(re.match(r'a', 'abc'))
print(re.match(r'a$', 'abc'))

result:

   <re.Match object; span=(0, 1), match='a'>
   None

In your case:

>>> print(re.match(r'(-?[01]\.[0-9]{1,8})\s(-?[01]\.[0-9]{1,8})$', '1.16797414 0.00000000e+00'))
None
>>> print(re.match(r'(-?[01]\.[0-9]{1,8})\s(-?[01]\.[0-9]{1,8})', '1.16797414 0.00000000e+00'))
<re.Match object; span=(0, 21), match='1.16797414 0.00000000'>

(This will not filter out the "exact zero" case. You will need to either force a non-zero digit at the end eg [0-9]{0,7}[1-9] (note this will filter out 0.0000 but also 0.50000), or you will need to check the value of the matched number afterward.)

You need to use re.search (since ) and use the following regex:

\[(-?(?!0(?:\.0+)?\s)[01](?:\.[0-9]{1,8})?)\s+(-?(?!0(?:\.0+)?])[01](?:\.[0-9]{1,8})?)]

See the regex demo .

The (??0(:.\?0+)?\s) and (??0(:.\?0+)?]) lookaheads will cancel the match if either of the numbers are all zeros.

See the Python demo :

import re
n = r'[01](?:\.[0-9]{1,8})?' # Number matching part declared as a variable
rx = re.compile(fr"\[(-?(?!0(?:\.0+)?{n}\s))\s+(-?(?!0(?:\.0+)?]){n})]")
test_strs = ["[-0.19666128 -0.0000]","[-1.09666128 -0.16812956]","[-0.180045 -0.22017317]", "[1.00000786 -0.24855652]", "[0.1766060 -1.]", "[1.16797414 0.00000000e+00]",
"[-0. 0.]", "[1.1223297 -0.2840327]","[1. -0.       ]", "[-0.11070672 -0.20553467]","[1.04924586 -0.16772696]"
"[0.06169098 -0.15855075]","[-0.11988816 1.20512903]","[-0.180045   -1.22017317]","[-0.18486786 -0.24855652]"]
for text in test_strs:
    if rx.search(text):
        print(f'{text}: Valid')
    else:
        print(f'{text}: Invalid')

Output:

[-0.19666128 -0.0000]: Invalid
[-1.09666128 -0.16812956]: Valid
[-0.180045 -0.22017317]: Valid
[1.00000786 -0.24855652]: Valid
[0.1766060 -1.]: Invalid
[1.16797414 0.00000000e+00]: Invalid
[-0. 0.]: Invalid
[1.1223297 -0.2840327]: Valid
[1. -0.       ]: Invalid
[-0.11070672 -0.20553467]: Valid
[1.04924586 -0.16772696][0.06169098 -0.15855075]: Valid
[-0.11988816 1.20512903]: Valid
[-0.180045   -1.22017317]: Valid
[-0.18486786 -0.24855652]: Valid

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM