I have this regular expression code for finding the anomaly pattern.
if not re.match(r'(-?[01]\.[0-9]{1,8})\s(-?[01]\.[0-9]{1,8})', text):
print("anomaly!!")
I want to find something that anomaly pattern through if not
.
My code usually works well, but I found a case where doesn't work:
0.00000000e+00 // It should be error (included non-numeric strings)
0.000000 // It should be error (complete zero cannot exist)
0.00 0000 // It should be error (included non-numeric strings)
I complied with the following rules:
1. The value after the decimal point can be from at least one letter to at maximum eight.
2. Never include non-numeric strings.
3. A complete zero (0.0 ~0.00000000) cannot exist.
4. A value must exist after the decimal point.
I think, my regular expression can't detect complete zero (0.0 to 0.00000000) and non-numeric values.
How can I apply two exceptions to my regular expression?
Please give me some advice.
This is my test cases:
[-0.19666128 -0.0000] # It should be error (complete zero cannot exist)
[-1.09666128 -0.16812956] # It should be correct.
[-0.180045 -0.22017317] # It should be correct.
[1.00000786 -0.24855652] # It should be correct.
[0.1766060 -1.] # It should be error (A value must exist after the decimal point)
[1.16797414 0.00000000e+00] # It should be error (included non-numeric strings)
[-0. 0.] # It should be error (A value must exist after the decimal point)
[1.1223297 -0.2840327] # It should be correct.
[1. -0. ] # It should be error (A value must exist after the decimal point and included non-numeric strings)
[-0.11070672 -0.20553467] # It should be correct.
[1.04924586 -0.16772696] # It should be correct.
[0.06169098 -0.15855075] # It should be correct.
[-0.11988816 1.20512903] # It should be correct.
[-0.180045 -1.22017317] # It should be correct.
[-0.18486786 -0.24855652] # It should be correct.
Add a $
character to the end of the pattern. Otherwise the re.match() function will accept any string that begins with the pattern.
example:
print(re.match(r'a', 'abc'))
print(re.match(r'a$', 'abc'))
result:
<re.Match object; span=(0, 1), match='a'>
None
In your case:
>>> print(re.match(r'(-?[01]\.[0-9]{1,8})\s(-?[01]\.[0-9]{1,8})$', '1.16797414 0.00000000e+00'))
None
>>> print(re.match(r'(-?[01]\.[0-9]{1,8})\s(-?[01]\.[0-9]{1,8})', '1.16797414 0.00000000e+00'))
<re.Match object; span=(0, 21), match='1.16797414 0.00000000'>
(This will not filter out the "exact zero" case. You will need to either force a non-zero digit at the end eg [0-9]{0,7}[1-9]
(note this will filter out 0.0000 but also 0.50000), or you will need to check the value of the matched number afterward.)
You need to use re.search
(since ) and use the following regex:
\[(-?(?!0(?:\.0+)?\s)[01](?:\.[0-9]{1,8})?)\s+(-?(?!0(?:\.0+)?])[01](?:\.[0-9]{1,8})?)]
See the regex demo .
The (??0(:.\?0+)?\s)
and (??0(:.\?0+)?])
lookaheads will cancel the match if either of the numbers are all zeros.
See the Python demo :
import re
n = r'[01](?:\.[0-9]{1,8})?' # Number matching part declared as a variable
rx = re.compile(fr"\[(-?(?!0(?:\.0+)?{n}\s))\s+(-?(?!0(?:\.0+)?]){n})]")
test_strs = ["[-0.19666128 -0.0000]","[-1.09666128 -0.16812956]","[-0.180045 -0.22017317]", "[1.00000786 -0.24855652]", "[0.1766060 -1.]", "[1.16797414 0.00000000e+00]",
"[-0. 0.]", "[1.1223297 -0.2840327]","[1. -0. ]", "[-0.11070672 -0.20553467]","[1.04924586 -0.16772696]"
"[0.06169098 -0.15855075]","[-0.11988816 1.20512903]","[-0.180045 -1.22017317]","[-0.18486786 -0.24855652]"]
for text in test_strs:
if rx.search(text):
print(f'{text}: Valid')
else:
print(f'{text}: Invalid')
Output:
[-0.19666128 -0.0000]: Invalid
[-1.09666128 -0.16812956]: Valid
[-0.180045 -0.22017317]: Valid
[1.00000786 -0.24855652]: Valid
[0.1766060 -1.]: Invalid
[1.16797414 0.00000000e+00]: Invalid
[-0. 0.]: Invalid
[1.1223297 -0.2840327]: Valid
[1. -0. ]: Invalid
[-0.11070672 -0.20553467]: Valid
[1.04924586 -0.16772696][0.06169098 -0.15855075]: Valid
[-0.11988816 1.20512903]: Valid
[-0.180045 -1.22017317]: Valid
[-0.18486786 -0.24855652]: Valid
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.