简体   繁体   中英

How to check if the whole input string (real numbers separated by a space) matches a regex in Python?

I have an input string consisting of a sequence of real numbers separated by a single space. It is also acceptable for the string to contain only one real number (no spaces). My goal is to check whether the string structure matches the following (in this order):

  • optional (0/1): minus (-)
  • 1/more digits
  • optional (1+): a period and 1/more digits
  • optional (0+): a group consisting of a space and the first group (the first three bullet points)

It should describe the string completely. If not, it should print an error message and exit.
My current regular expression is ^(-?\d+(\.?\d)*)( \1)*$ which I thought would be okay, but even the first group doesn't match all the real numbers individually. And I need it to check the string from the beginning to the end, including the spaces.

My code for this function looks like this:

import re
def structure_check(string):
    structure = r"^(-?\d+(\.?\d)*)( \1)*$"
    if re.match(structure,string):
        return("OK")
    else:
        print("Input error")
        exit()

It should accept strings like: 15 35 -45 8 -2.3 4564.18 56 etc., but it doesn't correspond to changes in the input (doesn't match) at all. It shouldn't match if there is too many spaces, incorrectly placed . or - , or if there are other characters than digits, periods, dashes ( - ) and spaces.

I could also do this with just the first group while iterating over a list created by splitting the input string by space, but I would prefer to check it according to my main goal, since I wouldn't have to split the input in the validation function and also to save some more code lines by checking the input alltogether (eg. for excess spaces, or unsupported characters, which I'd have to otherwise check separately).

Sorry if I missed any answered questions, I couldn't find any appropriate for my problem in Python. If you know about any, feel free to link them, please. And thank you, I am a beginner and started learning regex for a project just about yesterday.

You can use:

^((?:[+-]?\d+(?:[.]\d+)?)(?:[ \t]|$))*$ 

Demo and explantation

I added + to the optional sign. If you only want to match with no sign or - , just remove that from the optional character class.

You could also use an unrolled version to prevent matching a space at the end.

^-?\d+(?:\.\d+)?(?: -?\d+(?:\.\d+)?)*$

Regex demo


The backreference \1 will match exactly what is matched in group 1 and for your pattern will match for example 123 123 123

If you want to repeat the group, you could recurse the first group using the PyPi regex module and (?1)

^(-?\d+(?:\.\d+)?)(?: (?1))*$

See a Python example

In JavaScript you can use the method.test of regex. The regex should work in python.

 let ok = /^(([+\-]?\d+(\.\d+)?)( |$))+$/.test("15 35 -45 8 -2.3 4564.18 56"); console.log(ok);

Explanation: (.\d+)? You must make the whole group optional. The number can be followed by a space or the end of a string ( |$). The pattern is repeated throughout the string so I wrapped the entire expression in a group. Insert ^ at the beginning of the regex and $ at the end of the regex to force the regex to check the string completely.

Problem is in your regexp, to be specific, in ( \1)* part. This, described, means: space and string that was matched in group 1 zero or more times Thus, your regexp will match for the following, for example:
15 15 15
-5.3 -5.3 -5.3 -5.3

And so on.

To fix the regexp, I would replace the group reference with the actual group, like so:
^(-?\d+(\.?\d)*)( -?\d+(\.?\d)*)*$

I would also point out that this regexp allows the numbers to have multiple decimal dots, (eg 1.2.3 passes) however I'm not sure if that's intended or not.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM