简体   繁体   English

如何检查整个输入字符串(用空格分隔的实数)是否与 Python 中的正则表达式匹配?

[英]How to check if the whole input string (real numbers separated by a space) matches a regex in Python?

I have an input string consisting of a sequence of real numbers separated by a single space.我有一个输入字符串,该字符串由一个由单个空格分隔的实数序列组成。 It is also acceptable for the string to contain only one real number (no spaces).字符串也可以只包含一个实数(没有空格)。 My goal is to check whether the string structure matches the following (in this order):我的目标是检查字符串结构是否与以下内容匹配(按此顺序):

  • optional (0/1): minus (-)可选(0/1):减号(-)
  • 1/more digits 1/多个数字
  • optional (1+): a period and 1/more digits可选(1+):一个句点和 1 个/多个数字
  • optional (0+): a group consisting of a space and the first group (the first three bullet points)可选(0+):由空格和第一组(前三个要点)组成的组

It should describe the string completely.它应该完整地描述字符串。 If not, it should print an error message and exit.如果没有,它应该打印一条错误消息并退出。
My current regular expression is ^(-?\d+(\.?\d)*)( \1)*$ which I thought would be okay, but even the first group doesn't match all the real numbers individually.我当前的正则表达式是^(-?\d+(\.?\d)*)( \1)*$我认为可以,但即使是第一组也不能单独匹配所有实数。 And I need it to check the string from the beginning to the end, including the spaces.我需要它从头到尾检查字符串,包括空格。

My code for this function looks like this:我的 function 代码如下所示:

import re
def structure_check(string):
    structure = r"^(-?\d+(\.?\d)*)( \1)*$"
    if re.match(structure,string):
        return("OK")
    else:
        print("Input error")
        exit()

It should accept strings like: 15 35 -45 8 -2.3 4564.18 56 etc., but it doesn't correspond to changes in the input (doesn't match) at all.它应该接受如下字符串: 15 35 -45 8 -2.3 4564.18 56等,但它根本不对应于输入的变化(不匹配)。 It shouldn't match if there is too many spaces, incorrectly placed .如果空格太多,位置不正确,则不应匹配. or - , or if there are other characters than digits, periods, dashes ( - ) and spaces.- ,或者如果有数字、句点、破折号 ( - ) 和空格以外的其他字符。

I could also do this with just the first group while iterating over a list created by splitting the input string by space, but I would prefer to check it according to my main goal, since I wouldn't have to split the input in the validation function and also to save some more code lines by checking the input alltogether (eg. for excess spaces, or unsupported characters, which I'd have to otherwise check separately).我也可以只使用第一组来执行此操作,同时遍历通过空格拆分输入字符串创建的列表,但我更愿意根据我的主要目标检查它,因为我不必在验证中拆分输入function 并通过一起检查输入来节省更多代码行(例如,对于多余的空格或不受支持的字符,我必须单独检查)。

Sorry if I missed any answered questions, I couldn't find any appropriate for my problem in Python.抱歉,如果我错过了任何已回答的问题,我在 Python 中找不到适合我的问题的任何问题。 If you know about any, feel free to link them, please.如果您知道任何内容,请随时链接它们。 And thank you, I am a beginner and started learning regex for a project just about yesterday.谢谢你,我是一个初学者,就在昨天开始为一个项目学习正则表达式。

You can use:您可以使用:

^((?:[+-]?\d+(?:[.]\d+)?)(?:[ \t]|$))*$ 

Demo and explantation演示和解释

I added + to the optional sign.我在可选符号中添加了+ If you only want to match with no sign or - , just remove that from the optional character class.如果您只想匹配无符号或- ,只需将其从可选字符 class 中删除即可。

You could also use an unrolled version to prevent matching a space at the end.您还可以使用展开的版本来防止匹配末尾的空格。

^-?\d+(?:\.\d+)?(?: -?\d+(?:\.\d+)?)*$

Regex demo正则表达式演示


The backreference \1 will match exactly what is matched in group 1 and for your pattern will match for example 123 123 123反向引用\1将完全匹配组 1 中匹配的内容,并且您的模式将匹配例如123 123 123

If you want to repeat the group, you could recurse the first group using the PyPi regex module and (?1)如果要重复该组,可以使用PyPi regex module(?1)递归第一组

^(-?\d+(?:\.\d+)?)(?: (?1))*$

See a Python example查看Python 示例

In JavaScript you can use the method.test of regex.在 JavaScript 中,您可以使用正则表达式的 method.test。 The regex should work in python.正则表达式应该在 python 中工作。

 let ok = /^(([+\-]?\d+(\.\d+)?)( |$))+$/.test("15 35 -45 8 -2.3 4564.18 56"); console.log(ok);

Explanation: (.\d+)?解释: (.\d+)? You must make the whole group optional.您必须将整个组设为可选。 The number can be followed by a space or the end of a string ( |$).数字后面可以跟空格或字符串结尾 (|$)。 The pattern is repeated throughout the string so I wrapped the entire expression in a group.该模式在整个字符串中重复,因此我将整个表达式包装在一个组中。 Insert ^ at the beginning of the regex and $ at the end of the regex to force the regex to check the string completely.在正则表达式的开头插入 ^ 并在正则表达式的末尾插入 $ 以强制正则表达式完全检查字符串。

Problem is in your regexp, to be specific, in ( \1)* part.问题出在你的正则表达式中,具体来说,在( \1)*部分。 This, described, means: space and string that was matched in group 1 zero or more times Thus, your regexp will match for the following, for example:所描述的,这意味着:在第 1 组中匹配零次或多次的空格和字符串因此,您的正则表达式将匹配以下内容,例如:
15 15 15
-5.3 -5.3 -5.3 -5.3

And so on.等等。

To fix the regexp, I would replace the group reference with the actual group, like so:要修复正则表达式,我会将组引用替换为实际组,如下所示:
^(-?\d+(\.?\d)*)( -?\d+(\.?\d)*)*$

I would also point out that this regexp allows the numbers to have multiple decimal dots, (eg 1.2.3 passes) however I'm not sure if that's intended or not.我还要指出,这个正则表达式允许数字有多个小数点(例如1.2.3通行证),但我不确定这是否是有意的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM