正则表达式在不同长度的空格后捕获数字

Question

I try to use a non-capturing group to detect the spaces (before the numbers I needed) and not to bring spaces into my result, so I use 我尝试使用一个非捕获组来检测空格（在我需要的数字之前），而不是在结果中加入空格，因此我使用

(?: 1+)\\d*.?\\d* （？：1 +）\\ d *。？\\ d *

to process my text: 处理我的文字：

 input: kMPCV/epS4SgFoNdLo3LOuClO/URXS/5         134.686356921  2018-06-14 21:50:35.494
 input: pRVh7kPpFbtmuwS1NILiCzwHUVwJ4NcK         839.680408921  2018-06-14 22:13:39.996
 input: Ga7MIXmXAsrbaEc1Yj60qYYblcRQpnpz         4859.688276920  2018-06-14 23:02:11.125
 input: 4mqdb5njytfDOFpgeG3XS0Iv1OXFPEnb        1400.684675920  2018-06-14 23:33:42.031

and try to get the numbers. 并尝试获取数字。

But line 2 and 3 returns None result and line 1 and 4 returns numbers with 1 space before it: " 134.686356921" 但是第2行和第3行返回无结果，第1行和第4行返回前有1个空格的数字：“ 134.686356921”

Why I get different results? 为什么我得到不同的结果？ Code is below: 代码如下：

import re
def calcprice(filename):

    try:
        print ('ok')
        f = open(filename, 'r')
        data = f.read()
        rows = data.split('\n')

        for row in rows:
            print (re.search("[(?: 1+)\d*\.?\d*][1]",row))


    except Exception as e:
        print(e)


if __name__ == "__main__": ## If we are not importing this:
    calcprice('dfk balance.txt')

Result: 结果：

<_sre.SRE_Match object; <_sre.SRE_Match对象； span=(52, 66), match=' 134.686356921'> span =（52，66），match ='134.686356921'>

None 没有

<_sre.SRE_Match object; <_sre.SRE_Match对象； span=(51, 66), match=' 1400.684675920'> span =（51，66），match ='1400.684675920'>

Answer 1

Your current regex is basically one big character set : 您当前的正则表达式基本上是一个大字符集 ：

[(?: 1+)\d*\.?\d*]

which doesn't make much sense, looks like a misunderstanding of how regex works. 没什么意义，似乎是对正则表达式工作原理的误解。 If you want to match the numbers, it would probably make more sense to lookbehind for a couple spaces, match digits and periods, and lookahead for another couple spaces: 如果要匹配数字，则往后看几个空格，匹配数字和句点，然后再看另外两个空格可能更有意义：

(?<=  )[\d.]+(?=  )

https://regex101.com/r/NRnXWb/1 https://regex101.com/r/NRnXWb/1

for row in rows:
    print (re.search(r"(?<=  )[\d.]+(?=  )",row))

Answer 2

Your regex [(?: 1+)\\d*\\.?\\d*][1] consists or 2 times a character class . 您的正则表达式[(?: 1+)\\d*\\.?\\d*][1]包含或是字符类的 2倍。

If the number you want to match always contains a dot, you could use a word boundary and a positive lookahead to assert that what followes is a whitespace: 如果您要匹配的数字始终包含一个点，则可以使用单词边界和正向前瞻来断言其后是空白：

\\b\\d+\\.\\d+(?= )

If it could also be without a dot you could check for a leading and a trailing whitespace using lookrounds and make the part which will match a dot and one or more times a digit optional (?:\\.\\d+)? 如果也可能没有点，则可以使用lookrounds检查前导空格和尾随空格，并使与点和一个或多个倍数匹配的部分为可选(?:\\.\\d+)? . 。

(?<= )\\d+(?:\\.\\d+)?(?= )

Demo 演示

Answer 3

Try the regex \\b(\\d+[\\d\\.]*)\\b 尝试正则表达式\\b(\\d+[\\d\\.]*)\\b

Your regex doesn't align to what you're trying to do.. It's pretty erroneous. 您的正则表达式与您要执行的操作不一致。这是非常错误的。

Answer 4

Try this pattern: +(\\d+(\\.\\d+)?) + . 尝试以下模式： +(\\d+(\\.\\d+)?) + 。

Explanation: pattern will match number preceeded and followed by one or more spaces ( + ). 说明：pattern将匹配前面的数字，后跟一个或多个空格（ + ）。 It will match numbers with optional decimal part ( (\\.\\d+)? ), which will become second capturing group in a match (but you won't need it anyway). 它将匹配具有可选小数部分（ (\\.\\d+)? ）的数字，这将成为匹配项中的第二个捕获组（但无论如何您都不需要它）。

In every match, first capturing group \\1 will be your number. 在每场比赛中，第一个捕获组\\1将是您的号码。

Demo 演示

正则表达式在不同长度的空格后捕获数字

问题描述

4 个解决方案

解决方案1
0 已采纳 2018-08-06 06:01:29

解决方案2
0 2018-08-06 06:03:19

解决方案3
0 2018-08-06 06:05:30

解决方案4
0 2018-08-06 06:14:48

正则表达式在不同长度的空格后捕获数字

问题描述

4 个解决方案

解决方案1 0 已采纳 2018-08-06 06:01:29

解决方案2 0 2018-08-06 06:03:19

解决方案3 0 2018-08-06 06:05:30

解决方案4 0 2018-08-06 06:14:48

解决方案1
0 已采纳 2018-08-06 06:01:29

解决方案2
0 2018-08-06 06:03:19

解决方案3
0 2018-08-06 06:05:30

解决方案4
0 2018-08-06 06:14:48