[英]How to fix this regular expression in python?
I want to process some string date which print out like this 我想处理一些像这样打印出来的字符串日期
'node0, node1 0.04, node8 11.11, node14 72.21\n'
'node1, node46 1247.25, node6 20.59, node13 64.94\n'
I want to find all the floating points here , this is the code I use 我想在这里找到所有浮点数,这是我使用的代码
for node in nodes
pattern= re.compile('(?<!node)\d+.\d+')
distance = pattern.findall(node)
however the result is like this 但结果是这样的
['0.04', '11.11', '4 72']
while what i want is this 而我想要的是这个
['0.04', '11.11', '72.21']
Any suggestion on fixing this regular expression? 有关修正此正则表达式的任何建议吗?
The .
的
.
in your expression is unescaped. 在你的表达中没有转义。
for node in nodes:
pattern = re.compile(r"(?<!node)\d+\.\d+")
distance = pattern.findall(node)
In regular expressions, a .
在正则表达式中,a
.
character is interpreted as a wildcard character and can match (almost) any character. 字符被解释为通配符,可以匹配(几乎)任何字符。 Thus your search pattern actually allows a digit or set of digits, followed by any character, followed by another digit or set of digits.
因此,您的搜索模式实际上允许一个数字或一组数字,后跟任何字符,后跟另一个数字或一组数字。 To stop this interpretation of the dot character, escape it with a backslash
\\
. 要停止对点字符的这种解释,请使用反斜杠
\\
对其进行转义。
(An aside: You don't need to compile your regex pattern inside your loop. In fact, that will slow your code down.) (旁白:你不需要在你的循环中编译你的正则表达式模式。实际上,这将减慢你的代码。)
pattern = re.compile('(?<!node)\d+\.\d+')
for node in nodes:
distance = pattern.findall(node)
print distance
output: 输出:
['0.04', '11.11', '72.21']
['0.04','11 .11','72 .21']
['1247.25', '20.59', '64.94']['1247.25','20 .59','64 .94']
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.