[英]Python regex : extract multiple pattern in series containing decimal
Say, I have a string说,我有一个字符串
1. ACTNOWQUICK3 1234.56 1234.98 HYE912630964589376 PLUTO THEATRE OTHER WUN Cool Beans KIng
2. Cash WithdrawalATM 50.00 ABC 1111.22 23523455A
3. ACTNOWQUICK 76.53 653.24 HYE91234234589376 WiN OTHR JOHNKLING
I need to extract pattern from this such that, I get everything before the first numerical value, everything after it and also the two numerical values.我需要从中提取模式,以便获得第一个数值之前的所有内容、之后的所有内容以及两个数值。 Note that its guranteed that there will be only 2 numeric int/decimal values in the string with space before and after
请注意,它保证字符串中只有 2 个数字 int/decimal 值,前后有空格
this is what I have tried but its not giving me the expected output:这是我尝试过的,但它没有给我预期的 output:
pattern = '(.*)([0-9]*[,.][0-9]*).*([0-9]*[,.][0-9]*)(.*)'
What was expected:预期:
1. "ACTNOWQUICK3", 1234.56, 1234.98, "HYE912630964589376 PLUTO THEATRE OTHER WUN Cool Beans KIng"
2. "Cash WithdrawalATM", 50.00, 1111.22, "23523455A"
3. "ACTNOWQUICK", 76.53, 653.24, "HYE91234234589376 WiN OTHR JOHNKLING"
You're using a greedy quantifier.您正在使用贪婪的量词。 As Michael recommends, just change the first two
.*
to lazy adding a ?
正如迈克尔建议的那样,只需将前两个
.*
更改为延迟添加?
after it.在它之后。 And add a white space in the first and last parenthesis.
并在第一个和最后一个括号中添加一个空格。
pattern = '(.*?) ([0-9]+[,.][0-9]+).*?([0-9]+[,.][0-9]+) (.*)'
This works because you want to repeat the first patterns as few as possible.这是有效的,因为您希望尽可能少地重复第一个模式。
Test here: https://regex101.com/r/PVR6bd/1在这里测试: https://regex101.com/r/PVR6bd/1
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.