[英]python regex find/match one or more in a string
I almost can't see anymore for searching google and this site for solutions to my problem. 我几乎再也找不到搜索Google和此站点的解决方案。
I want to pick out one or more sequences of two different strings of text from a string: 我想从一个字符串中选择两个不同字符串的一个或多个序列:
eg 'aSATMPA23.37aSAAWAKE----aSABATT2.05-aSASLEEPING-'
例如'aSATMPA23.37aSAAWAKE----aSABATT2.05-aSASLEEPING-'
So I'd like to be able to pick out the 'aSATMPA23.37' and if it's there also the 'aSABATT2.05'. 因此,我希望能够选择“ aSATMPA23.37”,如果还有的话,还可以选择“ aSABATT2.05”。
I've tried the following: 我尝试了以下方法:
import re
serialdata = 'aSATMPA18.5-----aSBBATT2.97-aSBSLEEPING-'
def regex_serialdata(data):
GrandRegex = re.compile(r'(aS(.)(TMPA)(\d+\.\d+))|(aS(.)(BATT)(\d+\.\d+))')
match = GrandRegex.match(data)
but this stops after only the first match of 'aSATMPA18.5' 但这仅在“ aSATMPA18.5”的第一场比赛后停止
Next I tried using 'findall' method: 接下来,我尝试使用“ findall”方法:
def regex_serialdata(data):
GrandRegex = re.compile(r'(aS(.)(TMPA)(\d+\.\d+))|(aS(.)(BATT)(\d+\.\d+))')
match = GrandRegex.findall(data)
print(match)
Which resulted in: [('aSATMPA18.5', 'A', 'TMPA', '18.5', '', '', '', ''), ('', '', '', '', 'aSBBATT2.97', 'B', 'BATT', '2.97')]
结果为: [('aSATMPA18.5', 'A', 'TMPA', '18.5', '', '', '', ''), ('', '', '', '', 'aSBBATT2.97', 'B', 'BATT', '2.97')]
Is there a better way to do this? 有一个更好的方法吗?
Can I access the values within the list of tuples easily? 我可以轻松访问元组列表中的值吗?
Please note, I have spent hours on this and don't ask for help lightly. 请注意,我已经花了几个小时在此上,不要轻易寻求帮助。
Much appreciated, 非常感激,
Paul 保罗
>>> a = 'aSATMPA23.37aSAAWAKE----aSATMPA15.14-aSASLEEPING-'
>>> re.findall(r'aSATMPA\d+.\d+',a)
['aSATMPA23.37', 'aSATMPA15.14']
If You place the parenthesis like below, You can get a list of tuples with the values that You want from every match: 如果按如下所示放置括号,则可以从每个匹配项中获取具有所需值的元组列表:
>>> a
'aSATMPA23.37aSAAWAKE----aSBBATT2.05-aSASLEEPING-'
>>> b = re.findall(r'(aS)(ATMPA|BBATT)(\d+.\d+)',a)
>>> b
[('aS', 'ATMPA', '23.37'), ('aS', 'BBATT', '2.05')]
>>> b[0][0]
'aS'
>>> b[0][1]
'ATMPA'
>>> b[0][2]
'23.37'
>>> b[1][0]
'aS'
>>> b[1][1]
'BBATT'
>>> b[1][2]
'2.05'
Is there a better way to do this? 有一个更好的方法吗?
Yes. 是。 Get rid of all of your parentheses: 摆脱所有括号:
import re
serialdata = 'aSATMPA18.5-----aSBBATT2.97-aSBSLEEPING-'
def regex_serialdata(data):
GrandRegex = re.compile(r'aS.TMPA\d+\.\d+|aS.BATT\d+\.\d+')
match = GrandRegex.findall(data)
print (match)
regex_serialdata(serialdata)
Can I access the values within the list of tuples easily? 我可以轻松访问元组列表中的值吗?
Yes. 是。 From your second example, try print(match[0][0], match[1][4])
. 从第二个示例,尝试print(match[0][0], match[1][4])
。
Try following regex: 尝试使用以下正则表达式:
r'(aSA(?:TMPA|BATT))(\d+(?:\.\d+)?)'
Full Code: 完整代码:
import re
p = re.compile(r'(aSA(?:TMPA|BATT))(\d+(?:\.\d+)?)', re.DOTALL)
test_str = """
aSATMPA23.37aSAAWAKE----aSABATT2.05-aSASLEEPING-aSATMPA23.37aSAAWAKE--
--aSABATT2.05-aSASLEEPING-aSATMPA23.37aSAAWAKE---
-aSABATT2.05-aSASLEEPING-aSATMPA23.37aSAAWAKE-
"""
for m in re.finditer(p, test_str):
print('{0:<15}{1}'.format(m.group(1), m.group(2)))
It will print: 它将打印:
aSATMPA 23.37
aSABATT 2.05
aSATMPA 23.37
aSABATT 2.05
aSATMPA 23.37
aSABATT 2.05
aSATMPA 23.37
Based on your input, it will capture 根据您的输入,它将捕获
Thanks to everyone who replied and contributed, with your help I've come up with the following: 感谢所有做出了贡献的人,在您的帮助下,我提出了以下建议:
import re
serialdata = 'aSATMPA18.5-----aSBBATT2.97-aSBSLEEPING-'
def regex_serialdata(data):
GrandRegex = re.compile(r'aS(.)(TMPA|BATT)(\d+.\d+)')
match = GrandRegex.findall(data)
print(match)
for x, y, z in match:
if y == 'TMPA':
print('Temp is %s' % z)
elif y == 'BATT':
print('Battery is %sv' % z)
This produced the following output which is exactly what I want: 这产生了以下输出,正是我想要的:
[('A', 'TMPA', '18.5'), ('B', 'BATT', '2.97'), ('B', 'TMPA', '24.18')]
Temp is 18.5
Battery is 2.97v
I'm delighted, it even looks pretty :) 我很高兴,它甚至看起来很漂亮:)
Many thanks, 非常感谢,
Paul 保罗
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.