简体   繁体   English

python从正则表达式获取子字符串

[英]python get substring from regex

I want to extract a substring from a string, which is conform to a certain regex. 我想从符合特定正则表达式的字符串中提取子字符串。 The regex is: 正则表达式为:

(\\[\\s*(\\d)+ byte(s)?\\s*\\](\\s*|\\d|[AF]|[af])+)

Which effectively means that all of these strings get accepted: 这实际上意味着所有这些字符串都被接受:

[4 bytes] 66 74 79 70 33 67 70 35
[ 4 bytes ] 66 74 79 70 33 67 70 35
[1 byte] 66 74 79 70 33 67 70 35

I want to extract only the amount of bytes (just the number) from this string. 我只想从此字符串中提取字节数(仅数字)。 I thought of doing this with re.search , but I'm not sure if that will work. 我曾想过使用re.search做到这一点,但是我不确定那是否行得通。 What would be the cleanest and most performant way of doing this? 这样做最干净,最高效的方法是什么?

Use match.group to get the groups your regular expression defines: 使用match.group来获取您的正则表达式定义的组:

import re

s = """[4 bytes] 66 74 79 70 33 67 70 35
[ 4 bytes ] 66 74 79 70 33 67 70 35
[1 byte] 66 74 79 70 33 67 70 35"""
r = re.compile(r"(\[\s*(\d)+ byte(s)?\s*\](\s*|\d|[A-F]|[a-f])+)")

for line in s.split("\n"):
    m = r.match(line)
    if m:
        print(m.group(2))

The first group matches [4 bytes] , the second only 4 . 第一组匹配[4 bytes] ,第二组只有4

Output: 输出:

4
4
1

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM