正则表达式与组不匹配（Python）

Question

On my administration page I have a list of accounts with various values that I wanna to capture, like id, name, type, etc. On Regex101 its capturing perfectly all the values with "g" and "s" modifiers active. 在我的管理页面上，有一个帐户列表，其中包含我想捕获的各种值，例如ID，名称，类型等。在Regex101上，它可以完美捕获所有激活了“ g”和“ s”修饰符的值。 This what I trying to do: 这是我想做的：

def extract_accounts(src):
        list_accounts = []
        try:
            pattern = re.compile(r'''id=(?P<id>.*?)&serverzone=.\">(?P<name>[a-zA-Z].*?)<\/a>.*?75px;\">(?P<level>.*?)<\/td>.*?75px;.*?75px;\">(?P<type>.*?)<\/td>.*?Open!''', re.X)
            print type(pattern)
            match = pattern.match(src)
            print match, "type=", type(match)
            name = match.group("name")
            print "name", name
            ids = match.group("id")
            level = match.group("level")
            type = match.group("type")
            #list_accounts.append(name, ids, level, type)
            #print ("id=", ids, ", name=",name," level=", level, " type=", type)
        except Exception as e:
            print (e)

But somehow I get this: 但是以某种方式我得到了这个：

<type '_sre.SRE_Pattern'>
None type= <type 'NoneType'>
'NoneType' object has no attribute 'group'

I don't have a clue what I'm doing wrong. 我不知道我在做什么错。 Basically what I want is to put in a list = [(name1, id1, level1, type), (name2, id2, level1, type) ..... and so on the things that I grab from each line Thanks in advance for any help. 基本上我想要的是放入列表= [（name1，id1，level1，type），（name2，id2，level1，type）.....等等，我从每一行中抓取的东西任何帮助。

Answer 1

You should be capturing groups by their group number. 您应该按组号捕获组。 I have changed the regular expression completely and implemented it like so: 我已经完全改变了正则表达式，并像这样实现它：

#!/usr/bin/env python
# -*- coding: utf-8 -*- 
import re

def main():
    sample_data = '''
    <tr style="background-color: #343222;">
        <td style="width: 20px;"><img src="/images/Star.png" style="border: 0px;" /></td>
        <td><a target="_top" href="adminzone.php?id=2478&serverid=1">Mike</a></td>
        <td style="text-align: center;width: 75px;">74</td>
        <td>•Evolu†ion•</td>
        <td style="text-align: center;width: 100px;">1635</td>
        <td style="text-align: center;width: 75px;">40,826</td>
        <td style="text-align: center;width: 75px;">User</td>
        <td style="width: 100px;"><a target="_top" href="href="adminzone.php"><strong>Open!</strong></a></td>
    </tr>
    <tr style="background-color: #3423323;">
        <td style="width: 20px;"><img src="/images/Star.png" style="border: 0px;" /></td>
        <td><a target="_top" href="adminzone.php?suid=24800565&serverid=1">John</a></td>
        <td style="text-align: center;width: 75px;">70</td>
        <td>•Evolu†ion•</td>
        <td style="text-align: center;width: 100px;">9167</td>
        <td style="text-align: center;width: 75px;">36,223</td>
        <td style="text-align: center;width: 75px;">Admin</td>
        <td style="width: 100px;"><a style="color: #00DD19;" target="_top" href="adminzone.php?id=248005&serverid=1"><strong>Open!</strong></a></td>

'''

    matchObj = re.search('id=(.*)&serverid=.">(.*)<\\/a><\\/td>\\n.*?75px;\\">(.+)<\\/td>\\n.*\\n.*\\n.*75px;\\">(.+)<\\/td>\\n.*75px;\\">(.+)<\\/td>', sample_data, re.X)

    if matchObj:
        user_id = matchObj.group(1)
        name = matchObj.group(2)
        level = matchObj.group(3)
        user_type = matchObj.group(4)
        print user_id, name, level, user_type


if __name__ == '__main__':
    main()

Output: 2478 Mike 74 40,826 输出： 2478 Mike 74 40,826

The above should give you a basic idea. 上面应该给你一个基本的想法。 Just incase you might be wondering, group(0) is the entire regular expression. 万一您可能想知道， group(0)是整个正则表达式。

正则表达式与组不匹配（Python）

问题描述

1 个解决方案

解决方案1
1 已采纳 2015-09-01 14:43:28

正则表达式与组不匹配（Python）

问题描述

1 个解决方案

解决方案1 1 已采纳 2015-09-01 14:43:28

解决方案1
1 已采纳 2015-09-01 14:43:28