[英]re.findall only returning the last match
I have the following HTML:我有以下 HTML:
<tr>
<td style="text-align: left;" colspan="1">10:10</td>
<td style="text-align: left;" colspan="1">This is a description.</td>
</tr>
<tr>
<td colspan="1">10:30</td>
<td colspan="1">This is another description.</td>
</tr>
I'm wanting to return multiple matches, each consisting of two groups: group 1 which is the timestamp, and group 2 which is the description.我想返回多个匹配项,每个匹配项由两组组成:组 1 是时间戳,组 2 是描述。
When I run当我跑
re.findall(r'<td.*>(\d\d:\d\d)<\/td><td.*>(.*?)<\/td>', HTML)
I'm only getting the last match:我只得到最后一场比赛:
[('10:30', 'This is another description.')]
Can anyone tell me what's wrong with my regex?谁能告诉我我的正则表达式有什么问题?
Your first .*
is matching as many characters as it can, so you get exactly one match that's everything from the first <td
to the last </td>
.您的第一个
.*
匹配尽可能多的字符,因此您会得到一个匹配,即从第一个<td
到最后一个</td>
。 Using [^>]*
instead of .*
for the first two will make it only match what's inside one tag.前两个使用
[^>]*
而不是.*
会使它只匹配一个标签内的内容。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.