简体   繁体   English

re.findall 只返回最后一场比赛

[英]re.findall only returning the last match

I have the following HTML:我有以下 HTML:

<tr>
<td style="text-align: left;" colspan="1">10:10</td>
<td style="text-align: left;" colspan="1">This is a description.</td>
</tr>
<tr>
<td colspan="1">10:30</td>
<td colspan="1">This is another description.</td>
</tr>

I'm wanting to return multiple matches, each consisting of two groups: group 1 which is the timestamp, and group 2 which is the description.我想返回多个匹配项,每个匹配项由两组组成:组 1 是时间戳,组 2 是描述。

When I run当我跑

re.findall(r'<td.*>(\d\d:\d\d)<\/td><td.*>(.*?)<\/td>', HTML)

I'm only getting the last match:我只得到最后一场比赛:

[('10:30', 'This is another description.')]

Can anyone tell me what's wrong with my regex?谁能告诉我我的正则表达式有什么问题?

Your first .* is matching as many characters as it can, so you get exactly one match that's everything from the first <td to the last </td> .您的第一个.*匹配尽可能多的字符,因此您会得到一个匹配,即从第一个<td到最后一个</td> Using [^>]* instead of .* for the first two will make it only match what's inside one tag.前两个使用[^>]*而不是.*会使它只匹配一个标签内的内容。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM