简体   繁体   English

如何在正则表达式中使用括号获得多个数字

[英]How can I get more than one digit using parenthesis in regular expressions

I was trying to extract values from a html code using urllib and regular expressions in python3 and when I tried to run this code, it only gave me one of the digits of the number instead of both values even though I added a "+" sign meaning one or more times.我试图在 python3 中使用 urllib 和正则表达式从 html 代码中提取值,当我尝试运行此代码时,它只给了我一个数字而不是两个值,即使我添加了一个“+”号表示一次或多次。 What's wrong here?这里有什么问题?

import re
import urllib.error,urllib.parse,urllib.request
from bs4 import BeautifulSoup
finalnums=[]
sumn=0
urlfile = urllib.request.urlopen("http://py4e-data.dr-chuck.net/comments_42.html")

html=urlfile.read()
soup = BeautifulSoup( html,"html.parser" )
spantags = soup("span")
for span in spantags:
    span=span.decode()  
    numlist=re.findall(".+([0-9].*)<",span)
    print(numlist)
    finalnums.extend(numlist)
for anum in finalnums:
    sumn=sumn+int(anum)
print("Sum = ",sumn)

This is an example of the string I'm trying to extract the number from:这是我试图从中提取数字的字符串示例:

 <span class="comments">54</span>

Use numlist=re.findall("\d+",span) to search for all contiguous groups of digit characters.使用numlist=re.findall("\d+",span)搜索所有连续的数字字符组。

\d is a character class that's equivalent to [0-9] , so it would also work if you did numlist=re.findall("[0-9]+",span) \d是一个字符 class 相当于[0-9] ,所以如果你做了numlist=re.findall("[0-9]+",span)

Since there is only one number in each <span> tag:由于每个<span>标签中只有一个数字:

sumn = 0
for span in spantags:
    sumn += int(re.search(r'\d+', span.decode()).group(0))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何使用Python正则表达式从字符串中提取多个模式? - How to extract more than one patterns from a string using Python Regular Expressions? Python - 正则表达式获取括号之间的数字 - Python - Regular expressions get numbers between parenthesis 当匹配不总是多个时,如何使用python正则表达式? - how do you use python regular expressions when there isn't always more than one match? 如何使用 Python 中的正则表达式检测以空格分隔的数字? - How do I detect digit separated by space using regular expressions in Python? 如果函数的 output 有多个 output 使用 func(),我如何获取该函数的长度? - How can I get the length of a function's output if it has more than one output using func()? 如何使用 matplotlib 选择多个点? - How can I pick more than one point using matplotlib? 如何使用正则表达式查找字符超过 1 次? - How do I use regular expressions to find a character more than 1 time? 如何使用正则表达式找到所有 Markdown 链接? - How can I find all Markdown links using regular expressions? 使用正则表达式时如何连接多个if语句? - How can I connect multiple if statements when using regular expressions? 我怎样才能 select 整个导入语句使用正则表达式? - How can I select the entire import statements using regular expressions?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM