使用Beautiful Soup在python html解析中使用xml數據的理想方法是什么？

Question

使用Beautiful Soup在python html解析中將xml轉換為文本的理想方法是什么？

當我使用Python 2.7 BeautifulSoup庫進行html解析時，可以進入“湯”步驟，但是我不知道如何提取所需的數據，因此我嘗試將它們全部轉換為字符串。

在下面的示例中，我想提取span標記中的所有數字並將它們加起來。 有沒有更好的辦法？

XML數據： http ： //python-data.dr-chuck.net/comments_324255.html

碼：

import urllib2
from BeautifulSoup import *
import re

url = 'http://python-data.dr-chuck.net/comments_324255.html'
html = urllib2.urlopen(url).read()
soup = BeautifulSoup(html)
spans = soup('span')
lis = list()
span_str = str(spans)
sp = re.findall('([0-9]+)', span_str)
count = 0
for i in sp:
    count = count + int(i)
print('Sum:', count)

Answer 1

不需要正則表達式：

from bs4 import BeautifulSoup
from requests import get

url = 'http://python-data.dr-chuck.net/comments_324255.html'
html = get(url).text
soup = BeautifulSoup(html, 'lxml')

count = sum(int(n.text) for n in soup.findAll('span'))

Answer 2

import requests, bs4
r = requests.get("http://python-data.dr-chuck.net/comments_324255.html")
soup = bs4.BeautifulSoup(r.text, 'lxml')

sum(int(span.text) for span in soup.find_all(class_="comments"))

輸出：

使用Beautiful Soup在python html解析中使用xml數據的理想方法是什么？

問題描述

2 個解決方案

解決方案1
1 2017-01-19 14:15:48

解決方案2
0 2017-01-19 14:16:21

使用Beautiful Soup在python html解析中使用xml數據的理想方法是什么？

問題描述

2 個解決方案

解決方案1 1 2017-01-19 14:15:48

解決方案2 0 2017-01-19 14:16:21

解決方案1
1 2017-01-19 14:15:48

解決方案2
0 2017-01-19 14:16:21