簡體   English   中英

用漂亮的湯從 html 字符串中提取文本

[英]Extract text from html string with beautiful soup

我編寫以下代碼從網頁中提取價格:

from urllib.request import urlopen
from bs4 import BeautifulSoup
url = "https://www.teleborsa.it/azioni/intesa-sanpaolo-isp-it0000072618-SVQwMDAwMDcyNjE4"
html = urlopen(url)
soup = BeautifulSoup(html,'lxml')
prize = soup.select('.h-price')
print(prize)

output 是:

<span class="h-price fc0" id="ctl00_phContents_ctlHeader_lblPrice">1,384</span>

我想提取 1,384 個值。

嘗試這個

document.getElementById("ctl00_phContents_ctlHeader_lblPrice").innerText

或者,如果您有動態元素,則可以遍歷每個元素並從中獲取 innerText。

您可以使用.text屬性來獲取所需的文本。

例如:

from urllib.request import urlopen
from bs4 import BeautifulSoup
url = "https://www.teleborsa.it/azioni/intesa-sanpaolo-isp-it0000072618-SVQwMDAwMDcyNjE4"
html = urlopen(url)
soup = BeautifulSoup(html,'lxml')
prize = soup.select_one('.h-price') # <- change to .select_one() to get only one element
print(prize.text)                   # <- use the .text property to get text of the tag

印刷:

1,384

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM