![](/img/trans.png)
[英]How to print or extract the text in the div class with any <p> <span> or etc. using BeautifulSoup and Python3.x?
[英]I am trying to extract text inside span_id, but getting blank output using python beautifulsoup
我正在嘗試提取span-id標記內的文本,但輸出屏幕空白。
我也嘗試過使用父元素div文本,但提取失敗,請有人幫助我。 下面是我的代碼。
import requests
from bs4 import BeautifulSoup
r = requests.get('https://www.paperplatemakingmachines.com/')
soup = BeautifulSoup(r.text,'lxml')
mob = soup.find('span',{"id":"tollfree"})
print(mob.text)
我想要該跨度內的文字,該文字是手機號碼。
您必須使用Selenium,因為初始請求中不存在該文本,或者至少沒有搜索<script>
標記就沒有該文本。
from bs4 import BeautifulSoup as soup
from selenium import webdriver
import time
driver = webdriver.Chrome('C:\chromedriver_win32\chromedriver.exe')
url='https://www.paperplatemakingmachines.com/'
driver.get(url)
# It's better to use Selenium's WebDriverWait, but I'm still learning how to use that correctly
time.sleep(5)
soup = BeautifulSoup(driver.page_source, 'html.parser')
driver.close()
mob = soup.find('span',{"id":"tollfree"})
print(mob.text)
數據實際上是通過腳本動態發送的。 您需要做的是從腳本中解析數據:
import requests
import re
from bs4 import BeautifulSoup
r = requests.get('https://www.paperplatemakingmachines.com/')
soup = BeautifulSoup(r.text,'lxml')
script= soup.find('script')
mob = re.search("(?<=pns_no = \")(.*)(?=\";)", script.text).group()
print(mob)
使用正則表達式查找數字的另一種方法
import requests
import re
from bs4 import BeautifulSoup as bs
r = requests.get('https://www.paperplatemakingmachines.com/',)
soup = bs(r.content, 'lxml')
r = re.compile(r'var pns_no = "(\d+)"')
data = soup.find('script', text=r).text
script = r.findall(data)[0]
print('+91-' + script)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.