![](/img/trans.png)
[英]How do i scrape a structured table from a webpage using BeautifulSoup?
[英]How do I scrape hyperlink titles using BeautifulSoup?
所以,我試圖從中抓取的網站是:https//viewyourdeal-gabrielsimone.com'
產品名稱和價格在每個 div class = "info-wrapper" 下我可以毫無問題地提取價格,但是當我嘗試提取產品標題時,它無法將其轉換為文本作為其 href 鏈接。 每個產品名稱都在 href 下的 div class 下。 所以我的問題是,我如何抓取產品名稱?
import json
from bs4 import BeautifulSoup
import requests
import csv
from datetime import datetime
url = 'https://viewyourdeal-gabrielsimone.com'
gmaInfo=[]
response = requests.get(url, timeout=5)
content = BeautifulSoup(response.content, "html.parser")
for info in content.findAll('div', attrs={"class" : "wrapper ease-animation"}):
gridObject = {
"title" : info.find('div', attrs={"class" : "title animation allgrey"}),
"price" : info.find('span', attrs={"class":"red-price"}).text
}
print(gridObject)
with open('index.csv', 'w') as csv_file:
writer = csv.writer(csv_file)
writer.writerow([gridObject])
使用以下代碼,很少有項目返回為 None。如果元素存在,只需提供 If 條件即可獲取文本。
from bs4 import BeautifulSoup
import requests
import csv
from datetime import datetime
url = 'https://viewyourdeal-gabrielsimone.com'
gmaInfo=[]
response = requests.get(url, timeout=5)
content = BeautifulSoup(response.content, "html.parser")
for info in content.findAll('div', attrs={"class" : "wrapper ease-animation"}):
if info.find('div', attrs={"class": "title animation allgrey"}):
gridObject = {
"title" : info.find('div', attrs={"class" : "title animation allgrey"}).text.strip(),
"price" : info.find('span', attrs={"class":"red-price"}).text
}
print(gridObject)
with open('index.csv', 'w') as csv_file:
writer = csv.writer(csv_file)
writer.writerow([gridObject])
我對我的 div class 過於具體,我將 class 更改為簡單的標題,它工作正常。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.