簡體   English   中英

如何使用 BeautifulSoup 抓取超鏈接標題?

[英]How do I scrape hyperlink titles using BeautifulSoup?

所以,我試圖從中抓取的網站是:https//viewyourdeal-gabrielsimone.com'

產品名稱和價格在每個 div class = "info-wrapper" 下我可以毫無問題地提取價格,但是當我嘗試提取產品標題時,它無法將其轉換為文本作為其 href 鏈接。 每個產品名稱都在 href 下的 div class 下。 所以我的問題是,我如何抓取產品名稱?

import json
from bs4 import BeautifulSoup
import requests 
import csv
from datetime import datetime

url = 'https://viewyourdeal-gabrielsimone.com'

gmaInfo=[]
response = requests.get(url, timeout=5)
content = BeautifulSoup(response.content, "html.parser")
for info in content.findAll('div', attrs={"class" : "wrapper ease-animation"}):
    gridObject = {
            "title" : info.find('div', attrs={"class" : "title animation allgrey"}),
            "price" : info.find('span', attrs={"class":"red-price"}).text
            }
    print(gridObject)
    with open('index.csv', 'w') as csv_file:
        writer = csv.writer(csv_file)
        writer.writerow([gridObject])

使用以下代碼,很少有項目返回為 None。如果元素存在,只需提供 If 條件即可獲取文本。

from bs4 import BeautifulSoup
import requests
import csv
from datetime import datetime

url = 'https://viewyourdeal-gabrielsimone.com'

gmaInfo=[]
response = requests.get(url, timeout=5)
content = BeautifulSoup(response.content, "html.parser")

for info in content.findAll('div', attrs={"class" : "wrapper ease-animation"}):
   if info.find('div', attrs={"class": "title animation allgrey"}):
     gridObject = {
            "title" : info.find('div', attrs={"class" : "title animation allgrey"}).text.strip(),
            "price" : info.find('span', attrs={"class":"red-price"}).text
            }
     print(gridObject)
     with open('index.csv', 'w') as csv_file:
        writer = csv.writer(csv_file)
        writer.writerow([gridObject])

我對我的 div class 過於具體,我將 class 更改為簡單的標題,它工作正常。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM