簡體   English   中英

Beautiful Soup 返回不完整的 HTML 腳本

[英]Beautiful Soup returning incomplete HTML script

我正在嘗試解析下面的網頁,以便我可以遍歷網站上列出的所有鑽石並將詳細信息保存到csv但我的代碼沒有找到原始 HTML 上的所有詳細信息。

分配的diamonds變量返回一個沒有元素的數組。 它似乎無法找到原始 HTML 上詳細說明的catalog-view-offer-wrapper

https://www.bluenile.com/uk/diamond-search?tag=none&track=NavDiaVAll

代碼如下:

import bs4
from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup

my_url = 'https://www.bluenile.com/uk/diamond-search?tag=none&track=NavDiaVAll'


uClient = uReq(my_url)
page_html = uClient.read()
uClient.close()

page_soup = soup(page_html,"html.parser")

diamonds = page_soup.findAll("div",{"class":"catalog-view-offer-wrapper"})

print(len(diamonds))

您可以在grid div上錨定屬性選擇:

import requests, re
from bs4 import BeautifulSoup as soup
d = soup(requests.get('https://www.bluenile.com/uk/diamond-search?tag=none&track=NavDiaVAll').text, 'html.parser')
_base = d.find_all('div', {'class':'grid'})[-1]
_results = [[[i.text, i.attrs['class'][-1]] for i in c.find_all('div', {'class':re.compile('^row-cell')})] for c in _base.find_all('a', {'href':re.compile('\./diamond\-details/')})]
_headers, results = [d for _, d in _results[0]], [[c for c, _ in i] for i in _results]

輸出:

[['Round', '£235.20', '0.23', 'IdealIdeal', 'I', 'VS1', 'Excellent', 'Excellent', 'None', '58.8', '61.0', '1.01', '£1,023', 'Very Small', 'LD11887032', 'Mar 6'], ['Round', '£235.20', '0.23', 'IdealIdeal', 'I', 'VS1', 'Excellent', 'Excellent', 'None', '62.1', '59.0', '1.01', '£1,023', 'None', 'LD11887033', 'Mar 6'], ['Round', '£238.80', '0.23', 'Very GoodVery Good', 'I', 'VS2', 'Excellent', 'Excellent', 'Faint', '63.8', '58.0', '1.01', '£1,038', 'None', 'LD11887039', 'Mar 6'], ['Round', '£246.00', '0.24', 'Very GoodVery Good', 'I', 'VS1', 'Very Good', 'Excellent', 'None', '63.4', '59.0', '1.00', '£1,025', 'None', 'LD11887038', 'Mar 6'], ['Round', '£249.60', '0.24', 'Very GoodVery Good', 'J', 'VS1', 'Excellent', 'Excellent', 'Medium', '63.7', '58.0', '1.01', '£1,040', 'None', 'LD11887043', 'Mar 6'], ['Round', '£260.40', '0.23', 'GoodGood', 'H', 'SI1', 'Very Good', 'Good', 'Faint', '65.8', '60.0', '1.00', '£1,132', 'None', 'LD11590524', 'Mar 11'], ['Round', '£264.00', '0.23', 'GoodGood', 'I', 'VS2', 'Very Good', 'Very Good', 'None', '63.7', '61.0', '1.01', '£1,148', 'None', 'LD06936712', 'Mar 11'], ['Round', '£265.20', '0.24', 'GoodGood', 'E', 'SI2', 'Very Good', 'Very Good', 'None', '65.1', '57.0', '1.01', '£1,105', 'None', 'LD10176592', 'Mar 11'], ['Round', '£268.80', '0.23', 'Very GoodVery Good', 'J', 'VVS1', 'Excellent', 'Excellent', 'None', '63.4', '59.0', '1.01', '£1,169', 'None', 'LD11887040', 'Mar 6'], ['Round', '£268.80', '0.23', 'Very GoodVery Good', 'J', 'VVS1', 'Excellent', 'Excellent', 'None', '64.0', '59.0', '1.01', '£1,169', 'None', 'LD11887041', 'Mar 6'], ['Round', '£271.20', '0.24', 'GoodGood', 'E', 'SI2', 'Very Good', 'Very Good', 'None', '63.0', '58.0', '1.01', '£1,130', 'None', 'LD06936446', 'Mar 11'], ['Round', '£271.20', '0.23', 'GoodGood', 'H', 'SI1', 'Very Good', 'Good', 'Faint', '64.4', '58.0', '1.01', '£1,179', 'None', 'LD11590566', 'Mar 11'], ['Round', '£271.20', '0.23', 'IdealIdeal', 'G', 'SI1', 'Very Good', 'Very Good', 'None', '60.6', '58.0', '1.01', '£1,179', 'None', 'LD11677523', 'Mar 11'], ['Round', '£272.40', '0.23', 'GoodGood', 'D', 'SI1', 'Very Good', 'Very Good', 'Faint', '63.5', '59.0', '1.01', '£1,184', 'None', 'LD06933329', 'Mar 11'], ['Round', '£272.40', '0.24', 'GoodGood', 'H', 'SI1', 'Good', 'Very Good', 'Faint', '59.7', '57.0', '1.00', '£1,135', 'None', 'LD10319604', 'Mar 11'], ['Round', '£272.40', '0.23', 'Very GoodVery Good', 'E', 'SI2', 'Very Good', 'Good', 'None', '61.9', '58.0', '1.02', '£1,184', 'None', 'LD11812069', 'Mar 11'], ['Round', '£273.60', '0.24', 'Very GoodVery Good', 'G', 'SI1', 'Very Good', 'Good', 'None', '62.0', '60.0', '1.00', '£1,140', 'None', 'LD11590466', 'Mar 11'], ['Round', '£273.60', '0.24', 'GoodGood', 'H', 'SI1', 'Good', 'Good', 'None', '63.5', '58.0', '1.00', '£1,140', 'None', 'LD11590473', 'Mar 11'], ['Round', '£273.60', '0.23', 'Very GoodVery Good', 'J', 'VVS1', 'Very Good', 'Very Good', 'Faint', '63.9', '58.0', '1.01', '£1,190', 'None', 'LD11887042', 'Mar 6'], ['Round', '£277.20', '0.23', 'IdealIdeal', 'F', 'SI1', 'Very Good', 'Very Good', 'None', '58.6', '62.0', '1.02', '£1,205', 'None', 'LD11734906', 'Mar 11'], ['Round', '£277.20', '0.23', 'IdealIdeal', 'E', 'SI1', 'Very Good', 'Very Good', 'None', '59.8', '60.0', '1.01', '£1,205', 'None', 'LD11801566', 'Mar 11'], ['Round', '£278.40', '0.24', 'Very GoodVery Good', 'E', 'SI2', 'Very Good', 'Good', 'None', '61.1', '59.0', '1.01', '£1,160', 'None', 'LD10162176', 'Mar 11'], ['Round', '£278.40', '0.23', 'Very GoodVery Good', 'G', 'SI2', 'Very Good', 'Good', 'None', '62.4', '58.0', '1.01', '£1,210', 'None', 'LD10176571', 'Mar 11'], ['Round', '£279.60', '0.24', 'GoodGood', 'E', 'SI1', 'Very Good', 'Very Good', 'Medium', '64.3', '59.0', '1.01', '£1,165', 'None', 'LD06934326', 'Mar 11'], ['Round', '£280.80', '0.24', 'GoodGood', 'E', 'SI1', 'Very Good', 'Very Good', 'None', '64.5', '56.0', '1.01', '£1,170', 'None', 'LD06936476', 'Mar 11'], ['Round', '£280.80', '0.24', 'Very GoodVery Good', 'E', 'SI1', 'Very Good', 'Very Good', 'None', '62.9', '57.0', '1.01', '£1,170', 'None', 'LD10176527', 'Mar 11'], ['Round', '£280.80', '0.24', 'GoodGood', 'F', 'SI1', 'Very Good', 'Good', 'None', '63.6', '60.0', '1.01', '£1,170', 'None', 'LD11812134', 'Mar 11'], ['Round', '£280.80', '0.24', 'Very GoodVery Good', 'I', 'VVS1', 'Excellent', 'Excellent', 'None', '62.8', '61.0', '1.01', '£1,170', 'None', 'LD11887022', 'Mar 6'], ['Round', '£280.80', '0.24', 'Very GoodVery Good', 'I', 'VVS2', 'Excellent', 'Excellent', 'None', '59.9', '60.0', '1.00', '£1,170', 'None', 'LD11887027', 'Mar 6'], ['Round', '£282.00', '0.23', 'Very GoodVery Good', 'D', 'SI1', 'Very Good', 'Good', 'None', '62.7', '56.0', '1.01', '£1,226', 'None', 'LD06935424', 'Mar 11'], ['Round', '£282.00', '0.23', 'GoodGood', 'G', 'SI2', 'Very Good', 'Very Good', 'Faint', '59.2', '63.0', '1.01', '£1,226', 'None', 'LD10176611', 'Mar 11'], ['Round', '£282.00', '0.30', 'IdealIdeal', 'J', 'SI2', 'Excellent', 'Excellent', 'None', '62.2', '57.0', '1.01', '£940', 'None', 'LD11566404', 'Mar 11'], ['Round', '£282.00', '0.24', 'GoodGood', 'F', 'SI1', 'Very Good', 'Good', 'None', '63.9', '58.0', '1.01', '£1,175', 'None', 'LD11590526', 'Mar 11'], ['Round', '£282.00', '0.23', 'Very GoodVery Good', 'E', 'SI2', 'Excellent', 'Excellent', 'None', '63.1', '57.0', '1.00', '£1,226', 'None', 'LD11590575', 'Mar 11'], ['Round', '£282.00', '0.23', 'Very GoodVery Good', 'H', 'SI1', 'Very Good', 'Excellent', 'None', '58.9', '63.0', '1.01', '£1,226', 'None', 'LD11682834', 'Mar 11'], ['Round', '£282.00', '0.23', 'Very GoodVery Good', 'E', 'SI2', 'Good', 'Very Good', 'None', '59.7', '62.0', '1.01', '£1,226', 'None', 'LD11812066', 'Mar 11'], ['Round', '£283.20', '0.23', 'GoodGood', 'D', 'SI1', 'Very Good', 'Good', 'Faint', '63.4', '57.0', '1.00', '£1,231', 'None', 'LD06933658', 'Mar 11'], ['Round', '£283.20', '0.24', 'Very GoodVery Good', 'G', 'SI1', 'Very Good', 'Very Good', 'Faint', '62.4', '60.0', '1.01', '£1,180', 'None', 'LD11590561', 'Mar 11'], ['Round', '£284.40', '0.24', 'Very GoodVery Good', 'F', 'SI2', 'Good', 'Good', 'None', '62.8', '57.0', '1.01', '£1,185', 'None', 'LD08235414', 'Mar 11'], ['Round', '£284.40', '0.24', 'Very GoodVery Good', 'F', 'SI2', 'Very Good', 'Very Good', 'None', '63.9', '57.0', '1.00', '£1,185', 'None', 'LD11590515', 'Mar 11'], ['Round', '£284.40', '0.23', 'GoodGood', 'E', 'SI1', 'Excellent', 'Very Good', 'None', '62.6', '57.0', '1.01', '£1,237', 'None', 'LD11590527', 'Mar 11'], ['Round', '£284.40', '0.23', 'GoodGood', 'J', 'SI1', 'Very Good', 'Good', 'Faint', '63.2', '60.0', '1.01', '£1,237', 'None', 'LD11647661', 'Mar 11'], ['Round', '£284.40', '0.23', 'IdealIdeal', 'F', 'SI1', 'Very Good', 'Excellent', 'None', '58.4', '60.0', '1.00', '£1,237', 'None', 'LD11677569', 'Mar 11'], ['Round', '£284.40', '0.23', 'IdealIdeal', 'E', 'SI1', 'Very Good', 'Very Good', 'None', '58.6', '62.0', '1.01', '£1,237', 'None', 'LD11735037', 'Mar 11'], ['Round', '£284.40', '0.23', 'IdealIdeal', 'F', 'SI1', 'Very Good', 'Very Good', 'None', '61.9', '57.0', '1.01', '£1,237', 'None', 'LD11755923', 'Mar 4'], ['Round', '£284.40', '0.24', 'Very GoodVery Good', 'F', 'SI2', 'Very Good', 'Very Good', 'None', '63.4', '59.0', '1.01', '£1,185', 'None', 'LD11812132', 'Mar 11'], ['Round', '£284.40', '0.24', 'IdealIdeal', 'I', 'VVS2', 'Very Good', 'Excellent', 'Faint', '62.2', '58.0', '1.01', '£1,185', 'None', 'LD11887026', 'Mar 6'], ['Round', '£284.40', '0.24', 'Very GoodVery Good', 'I', 'VVS2', 'Excellent', 'Excellent', 'Faint', '63.7', '58.0', '1.01', '£1,185', 'None', 'LD11887028', 'Mar 6'], ['Round', '£284.40', '0.24', 'Very GoodVery Good', 'I', 'VVS2', 'Excellent', 'Excellent', 'Faint', '63.6', '58.0', '1.00', '£1,185', 'None', 'LD11887029', 'Mar 6'], ['Round', '£284.40', '0.24', 'Very GoodVery Good', 'I', 'VVS2', 'Excellent', 'Excellent', 'Faint', '63.6', '57.0', '1.00', '£1,185', 'None', 'LD11887030', 'Mar 6'], ['Round', '£284.40', '0.24', 'Very GoodVery Good', 'I', 'VVS2', 'Very Good', 'Very Good', 'Faint', '63.8', '59.0', '1.01', '£1,185', 'None', 'LD11887031', 'Mar 6'], ['Round', '£285.60', '0.24', 'GoodGood', 'D', 'SI1', 'Very Good', 'Good', 'Faint', '64.1', '60.0', '1.01', '£1,190', 'None', 'LD06933698', 'Mar 11'], ['Round', '£285.60', '0.24', 'GoodGood', 'H', 'SI1', 'Very Good', 'Very Good', 'Faint', '59.7', '64.0', '1.02', '£1,190', 'None', 'LD08298647', 'Mar 11'], ['Round', '£285.60', '0.25', 'GoodGood', 'H', 'SI2', 'Very Good', 'Very Good', 'Faint', '65.3', '56.0', '1.01', '£1,142', 'None', 'LD10176567', 'Mar 11'], ['Round', '£285.60', '0.25', 'GoodGood', 'H', 'SI2', 'Very Good', 'Good', 'Faint', '64.1', '55.0', '1.01', '£1,142', 'None', 'LD11590553', 'Mar 11'], ['Round', '£285.60', '0.24', 'IdealIdeal', 'I', 'VVS1', 'Excellent', 'Excellent', 'Faint', '62.0', '58.0', '1.00', '£1,190', 'None', 'LD11887019', 'Mar 6'], ['Round', '£285.60', '0.24', 'IdealIdeal', 'I', 'VVS1', 'Very Good', 'Excellent', 'Faint', '62.1', '57.0', '1.01', '£1,190', 'None', 'LD11887020', 'Mar 6'], ['Round', '£285.60', '0.24', 'IdealIdeal', 'I', 'VVS1', 'Very Good', 'Very Good', 'Faint', '62.2', '58.0', '1.00', '£1,190', 'None', 'LD11887021', 'Mar 6'], ['Round', '£285.60', '0.24', 'Very GoodVery Good', 'I', 'VVS1', 'Excellent', 'Excellent', 'Faint', '63.6', '59.0', '1.01', '£1,190', 'None', 'LD11887023', 'Mar 6'], ['Round', '£286.80', '0.24', 'Very GoodVery Good', 'F', 'SI1', 'Excellent', 'Good', 'Strong', '63.2', '57.0', '1.01', '£1,195', 'None', 'LD06934089', 'Mar 11']]

將結果寫入csv

import csv
with open('diamonds.csv', 'w') as f:
  write = csv.writer(f)
  write.writerows([_headers, *results])

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM