[英]Beautiful Soup returning incomplete HTML script
I am trying to parse the webpage below so I can loop through all the diamonds they have listed on the site and save the details into a csv
but my code is not finding all the details on the raw HTML.我正在尝试解析下面的网页,以便我可以遍历网站上列出的所有钻石并将详细信息保存到csv
但我的代码没有找到原始 HTML 上的所有详细信息。
The assigned diamonds
variable is returning an array with no elements in it.分配的diamonds
变量返回一个没有元素的数组。 It can't seem to find the catalog-view-offer-wrapper
class that is detailed on the raw HTML它似乎无法找到原始 HTML 上详细说明的catalog-view-offer-wrapper
类
https://www.bluenile.com/uk/diamond-search?tag=none&track=NavDiaVAll https://www.bluenile.com/uk/diamond-search?tag=none&track=NavDiaVAll
Code below:代码如下:
import bs4
from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup
my_url = 'https://www.bluenile.com/uk/diamond-search?tag=none&track=NavDiaVAll'
uClient = uReq(my_url)
page_html = uClient.read()
uClient.close()
page_soup = soup(page_html,"html.parser")
diamonds = page_soup.findAll("div",{"class":"catalog-view-offer-wrapper"})
print(len(diamonds))
You can anchor the attribute selection on the grid
div
:您可以在grid
div
上锚定属性选择:
import requests, re
from bs4 import BeautifulSoup as soup
d = soup(requests.get('https://www.bluenile.com/uk/diamond-search?tag=none&track=NavDiaVAll').text, 'html.parser')
_base = d.find_all('div', {'class':'grid'})[-1]
_results = [[[i.text, i.attrs['class'][-1]] for i in c.find_all('div', {'class':re.compile('^row-cell')})] for c in _base.find_all('a', {'href':re.compile('\./diamond\-details/')})]
_headers, results = [d for _, d in _results[0]], [[c for c, _ in i] for i in _results]
Output:输出:
[['Round', '£235.20', '0.23', 'IdealIdeal', 'I', 'VS1', 'Excellent', 'Excellent', 'None', '58.8', '61.0', '1.01', '£1,023', 'Very Small', 'LD11887032', 'Mar 6'], ['Round', '£235.20', '0.23', 'IdealIdeal', 'I', 'VS1', 'Excellent', 'Excellent', 'None', '62.1', '59.0', '1.01', '£1,023', 'None', 'LD11887033', 'Mar 6'], ['Round', '£238.80', '0.23', 'Very GoodVery Good', 'I', 'VS2', 'Excellent', 'Excellent', 'Faint', '63.8', '58.0', '1.01', '£1,038', 'None', 'LD11887039', 'Mar 6'], ['Round', '£246.00', '0.24', 'Very GoodVery Good', 'I', 'VS1', 'Very Good', 'Excellent', 'None', '63.4', '59.0', '1.00', '£1,025', 'None', 'LD11887038', 'Mar 6'], ['Round', '£249.60', '0.24', 'Very GoodVery Good', 'J', 'VS1', 'Excellent', 'Excellent', 'Medium', '63.7', '58.0', '1.01', '£1,040', 'None', 'LD11887043', 'Mar 6'], ['Round', '£260.40', '0.23', 'GoodGood', 'H', 'SI1', 'Very Good', 'Good', 'Faint', '65.8', '60.0', '1.00', '£1,132', 'None', 'LD11590524', 'Mar 11'], ['Round', '£264.00', '0.23', 'GoodGood', 'I', 'VS2', 'Very Good', 'Very Good', 'None', '63.7', '61.0', '1.01', '£1,148', 'None', 'LD06936712', 'Mar 11'], ['Round', '£265.20', '0.24', 'GoodGood', 'E', 'SI2', 'Very Good', 'Very Good', 'None', '65.1', '57.0', '1.01', '£1,105', 'None', 'LD10176592', 'Mar 11'], ['Round', '£268.80', '0.23', 'Very GoodVery Good', 'J', 'VVS1', 'Excellent', 'Excellent', 'None', '63.4', '59.0', '1.01', '£1,169', 'None', 'LD11887040', 'Mar 6'], ['Round', '£268.80', '0.23', 'Very GoodVery Good', 'J', 'VVS1', 'Excellent', 'Excellent', 'None', '64.0', '59.0', '1.01', '£1,169', 'None', 'LD11887041', 'Mar 6'], ['Round', '£271.20', '0.24', 'GoodGood', 'E', 'SI2', 'Very Good', 'Very Good', 'None', '63.0', '58.0', '1.01', '£1,130', 'None', 'LD06936446', 'Mar 11'], ['Round', '£271.20', '0.23', 'GoodGood', 'H', 'SI1', 'Very Good', 'Good', 'Faint', '64.4', '58.0', '1.01', '£1,179', 'None', 'LD11590566', 'Mar 11'], ['Round', '£271.20', '0.23', 'IdealIdeal', 'G', 'SI1', 'Very Good', 'Very Good', 'None', '60.6', '58.0', '1.01', '£1,179', 'None', 'LD11677523', 'Mar 11'], ['Round', '£272.40', '0.23', 'GoodGood', 'D', 'SI1', 'Very Good', 'Very Good', 'Faint', '63.5', '59.0', '1.01', '£1,184', 'None', 'LD06933329', 'Mar 11'], ['Round', '£272.40', '0.24', 'GoodGood', 'H', 'SI1', 'Good', 'Very Good', 'Faint', '59.7', '57.0', '1.00', '£1,135', 'None', 'LD10319604', 'Mar 11'], ['Round', '£272.40', '0.23', 'Very GoodVery Good', 'E', 'SI2', 'Very Good', 'Good', 'None', '61.9', '58.0', '1.02', '£1,184', 'None', 'LD11812069', 'Mar 11'], ['Round', '£273.60', '0.24', 'Very GoodVery Good', 'G', 'SI1', 'Very Good', 'Good', 'None', '62.0', '60.0', '1.00', '£1,140', 'None', 'LD11590466', 'Mar 11'], ['Round', '£273.60', '0.24', 'GoodGood', 'H', 'SI1', 'Good', 'Good', 'None', '63.5', '58.0', '1.00', '£1,140', 'None', 'LD11590473', 'Mar 11'], ['Round', '£273.60', '0.23', 'Very GoodVery Good', 'J', 'VVS1', 'Very Good', 'Very Good', 'Faint', '63.9', '58.0', '1.01', '£1,190', 'None', 'LD11887042', 'Mar 6'], ['Round', '£277.20', '0.23', 'IdealIdeal', 'F', 'SI1', 'Very Good', 'Very Good', 'None', '58.6', '62.0', '1.02', '£1,205', 'None', 'LD11734906', 'Mar 11'], ['Round', '£277.20', '0.23', 'IdealIdeal', 'E', 'SI1', 'Very Good', 'Very Good', 'None', '59.8', '60.0', '1.01', '£1,205', 'None', 'LD11801566', 'Mar 11'], ['Round', '£278.40', '0.24', 'Very GoodVery Good', 'E', 'SI2', 'Very Good', 'Good', 'None', '61.1', '59.0', '1.01', '£1,160', 'None', 'LD10162176', 'Mar 11'], ['Round', '£278.40', '0.23', 'Very GoodVery Good', 'G', 'SI2', 'Very Good', 'Good', 'None', '62.4', '58.0', '1.01', '£1,210', 'None', 'LD10176571', 'Mar 11'], ['Round', '£279.60', '0.24', 'GoodGood', 'E', 'SI1', 'Very Good', 'Very Good', 'Medium', '64.3', '59.0', '1.01', '£1,165', 'None', 'LD06934326', 'Mar 11'], ['Round', '£280.80', '0.24', 'GoodGood', 'E', 'SI1', 'Very Good', 'Very Good', 'None', '64.5', '56.0', '1.01', '£1,170', 'None', 'LD06936476', 'Mar 11'], ['Round', '£280.80', '0.24', 'Very GoodVery Good', 'E', 'SI1', 'Very Good', 'Very Good', 'None', '62.9', '57.0', '1.01', '£1,170', 'None', 'LD10176527', 'Mar 11'], ['Round', '£280.80', '0.24', 'GoodGood', 'F', 'SI1', 'Very Good', 'Good', 'None', '63.6', '60.0', '1.01', '£1,170', 'None', 'LD11812134', 'Mar 11'], ['Round', '£280.80', '0.24', 'Very GoodVery Good', 'I', 'VVS1', 'Excellent', 'Excellent', 'None', '62.8', '61.0', '1.01', '£1,170', 'None', 'LD11887022', 'Mar 6'], ['Round', '£280.80', '0.24', 'Very GoodVery Good', 'I', 'VVS2', 'Excellent', 'Excellent', 'None', '59.9', '60.0', '1.00', '£1,170', 'None', 'LD11887027', 'Mar 6'], ['Round', '£282.00', '0.23', 'Very GoodVery Good', 'D', 'SI1', 'Very Good', 'Good', 'None', '62.7', '56.0', '1.01', '£1,226', 'None', 'LD06935424', 'Mar 11'], ['Round', '£282.00', '0.23', 'GoodGood', 'G', 'SI2', 'Very Good', 'Very Good', 'Faint', '59.2', '63.0', '1.01', '£1,226', 'None', 'LD10176611', 'Mar 11'], ['Round', '£282.00', '0.30', 'IdealIdeal', 'J', 'SI2', 'Excellent', 'Excellent', 'None', '62.2', '57.0', '1.01', '£940', 'None', 'LD11566404', 'Mar 11'], ['Round', '£282.00', '0.24', 'GoodGood', 'F', 'SI1', 'Very Good', 'Good', 'None', '63.9', '58.0', '1.01', '£1,175', 'None', 'LD11590526', 'Mar 11'], ['Round', '£282.00', '0.23', 'Very GoodVery Good', 'E', 'SI2', 'Excellent', 'Excellent', 'None', '63.1', '57.0', '1.00', '£1,226', 'None', 'LD11590575', 'Mar 11'], ['Round', '£282.00', '0.23', 'Very GoodVery Good', 'H', 'SI1', 'Very Good', 'Excellent', 'None', '58.9', '63.0', '1.01', '£1,226', 'None', 'LD11682834', 'Mar 11'], ['Round', '£282.00', '0.23', 'Very GoodVery Good', 'E', 'SI2', 'Good', 'Very Good', 'None', '59.7', '62.0', '1.01', '£1,226', 'None', 'LD11812066', 'Mar 11'], ['Round', '£283.20', '0.23', 'GoodGood', 'D', 'SI1', 'Very Good', 'Good', 'Faint', '63.4', '57.0', '1.00', '£1,231', 'None', 'LD06933658', 'Mar 11'], ['Round', '£283.20', '0.24', 'Very GoodVery Good', 'G', 'SI1', 'Very Good', 'Very Good', 'Faint', '62.4', '60.0', '1.01', '£1,180', 'None', 'LD11590561', 'Mar 11'], ['Round', '£284.40', '0.24', 'Very GoodVery Good', 'F', 'SI2', 'Good', 'Good', 'None', '62.8', '57.0', '1.01', '£1,185', 'None', 'LD08235414', 'Mar 11'], ['Round', '£284.40', '0.24', 'Very GoodVery Good', 'F', 'SI2', 'Very Good', 'Very Good', 'None', '63.9', '57.0', '1.00', '£1,185', 'None', 'LD11590515', 'Mar 11'], ['Round', '£284.40', '0.23', 'GoodGood', 'E', 'SI1', 'Excellent', 'Very Good', 'None', '62.6', '57.0', '1.01', '£1,237', 'None', 'LD11590527', 'Mar 11'], ['Round', '£284.40', '0.23', 'GoodGood', 'J', 'SI1', 'Very Good', 'Good', 'Faint', '63.2', '60.0', '1.01', '£1,237', 'None', 'LD11647661', 'Mar 11'], ['Round', '£284.40', '0.23', 'IdealIdeal', 'F', 'SI1', 'Very Good', 'Excellent', 'None', '58.4', '60.0', '1.00', '£1,237', 'None', 'LD11677569', 'Mar 11'], ['Round', '£284.40', '0.23', 'IdealIdeal', 'E', 'SI1', 'Very Good', 'Very Good', 'None', '58.6', '62.0', '1.01', '£1,237', 'None', 'LD11735037', 'Mar 11'], ['Round', '£284.40', '0.23', 'IdealIdeal', 'F', 'SI1', 'Very Good', 'Very Good', 'None', '61.9', '57.0', '1.01', '£1,237', 'None', 'LD11755923', 'Mar 4'], ['Round', '£284.40', '0.24', 'Very GoodVery Good', 'F', 'SI2', 'Very Good', 'Very Good', 'None', '63.4', '59.0', '1.01', '£1,185', 'None', 'LD11812132', 'Mar 11'], ['Round', '£284.40', '0.24', 'IdealIdeal', 'I', 'VVS2', 'Very Good', 'Excellent', 'Faint', '62.2', '58.0', '1.01', '£1,185', 'None', 'LD11887026', 'Mar 6'], ['Round', '£284.40', '0.24', 'Very GoodVery Good', 'I', 'VVS2', 'Excellent', 'Excellent', 'Faint', '63.7', '58.0', '1.01', '£1,185', 'None', 'LD11887028', 'Mar 6'], ['Round', '£284.40', '0.24', 'Very GoodVery Good', 'I', 'VVS2', 'Excellent', 'Excellent', 'Faint', '63.6', '58.0', '1.00', '£1,185', 'None', 'LD11887029', 'Mar 6'], ['Round', '£284.40', '0.24', 'Very GoodVery Good', 'I', 'VVS2', 'Excellent', 'Excellent', 'Faint', '63.6', '57.0', '1.00', '£1,185', 'None', 'LD11887030', 'Mar 6'], ['Round', '£284.40', '0.24', 'Very GoodVery Good', 'I', 'VVS2', 'Very Good', 'Very Good', 'Faint', '63.8', '59.0', '1.01', '£1,185', 'None', 'LD11887031', 'Mar 6'], ['Round', '£285.60', '0.24', 'GoodGood', 'D', 'SI1', 'Very Good', 'Good', 'Faint', '64.1', '60.0', '1.01', '£1,190', 'None', 'LD06933698', 'Mar 11'], ['Round', '£285.60', '0.24', 'GoodGood', 'H', 'SI1', 'Very Good', 'Very Good', 'Faint', '59.7', '64.0', '1.02', '£1,190', 'None', 'LD08298647', 'Mar 11'], ['Round', '£285.60', '0.25', 'GoodGood', 'H', 'SI2', 'Very Good', 'Very Good', 'Faint', '65.3', '56.0', '1.01', '£1,142', 'None', 'LD10176567', 'Mar 11'], ['Round', '£285.60', '0.25', 'GoodGood', 'H', 'SI2', 'Very Good', 'Good', 'Faint', '64.1', '55.0', '1.01', '£1,142', 'None', 'LD11590553', 'Mar 11'], ['Round', '£285.60', '0.24', 'IdealIdeal', 'I', 'VVS1', 'Excellent', 'Excellent', 'Faint', '62.0', '58.0', '1.00', '£1,190', 'None', 'LD11887019', 'Mar 6'], ['Round', '£285.60', '0.24', 'IdealIdeal', 'I', 'VVS1', 'Very Good', 'Excellent', 'Faint', '62.1', '57.0', '1.01', '£1,190', 'None', 'LD11887020', 'Mar 6'], ['Round', '£285.60', '0.24', 'IdealIdeal', 'I', 'VVS1', 'Very Good', 'Very Good', 'Faint', '62.2', '58.0', '1.00', '£1,190', 'None', 'LD11887021', 'Mar 6'], ['Round', '£285.60', '0.24', 'Very GoodVery Good', 'I', 'VVS1', 'Excellent', 'Excellent', 'Faint', '63.6', '59.0', '1.01', '£1,190', 'None', 'LD11887023', 'Mar 6'], ['Round', '£286.80', '0.24', 'Very GoodVery Good', 'F', 'SI1', 'Excellent', 'Good', 'Strong', '63.2', '57.0', '1.01', '£1,195', 'None', 'LD06934089', 'Mar 11']]
To write the results to csv
:将结果写入csv
:
import csv
with open('diamonds.csv', 'w') as f:
write = csv.writer(f)
write.writerows([_headers, *results])
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.