繁体   English   中英

如何在 Python 中抓取嵌套列表?

[英]How do I webscrape nested lists in Python?

网站链接: https ://www.zivame.com/rosaline-chromaticity-knit-cotton-top-florida-key.html?trksrc=category&trkid=search&trkorder=relevance

我想刮什么:短袖款式,宽松舒适(基本上是描述下的要点)

这是我目前使用的代码:

from selenium import webdriver
import re
from bs4 import BeautifulSoup
import requests

result = requests.get("https://www.zivame.com/rosaline-chromaticity-knit-cotton-top-florida-key.html?trksrc=category&trkid=search&trkorder=relevance")

soup = BeautifulSoup(result.text, 'lxml')
page = soup.find('div', id="product-page")
description = page.find('div', id="product-basicdetail")
point1 = description.find('div', id="ff-rm text-size pd-b5")
print(point1)

数据以 JSON 数据的形式出现,您可以直接从源页面抓取数据。

import requests
from lxml import html

r = requests.get('https://www.zivame.com/rosaline-chromaticity-knit-cotton-top-florida-key.html?trksrc=category&trkid=search&trkorder=relevance')
source_page = html.fromstring(r.text)
json_value = source_page.xpath("//script[contains(.,'window.__product=')]/text()")[0]
json_value = json_value.split("{features:{values:[{list:[")[1].split("]}],count:1}}},modelMetaData:")[0]
print(json_value.split(','))

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM