The goal is to get clothe's rating (stars) via beautiful soup.
For more clear detail this is part of python code, and in the past it worked:
url = f"https://www.wildberries.ru/catalog/18645227/detail.aspx?targetUrl=IN"
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')
print(soup)
rating = soup.find('span', {'data-link': 'text{: product^star}'})
in inspector google chrome can see html:
<span data-link="text{: product^star}">5</span>
but if to see it via print (or via view-source in chrome):
print(soup)
we'll see nothing like this:
<span data-link="text{: product^star}">5</span>
in that place of html (via print(soup)) where must be body in html i can see just something like react stuff:
<div id="mainContainer" class="main__container">
<div id="app">
</div>
<button class="btn-quick-nav j-quicknav" type="button">to the
top</button>
</div>
and huge bunch of javascript stuff in footer, so i can't pull that span
concrete url for example:
https://www.wildberries.ru/catalog/18645227/detail.aspx?targetUrl=IN
concrete to parse
<span data-link="text{: product^star}">4</span>
is it new tecnology comufliaging code protecting from parsing)? is it any way to get "old-school html")?
The short answer is you can't parse and/or get that data with bs4
.
As you've noticed, all of the product's data is generated dynamically, which means you need to have a way of running JavaScript
, which bs4
doesn't do.
If you want to get the old school HTML , use automated tools like selenium
with, for example, Chrome driver .
However, you can get the data without selenium
, if you know the product's id.
Here's an example (the product id is the last value in the url nm=51728993
):
import requests
url = "https://wbxcatalog-ru.wildberries.ru/nm-2-card/catalog?spp=0&pricemarginCoeff=1.0®=0&appType=1&emp=0&locale=ru&lang=ru&curr=rub&nm=51728993"
headers = {
"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:100.0) Gecko/20100101 Firefox/100.0"
}
data = requests.get(url, headers=headers).json()["data"]["products"][0]
print(f"{data['name']}\n{data['rating']} stars from {data['feedbacks']} reviews.")
Outputs:
Смартфон Poco M4 Pro / 6.6'' / 1080x2400 / IPS / 8 ГБ / 128 ГБ / 5000 мА*ч
5 stars from 414 reviews
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.