I have been trying to scrape product names and prices from this website ( https://dentalspeed.com/?fbclid=IwAR1_gjjWAevu1pgikjwLUqeFXzjBRo7A93uXFSIAasxlvl97ptEorNP1fDo ) but unfortunately i can't get CSS selectors right. I have also used CSS selector gadget. I also know html and css and i have read it myself. I think the css selectors are right but i just can't extract data for some reason.
def parse(self, response):
items = DenItem()
all_div = response.css('div.collection-product')
for div in all_div:
product_name = div.css(".collection-product-name font font::text").extract()
_new_price = div.css('div.collection-product-price > a > font > font::text').extract() # .replace("Rs", "")
_new_price = [s.replace("$", "") for s in _new_price]
_new_price = [s.replace(",", "") for s in _new_price]
_old_price = div.css("main#setembro section:nth-child(5) > div > div > div > div > ul > div.owl-wrapper-outer > div > div:nth-child(3) > li > div > div.collection-product-price-content > p.collection-product-price > del > font > font::text").extract() # .replace("Rs", "")
_old_price = [n.replace("R $", "") for n in _old_price]
_old_price = [n.replace(",", "") for n in _old_price]
items['product_name'] = product_name
items['_new_price'] = _new_price
items['_old_price'] = _old_price
if len(items['_new_price']) == 0:
items['_new_price'] = '0'
if len(items['_old_price']) == 0:
items['_old_price'] = '0'
yield items
I find content dynamically returned from another url. You can find this in the network tab when refreshing the page with F5.
import requests
r = requests.get('https://dentalspeed.com/vitrines/app-vitrine__home--estetica').json()
print(r)
Depending on full list of products you want (you may need to track other urls)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.