Scraping from specific website has stopped working

Question

So a couple of weeks ago I wrote this program which sucessfuly scraped some info on some online store, but now it has stopped working without me changing the code?

Could this be something that has been changed within the website itself or is there something wrong with my code?

import requests
from bs4 import BeautifulSoup

url = 'https://www.continente.pt/stores/continente/pt-pt/public/Pages/ProductDetail.aspx?ProductId=7104665(eCsf_RetekProductCatalog_MegastoreContinenteOnline_Continente)'

res = requests.get(url)
html_page = res.content
soup = BeautifulSoup(html_page, 'html.parser')

priceInfo = soup.find('div', class_='pricePerUnit').text

priceInfo = priceInfo.replace('\n', '').replace('\r', '').replace(' ', '')

productName = soup.find('div', class_='productTitle').text.replace('\n', ' ')

productInfo = (soup.find('div', class_='productSubtitle').text
               + ', ' + soup.find('div', class_='productSubsubtitle').text)

print('Nome do produto: ' + productName)
print('Detalhes: ' + productInfo)
print('Custo: ' + priceInfo)

I know for a fact that what im searching for does exist and the url is still valid, so what could be the issue? I separated the priceInfo into 2 lines because the error exists in the first declaration, since it returns a NoneType which has no text attribute

Answer 1

Solution is bit multistep.

Try calling the page you want to scrape in Firefox once
Use browser_cookie3 lib to extract cookies
ensure they are not expired
Use the cookies in requests.get(url, cookies=browser_cookie3.firefox())
Use the headers as below

Hope it works!! Happy scraping

Have tried on my own and it works!!

 headers = {
    'Connection': 'keep-alive',
    'Upgrade-Insecure-Requests': '1',
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.121 Safari/537.36',
    'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
    'Sec-Fetch-Site': 'none',
    'Sec-Fetch-Mode': 'navigate',
    'Sec-Fetch-User': '?1',
    'Sec-Fetch-Dest': 'document',
    'Accept-Language': 'en-US,en;q=0.9,de;q=0.8',
}

Scraping from specific website has stopped working

Question

1 answers

solution1
0 ACCPTED 2020-09-27 17:42:29

Scraping from specific website has stopped working

Question

1 answers

solution1 0 ACCPTED 2020-09-27 17:42:29

solution1
0 ACCPTED 2020-09-27 17:42:29