简体   繁体   中英

What is the error getting in web scraping?

I am getting an error list index out of range in the last line. Also the containers variable is empty and it is giving out 0 when printed its length insted of 12. The containers variable should have contained all the details but it is not fetching anything.

  from urllib.request import urlopen as uReq
  from bs4 import BeautifulSoup as soup

  my_url='https://www.newegg.com/global/in-en/p/pl?d=graphics+card'
  uClient=uReq(my_url)  #opening the connecting,grabbing the page,this line downloads the web page

  page_html = uClient.read()   #this line dump every thing in the variable page_html
  uClient.close()         #close the connections.
  page_soup = soup(page_html,"html.parser")           #html parsing
  #print(page_soup.h1)                       #this line print the header 

  #print(page_soup.p)

  containers = page_soup.findAll("div", {"class": {"item-container"}})   #grabbing each product
  len(containers)
  containers[0] 

Try using requests like this:

import requests
from bs4 import BeautifulSoup as soup
url = 'https://www.newegg.com/p/pl?d=graphics+card'
r = requests.get(url)
soup_page = soup(r.content,'html.parser') 
containers = soup_page.find_all('div',{'class':'item-container'})
print(len(containers))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM