I'm using BeautifulSoup in python, to scrape a website.
While the addrs
, a_earths
was crawled, points = soup.select('.addr_point')
at the end This section can't be crawled. I don't know the cause (the dashed red box in Image of webpage )
Following is code block I'm using:
import urllib.parse
from bs4 import BeautifulSoup
import re
url = 'http://www.dooinauction.com/auction/ca_list.php'
req = urllib.request.Request(url) #
html = urllib.request.urlopen(req).read()
soup = BeautifulSoup(html, 'html.parser')
tots = soup.select('div.title_left font') #total
tot = int(re.findall('\d+', tots[0].text)[0])
print(f'total : {tot}건')
url = f'http://www.dooinauction.com/auction/ca_list.php?total_record={tot}&search_fm_off=1&search_fm_off=1&start=0'
html = urllib.request.urlopen(url).read()
soup = BeautifulSoup(html, 'html.parser')
addrs = soup.select('.addr') # crawling OK
a_earths = soup.select('.list_class.bold') #crawling OK
points = soup.select('.addr_point') #crawling NO
print()
I browse your website and it seems that I can't see the addr_points section. I think maybe this is the reason.
Screenshot:
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.