I've written a scraper in python to get different category names from a webpage but it is unable to fetch anything from that page. I'm seriously confused not to be able to figure out where i'm going wrong. Any help would be vastly appreciated.
Here is the link to the webpage: URL
Here is what I've tried so far:
from bs4 import BeautifulSoup
import requests
res = requests.get("replace_with_above_url",headers={"User-Agent":"Mozilla/5.0"})
soup = BeautifulSoup(res.text,"lxml")
for items in soup.select('.slide_container .h3.standardTitle'):
print(items.text)
Elements within which one such category names I'm after:
<div class="slide_container">
<a href="/offers/furniture/" tabindex="0">
<picture style="float: left; width: 100%;"><img style="width:100%" src="/_m4/9/8/1513184943_4413.jpg" data-w="270"></picture>
<div class="floated-details inverted" style="height: 69px;">
<div class="h3 margin-top-sm margin-bottom-sm standardTitle">
Furniture Offers #This is the name I'm after
</div>
<p class="carouselDesc">
</p>
</div>
</a>
</div>
from bs4 import BeautifulSoup
import requests
headers = {
'accept':'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8',
'accept-encoding':'gzip, deflate, br',
'accept-language':'en-US,en;q=0.9',
'cache-control':'max-age=0',
'referer':'https://www.therange.co.uk/',
'upgrade-insecure-requests':'1',
'user-agent':'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.84 Safari/537.36',
}
res = requests.get("https://www.therange.co.uk/",headers=headers)
soup = BeautifulSoup(res.text,'html.parser')
for items in soup.select('.slide_container .h3.standardTitle'):
print(items.text)
Try this
a user-agent is not enough because headers are the most important part of scrapping.if you miss any header then server ll treat you as a bot.
使用"html.parser"
而不是"lxml"
soup = BeautifulSoup(res.text,"html.parser")
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.