How I can a image if code like this:
<div class="galery-images">
<div class="galery-images-slide" style="width: 760px;">
<div class="galery-item galery-item-selected" style="background-image: url(/images/photo/1/20130206/30323/136666697057736800.jpg);"></div>
I want to get 136666697057736800.jpg I wrote:
images = soup.select("div.galery-item")
And i get a list:
[<div class="galery-item galery-item-selected" style="background-image: url(/images/photo/1/20130206/30323/136666697057736800.jpg);"></div>,
<div class="galery-item" style="background-image: url(/images/photo/1/20130206/30323/136013892671126300.jpg);" ></div>,
<div class="galery-item" style="background-image: url(/images/photo/1/20130206/30323/136666699218876700.jpg);"></div>]
I dont understand: how I can get all images?
Use regex or a css parser to extract the url , concatenate the host to the beginning of the URL, finally download the image like this.
import urllib
urllib.urlretrieve("https://www.google.com/images/srpr/logo11w.png", "google.png")
To make your life easier, you should use a regex:
urls = []
for ele in soup.find_all('div', attrs={'class':'galery-images-slide'}):
pattern = re.compile('.*background-image:\s*url\((.*)\);')
match = pattern.match(ele.div['style'])
if match:
urls.append(match.group(1))
This works by finding all the divs
belonging to the parent div (which has the class: 'galery-images-slide'). Then, you can parse the child divs
to find any that contain the style (which itself contains the background-url) using a regex.
So, from your above example, this will output:
[u'/images/photo/1/20130206/30323/136666697057736800.jpg']
Now, to download the specified image, you append the site name in front of the url, and you should be able to download it.
NOTE:
This requires the regex module ( re
) in Python in addition to BeautifulSoup
. And, the regex I used is quite naive. But, you can adjust this as required to suit your needs.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.