I need to extract all images from a website using Selenium . This should include all images of any extension ( png
, jpg
, svg
, etc) from html, css and javascript. This means that a simple extraction of all the <img>
elements will not be sufficient (eg any image loaded from css style will be missed):
images = driver.find_elements_by_tag_name('img') # not sufficient
Is there anything smarter to do instead of downloading and parsing every css and javascript script required in the website and using regex to look for image files?
It would be ideal if there is a way to just look for the downloaded resources after the page load, something similar to the network
tab in chrome dev tools
:
Any idea?
The answer is originally taken from How to access Network panel on google chrome developer tools with selenium? . I just updated a little bit.
resources = driver.execute_script("return window.performance.getEntriesByType('resource');")
for resource in resources:
if resource['initiatorType'] == 'img': # check for other types if needed
print(resource['name']) # this is the original link of the file
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.