I'm trying to download some images (let's say the first 10) from a website. The problem is that i don't know how html works.
What I did so far:
from selenium import webdriver
import time
driver = webdriver.Chrome("C:\web_driver\chromedriver")
url = "https://9gag.com/"
driver.get(url)
time.sleep(5)
driver.find_element_by_xpath("/html/body/div[7]/div[1]/div[2]/div/div[3]/button[2]/span").click()
images = driver.find_elements_by_tag_name('img')
list = []
for image in images:
print(image.get_attribute('src'))
list.append(image.get_attribute('src'))
I want to download the images at the center of the page but the program just retrieve the images on the left sidebar. My attempt to solve this problem is:
from selenium import webdriver
import time
driver = webdriver.Chrome("C:\web_driver\chromedriver")
url = "https://9gag.com/"
driver.get(url)
time.sleep(5)
# this part is to close the cookies pop up
driver.find_element_by_xpath("/html/body/div[7]/div[1]/div[2]/div/div[3]/button[2]/span").click()
images = driver.find_element_by_class_name("page").get_attribute("img")
list = []
for image in images:
print(image.get_attribute('src'))
# list.append(image.get_attribute('src'))
# print("list:", list)
time.sleep(1)
but I got the following error:
Traceback (most recent call last):
File "C:/Users/asus/PycharmProjects/project1/36.py", line 14, in <module>
for image in images:
TypeError: 'NoneType' object is not iterable
Process finished with exit code 1
<div class=page>
doesn't contain any img
attribute. You have to look for the <img>
tag find_element_by_
only returns one element. To get the list of elements you have to use find_elements_by_
. That is why you are getting the error. //div[contains(@id,'stream-')]//div[@class='post-container']//picture/img
gif
s are not image or inside an <image>
tag. So you will only be able to get the still images by this method. Try this:
images = driver.find_elements_by_xpath("//div[contains(@id,'stream-')]//div[@class='post-container']//picture/img")
list = []
for image in images:
print(image.get_attribute('src'))
list.append(image.get_attribute('src'))
It will put all the found images sources to the list.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.