简体   繁体   中英

Download photo of instagram with python

I have a problem that I can't seem to find an answer to. What I want to achieve: Download the first photo on a persons page.

I was going to do it using chromedriver and then get the HTML tag for the scontent-link. After that I was probably going to write some code to download the photo to a specific folder on my PC using the link.

The code I wanted to use for reference is:

from selenium import webdriver
from selenium.webdriver.common.action_chains import ActionChains
import os

#set up chromedriver
chromedriver = "E:/Instabot/chromedriver.exe"
os.environ["webdriver.chrome.driver"] = chromedriver
driver = webdriver.Chrome(chromedriver)
actions = ActionChains(driver)

base_url = "https://www.instagram.com/"

#go to a picture images[n] is the number of the picture in their feed
images = driver.find_elements_by_class_name("_bz0w")
image_curr = images[1].find_element_by_tag_name("a").get_attribute("href")

#Find the HTML class that has the like count
likes = driver.find_elements_by_class_name("Nm9Fw")
Like_list =[]

for l in likes:
   likes = l.find_element_by_css_selector('span').get_attribute("textContent")
   listToStr = ' '.join([str(elem) for elem in Like_list])

df = pd.DataFrame({"Likes:": Like_list})
df.to_csv("instagram_likes.txt", index=False)

I used this code to extract the like count from a post. I am not a skilled or advanced programmer so my code may be messy...

I hope someone can help me with this problem!

You can use Selenium to get image src but later you need requests or urllib to download it

import requests

# ... selenium code ... 

img_src = driver.find_element_by_xpath('//div/img').get_attribute("src")
print('img:', img_src)

r = requests.get(img_src)

fp = open('image.jpg', 'wb') # it has to be `bytes` mode
fp.write(r.content) # it has to be `r.content, not `r.text`

EDIT: Full code which I used to test it.

from selenium import webdriver
import requests

#set up chromedriver
#chromedriver = "E:/Instabot/chromedriver.exe"
#os.environ["webdriver.chrome.driver"] = chromedriver
#driver = webdriver.Chrome(chromedriver)
driver = webdriver.Firefox()

base_url = "https://www.instagram.com/"
handle = "nobody"  # it is real name

images = driver.find_elements_by_class_name("_bz0w")

# first get all `href` as text
# because after using `driver.get()` it will lost access to objects on page
images_href = []
for img in images:
    href = img.find_element_by_tag_name("a").get_attribute("href")

# now we can get all images
for number, href in enumerate(images_href):

    img_src = driver.find_element_by_xpath('//div/img').get_attribute("src")
    print('img:', img_src)

    r = requests.get(img_src)
    filename = f'image-{number}.jpg' 
    with open(filename, 'wb') as fp:

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM