[英]Download photo of instagram with python
我有一個問題,我似乎無法找到答案。 我想要實現的目標:下載人物頁面上的第一張照片。
我打算使用 chromedriver 來完成,然后獲取 scontent-link 的 HTML 標記。 在那之后,我可能會編寫一些代碼來使用鏈接將照片下載到我電腦上的特定文件夾中。
我想用作參考的代碼是:
from selenium import webdriver
from selenium.webdriver.common.action_chains import ActionChains
import os
#set up chromedriver
chromedriver = "E:/Instabot/chromedriver.exe"
os.environ["webdriver.chrome.driver"] = chromedriver
driver = webdriver.Chrome(chromedriver)
actions = ActionChains(driver)
base_url = "https://www.instagram.com/"
handle="username"
driver.get(base_url+handle)
#go to a picture images[n] is the number of the picture in their feed
images = driver.find_elements_by_class_name("_bz0w")
image_curr = images[1].find_element_by_tag_name("a").get_attribute("href")
driver.get(image_curr)
#Find the HTML class that has the like count
likes = driver.find_elements_by_class_name("Nm9Fw")
Like_list =[]
for l in likes:
likes = l.find_element_by_css_selector('span').get_attribute("textContent")
#print(str(likes))
Like_list.append(likes)
listToStr = ' '.join([str(elem) for elem in Like_list])
#print(listToStr)
df = pd.DataFrame({"Likes:": Like_list})
df.to_csv("instagram_likes.txt", index=False)
我使用此代碼從帖子中提取點贊數。 我不是一個熟練或高級的程序員,所以我的代碼可能很亂......
我希望有人可以幫助我解決這個問題!
您可以使用 Selenium 來獲取圖像src
但稍后您需要requests
或urllib
來下載它
import requests
# ... selenium code ...
img_src = driver.find_element_by_xpath('//div/img').get_attribute("src")
print('img:', img_src)
r = requests.get(img_src)
fp = open('image.jpg', 'wb') # it has to be `bytes` mode
fp.write(r.content) # it has to be `r.content, not `r.text`
fp.close()
編輯:我用來測試它的完整代碼。
from selenium import webdriver
import requests
#set up chromedriver
#chromedriver = "E:/Instabot/chromedriver.exe"
#os.environ["webdriver.chrome.driver"] = chromedriver
#driver = webdriver.Chrome(chromedriver)
driver = webdriver.Firefox()
base_url = "https://www.instagram.com/"
handle = "nobody" # it is real name
driver.get(base_url+handle)
images = driver.find_elements_by_class_name("_bz0w")
# first get all `href` as text
# because after using `driver.get()` it will lost access to objects on page
images_href = []
for img in images:
href = img.find_element_by_tag_name("a").get_attribute("href")
images_href.append(href)
# now we can get all images
for number, href in enumerate(images_href):
driver.get(href)
img_src = driver.find_element_by_xpath('//div/img').get_attribute("src")
print('img:', img_src)
r = requests.get(img_src)
filename = f'image-{number}.jpg'
with open(filename, 'wb') as fp:
fp.write(r.content)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.