![](/img/trans.png)
[英]How can I find the url of an image from Google Images (or bing) using python (and preferably BS4)?
[英]How to scrape high resolution images from google images using BS4 in python
我們制作了一個程序,它通過 tkinter GUI 接受輸入並轉到谷歌圖像,並根據輸入下載圖像。這是代碼:
import requests
import bs4
import random
from PIL import Image
from tkinter import messagebox as msgbox
i=0
import os
from tkinter import *
from tkinter import filedialog
ac=str(random.randint(1,20))
b=str(random.randint(20,38))
y=Tk()
def find_file():
aaa=filedialog.askdirectory()
return aaa
def create_folder():
ad=find_file()
global ac
global b
ads=os.path.join(ad,f"Img{ac}{b}")
os.mkdir(ads)
return ads
defe=Entry(bg="white")
defe.grid(row=2,column=2)
adj=Label(text="Enter the name of the photo(s) you want to download :")
adj.grid(row=2,column=1)
ack=Label(text="How many photos you want to download?")
ack.grid(row=3,column=1)
dee=Entry(bg="white")
dee.grid(row=3,column=2)
def download_images():
defei=defe.get()
deee=int(dee.get())
aadgc=[]
play=True
if " gif" in defei or ".gif" in defei:
msgbox.showerror("GIF not supported",".gif format is not supported by this software.Sorry for the inconvenience")
play=False
while play:
asd=create_folder()
for start in range(0,400,20):
bararara=f"https://www.google.co.in/search?q={defei}&source=lnms&tbm=isch&start={start}#imgrc=fTslNdnf0RRRxM"
a=requests.get(bararara).text
soup=bs4.BeautifulSoup(a,"lxml")
ab=soup.find_all("img",{"class":"n3VNCb"},limit=deee)
aadgc.extend(ab)
aa=[abb["src"] for abb in aadgc]
for source in aa:
r=random.randint(0,100)
ra=random.randint(0,1000)
raa=asd+"\\"+str(r)+str(ra)+".png"
try:
binary=requests.get(source).content
except requests.exceptions.MissingSchema:
binary=requests.get("http:"+source).content
except:
binary=requests.get("https:"+source).content
with open(raa,"wb") as saaho:
saaho.write(binary)
saaho.close()
global i
i+=1
if i==int(deee):
break
asd=asd.replace("/","\\")
os.system(f"explorer \"{asd}\"")
break
aadg=Button(y,bg="red",text="Download!",command=lambda:download_images(),activebackground="dark red",activeforeground="grey")
aadg.grid(row=4,column=1)
y.mainloop()
aadg=Button(y,bg="red",text="Download!",command=lambda:download_images(),activebackground="dark red",activeforeground="grey")
aadg.grid(row=4,column=1)
y.mainloop()
但是我們得到的是圖像的縮略圖而不是圖像,因為軟件只返回低分辨率的照片並且不支持.gif圖像。
我們也找不到主圖像所屬的 class。 謝謝。
使用 Selenium:
單擊搜索結果中的圖像。
等到圖像可見。
image_link = driver.find_element_by_css_selector(".tvh9oe.BIB1wf.eHAdSb>img").get_attribute("src")
您可以為bs4
使用相同的定位器
要找到原始或全分辨率圖像,您必須首先獲取圖像的data-tbnid
。
在這種情況下,它是: sd7iKvYzujke_M
。 獲得 ID 后,您只需使用正則表達式從頁面源中提取完整的原始圖像。
或者,您可以使用 SerpApi 等第三方解決方案。 它是付費的 API,可免費試用。
from serpapi import GoogleSearch
params = {
"api_key": "secret_api_key",
"engine": "google",
"q": "inception",
"tbm": "isch"
}
search = GoogleSearch(params)
results = search.get_dict()
示例 JSON output:
"images_results": [
{
"position": 1,
"thumbnail": "https://serpapi.com/searches/60e70bf0e815af01fd163d6a/images/39eac787b1522b4ccc71382ac53fc933e15aa52342a5d06fafca53990897f2f9.jpeg",
"source": "rottentomatoes.com",
"title": "Inception (2010) - Rotten Tomatoes",
"link": "https://www.rottentomatoes.com/m/inception",
"original": "https://flxt.tmsimg.com/assets/p7825626_p_v10_af.jpg"
},
{
"position": 2,
"thumbnail": "https://serpapi.com/searches/60e70bf0e815af01fd163d6a/images/39eac787b1522b4ce9c6c6de6d244df2098ae0b7da6856fba545b4376e01d075.jpeg",
"source": "imdb.com",
"title": "Inception (2010) - IMDb",
"link": "https://www.imdb.com/title/tt1375666/",
"original": "https://m.media-amazon.com/images/M/MV5BMjAxMzY3NjcxNF5BMl5BanBnXkFtZTcwNTI5OTM0Mw@@._V1_.jpg"
},
{
"position": 3,
"thumbnail": "https://serpapi.com/searches/60e70bf0e815af01fd163d6a/images/39eac787b1522b4ca9d2321fc39291764c3894c9816af84a5af355f2d54f6921.jpeg",
"source": "screenrant.com",
"title": "Inception: What Each Character Represents (Confirmed By Christopher Nolan)",
"link": "https://screenrant.com/inception-movie-christopher-nolan-characters-actors-meaning-confirmed/",
"original": "https://static2.srcdn.com/wordpress/wp-content/uploads/2020/03/Inception-characters-film-crew.jpg?q=50&fit=crop&w=960&h=500&dpr=1.5"
},
...
]
查看文檔以獲取更多詳細信息。
免責聲明:我在 SerpApi 工作。
我需要抓取 40,000 個搜索結果來構建我們的葯房產品目錄,其中包含一個 csv 文件,其中包括 40,000 多個葯物名稱和條形碼,我需要為每個產品下載 5-10 個圖像結果並根據我們的軟件索引對其進行命名。 我不是編碼員,所以我需要一個軟件或您可以提供的任何幫助。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.