![](/img/trans.png)
[英]how to download images from url and create folders with python in pandas
[英]Download web images by URL from excel and save to folders in Python
我有一個Excel文件,如下所示:
import pandas as pd
pd.set_option('display.max_rows', 500)
pd.set_option('display.max_columns', 500)
pd.set_option('display.width', 1000)
pd.options.display.max_colwidth
df = pd.read_excel("./test.xlsx")
print(df)
輸出:
city buildingName buildingID imgType imgUrl
0 bj LG tower 123456 inside http://pic3.nipic.com/20090629/827780_144001014_2.jpg
1 bj LG tower 123456 outside http://pic.baike.soso.com/p/20140321/20140321160157-391052318.jpg
2 sh LXD 123457 inside http://pic10.nipic.com/20101008/2634566_104534032717_2.jpg
3 gz GM 123458 inside http://pic1.to8to.com/case/day_120720/20120720_fb680a57416b8d16bad2kO1kOUIzkNxO.jpg
我需要通過讀取和迭代imgUrl
列來下載圖像,然后將圖像保存到按列city, buildingName, buildingId, imgType.
組合的路徑city, buildingName, buildingId, imgType.
最終的輸出文件夾和子文件夾的結構將是這樣,它們將保存在名為output
的文件夾中:
├── bj
│ └── LG tower_123456
│ ├── inside
│ │ └── 827780_144001014_2.jpg
│ └── outside
│ └── 20140321160157-391052318.jpg
├── gz
│ └── GM_123458
│ └── inside
│ └── 2634566_104534032717_2.jpg
├── sh
│ └── LXD_123457
│ └── inside
│ └── 20120720_fb680a57416b8d16bad2kO1kOUIzkNxO.jpg
如何在Python中做到這一點? 感謝您的幫助。
我嘗試下載一張圖片:
import requests
r = requests.get("http://pic1.to8to.com/case/day_120720/20120720_fb680a57416b8d16bad2kO1kOUIzkNxO.jpg")
if r.status_code == 200:
with open("test.jpg", "wb") as f:
f.write(r.content)
假設已加載數據幀,則可以執行類似的操作。
import requests
from os.path import join
for index, row in df.iterrows():
url = row['url']
file_name = url.split('/')[-1]
r = requests.get(url)
abs_file_name = join(row['city'],row['buildingName']+str(row['buildingId']),row['imgType'],file_name)
if r.status_code == 200:
with open(abs_file_name, "wb") as f:
f.write(r.content)
編輯代碼:
import requests
from os.path import join,expanduser
import os
home = expanduser("~")
df = pd.DataFrame()
# df.append({})
for index, row in df.iterrows():
url = row['url']
file_name = url.split('/')[-1]
r = requests.get(url)
filepath = join(home,row['city'],row['buildingName']+str(row['buildingId']),row['imgType'])
if not os.path.exists(filepath):
os.makedirs(filepath)
filepath = join(filepath, file_name)
# print(filepath)
if r.status_code == 200:
with open(filepath, "wb") as f:
f.write(r.content)
import pandas as pd
import requests
def download_urls(csv_path):
df = pd.read_csv(csv_path,encoding='utf-8',error_bad_lines=False)
for index, row in df.iterrows():
folder = row[0]
sub_folder = row[1]
url = row[3]
r = requests.get(url)
if r.status_code == 200:
with open("/{0}/{1}/{2}".format(folder, sub_folder, url.split("/")[-1]), "wb") as f:
f.write(r.content)
path = r"C:\path\your_csv_path"
download_urls(path)
嘗試此操作,假設您將csv文件作為輸入,沒有用pandas迭代行的優雅方法,因此您可以使用csv libary代替
import pandas as pd
import requests
import os
def download_urls(csv_path):
df = pd.read_csv(csv_path,encoding='utf-8',error_bad_lines=False)
for index, row in df.iterrows():
folder = row[0]
sub_folder = row[1]
url = row[3]
r = requests.get(url)
if r.status_code == 200:
if not os.path.exists(folder):
os.makedirs(folder)
if not os.path.exists(sub_folder):
os.makedirs(sub_folder)
with open("/{0}/{1}/{2}".format(folder, sub_folder, url.split("/")[-1]), "wb") as f:
f.write(r.content)
path = r"C:\path\your_csv_path"
download_urls(path)
嘗試使用打開文件夾(如果不存在)進行操作(將首先打開目錄)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.