簡體   English   中英

具有 2 個列表作為變量的 concurrent.futures 多線程

[英]concurrent.futures multithreading with 2 lists as variables

所以我想用並發期貨對以下工作代碼進行多線程處理,但到目前為止我嘗試過的任何東西似乎都不起作用。

def download(song_filename_list, song_link_list):

    with requests.Session() as s:
    
        login_request = s.post(login_url, data= payload, headers= headers)

        for x in range(len(song_filename_list)):

            download_request = s.get(song_link_list[x], headers= download_headers, stream=True)

            if download_request.status_code == 200:
                print(f"Downloading {x+1} out of {len(song_filename_list)}!\n")
                pass
            else:
                print(f"\nStatus Code: {download_request.status_code}!\n")
                sys.exit()

            
            with open (song_filename_list[x], "wb") as file:
                file.write(download_request.content)

兩個主要變量是song_filename_listsong_link_list

第一個列表包含每個文件的名稱,第二個列表包含它們各自的下載鏈接。
所以每個文件的名稱和鏈接都位於相同的位置。
例如: name_of_file1 = song_filename_list[0]link_of_file1 = song_link_list[0]


這是多線程的最新嘗試:

def download(song_filename_list, song_link_list):

    with requests.Session() as s:
    
        login_request = s.post(login_url, data= payload, headers= headers)

        x = []
        for i in range(len(song_filename_list)):
            x.append(i)


        with concurrent.futures.ThreadPoolExecutor() as executor:
            executor.submit(get_file, x)


def get_file(x):
    
    download_request = s.get(song_link_list[x], headers= download_headers, stream=True)

    if download_request.status_code == 200:
        print(f"Downloading {x+1} out of {len(song_filename_list)}!\n")
        pass
    else:
        print(f"\nStatus Code: {download_request.status_code}!\n")
        sys.exit()

        
    with open (song_filename_list[x], "wb") as file:
        file.write(download_request.content)

有人可以向我解釋我做錯了什么嗎?
因為在get_file函數調用后什么也沒有發生。
它跳過所有代碼並退出而沒有任何錯誤,那么我的邏輯哪里錯了?


編輯 1:

添加打印后:

print(song_filename_list, song_link_list)
        with concurrent.futures.ThreadPoolExecutor() as executor:
            print("Before executor.map")
            executor.map(get_file, zip(song_filename_list, song_link_list))
            print("After executor.map")
            print(song_filename_list, song_link_list)

以及開始和結束get_file及其file.write

輸出如下:


Succesfully logged in!

["songs names"] ["songs links"]    <- These are correct.
Before executor.map
After executor.map
["songs names"] ["songs links"]    <- These are correct.

Exiting.

換句話說,值是正確的,但它跳過了executor.mapget_file


編輯2:

以下是使用的值。

  • song_filename_list = ['100049 Himeringo - Yotsuya-san ni Yoroshiku.osz', '1001507 ZUTOMAYO - Kan Saete Kuyashiiwa.osz']

  • song_link_list = ['https://osu.ppy.sh/beatmapsets/100049/download', 'https://osu.ppy.sh/beatmapsets/1001507/download']


編輯 3:

經過一些修補后,這似乎有效。

for i in range(len(song_filename_list)):
    with concurrent.futures.ThreadPoolExecutor() as executor:
        executor.submit(get_file, song_filename_list, song_link_list, i, s)
def get_file(song_filename_list, song_link_list, i, s):
    
    download_request = s.get(song_link_list[i], headers= download_headers, stream=True)

    if download_request.status_code == 200:
        print("Downloading...")
        pass
    else:
        print(f"\nStatus Code: {download_request.status_code}!\n")
        sys.exit()
    
    with open (song_filename_list[i], "wb") as file:
        file.write(download_request.content)

在您的download()函數中,您提交整個數組,而您應該提交每個項目:

def download(song_filename_list, song_link_list):

    with requests.Session() as s:
    
        login_request = s.post(login_url, data=payload, headers= headers)

        for i in range(len(song_filename_list)):
            with concurrent.futures.ThreadPoolExecutor() as executor:
                executor.submit(get_file, i)

您可以使用 executor .map()方法簡化此操作:

def download(song_filename_list, song_link_list):
  with requests.Session() as session:
    session.post(login_url, data=payload, headers=headers)

  with concurrent.futures.ThreadPoolExecutor() as executor:
    executor.map(get_file, song_filename_list, song_link_list)

get_file函數在哪里:

def get_file(song_name, song_link):
  with requests.Session() as session:
    download_request = session.get(song_link, headers=download_headers, stream=True)

  if download_request.status_code == 200:
    print(f"Downloaded {song_name}")
  else:
    print(f"\nStatus Code: {download_request.status_code}!\n")
  
  with open(song_name, "wb") as file:
    file.write(download_request.content)

這避免了線程之間共享狀態,從而避免了潛在的數據競爭。

如果您需要監控下載了多少歌曲,您可以使用tqdm ,它有一個thread_map迭代器包裝器,可以做到這一點。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM