[英]Python threading - unexpected output
我是Python的新手,並在下面編寫了一個線程腳本,它接受文件的每一行,並將其傳遞給get_result函數。 get_result函數應輸出url和status代碼(如果它是200或301)。
代碼如下:
import requests
import Queue
import threading
import re
import time
start_time = int(time.time())
regex_to_use = re.compile(r"^")
def get_result(q, partial_url):
partial_url = regex_to_use.sub("%s" % "http://www.domain.com/", partial_url)
r = requests.get(partial_url)
status = r.status_code
#result = "nothing"
if status == 200 or status == 301:
result = str(status) + " " + partial_url
print(result)
#need list of urls from file
file_list = [line.strip() for line in open('/home/shares/inbound/seo/feb-404s/list.csv', 'r')]
q = Queue.Queue()
for url in file_list:
#for each partial. send to the processing function get_result
t = threading.Thread(target=get_result, args=(q, url))
t.start()
end_time = int(time.time())
exec_time = end_time - start_time
print("execution time was " + str(exec_time))
我使用了Queue和線程,但發生的事情是在線程完成輸出數據之前輸出“執行時間為x”的打印。
即典型的輸出是:
200 www.domain.com/ok-url
200 www.domain.com/ok-url-1
200 www.domain.com/ok-url-2
execution time was 3
200 www.domain.com/ok-url-4
200 www.domain.com/ok-ur-5
200 www.domain.com/ok-url-6
這是怎么回事,我想知道如何在腳本結束時顯示腳本執行,即一旦所有網址都被處理和輸出?
感謝utdemir給出的答案,這里是加入的更新代碼。
import requests
import Queue
import threading
import re
import time
start_time = int(time.time())
regex_to_use = re.compile(r"^")
def get_result(q, partial_url):
partial_url = regex_to_use.sub("%s" % "http://www.domain.com/", partial_url)
r = requests.get(partial_url)
status = r.status_code
#result = "nothing"
if status == 200 or status == 301:
result = str(status) + " " + partial_url
print(result)
#need list of urls from file
file_list = [line.strip() for line in open('/home/shares/inbound/seo/feb-404s/list.csv', 'r')]
q = Queue.Queue()
threads_list = []
for url in file_list:
#for each partial. send to the processing function get_result
t = threading.Thread(target=get_result, args=(q, url))
threads_list.append(t)
t.start()
for thread in threads_list:
thread.join()
end_time = int(time.time())
exec_time = end_time - start_time
print("execution time was " + str(exec_time))
你應該加入線程來等待它們,否則它們將繼續在后台執行。
像這樣:
threads = []
for url in file_list:
...
threads.append(t)
for thread in threads:
thread.join() # Wait until each thread terminates
end_time = int(time.time()
...
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.