[英]subprocess.Popen in Threads
我有很多文件(超過4000個),我想同時加載到PostgreSQL中。 我已將它們分成4個不同的文件列表,我想要一個線程迭代加載數據的每個列表。
我遇到的問題是我使用os.system來調用加載程序但這會阻止其他線程同時運行。 如果我使用subprocess.Popen然后它們同時運行但是線程認為它們已經完成了execeuting,所以移動到我的腳本的下一部分。
我這樣做是對的嗎? 或者是否有更好的方法從線程內調用子進程。
def thread1Load(self, thread1fileList):
connectionstring = settings.connectionstring
postgreshost = settings.postgreshost
postgresdatabase = settings.postgresdatabase
postgresport = settings.postgresport
postgresusername = settings.postgresusername
postgrespassword = settings.postgrespassword
tablename = None
encoding = None
connection = psycopg2.connect(connectionstring)
for filename in thread1fileList:
load_cmd = #load command
run = subprocess.Popen(load_cmd, shell=True)
print "finished loading thread 1"
def thread2Load(self, thread2fileList):
connectionstring = settings.connectionstring
postgreshost = settings.postgreshost
postgresdatabase = settings.postgresdatabase
postgresport = settings.postgresport
postgresusername = settings.postgresusername
postgrespassword = settings.postgrespassword
tablename = None
connection = psycopg2.connect(connectionstring)
for filename in thread2fileList:
load_cmd = #load command
run = subprocess.Popen(load_cmd, shell=True)
print "finished loading thread 2"
def thread3Load(self, thread3fileList):
connectionstring = settings.connectionstring
postgreshost = settings.postgreshost
postgresdatabase = settings.postgresdatabase
postgresport = settings.postgresport
postgresusername = settings.postgresusername
postgrespassword = settings.postgrespassword
tablename = None
connection = psycopg2.connect(connectionstring)
for shapefilename in thread3fileList:
load_cmd = #load command
run = subprocess.Popen(load_cmd, shell=True)
print "finished loading thread 3"
def thread4Load(self, thread4fileList):
connectionstring = settings.connectionstring
postgreshost = settings.postgreshost
postgresdatabase = settings.postgresdatabase
postgresport = settings.postgresport
postgresusername = settings.postgresusername
postgrespassword = settings.postgrespassword
tablename = None
connection = psycopg2.connect(connectionstring)
for filename in thread4fileList:
load_cmd = #load command
run = subprocess.Popen(load_cmd, shell=True)
print "finished loading thread 4"
def finishUp(self):
print 'finishing up'
def main():
load = Loader()
thread1 = threading.Thread(target=(load.thread1Load), args=(thread1fileList, ))
thread2 = threading.Thread(target=(load.thread2Load), args=(thread2fileList, ))
thread3 = threading.Thread(target=(load.thread3Load), args=(thread3fileList, ))
thread4 = threading.Thread(target=(load.thread4Load), args=(thread4fileList, ))
threads = [thread1, thread2, thread3, thread4]
for thread in threads:
thread.start()
thread.join()
load.finishUp(connectionstring)
if __name__ == '__main__':
main()
threadLoad
方法就足夠了。 這樣,如果您需要修改方法中的某些內容,則無需在4個不同的位置進行相同的修改。 run.communicate()
來阻止,直到子run.communicate()
完成。 這將啟動一個線程,然后阻塞直到該線程完成,然后啟動另一個線程等:
for thread in threads: thread.start() thread.join()
相反,首先啟動所有線程,然后加入所有線程:
for thread in threads: thread.start() for thread in threads: thread.join()
import subprocess
import threading
class Loader(object):
def threadLoad(self, threadfileList):
connectionstring = settings.connectionstring
...
connection = psycopg2.connect(connectionstring)
for filename in threadfileList:
load_cmd = # load command
run = subprocess.Popen(load_cmd, shell=True)
# block until subprocess is done
run.communicate()
name = threading.current_thread().name
print "finished loading {n}".format(n=name)
def finishUp(self):
print 'finishing up'
def main():
load = Loader()
threads = [threading.Thread(target=load.threadLoad, args=(fileList, ))
for fileList in (thread1fileList, thread2fileList,
thread3fileList, thread4fileList)]
for thread in threads:
thread.start()
for thread in threads:
thread.join()
load.finishUp(connectionstring)
if __name__ == '__main__':
main()
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.