[英]Beginning word counting program only produces output for the last line in python
[英]Parallel Program in Python produces no output
我有一个简单的任务。 需要为大量文件运行特定功能。 这个任务可以很容易地并行化。
这是工作代码:
# filelist is the directory containing two file, a.txt and b.txt.
# a.txt is the first file, b.xt is the second file
# I pass a file that lits the names of the two files to the main program
from concurrent.futures import ProcessPoolExecutor, as_completed
from pathlib import Path
import sys
def translate(filename):
print(filename)
f = open(filename, "r")
g = open(filename + ".x", , "w")
for line in f:
g.write(line)
def main(path_to_file_with_list):
futures = []
with ProcessPoolExecutor(max_workers=8) as executor:
for filename in Path(path_to_file_with_list).open():
executor.submit(translate, "filelist/" + filename)
for future in as_completed(futures):
future.result()
if __name__ == "__main__":
main(sys.argv[1])
但是,不会创建新文件,即该文件夹不包含 a.txt.x 和 b.txt.x 文件。
上面的代码有什么问题,我该如何使它工作?
谢谢。
这应该让你走上正确的道路。 如果它不起作用并且不是一个明显的错误,那么我怀疑您可能没有正确设置所有文件路径...我应该指出,写入文件将受益于线程而不是减少开销的进程。 文件 I/O 应该释放 GIL,这样你就会从加速中受益(如果你一次复制多于一行,效果会显着增加。)也就是说,如果你只是复制文件,你真的应该只使用shutil.copy
或shutil.copy2
from concurrent.futures import ProcessPoolExecutor, wait
from pathlib import Path
import sys
def translate(filename):
print(filename)
with open(filename, "r") as f, open(filename + ".x", , "w") as g:
for line in f:
g.write(line)
def main(path_to_file_with_list):
futures = []
with ProcessPoolExecutor(max_workers=8) as executor:
for filename in Path(path_to_file_with_list).open():
futures.append(executor.submit(translate, "filelist/" + filename))
wait(futures) #simplify waiting on processes if you don't need the result.
for future in futures:
if future.excpetion() is not None:
raise future.exception() #ProcessPoolEcecutors swallow exceptions without telling you...
print("done")
if __name__ == "__main__":
main(sys.argv[1])
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.