简体   繁体   English

Python GIL-固有的文件I / O保护?

[英]Python GIL - Inherent File I/O Protection?

Tomorrow I need to write a threaded program that reads all lines from some a src file, does some network bound I/O process, and then writes the output to a dest file. 明天我需要编写一个线程程序,该程序从某个src文件读取所有行,执行一些网络绑定的I / O进程,然后将输出写入一个dest文件。 My worry is that two (or more) threads might try to write to disk at the same time and, therefore, I will have jumbled/corrupted data. 我担心的是两个(或更多)线程可能会尝试同时写入磁盘,因此,我的数据会混乱/损坏。

Is this something I even need to worry about? 我什至需要担心吗? I've never multithreaded anything in my life before. 我以前从未对任何事物进行多线程处理。 Will global interpreter lock (GIL) effectively protect me in this scenario? 在这种情况下,全局解释器锁定(GIL)是否可以有效地保护我? (The reason I'm doing this is because a single threaded implementation of this will take 30 days). (我这样做的原因是因为单线程实现将需要30天)。

Many thanks 非常感谢

Does your program really need to do the I/O multithreaded? 您的程序真的需要做I / O多线程吗? If you are reading / writing to a single file, I would do that part of the program single threaded - it will be faster and you are avoiding locking problems all around. 如果您正在读取/写入单个文件,那么我将在单线程中执行程序的那部分-它将更快,并且您可以避免周围的锁定问题。 Not to mention the fact that disc i/o is so slow that parallen processes will not really help you. 更不用说光盘I / O如此之慢,以至于并行处理并不能真正为您提供帮助。 I see two possible scenarios for you: 1. Single file: Do the i/o in one thread, start N processes to do the heavy computations. 我为您提供了两种可能的方案:1.单个文件:在一个线程中执行I / O,启动N个进程进行繁重的计算。 2. N files: Start one thread which manages a queue, which then starts N processes to do the work. 2. N个文件:启动一个管理队列的线程,然后该线程启动N个进程来完成工作。

See https://docs.python.org/2/library/multiprocessing.html for "pool" which might be really helpful in your case. 请参阅https://docs.python.org/2/library/multiprocessing.html中的“池”,这可能对您的情况很有帮助。

As a word of advice: multithreading is damned hard. 忠告:多线程是该死的。 Many things can and will go wrong in different ways when you switch from single-threading to multi-threading. 从单线程切换到多线程时,许多事情可能并且会以不同的方式出错。 So start off with a simple single threaded model and build your way up and ensure that each step still does the correct thing. 因此,从一个简单的单线程模型开始,逐步构建并确保每个步骤仍然可以正确执行。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM