简体   繁体   English

正确的Python线程处理方法

[英]Correct Approach to Threading in Python

I have a script that takes a text file as input and performs the testing. 我有一个脚本,该脚本将文本文件作为输入并执行测试。 What I want to do is create two threads and divide the input text file in 2 parts and run them so as to minimize the execution time. 我想要做的是创建两个线程,并将输入文本文件分为两部分,然后运行它们,以最大程度地减少执行时间。 Is there a way I can do this ? 有办法吗?

Thanks 谢谢

class myThread (threading.Thread):
    def __init__(self, ip_list):
        threading.Thread.__init__(self)
        self.input_list = ip_list

    def run(self):
        # Get lock to synchronize threads
        threadLock.acquire()
        print "python Audit.py " + (",".join(x for x in self.input_list))
        p = subprocess.Popen("python Audit.py " + (",".join(x for x in self.input_list)), shell=True)
        # Free lock to release next thread
        threadLock.release()
        while p.poll() is None:
            print('Test Execution in Progress ....')
            time.sleep(60)

        print('Not sleeping any longer.  Exited with returncode %d' % p.returncode)


def split_list(input_list, split_count):
    for i in range(0, len(input_list), split_count):
        yield input_list[i:i + split_count]

if __name__ == '__main__':

    threadLock = threading.Lock()
    threads = []

    with open("inputList.txt", "r") as Ptr:       
     for i in Ptr:
         try:
             id = str(i).rstrip('\n').rstrip('\r')
             input_list.append(id)
         except Exception as err:
            print err
            print "Exception occured..."
    try:
      test = split_list(input_list, len(input_list)/THREAD_COUNT)
      list_of_lists = list(test)
    except Exception as err:
      print err
      print "Exception caught in splitting list"

    try:
      #Create Threads & Start
      for i in range(0,len(list_of_lists)-1):
         # Create new threads
         threads.append(myThread(list_of_lists[i]))
         threads[i].start()
         time.sleep(1)

      # Wait for all threads to complete
      for thread in threads:
          thread.join()
      print "Exiting Main Thread..!"
    except Exception as err:
      print err
      print "Exception caught during THREADING..."

Some notes, in random order: 一些注释,以随机顺序排列:

In python, multithreading is not a good solution to approach computationally intensive tasks. 在python中,多线程并不是解决计算密集型任务的好方法。 A better approach is multiprocessing: Python: what are the differences between the threading and multiprocessing modules? 更好的方法是多处理: Python:线程和多处理模块之间有什么区别?

For resources that are not shared (in your case, each line will be used exclusively by a single process) you do not need locks. 对于不共享的资源(在您的情况下,每一行将仅由单个进程使用),您不需要锁。 A better approach would be the map function. 更好的方法是地图功能。

def processing_function(line):
    suprocess.call(["python", "Audit.py", line])

with open('file.txt', 'r') as f:
    lines = f.readlines()

to_process = [lines[:len(lines)//2], lines[len(lines)//2:]]    
p = multiprocessing.Pool(2)
results = p.map(processing_func, to_process)

If the computation requires a variable amount of time depending on the line, using Queues to move data between processes instead of mapping could help to balance the load 如果计算需要根据行而变化的时间量,则使用队列在进程之间移动数据而不是映射可以帮助平衡负载

You are trying to do 2 things at the same time, which is the definition of parallelism. 您试图同时做两件事,这就是并行性的定义。 The problem here is that if you are using CPython, you won't be able to do parallelism because of the GIL(Global Interpreter Lock). 这里的问题是,如果您使用的是CPython,则由于GIL(全局解释器锁定)的原因,您将无法执行并行处理。 The GIL makes sure that only 1 thread is running because the python interpreter is not considered thread safe. GIL确保仅运行1个线程,因为python解释器不被认为是线程安全的。

What you should use if you really want to do two operations in parallel is to use the multiprocessing module (import multiprocessing) 如果您确实要并行执行两个操作,则应该使用多处理模块(导入多处理)

Read this: Multiprocessing vs Threading Python 阅读本文: 多处理与线程Python

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM