Python threading.Thread只能用私有方法self來停止.__ Thread_stop（）

Question

我有一個函數接受大量的x，y對作為輸入，使用numpy和scipy做一些精細的曲線擬合，然后返回一個值。 為了嘗試加快速度，我嘗試使用Queue.Queue將數據提供給兩個線程。 數據完成后。 我試圖讓線程終止，然后結束調用進程並將控制權返回給shell。

我試圖理解為什么我必須在threading中使用私有方法。線程停止我的線程並將控制返回到命令行。

self.join（）不會結束程序。 獲得控制權的唯一方法是使用私有停止方法。

        def stop(self):
            print "STOP CALLED"
            self.finished.set()
            print "SET DONE"
            # self.join(timeout=None) does not work
            self._Thread__stop()

這是我的代碼的近似值：

    class CalcThread(threading.Thread):
        def __init__(self,in_queue,out_queue,function):
            threading.Thread.__init__(self)
            self.in_queue = in_queue
            self.out_queue = out_queue
            self.function = function
            self.finished = threading.Event()

        def stop(self):
            print "STOP CALLED"
            self.finished.set()
            print "SET DONE"
            self._Thread__stop()

        def run(self):
            while not self.finished.isSet():
                params_for_function = self.in_queue.get()
                try:
                    tm = self.function(paramsforfunction)
                    self.in_queue.task_done()
                    self.out_queue.put(tm)
                except ValueError as v:
                    #modify params and reinsert into queue
                    window = params_for_function["window"]
                    params_for_function["window"] = window + 1
                    self.in_queue.put(params_for_function)

    def big_calculation(well_id,window,data_arrays):
            # do some analysis to calculate tm
            return tm

    if __name__ == "__main__":
        NUM_THREADS = 2
        workers = []
        in_queue = Queue()
        out_queue = Queue()

        for i in range(NUM_THREADS):
            w = CalcThread(in_queue,out_queue,big_calculation)
            w.start()
            workers.append(w)

        if options.analyze_all:
              for i in well_ids:
                  in_queue.put(dict(well_id=i,window=10,data_arrays=my_data_dict))

        in_queue.join()
        print "ALL THREADS SEEM TO BE DONE"
        # gather data and report it from out_queue
        for i in well_ids:
            p = out_queue.get()
            print p
            out_queue.task_done()
            # I had to do this to get the out_queue to proceed
            if out_queue.qsize() == 0:
                out_queue.join()
                break
# Calling this stop method does not seem to return control to the command line unless I use threading.Thread private method

        for aworker in workers:
            aworker.stop()

Answer 1

通常，殺死修改共享資源的線程是個壞主意。

除非在執行計算時釋放GIL，否則多線程中的CPU密集型任務比Python中的無用任務更糟糕。 許多numpy函數確實發布了GIL。

來自docs的ThreadPoolExecutor示例

import concurrent.futures # on Python 2.x: pip install futures 

calc_args = []
if options.analyze_all:
    calc_args.extend(dict(well_id=i,...) for i in well_ids)

with concurrent.futures.ThreadPoolExecutor(max_workers=NUM_THREADS) as executor:
    future_to_args = dict((executor.submit(big_calculation, args), args)
                           for args in calc_args)

    while future_to_args:
        for future in concurrent.futures.as_completed(dict(**future_to_args)):
            args = future_to_args.pop(future)
            if future.exception() is not None:
                print('%r generated an exception: %s' % (args,
                                                         future.exception()))
                if isinstance(future.exception(), ValueError):
                    #modify params and resubmit
                    args["window"] += 1
                    future_to_args[executor.submit(big_calculation, args)] = args

            else:
                print('f%r returned %r' % (args, future.result()))

print("ALL work SEEMs TO BE DONE")

如果沒有共享狀態，您可以用ProcessPoolExecutor替換ThreadPoolExecutor 。 將代碼放在main()函數中。

Answer 2

詳細說明我的評論 - 如果你的線程的唯一目的是消耗隊列中的值並對它們執行一個函數，你最好做這樣的事情恕我直言：

q = Queue()
results = []

def worker():
  while True:
    x, y = q.get()
    results.append(x ** y)
    q.task_done()

for _ in range(workerCount):
  t = Thread(target = worker)
  t.daemon = True
  t.start()

for tup in listOfXYs:
  q.put(tup)

q.join()

# Some more code here with the results list.

q.join()將阻塞，直到它再次為空。 工作線程將繼續嘗試檢索值，但不會找到任何值，因此一旦隊列為空，它們將無限期地等待。 當您的腳本稍后完成執行時，工作線程將死亡，因為它們被標記為守護程序線程。

Answer 3

我嘗試了gddc的方法，它產生了一個有趣的結果。 我可以得到他精確的x **計算，以便在線程之間進行良好的擴展。

當我在True循環中調用我的函數時。 只有當我在調用線程start（）方法的for循環中放入time.sleep（1）時，我才能在多個線程之間執行計算。

所以在我的代碼中。 沒有time.sleep（1），程序給了我一個干凈的退出沒有輸出或在某些情況下

“Thread-2線程中的異常（很可能在解釋器關閉期間引發）：線程Thread-1中的異常（很可能在解釋器關閉期間引發）：”

一旦我添加了time.sleep（），一切都運行良好。

for aworker in range(5):
    t = Thread(target = worker)
    t.daemon = True
    t.start()
    # This sleep was essential or results for my specific function were None
    time.sleep(1)
    print "Started"

Python threading.Thread只能用私有方法self來停止.__ Thread_stop（）

問題描述

3 個解決方案

解決方案1
5 已采納 2011-10-06 22:43:38

來自docs的ThreadPoolExecutor示例

解決方案2
4 2011-10-06 21:32:31

解決方案3
0 2011-10-07 01:38:09

Python threading.Thread只能用私有方法self來停止.__ Thread_stop（）

問題描述

3 個解決方案

解決方案1 5 已采納 2011-10-06 22:43:38

來自docs的ThreadPoolExecutor示例

解決方案2 4 2011-10-06 21:32:31

解決方案3 0 2011-10-07 01:38:09

解決方案1
5 已采納 2011-10-06 22:43:38

解決方案2
4 2011-10-06 21:32:31

解決方案3
0 2011-10-07 01:38:09