简体   繁体   English

多处理/线程-高估了,或者我缺少什么?

[英]Multiprocessing/Threading - Overrated, or am I missing something?

I have a program that I am in the process of making multithreaded and multiprocessed. 我有一个正在制作多线程和多进程的程序。 But in my initial tests, I find that old-fashioned serial processing beats either multiprocessing (using more CPU cores) or threading (using more threads pr. CPU core, but only one core at a time) hands down!!! 但是在我的初始测试中,我发现老式的串行处理胜过多处理(使用更多的CPU内核)或线程(使用更多的线程(相对于CPU内核,但一次仅一个内核))胜过了!

To illustrate I have made this simple code-snippet. 为了说明这一点,我制作了这个简单的代码段。 In the main section of the script I have marked the 3 types of processing, so you can easily pick which one type of processing you want, and rem-out the other 2 options (rem-out the full sections to disable the feature): 在脚本的主要部分中,我标记了3种处理类型,因此您可以轻松地选择所需的一种处理类型,然后删除其他2个选项(删除整个部分以禁用该功能):

My script iterates through a list of 2 strategies. 我的脚本遍历了2个策略列表。 And for each strategy it iterates through a list of 193 tickers (stock_list). 对于每种策略,它都会遍历193个股票报价列表(stock_list)。

In the main section you may select which type of processing you want to test/employ: 在主要部分中,您可以选择要测试/采用的处理类型:

  1. Section is normal serial processing with only one CPU core and one thread. 本节是只有一个CPU内核和一个线程的常规串行处理。
  2. Section is Multiprocessing using all the available CPU cores in the system. 本节是使用系统中所有可用CPU内核的多处理。
  3. Section is Threading using a queue of 40 threads to process the list. 本节是使用40个线程的队列来处理列表的线程。

I do not do anything fancy with them in this simple test-script, but only sleep 0.01s pr. 在这个简单的测试脚本中,我对它们没有任何幻想,但仅睡眠0.01秒。 iteration for me to get a feel for which is fastest. 迭代让我感觉最快。

Bly me, but it seems that processing the list the old-fashioned serial way is slightly faster than any of the other types...! 拜托,但似乎老式的串行方式处理列表比其他任何类型都快一点……! My test results show these run-times: 我的测试结果显示了这些运行时:

  1. Serial : 3.86s 序列:3.86s
  2. Multiprocessing (cores) 4.03s 多处理(核心)4.03s
  3. Multithreading (threads) 3.90s 多线程(线程)3.90秒

I must be missing the point, and must have made a mistake in the code below. 我必须遗漏要点,并且必须在下面的代码中犯了一个错误。 Please could someone with multiprocessing experience shine some light on this conundrum. 请具有多处理经验的人对此难题有所了解。

How do I speed up processing the stock_list through the strategies and make this example code run the fastest it possibly can? 如何通过策略加快处理stock_list的速度,并使此示例代码尽可能快地运行?

import time
import threading
from threading import Thread, Lock
from queue import Queue
from multiprocessing import Pool
from multiprocessing.dummy import Pool as ThreadPool

start = time.time()     # Start script timer

stock_list = ['aan', 'anf', 'ancx', 'ace', 'atvi', 'aet', 'agco', 'atsg', 'awh', 'all', 'afam', 'alj', 'dox', 'acas', 'afg', 'arii', 'asi', 'crmt', 'amkr', 'nly', 'anh', 'acgl', 'arw', 'aiz', 'atw', 'avt', 'axll', 'axs', 'blx', 'bkyf', 'bmrc', 'bku', 'banr', 'b', 'bbt', 'bbcn', 'bhlb', 'bokf', 'cjes', 'caci', 'cap', 'cof', 'cmo', 'cfnl', 'cacb', 'csh', 'cbz', 'cnbc', 'cpf', 'cvx', 'cb', 'cnh', 'cmco', 'cnob', 'cop', 'cpss', 'glw', 'crox', 'do', 'dds', 'dcom', 'dyn', 'ewbc', 'eihi', 'ebix', 'exxi', 'efsc', 'ever', 're', 'ezpw', 'ffg', 'fisi', 'fdef', 'fibk', 'nbcb', 'banc', 'frc', 'frf', 'fcx', 'gm', 'gco', 'gsol', 'gs', 'glre', 'hbhc', 'hafc', 'hdng', 'hcc', 'htlf', 'hele', 'heop', 'hes', 'hmn', 'hum', 'im', 'irdm', 'joy', 'jpm', 'kalu', 'kcap', 'kmpr', 'kss', 'lbai', 'lf', 'linta', 'lmca', 'lcut', 'lnc', 'lmia', 'mtb', 'mgln', 'mant', 'mpc', 'mcgc', 'mdc', 'taxi', 'mcc', 'mw', 'mofg', 'mrh', 'mur', 'mvc', 'myrg', 'nov', 'nci', 'navg', 'nni', 'nmfc', 'nnbr', 'nwpx', 'oln', 'ovti', 'olp', 'pccc', 'pre', 'pmc', 'psx', 'phmd', 'pjc', 'ptp', 'pnc', 'bpop', 'pfbc', 'pri', 'pl', 'rf', 'rnr', 'regi', 'rcii', 'rjet', 'rbcaa', 'sybt', 'saft', 'sasr', 'sanm', 'sem', 'skh', 'skyw', 'sfg', 'stt', 'sti', 'spn', 'sya', 'tayc', 'tecd', 'tsys', 'ticc', 'twi', 'titn', 'tol', 'tmk', 'twgp', 'trv', 'tcbk', 'trn', 'trmk', 'tpc', 'ucbi', 'unm', 'urs', 'usb', 'vlo', 'vr', 'voxx', 'vsec', 'wd', 'wres', 'wbco', 'wlp', 'wfc', 'wibc', 'xrx', 'xl']

tickers = []
strategies = []

def do_multiproces_work(ticker):
    print(threading.current_thread().name,strategy,ticker)
    time.sleep(0.01)


#==============================================================================
# Threading

# lock to serialize console output
lock = Lock()

def do_work(item):

    try:

        with lock: # This is where the work is done
            print(threading.current_thread().name,strategy,item)
            time.sleep(.01) # pretend to do some lengthy work.

    except Exception as e:
            print(str(e))

# The worker thread pulls an item from the queue and processes it
def worker():
    try:
        while True:
            item = q.get()
            do_work(item)
            q.task_done()

    except Exception as e:
            print(str(e))

#==============================================================================

if __name__ == '__main__':

    strategies = ['strategy0', 'strategy1']
    #==============================================================================
    # Strategies iteration
    #==============================================================================
    try:
        for strategy in strategies:
            ##=========================================================================
            ## Tickers iteration
            ##=========================================================================
           # 1. Normal Serial processing
            for ticker in stock_list:
                do_multiproces_work(ticker)

            #==============================================================================
#           # 2. Pure Multiprocessing (without multiple threads)
#            '''
#            pool = ThreadPool()
#            # Sets the pool size, If you leave it blank,
#            it will default to the number of Cores in your machine.
#            '''
#
#            # Make the Pool of workers
#            pool = ThreadPool()
#
#            # Do work and return the results
#            # Multiproc. Without threading
#            pool.map(do_work, stock_list)
#
#            #results = pool.map(urllib2.urlopen, urls)
#
#            #close the pool and wait for the work to finish
#            pool.close()
#            pool.join()

            #==============================================================================

#            # 3. Threading (from only one processor core)
#            # Create the queue and thread pool.
#            global q
#            q = Queue()
#            for i in range(40):  # No of parallel threads/Queues - 40 et godt valg.
#                 t = threading.Thread(target=worker)  # Kalder arb fct.
#                 t.daemon = True  # thread dies when main thread (only non-daemon thread) exits.
#                 t.start()
#
#            # stuff work items on the queue (in this case, just a number).
#            for item in stock_list:  # INPUT LISTE TIL ANALYSE
#                q.put(item)
#
#            q.join()       # block until all tasks are done
            #==============================================================================
    except Exception as e:
            print(str(e))

    # Stopping and printing result from script-timer
    seconds = time.time()-start

    m, s = divmod(seconds, 60)
    h, m = divmod(m, 60)
    print('Script finished in %.2f seconds' %(time.time()-start))

By using lock in the work function, will you make the code more or less serial instead of multithreaded. 通过使用工作功能中的锁,您将使代码或多或少地成为串行的而不是多线程的。 Is it needed? 需要吗?

If analysing the tickers with one strategy does not affect the others, then there is no need to use lock. 如果使用一种策略分析报价器不会影响其他策略,则无需使用锁定。

Lock should only be used if they access a shared resource, example a file, or printer. 仅当它们访问共享资源(例如文件或打印机)时,才应使用锁定。

You didn't specify, so I'm going to assume you are using the C implementation of Python 2.7. 您没有指定,所以我假设您使用的是Python 2.7的C实现。

In this case, Python threads do not run in parallel. 在这种情况下,Python线程不会并行运行。 There is something called the Global Interpreter Lock which prevents this. 有一种叫做“全局解释器锁定”的东西可以防止这种情况的发生。 Hence, no speedup for you, since you now do the same waiting as in the single-thread case, but also a lot of task switching and waiting for threads to finish. 因此,对您来说没有任何提速,因为您现在的等待与单线程情况下的等待相同,而且还需要执行许多任务切换和等待线程完成。

I'm not sure what the deal with the multiprocessing variant is, but I suspect it is overhead. 我不确定对多处理变量的处理方式是什么,但我怀疑这是开销。 Your example is rather contrived, no only in that you wait instead of doing work, but also because you don't wait for very long. 您的示例非常人为设计,不仅因为您等待而不是工作,而且还因为您等待时间不长。 One typically wouldn't use multiprocessing for something that only takes four seconds total... 通常只需要四秒钟就不会使用多处理...

As a footnote, I don't understand what your claim that threading uses only one core is based on. 作为一个脚注,我不理解您关于线程仅使用一个内核的主张的依据。 AFAIK, it's not a python constraint. AFAIK,这不是python约束。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM