簡體   English   中英

python中的多處理(從for循環轉到for循環的多處理)

[英]multiprocessing in python (going from a for loop to multiprocessing for loop)

我有一個有效的腳本。 它有一個for循環,它喜歡通過合並多處理來提高速度。

沒有多處理的代碼如下:

    Symbol= Symbol[0:]   #slicing to coose which stocks to look at
    ################################for loop
    for item in Symbol:
        print item
        try:
            serious=web.DataReader([item], 'yahoo', start, end)['Adj Close']
            serious2=serious.loc[:, item].tolist()   #extract the column of 'Adj Close' 
            tickerlistori.append(item)
            valuemax = max(serious2)
            indexmax = serious2.index(max(serious2))
            valuemin = min(serious2)
            indexmin = serious2.index(min(serious2))         
            pricecurrent = serious2[-1]
            if valuemax>30 and valuemin<2 and pricecurrent<2.5:
                tickerlist.append(item)
                maxpricelist.append(valuemax)
                minpricelist.append(valuemin)
        except RemoteDataError: 
            pass
print tickerlist

下面的第二個代碼塊是“具有並行處理”

    Symbol= Symbol[0:]   #slicing to coose which stocks to look at
    ############ multi processing before the for loop
    def search1(Symbol):

        for item in Symbol:
            print item  #trying to see why the tickers are messed up
            try:
                serious=web.DataReader([item], 'yahoo', start, end)['Adj Close']
                serious2=serious.loc[:, item].tolist()   #extract the column of 'Adj Close' 
                tickerlistori.append(item)
                valuemax = max(serious2)
                indexmax = serious2.index(max(serious2))

                valuemin = min(serious2)
                indexmin = serious2.index(min(serious2))         


                pricecurrent = serious2[-1]

                if valuemax>30 and valuemin<2 and pricecurrent<2.5:
                    tickerlist.append(item)
                    maxpricelist.append(valuemax)
                    minpricelist.append(valuemin)
            except RemoteDataError: 
                pass


    pool = Pool(processes=4) 
    tickerlist = pool.map(search1, Symbol)
print tickerlist

第一個可以正常工作,但是第二個可以正常運行,盡管代碼可以正確運行,但是輸入pool.map(search1, Symbol)看起來不正確。

提前致謝。

(符號只是股票行情清單)

---------------進行更改后建議tdelaney

import matplotlib.pyplot as plt
import csv
import pandas as pd
import datetime
import pandas.io.data as web
from pandas.io.data import DataReader, SymbolWarning, RemoteDataError
from filesortfunct import filesort
from scipy import stats
from scipy.stats.stats import pearsonr
import numpy as np
import math
from multiprocessing import Pool
import warnings
warnings.filterwarnings("ignore")


#decide the two dates between which to look at stock prices
start = datetime.datetime.strptime('2/10/2015', '%m/%d/%Y')
end = datetime.datetime.strptime('2/25/2016', '%m/%d/%Y')

#intended to collect indeces and min/max prices
#global tickerlist, maxpricelist, minpricelist, tickerlistori
tickerlistori=[]    #list of stocks available from google finance
tickerlist=[]      
maxpricelist = []
minpricelist =[]


datanamelist= ['NYSE.csv']#,'NASDAQ.csv','AMEX.csv']
for each in datanamelist:


    #print each   #print out which stock exchange is being looked at
    dataname= each  #csv file from which to extract stock tickers
    new = 'new'


    df = pd.read_csv(dataname, sep=',')
    df = df[['Symbol']]

    df.to_csv(new+dataname, sep=',', index=False)

    x=open(new+dataname,'rb')    #convert it into a form more managable
    f = csv.reader(x) # csv is binary

    Symbol = zip(*f) 

    #print type(Symbol)   #list format

    Symbol=Symbol[0]   #pick out the first column

   # Symbol = Symbol[1:len(Symbol)]  #remove the first row "symbol" header
    Symbol = Symbol[3210:len(Symbol)] 


    Symbol= Symbol[0:]   #slicing to coose which stocks to look at
    #print Symbol


    def search1(item):
        print item  #trying to see why the tickers are messed up
        try:
            serious=web.DataReader([item], 'yahoo', start, end)['Adj Close']
            serious2=serious.loc[:, item].tolist()   #extract the column of 'Adj Close' 
            valuemax = max(serious2)
            indexmax = serious2.index(max(serious2))
            valuemin = min(serious2)
            indexmin = serious2.index(min(serious2))         
            pricecurrent = serious2[-1]

            if valuemax>30 and valuemin<2 and pricecurrent<2.5:
                return item, valuemax, valuemin
        except RemoteDataError: 
            pass


    pool = Pool(processes=4) 
    pool.start()
    for result in pool.map(search1, Symbol):

        if result:
            tickerlist.append(result[0])
            maxpricelist.append(result[1])
            minpricelist.append(result[2])

print tickerlist

您有幾個問題:

  • map將枚舉Symbol並為每個Symbol運行worker。 工人不需要在for循環中再次枚舉它
  • 您更新全局列表...但是這些列表對於子流程是全局的。 父母從未見過他們

這是更新

Symbol= Symbol[0:]   #slicing to coose which stocks to look at
############ multi processing before the for loop
def search1(item):
    print item  #trying to see why the tickers are messed up
    try:
        serious=web.DataReader([item], 'yahoo', start, end)['Adj Close']
        serious2=serious.loc[:, item].tolist()   #extract the column of 'Adj Close' 
        valuemax = max(serious2)
        indexmax = serious2.index(max(serious2))
        valuemin = min(serious2)
        indexmin = serious2.index(min(serious2))         
        pricecurrent = serious2[-1]

        if valuemax>30 and valuemin<2 and pricecurrent<2.5:
            return item, valuemax, valuemin
    except RemoteDataError: 
        pass


pool = Pool(processes=4) 
for result in pool.map(search1, Symbol):
    if result:
            tickerlist.append(result[0])
            maxpricelist.append(result[1])
            minpricelist.append(result[2])
print tickerlist

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM