[英]Python Multiprocessing within Jupyter Notebook does not work
我是Python中的multiprocessing
模塊的新手,並且使用Jupyter筆記本。 當我嘗試運行以下代碼時,我不斷得到AttributeError: Can't get attribute 'load' on <module '__main__' (built-in)>
當我運行文件時沒有輸出,它只是繼續加載。
import pandas as pd
import datetime
import urllib
import requests
from pprint import pprint
import time
from io import StringIO
from multiprocessing import Process, Pool
symbols = ['AAP']
start = time.time()
dflist = []
def load(date):
if date is None:
return
url = "http://regsho.finra.org/FNYXshvol{}.txt".format(date)
try:
df = pd.read_csv(url,delimiter='|')
if any(df['Symbol'].isin(symbols)):
stocks = df[df['Symbol'].isin(symbols)]
print(stocks.to_string(index=False, header=False))
# Save stocks to mysql
else:
print(f'No stock found for {date}' )
except urllib.error.HTTPError:
pass
pool = []
numdays = 365
start_date = datetime.datetime(2019, 1, 15 ) #year - month - day
datelist = [
(start_date - datetime.timedelta(days=x)).strftime('%Y%m%d') for x in range(0, numdays)
]
pool = Pool(processes=16)
pool.map(load, datelist)
pool.close()
pool.join()
print(time.time() - start)
如何直接從筆記本運行此代碼而不會出現問題?
一種方法:
1.獲取load
函數並創建例如worker.py
2. import worker
and worker.load
3。
from multiprocessing import Pool
import worker
if __name__ == '__main__':
pool = []
numdays = 365
start_date = datetime.datetime(2019, 1, 15 ) #year - month - day
datelist = [
(start_date - datetime.timedelta(days=x)).strftime('%Y%m%d') for x in
range(0, numdays)
]
pool = Pool(processes=16)
pool.map(worker.load, datelist)
pool.close()
pool.join()
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.