簡體   English   中英

python multiprocessing struct.error

[英]python multiprocessing struct.error

我正在遍歷一組大文件,並使用多處理進行操作/寫入。 我從數據框中創建了一個可迭代的對象,並將其傳遞給多處理的map函數。 對於較小的文件,該處理很好,但是當我碰到較大的文件(〜10g)時,出現錯誤:

python struct.error: 'i' format requires -2147483648 <= number <= 2147483647

編碼:

    data = np.array_split(data, 10)        
    with mp.Pool(processes=5, maxtasksperchild=1) as pool1:
                    pool1.map(write_in_parallel, data)
                    pool1.close()
                    pool1.join()

基於此答案,我認為問題是我要傳遞給地圖的文件太大。 因此,我嘗試首先將數據幀拆分為1.5g的塊,並將每個塊獨立地傳遞到映射,但是仍然收到相同的錯誤。

完整回溯:

Traceback (most recent call last):
  File "_FNMA_LLP_dataprep_final.py", line 51, in <module>
    write_files()
  File "_FNMA_LLP_dataprep_final.py", line 29, in write_files
    '.txt')
  File "/DATAPREP/appl/FNMA_LLP/code/FNMA_LLP_functions.py", line 116, in write_dynamic_columns_fannie
    pool1.map(write_in_parallel, first)
  File "/opt/Python364/lib/python3.6/multiprocessing/pool.py", line 266, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
  File "/opt/Python364/lib/python3.6/multiprocessing/pool.py", line 644, in get
    raise self._value
  File "/opt/Python364/lib/python3.6/multiprocessing/pool.py", line 424, in _handle_tasks
    put(task)
  File "/opt/Python364/lib/python3.6/multiprocessing/connection.py", line 206, in send
    self._send_bytes(_ForkingPickler.dumps(obj))
  File "/opt/Python364/lib/python3.6/multiprocessing/connection.py", line 393, in _send_bytes
    header = struct.pack("!i", n)
struct.error: 'i' format requires -2147483648 <= number <= 2147483647

在您提到的答案中,還有另一個要點:數據應該由child函數加載。 在您的情況下,它的功能是write_in_parallel 我建議您以以下方式更改子功能:

def write_in_parallel('/path/to/your/data'):
    """ We'll make an assumption that your data is stored in csv file""" 

    data = pd.read_csv('/path/to/your/data')
    ...

然后,您的“ Pool代碼”應如下所示:

with mp.Pool(processes=(mp.cpu_count() - 1)) as pool:
    chunks = pool.map(write_in_parallel, ('/path/to/your/data',))
df = pd.concat(chunks)

希望對您有所幫助。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM