![](/img/trans.png)
[英]Rapids.ai / difference of computation with log between Pandas and cudf
[英]MemoryError: std::bad_alloc: rapids.ai Dask-cuDF
我想加載 5.9 GB CSV,但我不使用 Pandas 庫。 我有 4 個 GPU。 我使用Rapids.ai更快地加載這個大型數據集,但每次嘗試時,盡管我的其他 GPU 內存中有空間,但會向我顯示此錯誤。 開始時 GPU 的內存使用情況是:
GPU 0
total : 11554717696
free : 11126046720
used : 428670976
GPU 1
total : 11554717696
free : 11542331392
used : 12386304
GPU 2
total : 11554717696
free : 11542331392
used : 12386304
GPU 3
total : 11551440896
free : 11113070592
used : 438370304
代碼是:
import cudf
import pandas as pd
import time
import subprocess as sp
import os
import dask_cudf
name = 'T100'
path = '/media/mo/2438a3d1-29fe-4c6f-aafb-f906acd5140d/AIMD/c1/trajs/'+name+'.CSV'
start = time.time()
data = dask_cudf.from_cudf(cudf.read_csv(path),
npartitions=4).compute()
done = time.time()
elapsed = done - start
print(elapsed)
提示:
---------------------------------------------------------------------------
MemoryError Traceback (most recent call last)
<ipython-input-3-1fff5fb4e9b4> in <module>
2
3
----> 4 data = dask_cudf.from_cudf(cudf.read_csv(path),
5 npartitions=4).compute()
6 done = time.time()
~/anaconda3/envs/machineLearning/lib/python3.7/contextlib.py in inner(*args, **kwds)
72 def inner(*args, **kwds):
73 with self._recreate_cm():
---> 74 return func(*args, **kwds)
75 return inner
76
~/anaconda3/envs/machineLearning/lib/python3.7/site-packages/cudf/io/csv.py in read_csv(filepath_or_buffer, lineterminator, quotechar, quoting, doublequote, header, mangle_dupe_cols, usecols, sep, delimiter, delim_whitespace, skipinitialspace, names, dtype, skipfooter, skiprows, dayfirst, compression, thousands, decimal, true_values, false_values, nrows, byte_range, skip_blank_lines, parse_dates, comment, na_values, keep_default_na, na_filter, prefix, index_col, **kwargs)
82 na_filter=na_filter,
83 prefix=prefix,
---> 84 index_col=index_col,
85 )
86
cudf/_lib/csv.pyx in cudf._lib.csv.read_csv()
MemoryError: std::bad_alloc: CUDA error at: /conda/conda-bld/librmm_1591196551527/work/include/rmm/mr/device/cuda_memory_resource.hpp66: cudaErrorMemoryAllocation out of memory
問題的答案: CUDF 錯誤處理大量鑲木地板文件
解釋如何使用 dask_cudf 讀取大文件: https ://stackoverflow.com/a/58123478/13887495
按照答案中提供的說明應該可以幫助您解決MemoryError: std::bad_alloc: CUDA error at: /conda/conda-bld/librmm_1591196551527/work/include/rmm/mr/device/cuda_memory_resource.hpp66: cudaErrorMemoryAllocation out of memory
代碼應該是
data = dask_cudf.read_csv(path,
npartitions=4)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.