简体   繁体   English

内存错误Python 64位

[英]Memory Error Python 64bits

I am getting a "Memory Error" on Python when trying to sort a Pandas dataframe and then save it on disk. 尝试对Pandas数据框进行排序,然后将其保存在磁盘上时,在Python上出现“内存错误”。

df = pd.read_hdf('big_df_file.h5')
df.sort_values(by='opt',inplace=True,kind='quicksort')
df.to_hdf('sorted.h5')

My computer has 16 Gbs of RAM and the data file is 8 Gb. 我的计算机有16 Gb的RAM,数据文件是8 Gb。 Shouldn't I be able to do this without getting a "Memory Error" ? 我是否应该能够在没有出现“内存错误”的情况下执行此操作?

PS I am using quicksort because it's the sorting algorithm that allocates less memory. PS我正在使用quicksort,因为它是分配较少内存的排序算法。

Versions:
python: 2.7.11.final.0
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 94 Stepping 3, GenuineIntel
byteorder: little
LC_ALL: None
LANG: en_GB

pandas: 0.17.1

We need more information, at what stage does the MemoryError invoked? 我们需要更多信息,在哪个阶段调用MemoryError? What type of data is loaded? 加载什么类型的数据? how much of it is really necessary to perform the sort? 真正需要进行多少排序?

However I will try to address the issue. 但是,我将尝试解决该问题。

In case the error is invoked during the read_hdf, I would suggest maybe limiting the number of columns you load from the file, for example, only load index column (or infer by line enumeration), value column and perform the sort. 如果在read_hdf期间调用了错误,则建议您限制从文件加载的列数,例如,仅加载索引列(或按行枚举推断),值列并执行排序。 After that you can (perhaps) incrementally write the new data to a file. 之后,您可以(也许)以增量方式将新数据写入文件。

An even more "Hardcore" approach would be a divide and conquer (binary sorting algorithm such as merge sort), load only half of the file (or quarter, you decide what works best, according to the docs this is possible by passing start and stop arguments) and perform the chosen sorting algorithm (merge sort or external sort). 一种更“硬核”的方法将是分而治之(二进制排序算法,例如合并排序),仅加载文件的一半(或四分之一,您可以决定哪种方法效果最好),根据文档 ,通过传递start和停止参数)并执行所选的排序算法(合并排序或外部排序)。

More information on the general problem was provided by this answer 此答案提供了有关一般问题的更多信息

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM