如何將 pandas dataframe 的 hdf5 二進制文件保存在內存中？

Question

我想將 pandas dataframe 的字節內容導出為 hdf5，理想情況下無需實際保存文件（即內存中）。

在python>=3.6, < 3.9 （和pandas==1.2.4 ， pytables==3.6.1 ）上，以下曾經工作：

import pandas as pd
with pd.HDFStore(
    "in-memory-save-file",
    mode="w",
    driver="H5FD_CORE",
    driver_core_backing_store=0,
) as store:
    store.put("my_key", df, format="table")
    binary_data = store._handle.get_file_image()

其中df是要轉換為hdf5的dataframe，最后一行調用這個pytables function 。

但是，從 python 3.9 開始，使用上面的代碼片段時出現以下錯誤：

File "tables/hdf5extension.pyx", line 523, in tables.hdf5extension.File.get_file_image
tables.exceptions.HDF5ExtError: Unable to retrieve the size of the buffer for the file image.  Plese note that not all drivers provide support for image files.

該錯誤是由上面鏈接的相同pytables function 引發的，顯然是由於在檢索文件圖像的緩沖區大小時出現問題。 不過，我不明白它的最終原因。

我嘗試了其他替代方法，例如保存到BytesIO file-object ，但到目前為止沒有成功。

如何將 pandas dataframe 的 hdf5 二進制文件保存在 python 3.9 的內存中？

Answer 1

解決方法是執行conda install -c conda-forge pytables而不是pip install pytables 。 不過，我仍然不明白錯誤背后的最終原因。

如何將 pandas dataframe 的 hdf5 二進制文件保存在內存中？

問題描述

1 個解決方案

解決方案1
0 已采納 2021-05-18 12:59:44

如何將 pandas dataframe 的 hdf5 二進制文件保存在內存中？

問題描述

1 個解決方案

解決方案1 0 已采納 2021-05-18 12:59:44

解決方案1
0 已采納 2021-05-18 12:59:44