为什么在我将.txt 文件转换为.hdf5 时，pandas 会添加额外的小数点？

Question

Whenever my a part of my data is equal to exactly 0.05 it turns into sometimes 0.05.1 when I go from a.txt file to a.hdf5 file.每当我的数据的一部分正好等于 0.05 时，当我从 a.txt 文件到 a.hdf5 文件的 go 时，它有时会变成 0.05.1。 Here's the code:这是代码：

h_charge = pd.read_csv('/path/to/file.txt').to_hdf('/path/to/file.hdf5', key='data')

.txt .hdf5 In the images you can see that it goes from.05 in the.txt to.05.1 in the.hdf5, but earlier in the same file the.05 stays.05, and in other files also converted using this code I'm having the same problem. .txt .hdf5在图像中你可以看到它从.txt 中的.05 到.hdf5 中的.05.1，但在同一个文件中的更早的.05 仍然是.05，并且在其他文件中也使用此代码转换我有同样的问题。 Is this something I should just search and replace or is there a way to fix why this is happening?这是我应该搜索和替换的东西，还是有办法解决为什么会发生这种情况？ Thanks!谢谢！

Edit: Here's my code for loading it in Jupyter using h5py:编辑：这是我使用 h5py 在 Jupyter 中加载它的代码：

ch=h5.File('/path/to/file.hdf5', 'r')
c = []
for n in ch['data']['axis0']:
      c.append(n.decode())

Gives the error: "ValueError: could not convert string to float: '0.05.1'"给出错误：“ValueError：无法将字符串转换为浮点数：'0.05.1'”

Answer 1

Start by verifying the values in the Pandas dataframe.首先验证 Pandas dataframe 中的值。 Assuming those are correct, you have to use HDF View (from The HDF Group) if you want to "see" the data in the h5 file.假设这些是正确的，如果您想“查看”h5 文件中的数据，则必须使用HDF View （来自 HDF Group）。
Checking the h5 file contents with h5py is complicated b/c Pandas default schema is complicated.使用 h5py 检查 h5 文件内容很复杂 b/c Pandas 默认模式很复杂。 Your key ( data ) is a group with multiple datasets: axis0, axis1, block#_items, block#_values (where # goes from 0->N - it is the dataframe column counter).您的键（ data ）是一个包含多个数据集的组： axis0, axis1, block#_items, block#_values （其中 # 从 0->N 开始 - 它是 dataframe 列计数器）。 So, to get the data you want, you need to read from ch['data']['block#_values'] where # is the appropriate column #.因此，要获取您想要的数据，您需要从ch['data']['block#_values']中读取 # 是相应的列 #。

Simple example below demonstrates the process.下面的简单示例演示了该过程。
Create some data with Pandas使用 Pandas 创建一些数据

import pandas as pd
dates = ['2021-08-01','2021-08-02','2021-08-03','2021-08-04','2021-08-05',
          '2021-08-06','2021-08-07','2021-08-08','2021-08-09','2021-08-10' ]
precip = [ 0.0, 0.02, 0.0, 0.12, 0.0,
            0.0, 1.11, 0.0, 0.0,  0.05]
df = pd.DataFrame({'dates': dates, 'precip': precip})

df.to_hdf('file_1.h5',key='data')

Reading data with h5py:使用 h5py 读取数据：

import h5py
with h5py.File('file_1.h5','r') as h5f:
    print(h5f['data']['axis0'][:])  # prints names
    print(h5f['data']['block0_values'][:]) # prints data for column 0

Output: Output：

[b'dates' b'precip']
[[0.  ]
 [0.02]
 [0.  ]
 [0.12]
 [0.  ]
 [0.  ]
 [1.11]
 [0.  ]
 [0.  ]
 [0.05]]

为什么在我将.txt 文件转换为.hdf5 时，pandas 会添加额外的小数点？

问题描述

1 个解决方案

解决方案1
0 2022-08-17 16:22:25

为什么在我将.txt 文件转换为.hdf5 时，pandas 会添加额外的小数点？

问题描述

1 个解决方案

解决方案1 0 2022-08-17 16:22:25

解决方案1
0 2022-08-17 16:22:25