如何从 HDF5 文件中提取数据以填充 PyTables 表？

Question

我正在尝试用 Python 编写一个 Discord 机器人。 该机器人的目标是用来自用户的条目填充表格，其中检索到的用户名、游戏名和 gamepswd。 然后，针对特定用户提取这些数据并删除已解决的条目。 我使用了在谷歌上找到的第一个工具来管理表，因此 PyTables，我可以在 HDF5 文件中填写表，但我无法检索它们。

说我以前从未用 Python 编码可能很重要。

这就是我声明我的对象并创建一个文件来存储条目的方式。

class DCParties (tables.IsDescription):
    user_name=StringCol(32)
    game_name=StringCol(16)
    game_pswd=StringCol(16) 


h5file = open_file("DCloneTable.h5", mode="w", title="DClone Table")    
group = h5file.create_group("/", 'DCloneEntries', "Entries for DClone runs")
table = h5file.create_table(group, 'Entries', DCParties, "Entrées")
h5file.close()

这就是我填写条目的方式

h5file = open_file("DCloneTable.h5", mode="a")
    table = h5file.root.DCloneEntries.Entries
    
    particle = table.row
    particle['user_name'] = member.author
    particle['game_name'] = game_name
    particle['game_pswd'] = game_pswd
    particle.append()
    
    table.flush()
    h5file.close()

所有这些工作，我可以看到我的条目使用 HDF5 查看器填充了文件中的表格。 但是，我希望读取存储在文件中的表以提取数据，但它不起作用。

h5file = open_file("DCloneTable.h5", mode="a")
    table = h5file.root.DCloneEntries.Entries
    
    particle = table.row
    
    """???"""
    
    h5file.close()

我尝试使用particle["user_name"]（因为没有定义'user_name'），它给了我"b''"作为输出

h5file = open_file("DCloneTable.h5", mode="a")
    table = h5file.root.DCloneEntries.Entries
    
    particle = table.row
    print(f'{particle["user_name"]}')
    
    h5file.close()

b''

如果我这样做

h5file = open_file("DCloneTable.h5", mode="a")
    table = h5file.root.DCloneEntries.Entries
    
    particle = table.row
    print(f'{particle["user_name"]} - {particle["game_name"]} - {particle["game_pswd"]}')
    
    h5file.close()

b'' - b'' - b''

我在哪里失败？ 提前谢谢了：）

Answer 1

这是一种遍历表行并一次打印一个的简单方法。 HDF5 不支持 Unicode 字符串，因此您的字符数据存储为字节字符串。 这就是你看到'b'的原因。 要摆脱'b' ，您必须使用.decode('utf-8')转换回 Unicode。 这适用于您的硬编码字段名称。 您可以使用table.colnames中的值来处理任何列名。 另外，我建议使用 Python 的文件上下文管理器（ with/as: ）来避免打开文件。

import tables as tb

with tb.open_file("DCloneTable.h5", mode="r") as h5file:
    table = h5file.root.DCloneEntries.Entries
    print(f'Table Column Names: {table.colnames}')

# Method to iterate over rows
    for row in table:
    print(f"{row['user_name'].decode('utf-8')} - " +
          f"{row['game_name'].decode('utf-8')} - " +
          f"{row['game_pswd'].decode('utf-8')}" )

# Method to only read the first row, aka table[0]
    print(f"{table[0]['user_name'].decode('utf-8')} - " +
          f"{table[0]['game_name'].decode('utf-8')} - " +
          f"{table[0]['game_pswd'].decode('utf-8')}" )

如果您更喜欢一次读取所有数据，可以使用table.read()方法将数据加载到 NumPy 结构化数组中。 您仍然必须从字节转换为 Unicode。 结果它“稍微复杂一点”，所以我没有发布那个方法。

如何从 HDF5 文件中提取数据以填充 PyTables 表？

问题描述

1 个解决方案

解决方案1
0 已采纳 2022-05-25 01:45:38

如何从 HDF5 文件中提取数据以填充 PyTables 表？

问题描述

1 个解决方案

解决方案1 0 已采纳 2022-05-25 01:45:38

解决方案1
0 已采纳 2022-05-25 01:45:38