简体   繁体   English

将我自己的描述属性添加到Pandas DataFrame

[英]Adding my own description attribute to a Pandas DataFrame

I am retrieving some web data, parsing it, and storing the output as a Pandas DataFrame into an HDF5 file. 我正在检索一些Web数据,对其进行解析,并将输出作为Pandas DataFrame存储到HDF5文件中。 Right before I write the DataFrame into the H5 file, I add my own description string to annotate some metadata about where the data came from and whether anything went wrong while parsing it. 在将DataFrame写入H5文件之前,我添加了自己的描述字符串以注释一些元数据,这些元数据涉及数据的来源以及在解析数据时是否出错。

In [1]: my_data_frame.desc = "Some string about the data"

In [2]: my_data_frame.desc

Out[1]: "Some string about the data"

In [3]: print type(my_data_frame)
<class 'pandas.core.frame.DataFrame'>

However, after loading the same data with pandas.io.pytables.HDFStore() , my added desc attribute is missing and I get the error: AttributeError: 'DataFrame' object has no attribute 'desc' as if I had never added this new attribute. 但是,在使用pandas.io.pytables.HDFStore()加载相同的数据后,我添加的desc属性丢失,并且出现错误: AttributeError: 'DataFrame' object has no attribute 'desc' ,就好像我从未添加过这个新AttributeError: 'DataFrame' object has no attribute 'desc'一样属性。

How can I get my metadata descriptions to persist as an extra attribute of the DataFrame object? 如何获取元数据描述作为DataFrame对象的额外属性而持久化? (Or is there some existing, recognized attribute of a DataFrame that I can hijack for my metadata purposes?) (或者是否存在一些现有的,可识别的DataFrame属性,可以出于元数据目的劫持该属性?)

Adding DataFrame metadata or per-column metadata is on the roadmap but hasn't been implemented yet. 路线图上正在添加DataFrame元数据或每列元数据,但尚未实现。 I'm open to ideas about what the API should look like, though. 不过,我对API的外观持开放态度。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM