[英]Adding my own description attribute to a Pandas DataFrame
I am retrieving some web data, parsing it, and storing the output as a Pandas DataFrame into an HDF5 file. 我正在检索一些Web数据,对其进行解析,并将输出作为Pandas DataFrame存储到HDF5文件中。 Right before I write the DataFrame
into the H5 file, I add my own description string to annotate some metadata about where the data came from and whether anything went wrong while parsing it. 在将DataFrame
写入H5文件之前,我添加了自己的描述字符串以注释一些元数据,这些元数据涉及数据的来源以及在解析数据时是否出错。
In [1]: my_data_frame.desc = "Some string about the data"
In [2]: my_data_frame.desc
Out[1]: "Some string about the data"
In [3]: print type(my_data_frame)
<class 'pandas.core.frame.DataFrame'>
However, after loading the same data with pandas.io.pytables.HDFStore()
, my added desc
attribute is missing and I get the error: AttributeError: 'DataFrame' object has no attribute 'desc'
as if I had never added this new attribute. 但是,在使用pandas.io.pytables.HDFStore()
加载相同的数据后,我添加的desc
属性丢失,并且出现错误: AttributeError: 'DataFrame' object has no attribute 'desc'
,就好像我从未添加过这个新AttributeError: 'DataFrame' object has no attribute 'desc'
一样属性。
How can I get my metadata descriptions to persist as an extra attribute of the DataFrame object? 如何获取元数据描述作为DataFrame对象的额外属性而持久化? (Or is there some existing, recognized attribute of a DataFrame that I can hijack for my metadata purposes?) (或者是否存在一些现有的,可识别的DataFrame属性,可以出于元数据目的劫持该属性?)
Adding DataFrame metadata or per-column metadata is on the roadmap but hasn't been implemented yet. 路线图上正在添加DataFrame元数据或每列元数据,但尚未实现。 I'm open to ideas about what the API should look like, though. 不过,我对API的外观持开放态度。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.