[英]Object representation in Pandas.DataFrame
Assume I have the following class, 'MyClass'.假设我有以下课程,“MyClass”。
class MyClass:
def __repr__(self):
return 'Myclass()'
def __str__(self):
return 'Meh'
instances = [MyClass() for i in range(5)]
Some instances are created and stored in the instances
variable.一些实例被创建并存储在
instances
变量中。 Now, we check its content.现在,我们检查它的内容。
>>> instances
[Myclass(), Myclass(), Myclass(), Myclass(), Myclass()]
To represent the object python calls the __repr__
method.为了表示对象,python 调用
__repr__
方法。 However, when the same instances
variable is passed to a pandas.DataFrame
, the representation of the object changes and the __str__
method seemed to be called.但是,当将相同的
instances
变量传递给pandas.DataFrame
,对象的表示会发生变化并且__str__
方法似乎被调用。
import pandas as pd
df = pd.DataFrame(data=instances)
>>> df
0
0 Meh
1 Meh
2 Meh
3 Meh
4 Meh
Why has the object's representation changed?为什么对象的表示发生了变化? Can I determine which representation is used in the DataFrame?
我可以确定在 DataFrame 中使用哪种表示吗?
The data is indeed stored as object.数据确实存储为对象。 It seems pandas just calls the
__str__
method (implicitly) when it displays the dataframe.似乎熊猫在显示数据帧时只是调用了
__str__
方法(隐式)。
You can verify that by calling:您可以通过调用来验证:
df[0].map(type)
It calls type
for each element in the column and returns:它为列中的每个元素调用
type
并返回:
Out[572]:
0 <class '__main__.MyClass'>
1 <class '__main__.MyClass'>
2 <class '__main__.MyClass'>
3 <class '__main__.MyClass'>
4 <class '__main__.MyClass'>
Name: 0, dtype: object
# likewise you get the the
# representation string of the objects
# with:
df[0].map(repr)
Out[578]:
0 Myclass()
1 Myclass()
2 Myclass()
3 Myclass()
4 Myclass()
Name: my_instances, dtype: object
Btw, if you want to create a dataframe with a column that contains the data explicitly, rather use:顺便说一句,如果要创建一个包含显式包含数据的列的数据框,请使用:
df = pd.DataFrame({'my_instances': instances})
This way, you assign a column name.这样,您就可以分配一个列名。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.