[英]Pandas DataFrame constructor sorts rows, even with OrderedDict as input
I create an OrderedDict:我创建了一个 OrderedDict:
from collections import OrderedDict
od = OrderedDict([((2, 9), 0.5218),
((2, 0), 0.3647),
((3, 15), 0.3640),
((3, 8), 0.3323),
((2, 28), 0.3310),
((2, 15), 0.3281),
((2, 10), 0.2938),
((3, 9), 0.2719)])
Then I feed that into the pandas DataFrame constructor:然后我将其输入到 pandas DataFrame 构造函数中:
import pandas as pd
df = pd.DataFrame({'values': od})
the result is this:结果是这样的:
instead it should give this:相反,它应该给出:
What is going on here that I don't understand?我不明白这里发生了什么?
PS: I am not looking for an alternative way to solving the problem (though you are welcome to post it if you think it would help the community). PS:我不是在寻找解决问题的替代方法(尽管如果您认为它对社区有帮助,欢迎您发布)。 All I want is to understand why this here doesn't work.
我只想了解为什么这在这里不起作用。 Is it a bug, or is there some logic to it?
这是一个错误,还是有一些逻辑? This is also not a duplicate of this link , because i am using specifically an OrderedDict and not a normal dict.
这也不是此链接的副本,因为我专门使用 OrderedDict 而不是普通的 dict。
If you want to get the DataFrame in the same order as your dictionary you can如果您想以与字典相同的顺序获取 DataFrame,您可以
df = pd.DataFrame(od.values(), index=od.keys(), columns=['values'])
Output输出
values
2 9 0.5218
0 0.3647
3 15 0.3640
8 0.3323
2 28 0.3310
15 0.3281
10 0.2938
3 9 0.2719
The only mention of OrderedDict
in the frame source code is for an example of df.to_dict()
, so not useful here. 框架源代码中唯一提到的
OrderedDict
是df.to_dict()
的示例,因此在这里没有用。
It seems that even though you are passing an ordered structure, it is being parsed and re-ordered by default once you wrap it in a common dictionary {'values': od}
and pandas takes its index from the OrderedDict.似乎即使您正在传递一个有序结构,一旦您将它包装在一个公共字典
{'values': od}
并且 Pandas 从 OrderedDict 中获取其索引,它就会被默认解析和重新排序。
This behavior seems to be overruled if you build your dictionary with the column labels as well (à la json).如果您也使用列标签(à la json)构建字典,则此行为似乎被否决。
od = OrderedDict([
((2, 9), {'values':0.5218}),
((2, 0), {'values':0.3647}),
((3, 15), {'values':0.3640}),
((3, 8), {'values':0.3323}),
((2, 28), {'values':0.3310}),
((2, 15), {'values':0.3281}),
((2, 10), {'values':0.2938}),
((3, 9), {'values':0.2719})
])
df = pd.DataFrame(od).T
print(df)
values
2 9 0.5218
0 0.3647
3 15 0.3640
8 0.3323
2 28 0.3310
15 0.3281
10 0.2938
3 9 0.2719
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.