[英]Sort pandas MultiIndex
I have created a Dataframe with a MultiIndex by using another Dataframe:我使用另一个 Dataframe 创建了一个带有 MultiIndex 的 Dataframe:
arrays = [df['bus_uid'], df['bus_type'], df['type'],
df['obj_uid'], df['datetime']]
tuples = list(zip(*arrays))
index = pd.MultiIndex.from_tuples(tuples, names=['bus_uid', 'bus_type', 'type',
'obj_uid', 'datetime'])
multindex_df = pd.DataFrame(df['val'].values, index=index)
This worked fine as described in the documentation http://pandas.pydata.org/pandas-docs/stable/advanced.html .这工作正常,如文档http://pandas.pydata.org/pandas-docs/stable/advanced.html 中所述。
In the documentation it also says that the labels need to be sorted for the correct working of indexing and slicing functionalities under "The need for sortedness with MultiIndex".在文档中,它还表示需要对标签进行排序,以便在“使用 MultiIndex 进行排序的需要”下索引和切片功能的正确工作。
But somehow但不知何故
multindexed_df.sort_index(level=0)
or或者
multindexed_df.sort_index(level='bus_uid')
does not work anymore and throws TypeError: sort_index() got an unexpected keyword argument 'level' .不再工作并抛出TypeError: sort_index() got an unexpected keyword argument 'level' 。
Looking up the object information on sort_index() it looks as "by" is my new friend instead of "levels":在sort_index()上查找对象信息它看起来像“by”是我的新朋友而不是“levels”:
by:object
Column name(s) in frame. Accepts a column name or a list for a nested sort. A tuple will be interpreted as the levels of a multi-index.
My question is the following: How can I sort my MultiIndex so that all functionalities (slicing,etc.) are working correctly?我的问题如下:如何对我的 MultiIndex 进行排序,以便所有功能(切片等)都能正常工作?
The answer depends on the pandas version you are working with.答案取决于您正在使用的 Pandas 版本。 With the latest pandas (>= 0.17.0) , you can indeed use the level
keyword to specify to sort which level of the multi-index:使用最新的 pandas (>= 0.17.0) ,确实可以使用level
关键字来指定对多索引的哪个级别进行排序:
df = df.sort_index(level=0)
But, if you have an older pandas (< 0.17.0) , this level
keyword is not yet available, but you can use the sortlevel
method:但是,如果您有一个较旧的 pandas (< 0.17.0) ,则此level
关键字尚不可用,但您可以使用sortlevel
方法:
df = df.sortlevel(level=0)
But note that if you want to sort all levels , you don't need to specify the level
keyword, and you can just do:但请注意,如果要对所有 level进行排序,则不需要指定level
关键字,您可以这样做:
df = df.sort_index()
This will work for both the recent and older versions of pandas.这适用于最新版本和旧版本的熊猫。
For a summary of these changes in the sorting API, see http://pandas.pydata.org/pandas-docs/stable/whatsnew.html#changes-to-sorting-api有关排序 API 中这些更改的摘要,请参阅http://pandas.pydata.org/pandas-docs/stable/whatsnew.html#changes-to-sorting-api
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.