简体   繁体   English

在Pandas DataFrame中对布尔值的列进行排序

[英]Sort column of booleans in Pandas DataFrame

I am attempting to learn time series. 我正在尝试学习时间序列。 I want to find the dates that are linked with a boolean value that is True . 我想查找与True布尔值链接的日期。 I then assigned the boolean to the pd.DataFrame . 然后,我将布尔值分配给pd.DataFrame

I assigned the boolean statements to the column named 50+ , like this: 我将布尔语句分配给名为50+的列,如下所示:

我将布尔语句分配给列表50+

How do I sort the True rows from the column 50+ ? 如何对列50+True行进行排序?

I have searched the internet and have not found a solution. 我已经搜索了互联网,但没有找到解决方案。 Since I passed the 50+ from a boolean to the dateframe, doesn't this make it a normal string that can be sorted by the sort value function? 由于我将50+从布尔值传递给了日期框架,这是否使其成为可以通过排序值函数进行排序的普通字符串?

You need to specify the column name: 您需要指定列名称:

>>> import pandas as pd
>>> import numpy as np
>>> np.random.seed(123)

>>> idx = pd.date_range('2018-10-05', periods=7, freq='D')

>>> df = pd.DataFrame({'data': np.random.randn(idx.size),
...                    '50+': np.random.choice([0, 1], size=idx.size).astype(bool)},
...                   index=idx)

>>> df
                data    50+
2018-10-05 -1.085631   True
2018-10-06  0.997345   True
2018-10-07  0.282978  False
2018-10-08 -1.506295  False
2018-10-09 -0.578600  False
2018-10-10  1.651437   True
2018-10-11 -2.426679  False

>>> df.sort_values('50+')
                data    50+
2018-10-07  0.282978  False
2018-10-08 -1.506295  False
2018-10-09 -0.578600  False
2018-10-11 -2.426679  False
2018-10-05 -1.085631   True
2018-10-06  0.997345   True
2018-10-10  1.651437   True

>>> df.sort_values('50+', ascending=False)
                data    50+
2018-10-05 -1.085631   True
2018-10-06  0.997345   True
2018-10-10  1.651437   True
2018-10-07  0.282978  False
2018-10-08 -1.506295  False
2018-10-09 -0.578600  False
2018-10-11 -2.426679  False

If you're uncertain, you can always check the docstring . 如果不确定,可以随时检查docstring

The default is ascending=True , which will put False s first, because they are just 0s under the hood. 默认值为ascending=True ,它将把False放在第一位,因为它们在底层仅是0。 (While True is 1.) (虽然True为1。)

If you'd like to filter to rows where this column is True, you can use: 如果您想过滤到该列为True的行,则可以使用:

>>> df[df['50+']]
                data   50+
2018-10-05 -1.085631  True
2018-10-06  0.997345  True
2018-10-10  1.651437  True

I want to find the dates that are linked with a boolean value that is True. 我想查找与布尔值True链接的日期。

You don't need to sort anything for this. 您无需为此进行任何排序。 You need only use Boolean indexing; 您只需要使用布尔索引即可; in other words, construct a Boolean series or array with the same length as your dataframe index and apply it via __getitem__ , called by the syntax [] . 换句话说,构造一个长度与数据帧索引相同的布尔系列或数组,然后通过__getitem__应用它,该语法由语法[]调用。

So, instead of "assigning the Boolean to pd.DataFrame ", just index the index ! 因此, pd.DataFrame “将布尔值分配给pd.DataFrame ”, pd.DataFrame 对索引建立索引

index_filtered = df.index[df['50+']]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM