pandas 严格增加列组

Question

dates=['2020-12-01','2020-12-03','2020-12-04', '2020-12-01','2020-12-03','2020-12-04']
symbols=['ABC','ABC','ABC','DEF','DEF','DEF']
v=[1,3,5,7,9,8]

df= pd.DataFrame({'date':dates, 'g':symbols, 'v':v})

         date    g  v
0  2020-12-01  ABC  1
1  2020-12-03  ABC  3
2  2020-12-04  ABC  5
3  2020-12-01  DEF  7
4  2020-12-03  DEF  9
5  2020-12-04  DEF  8

I want to create new dataframe group by 'g' and tell me whether it is strictly increasing or not.
For example, output
    g increasing
0    ABC 1
1    DEF 0
since ABC is always increasing whereas DEF is not.

I thought maybe I can use diff() and then select for names that have negative values.我想也许我可以使用 diff() 然后 select 用于具有负值的名称。 (These names I can exclude from list) But I lose grouping column when I use this function: （我可以从列表中排除这些名称）但是当我使用这个 function 时，我丢失了分组列：

df.groupby(by='g')['v'].diff()
0    NaN
1    2.0
2    2.0
3    NaN
4    2.0
5   -1.0

What is the best way to do this?做这个的最好方式是什么？

The following looks good but is NOT want I want (Since it returns true even if value stays the same)以下看起来不错，但不是我想要的（因为即使值保持不变，它也会返回 true）

>>> df.groupby(by='g')['v'].is_monotonic_increasing.reset_index()
     g      v
0  ABC   True
1  DEF  False

Answer 1

You just need to check if is monotonic increasing and all the elements are unique.您只需要检查是否是单调递增的并且所有元素都是唯一的。 For this you could use pandas is_monotonic_increasing and unique :为此，您可以使用 pandas is_monotonic_increasing和unique ：

res = df.groupby('g', as_index=False)['v'].apply(lambda x: len(x) == len(x.unique()) and x.is_monotonic_increasing)
print(res)

Output Output

g
ABC     True
DEF    False
Name: v, dtype: bool

As an alternative use duplicated to check if all the values are unique:作为替代使用duplicated来检查所有值是否都是唯一的：

res = df.groupby('g', as_index=False)['v'].apply(lambda x: (~x.duplicated()).all() and x.is_monotonic_increasing)
print(res)

Output Output

     g      v
0  ABC   True
1  DEF  False

A third alternative is to use numpy and verify all the differences between consecutives elements are greater than 0:第三种选择是使用 numpy 并验证连续元素之间的所有差异都大于 0：

res = df.groupby('g', as_index=False)['v'].apply(lambda x: np.all(np.diff(x) > 0))

Answer 2

Thank you Dani for answer.谢谢丹妮的回答。 I had to make a small change to make 'g' column appear.我必须做一个小改动才能使“g”列出现。

df.groupby('g', as_index=True)['v'].apply(lambda x: len(x) == len(x.unique()) and x.is_monotonic_increasing).reset_index()

pandas 严格增加列组

问题描述

2 个解决方案

解决方案1
1 2020-12-10 22:17:42

解决方案2
0 2020-12-10 22:36:04

pandas 严格增加列组

问题描述

2 个解决方案

解决方案1 1 2020-12-10 22:17:42

解决方案2 0 2020-12-10 22:36:04

解决方案1
1 2020-12-10 22:17:42

解决方案2
0 2020-12-10 22:36:04