简体   繁体   English

Pandas dataframe 中的列中最近的不相等行

[英]Closest non equal row in a column in Pandas dataframe

I have this df我有这个 df

d={}
d['id']=['1','1','1','1','1','1','1','1','2','2','2','2','2','2','2','2']
d['qty']=[5,5,5,5,5,6,5,5,1,1,2,2,2,3,5,8]

I would like to create a column that is going to have the following non-equal value of column qty .我想创建一个列,该列将具有以下不相等的列qty值。 Meaning that if qty is equal to 5 and its next row is 5 I am going to skip it and look until I find next value not equal to 5, In my case it is 6. And all this should be grouped by id这意味着如果qty等于 5 并且它的下一行是 5 我将跳过它并查看直到我找到下一个不等于 5 的值,在我的情况下它是 6。所有这些都应该按id分组

Here is the desired dataframe.这是所需的 dataframe。

d['id']=['1','1','1','1','1','1','1','1','2','2','2','2','2','2','2','2']
d['qty']=[5,5,5,5,5,6,5,5,1,1,2,2,2,3,5,8]
d['qty2']=[6,6,6,6,6,5,'NAN','NAN',2,2,3,3,3,5,8,'NAN']

Any help is very much appreciated很感谢任何形式的帮助

You can groupby.shift , mask the identical values, and groupby.bfill :您可以groupby.shift ,屏蔽相同的值和groupby.bfill

# shift up per group
s = df.groupby('id')['qty'].shift(-1)

# keep only the different values and bfill per group
df['qty2'] = s.where(df['qty'].ne(s)).groupby(df['id']).bfill()

output: output:

   id  qty  qty2
0   1    5   6.0
1   1    5   6.0
2   1    5   6.0
3   1    5   6.0
4   1    5   6.0
5   1    6   5.0
6   1    5   NaN
7   1    5   NaN
8   2    1   2.0
9   2    1   2.0
10  2    2   3.0
11  2    2   3.0
12  2    2   3.0
13  2    3   5.0
14  2    5   8.0
15  2    8   NaN

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM