[英]Pandas check for row equivalence
I have a DataFrame with three columns and I would like to calculate how many of the three values were also contained in the previous row. 我有一个包含三列的DataFrame,我想计算上一行中还包含三个值中的多少个。 The values are strings.
值是字符串。
Original DF: 原始DF:
Date num1 num2 num3
Y1 x y z
Y2 b x a
Y3 x c c
Y4 c x d
Y5 x c d
Needed output: 所需的输出:
Date num1
Y1 -
Y2 1 <- since only x in previous row
Y3 1 <- since only x in previous
Y4 2 <- since both x and c in previous
Y5 3 <- since all three in previous row
Any thoughts? 有什么想法吗?
Typically when comparing rows you want to use the shift method 通常,在比较要使用shift方法的行时
[90]:
rel = df.set_index('Date')
shifted = rel.shift()
rel.apply(lambda x:x.isin(shifted.loc[x.name]).sum(),axis=1)
Out[90]:
Date
Y1 0
Y2 1
Y3 1
Y4 2
Y5 3
dtype: int64
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.