简体   繁体   English

熊猫检查行等效性

[英]Pandas check for row equivalence

I have a DataFrame with three columns and I would like to calculate how many of the three values were also contained in the previous row. 我有一个包含三列的DataFrame,我想计算上一行中还包含三个值中的多少个。 The values are strings. 值是字符串。

Original DF: 原始DF:

Date    num1    num2    num3
Y1      x       y       z
Y2      b       x       a
Y3      x       c       c
Y4      c       x       d
Y5      x       c       d

Needed output: 所需的输出:

Date    num1    
Y1      -       
Y2      1       <- since only x in previous row
Y3      1       <- since only x in previous
Y4      2       <- since both x and c in previous 
Y5      3       <- since all three in previous row

Any thoughts? 有什么想法吗?

Typically when comparing rows you want to use the shift method 通常,在比较要使用shift方法的行时

[90]:

rel = df.set_index('Date')
shifted = rel.shift()

rel.apply(lambda x:x.isin(shifted.loc[x.name]).sum(),axis=1)
Out[90]:
Date
Y1      0
Y2      1
Y3      1
Y4      2
Y5      3
dtype: int64

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM