简体   繁体   English

查找行之间的交点

[英]Find intersection between rows

If I want to find the difference between two consecutive rows in a pandas DataFrame, I can simply call the diff function. 如果我想在pandas DataFrame中找到两个连续行之间的差异,我可以简单地调用diff函数。

I have rows that contain set s of characters. 我有一个包含行set的字符秒。 What I want to do now is compute the intersection of each set in rowise pairs. 我现在想做的是计算行对中每个集合的交集。 in other words, I'd like to use diff , but supply my own function instead. 换句话说,我想使用diff ,而是提供自己的函数。 Is there a way to accomplish this in pandas? 有没有办法在熊猫身上做到这一点?

example input: 输入示例:

 100118231     1               set([])           
               2            set([142.136.6])    
               3            set([142.136.6])    
               4            set([])             
               5            set([])             
               6            set([108.0.239])    

desired output: 所需的输出:

 100118231     1               set([])             NaN
               2            set([142.136.6])    set([])
               3            set([142.136.6])    {142.136.6}
               4            set([])             set([])
               5            set([])             set([])
               6            set([108.0.239])    set([])

I've tried using shift , but it throws an error 我尝试使用shift ,但是会引发错误

In [213]: type(tgr.head(1))
Out[213]: pandas.core.frame.DataFrame

In [214]: tt=tgr.apply(lambda x: x['value'].intersection((x['value'].shift(-1))))

AttributeError: 'Series' object has no attribute 'intersection'

& will run over all the items, there's no need to involve lambdas and the like. &将遍历所有项目,而无需涉及lambda等。

> df = pd.DataFrame(['hi', set([142,136,6]), set([142, 137, 6]), set([0, 6])]).iloc[1:]
> df & df.shift(1)
               0
1            NaN
2  set([142, 6])
3       set([6])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM