簡體   English   中英

如何在熊貓python中找到具有完全相同的列和索引但值不同的數據框列表的交集?

[英]How to find an intersection of a list of dataframes with exactly same columns and indexes but different values in pandas python?

我有一個熊貓數據框列表,如下所示:

lis = [df1, df2, df3, ... , dfn]

我想找到這些數據幀的交集,這樣我的最終df稱為交集df就只有所有人都通用的值。 所有的列和行都應該在那里,但是如果找不到相交處,則用na填充。

我的數據框都是多維的,具有相同數量的行和列,如下所示:

              1   2   3   4   5  
cat   cat     1   0   0   1   1  
      dog     1   0   0   1   1  
      fox     0   0   0   0   0  
      jumps   0   0   1   1   1  
      over    1   0   0   1   1  
      the     1   0   0   1   1
dog   cat     1   0   0   1   0  
      dog     1   0   0   1   0  
      fox     0   0   0   0   0  
      jumps   1   0   0   1   0  
      over    1   0   0   1   0  
      the     1   1   0   1   0 

我嘗試過在stackoverflow上找到不同的解決方案,但是沒有運氣,有什么想法嗎?

看看是否可行

from functools import reduce
import pandas as pd

lis = [df1, df2, df3, ... , dfn]

inner_align = lambda d1, d2: d1.align(d2, 'inner')[0]
outer_align = lambda d1, d2: d1.align(d2, 'outer')[0]

inner_indcs = reduce(inner_align, lis)
outer_indcs = reduce(outer_aling, lis)

innout = lambda d, i, o: d.reindex_like(i).reindex_like(o)

output = innout(lis[0], inner_indcs, outer_indcs)

設定

lis = [
    pd.DataFrame(1, list('abc'), list('xyz')),
    pd.DataFrame(1, list('acd'), list('wyz')),
    pd.DataFrame(1, list('bec'), list('ysu')),
    pd.DataFrame(1, list('cef'), list('xgy')),
]

print(*lis, sep='\n'*2)

   x  y  z
a  1  1  1
b  1  1  1
c  1  1  1

   w  y  z
a  1  1  1
c  1  1  1
d  1  1  1

   y  s  u
b  1  1  1
e  1  1  1
c  1  1  1

   x  g  y
c  1  1  1
e  1  1  1
f  1  1  1

示范

from functools import reduce
import pandas as pd

inner_align = lambda d1, d2: d1.align(d2, 'inner')[0]
outer_align = lambda d1, d2: d1.align(d2, 'outer')[0]

inner_indcs = reduce(inner_align, lis)
outer_indcs = reduce(outer_align, lis)

innout = lambda d, i, o: d.reindex_like(i).reindex_like(o)

output = innout(lis[0], inner_indcs, outer_indcs)

print(output)

    g   s   u   w   x    y   z
a NaN NaN NaN NaN NaN  NaN NaN
b NaN NaN NaN NaN NaN  NaN NaN
c NaN NaN NaN NaN NaN  1.0 NaN
d NaN NaN NaN NaN NaN  NaN NaN
e NaN NaN NaN NaN NaN  NaN NaN
f NaN NaN NaN NaN NaN  NaN NaN

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM