Pandas 在一個索引級別上的 MultiIndex 匹配

Question

我有一個 pandas MultiIndex object，其中第一級是整數的常規遞增索引，第二級包含其他整數，這些整數可能會或可能不會重復不同的“第一”索引值：

lst = list(filter(lambda x: x[1]%5 == x[0] or x[1]%4 == x[0],[(i,j) for i in range(5) for j in range(0, 20, 2)]))
mi = pd.MultiIndex.from_tuples(lst).rename(['frst', 'scnd'])
# mi = MultiIndex([(0,  0),(0,  4),(0,  8),(0, 10),(0, 12),(0, 16),(1,  6),(1, 16),(2,  2),(2,  6),(2, 10),(2, 12),(2, 14),(2, 18),(3,  8),(3, 18),(4,  4),(4, 14)], names=['frst', 'scnd'])

對於給定的第一個值（例如frst frst_idx = 0 ）和一些shift ，我需要找到所有索引，其中frst是frst_idx+shift ，並且scnd在frst_idx和frst_idx+shift之間共享。

例如：

frst_idx = 0 , shift = 3應該是 output [8]因為上面的 MultiIndex 包含(0, 8)和(3, 8) 。
frst_idx = 1 , shift = 1應該是 output [6]因為(1, 6)和(2, 6)都在索引中

所以我希望 function 可以接受這些參數並返回一個 pd.Series 所有匹配的scnd值：

my_func(multi_index=mi, frst_idx=0, shift=3) ==> pd.Series([8])

迭代地執行此操作非常昂貴 ( O(n^2) )，我希望有一些 pandas 魔法可以更快地執行此操作。

Answer 1

我找到了以下解決方案：

# reminder: $mi is a MultiIndex, mi.names = ['frst', 'scnd']
# assume some integer values for $frst_idx1, $shift

scnd_indices1 = mi[mi.get_level_values('frst') == frst_idx1].drop_level('frst')

frst_idx2 = frst_idx1 + shift
scnd_indices2 = mi[mi.get_level_values('frst') == frst_idx2].drop_level('frst')

result = scnd_indices1.intersection(scnd_indices2).to_series().reset_index(drop=True)

Pandas 在一個索引級別上的 MultiIndex 匹配

問題描述

1 個解決方案

解決方案1
0 已采納 2022-05-02 07:36:01

Pandas 在一個索引級別上的 MultiIndex 匹配

問題描述

1 個解決方案

解決方案1 0 已采納 2022-05-02 07:36:01

解決方案1
0 已采納 2022-05-02 07:36:01