[英]How to get row,column list of tuples from DataFrame?
試圖從df中獲取滿足某些條件的行、列元組列表。
我引用了這篇文章: Get column and row index pairs of Pandas DataFrame matching some criteria
A = pd.DataFrame([(1.0,0.8,0.6708203932499369,0.6761234037828132,0.7302967433402214),
(0.8,1.0,0.6708203932499369,0.8451542547285166,0.9128709291752769),
(0.6708203932499369,0.6708203932499369,1.0,0.5669467095138409,0.6123724356957946),
(0.6761234037828132,0.8451542547285166,0.5669467095138409,1.0,0.9258200997725514),
(0.7302967433402214,0.9128709291752769,0.6123724356957946,0.9258200997725514,1.0)
])
c2 = A.copy()
c2.values[np.tril_indices_from(c2)] = np.nan
[(c2.index[i], c2.columns[j]) for i, j in np.argwhere(c2 > 0.8)]
Shape of passed values is (2, 3), indices imply (5, 5)
我做錯了什么?
您可能想使用 numpy 數組而不是數據c2.values
本身,即c2.values
[(c2.index[i], c2.columns[j]) for i, j in np.argwhere(c2.values > 0.8)]
我將使用np.column_stack(np.where(condition))
來達到目的:
import pandas as pd
import numpy as np
A = pd.DataFrame([(1.0,0.8,0.6708203932499369,0.6761234037828132,0.7302967433402214),
(0.8,1.0,0.6708203932499369,0.8451542547285166,0.9128709291752769),
(0.6708203932499369,0.6708203932499369,1.0,0.5669467095138409,0.6123724356957946),
(0.6761234037828132,0.8451542547285166,0.5669467095138409,1.0,0.9258200997725514),
(0.7302967433402214,0.9128709291752769,0.6123724356957946,0.9258200997725514,1.0)
])
c2 = A.copy()
c2.values[np.tril_indices_from(c2)] = np.nan
np.column_stack(np.where(c2>0.8))
Out[4]:
array([[1, 3],
[1, 4],
[3, 4]], dtype=int64)
您可以mask
DataFrame,然后stack
您留下滿足條件的(index, column)
的 MultiIndex 元組。
m = A.gt(0.8) & np.triu(np.ones(A.shape), k=1).astype('bool')
A[m].stack().index.tolist()
#[(1, 3), (1, 4), (3, 4)]
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.