![](/img/trans.png)
[英]Populate and generate a pandas dataframe using conditions from another dataframe
[英]efficient way to populate pandas dataframe based on conditions from another dataframe
數據
我有一個數據框,它具有給定 ID 的排名分數:
>>> ranks
ID rank
0 A 6
1 B 9
2 C 6
3 D 1
4 E 1
5 F 2
我想把它變成一個方陣,每個ID
作為索引和列,基於幾個條件:如果索引上 ID 的等級高於列中 ID 的等級,則將其設置為1、如果較低,則設置為0,如果相等,則設置為0.5,如果索引與列相同,則設置為np.nan
。 通過查看我想要的矩陣可以更好地描述這一點:
想要的結果
>>> mtrx
A B C D E F
A NaN 1.0 0.5 0.0 0.0 0.0
B 0.0 NaN 0.0 0.0 0.0 0.0
C 0.5 1.0 NaN 0.0 0.0 0.0
D 1.0 1.0 1.0 NaN 0.5 1.0
E 1.0 1.0 1.0 0.5 NaN 1.0
F 1.0 1.0 1.0 0.0 0.0 NaN
我所做的(有效,但速度很慢)
以下循環有效,但對於較大的數據幀,速度很慢。 如果有人可以指出我采用更好的 Pythonic/pandorable 方式來實現這一目標,我希望得到一些幫助:
# Make an empty matrix as a dataframe
mtrx = pd.DataFrame(np.zeros((len(IDs), len(IDs))), index=IDs, columns = IDs)
# Populate it via for loop
for i in IDs:
for j in IDs:
i_rank = ranks.loc[ranks['ID'] == i].iloc[0]['rank']
j_rank = ranks.loc[ranks['ID'] == j].iloc[0]['rank']
if i == j:
mtrx.loc[i, j] = np.nan
elif i_rank < j_rank:
mtrx.loc[i, j] = 1.
elif i_rank == j_rank:
mtrx.loc[i, j] = 0.5
重現此玩具示例的代碼
import pandas as pd
import numpy as np
np.random.seed(1)
IDs = list('ABCDEF')
ranks = pd.DataFrame({'ID':IDs, 'rank':np.random.randint(1,10,len(IDs))})
numpy
方法
s=ranks['rank'].values
s1=(s>s[:,None]).astype(int).astype(float)
s1[s==s[:,None]]=0.5
s1[[np.arange(len(s))]*2] = np.nan
pd.DataFrame(s1,index=ranks.ID,columns=ranks.ID)
Out[843]:
ID A B C D E F
ID
A NaN 1.0 0.5 0.0 0.0 0.0
B 0.0 NaN 0.0 0.0 0.0 0.0
C 0.5 1.0 NaN 0.0 0.0 0.0
D 1.0 1.0 1.0 NaN 0.5 1.0
E 1.0 1.0 1.0 0.5 NaN 1.0
F 1.0 1.0 1.0 0.0 0.0 NaN
熊貓方法
s=ranks.assign(key=1).merge(ranks.assign(key=1),on='key')
s['New']=(s['rank_x']<s['rank_y']).astype(int)
s.loc[s['rank_x']==s['rank_y'],'New']=0.5
s.loc[s['ID_x']==s['ID_y'],'New']=np.nan
s.set_index(['ID_x','ID_y']).New.unstack()
Out[854]:
ID_y A B C D E F
ID_x
A NaN 1.0 0.5 0.0 0.0 0.0
B 0.0 NaN 0.0 0.0 0.0 0.0
C 0.5 1.0 NaN 0.0 0.0 0.0
D 1.0 1.0 1.0 NaN 0.5 1.0
E 1.0 1.0 1.0 0.5 NaN 1.0
F 1.0 1.0 1.0 0.0 0.0 NaN
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.