[英]Convert a square pandas dataframe to a tall format
我有一個數據框,它是一個相關矩陣。 我想從中創建一個圖形並這樣做,我需要轉換如下所示的相關數據框
df = pd.DataFrame([[1,0.2,0.4],[0.2,1,0.6],[0.4,0.6,1]])
df.columns = list('ABC')
df.index= list('ABC')
df
# result-
A B C
A 1.0 0.2 0.4
B 0.2 1.0 0.6
C 0.4 0.6 1.0
這種格式-
df = pd.DataFrame({ 'from':['A', 'A', 'A', 'B', 'B', 'C'], 'to':['A', 'B', 'C', 'B', 'C', 'C'], 'value':[1, 0.2,0.4,1,0.6,1]})
df
# result-
from to value
0 A A 1.0
1 A B 0.2
2 A C 0.4
3 B B 1.0
4 B C 0.6
5 C C 1.0
我該如何實現?
使用stack
+ rename_axis
+ reset_index
:
df1 = df.stack().rename_axis(('from','to')).reset_index(name='value')
print (df1)
from to value
0 A A 1.0
1 A B 0.2
2 A C 0.4
3 B A 0.2
4 B B 1.0
5 B C 0.6
6 C A 0.4
7 C B 0.6
8 C C 1.0
另一個numpy的解決方案:
a = np.repeat(df.columns, len(df.index))
b = np.tile(df.index, len(df.columns))
c = df.values.ravel()
df1 = pd.DataFrame({'from':a, 'to':b, 'value':c})
print (df1)
from to value
0 A A 1.0
1 A B 0.2
2 A C 0.4
3 B A 0.2
4 B B 1.0
5 B C 0.6
6 C A 0.4
7 C B 0.6
8 C C 1.0
編輯:
刪除重復項的另一種解決方案:
df = pd.DataFrame([[1,0.2,0.4],[0.2,1,0.6],[0.4,0.6,1]])
df.columns = list('ACC')
df.index= list('ABC')
print (df)
A C C
A 1.0 0.2 0.4
B 0.2 1.0 0.6
C 0.4 0.6 1.0
a = np.repeat(df.columns, len(df.index))
b = np.tile(df.index, len(df.columns))
c = df.values.ravel()
df1 = pd.DataFrame({'from':a, 'to':b, 'value':c})
df1 = (pd.DataFrame(np.sort(df1[['from','to']], axis=1), columns=['from','to'])
.drop_duplicates())
print (df1)
from to
0 A A
1 A B
2 A C
4 B C
5 C C
我接受了耶斯萊爾的回答。 為了完整起見,我添加了幾行內容來刪除重復項。
# from jezrael's solution
df1 = df.stack().rename_axis(('from','to')).reset_index(name='value')
# drop the dupes
df1.loc[:, ['from', 'to']] = df1.loc[:, ['from', 'to']].apply(sorted, axis=1)
df1.drop_duplicates()
# result -
from to
0 A A
1 A B
2 A C
4 B B
5 B C
8 C C
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.