[英]turn pandas dataframe into series by iterating over columns
我有一個數據框,並且我正在嘗試獲取以下形式的系列:
col1 col2 col3
col1 1.0 0.20 0.70
col2 0.2 1.00 0.01
col3 0.7 0.01 1.00
目標:
col1Xcol1 1.0
col1Xcol2 0.2
col1Xcol3 0.7
col2Xcol1 0.2
...
到目前為止,我的代碼:
pvals2=pd.DataFrame({'col1': [1, .2,.7],
'col2': [.2, 1,.01],
'col3': [.7,.01,1]},
index = ['col1', 'col2', 'col3'])
print(pvals.transpose().join(pvals, how='outer',lsuffix='_left', rsuffix='_right'))
輸出:
vote_left ballot1_left ballot1_x_left vote_right ballot1_right \
vote 0 0.0923 0.0521 0 0.0923
ballot1 0.0923 0 0.8213 0.0923 0
ballot1_x 0.0521 0.8213 0 0.0521 0.8213
ballot1_x_right
vote 0.0521
ballot1 0.8213
ballot1_x 0
concat
並設置新索引有效:
>>> ser = pd.concat([pvals2[col] for col in pvals2.columns])
>>> ser.index = [pvals2[col].name + 'X' + x for col in pvals2.columns
for x in pvals2[col].index]
>>> ser
col1Xcol1 1.00
col1Xcol2 0.20
col1Xcol3 0.70
col2Xcol1 0.20
col2Xcol2 1.00
col2Xcol3 0.01
col3Xcol1 0.70
col3Xcol2 0.01
col3Xcol3 1.00
dtype: float64
如下代碼:
pvals = pd.DataFrame({'col1': [1, .2,.7],
'col2': [.2, 1,.01],
'col3': [.7,.01,1]},
index = ['row1', 'row2', 'row3'])
values = []
ind = []
for i in range(len(pvals.index)):
for col in pvals:
row = pvals.index[i]
values.append(pvals[col][row])
ind.append("%sX%s" % (row, col))
newpvals = pd.Series(values, ind)
給出:
>>> newvals
row1Xcol1 1.00
row1Xcol2 0.20
row1Xcol3 0.70
row2Xcol1 0.20
row2Xcol2 1.00
row2Xcol3 0.01
row3Xcol1 0.70
row3Xcol2 0.01
row3Xcol3 1.00
dtype: float64
編輯:我讀錯了,所以變成了Series
。
考慮melt
與新的索引列分配,然后選擇相應的值列,因為一個單一的大熊貓據幀列是熊貓系列:
數據
from io import StringIO
import pandas as pd
txt = ''' col1 col2 col3
col1 1.0 0.20 0.70
col2 0.2 1.00 0.01
col3 0.7 0.01 1.00'''
df = pd.read_table(StringIO(txt), sep="\s+")
系列構建
mdf = pd.melt(df.reset_index(), id_vars='index')
mdf['s'] = mdf['index'] + 'X' + mdf['variable']
new_series = mdf.set_index('s').rename_axis(None)['value']
print(new_series)
# col1Xcol1 1.00
# col2Xcol1 0.20
# col3Xcol1 0.70
# col1Xcol2 0.20
# col2Xcol2 1.00
# col3Xcol2 0.01
# col1Xcol3 0.70
# col2Xcol3 0.01
# col3Xcol3 1.00
# Name: value, dtype: float64
首先堆疊數據框
st = pvals2.stack()
通過將多索引加在一起來創建新索引
newdex = st.index._get_level_values(0) + 'X' + st.index._get_level_values(1)
將newdex
設置newdex
系列的索引
st.set_axis(0,newdex)
全部一起
st = pvals2.stack()
st.set_axis(0,st.index._get_level_values(0) + 'X' + st.index._get_level_values(1))
col1Xcol1 1.00
col1Xcol2 0.20
col1Xcol3 0.70
col2Xcol1 0.20
col2Xcol2 1.00
col2Xcol3 0.01
col3Xcol1 0.70
col3Xcol2 0.01
col3Xcol3 1.00
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.