[英]reshaping a pandas data frame while doing a merge
我有一個pandas.DataFrame
( df
)和一些元數據,其中有一個ID
、 Column
和Value
我想與另一個df
結合,例如:
df_map = pd.DataFrame({"ID" : [3, 7, 17], "Column" : ["A1", "B7", "C17"],
"Value" : ["ValA1", "ValB7", "ValC17"]})
我想將上面的(為了更好的詞)與下面的df
結合起來,其中列名與上面Column
中的行條目匹配,下面df
中的行值與ID
行值匹配以上。
df_main = pd.DataFrame({"A1" : [3, 6], "A5" : [5, 10], "B7" : [7, 14] ,
"C17" : [17, 34], "C19" : [19, 38] })
因此,我想以這樣一種方式將這些合並到df's
中,即通過將它們添加為ID's
附加維度,基於Value
列重塑它,即df_result = combine(df_map, df_main)
我基本上期望結果如下
df_result = pd.DataFrame({"A1" : [3, 6], "A5" : [5, 10], "B7" : [7, 14] ,
"C17" : [17, 34], "C19" : [19, 38], "Value A1" : ["ValA1", None],
"Value B7" : ["ValB7", None], "Value C17" : ["ValC17", None ]})
Out[30]:
A1 A5 B7 C17 C19 Value A1 Value B7 Value C17
0 3 5 7 17 19 ValA1 ValB7 ValC17
1 6 10 14 34 38 None None None
不確定在pandas
中執行此操作的最佳方法?
First DataFrame.melt
with converted index
to column for avoid lost in DataFrame.merge
with left join, then reshape back by DataFrame.set_index
with DataFrame.unstack
, remove only missing columns by DataFrame.dropna
and last flatten MultiIndex
with map
:
df = (df_main.reset_index()
.melt('index',var_name='Column', value_name='ID')
.merge(df_map, how='left')
.set_index(['index', 'Column'])
.unstack()
.rename_axis(None)
.dropna(how='all', axis=1))
df.columns = df.columns.map('_'.join)
print (df)
ID_A1 ID_A5 ID_B7 ID_C17 ID_C19 Value_A1 Value_B7 Value_C17
0 3 5 7 17 19 ValA1 ValB7 ValC17
1 6 10 14 34 38 NaN NaN NaN
Series.map和pandas.concat的替代解決方案:
df2=pd.concat([df_main.T[key].map(df_map.set_index('ID')['Value']) for key in df_main.index.tolist()],axis=1).T.add_prefix('Value_')
df_main=pd.concat([df_main,df2],axis=1)
df_main.dropna(how='all',axis=1,inplace=True)
print(df_main)
A3 A5 B7 C17 C19 Value_A3 Value_B7 Value_C17
0 3 5 7 17 19 ValA1 ValB7 ValC17
1 6 10 14 34 38 NaN NaN NaN
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.