简体   繁体   English

Python根据存在于另一个数据帧值中的列索引填充数据帧值

[英]Python Filling dataframe values based on Column Index present in another dataframe value

I have 2 dataframes like below: dataframe df1:我有 2 个数据框,如下所示:数据框 df1:

id  val1    val2    val3    val4    val5
abc 0.0 1.0 4.0 3.0 4.0
dsssd   0.0 1.0 1.0 1.0 1.0
dsd 0.0 4.0 7.0 

Another dataframe df2:另一个数据帧 df2:

id  val1    val2    val3    val4    val5
abc 88 76 55 43 21
dsssd   92.4 21.3 22 45 49
dsd 22.3 87.2 78.2

df1 contains column index as values. df1 包含列索引作为值。 I want to create df3 which has corresponding index value from df2.我想创建 df3,它具有来自 df2 的相应索引值。 Expected results df3:预期结果df3:

id  val1    val2    val3    val4    val5
abc 88  76  21  43  21
dsssd   92.4    21.3    21.3    21.3    21.3
dsd 22.3    nan nan 

I have explored df.lookup and iloc, but couldn't get how it can be done.我探索了 df.lookup 和 iloc,但不知道如何完成。 I am still looking to find solution.我仍在寻找解决方案。 Meanwhile I posted it here, if anyone knew how it's done.同时我把它贴在这里,如果有人知道它是怎么做的。

import pandas as pd
import numpy as np

df1= pd.DataFrame({'id': ['abs', 'dssd', 'dsd'],
                   'val1': [0.0, 0.0, 0.0],
                   'val2': [1.0, 1.0, 4.0],
                   'val3': [4.0, 1.0, 7.0],
                   'val4': [3.0, 1.0, np.nan],
                   'val5': [4.0, 1.0, np.nan]})


df2= pd.DataFrame({'id': ['abs', 'dssd', 'dsd'],
                   'val1': [88.0, 92.4, 22.3],
                   'val2': [76.0, 21.3, 87.2],
                   'val3': [55.0, 22.0, 78.2],
                   'val4': [43.0, 45.0, np.nan],
                   'val5': [21.0, 49.0, np.nan]})

Thanks!谢谢!

You can use DataFrame.set_index with DataFrame.stack for reshape, add counter column by GroupBy.cumcount , left join by DataFrame.merge and last pivoting by DataFrame.pivot with change order of id by DataFrame.reindex :您可以使用DataFrame.set_indexDataFrame.stack进行重塑,通过GroupBy.cumcount添加计数器列,通过GroupBy.cumcount左连接,最后通过DataFrame.pivot进行DataFrame.merge并通过DataFrame.pivot更改id DataFrame.reindex

df11 = df1.set_index('id').stack().rename_axis(index=['id','v']).reset_index(name='idx')
# print (df11)

df22 = df2.set_index('id').stack().rename_axis(index=['id','v']).reset_index(name='val')
df22['idx'] = df22.groupby('id').cumcount()
# print (df22)


df = (df11.merge(df22, on=['id','idx'], how='left')
          .pivot(index='id', columns='v_x', values='val')
          .reindex(df1['id'])
          .rename_axis(None, axis=1)
          .reset_index()
          )
print (df)
     id  val1  val2  val3  val4  val5
0   abs  88.0  76.0  21.0  43.0  21.0
1  dssd  92.4  21.3  21.3  21.3  21.3
2   dsd  22.3   NaN   NaN   NaN   NaN

use merge使用合并

https://pandas.pydata.org/pandas-docs/stable/user_guide/merging.html https://pandas.pydata.org/pandas-docs/stable/user_guide/merging.html

pd.merge(df1, df2, how='outer', left_on='id', right_on='id',
         left_index=False, right_index=False, sort=True,
         suffixes=('_x', '_y'), copy=True, indicator=False,
         validate=None)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM