Python根据存在于另一个数据帧值中的列索引填充数据帧值

Question

I have 2 dataframes like below: dataframe df1:我有 2 个数据框，如下所示：数据框 df1：

id  val1    val2    val3    val4    val5
abc 0.0 1.0 4.0 3.0 4.0
dsssd   0.0 1.0 1.0 1.0 1.0
dsd 0.0 4.0 7.0

Another dataframe df2:另一个数据帧 df2：

id  val1    val2    val3    val4    val5
abc 88 76 55 43 21
dsssd   92.4 21.3 22 45 49
dsd 22.3 87.2 78.2

df1 contains column index as values. df1 包含列索引作为值。 I want to create df3 which has corresponding index value from df2.我想创建 df3，它具有来自 df2 的相应索引值。 Expected results df3:预期结果df3：

id  val1    val2    val3    val4    val5
abc 88  76  21  43  21
dsssd   92.4    21.3    21.3    21.3    21.3
dsd 22.3    nan nan

I have explored df.lookup and iloc, but couldn't get how it can be done.我探索了 df.lookup 和 iloc，但不知道如何完成。 I am still looking to find solution.我仍在寻找解决方案。 Meanwhile I posted it here, if anyone knew how it's done.同时我把它贴在这里，如果有人知道它是怎么做的。

import pandas as pd
import numpy as np

df1= pd.DataFrame({'id': ['abs', 'dssd', 'dsd'],
                   'val1': [0.0, 0.0, 0.0],
                   'val2': [1.0, 1.0, 4.0],
                   'val3': [4.0, 1.0, 7.0],
                   'val4': [3.0, 1.0, np.nan],
                   'val5': [4.0, 1.0, np.nan]})


df2= pd.DataFrame({'id': ['abs', 'dssd', 'dsd'],
                   'val1': [88.0, 92.4, 22.3],
                   'val2': [76.0, 21.3, 87.2],
                   'val3': [55.0, 22.0, 78.2],
                   'val4': [43.0, 45.0, np.nan],
                   'val5': [21.0, 49.0, np.nan]})

Thanks!谢谢！

Answer 1

You can use DataFrame.set_index with DataFrame.stack for reshape, add counter column by GroupBy.cumcount , left join by DataFrame.merge and last pivoting by DataFrame.pivot with change order of id by DataFrame.reindex :您可以使用DataFrame.set_index和DataFrame.stack进行重塑，通过GroupBy.cumcount添加计数器列，通过GroupBy.cumcount左连接，最后通过DataFrame.pivot进行DataFrame.merge并通过DataFrame.pivot更改id DataFrame.reindex ：

df11 = df1.set_index('id').stack().rename_axis(index=['id','v']).reset_index(name='idx')
# print (df11)

df22 = df2.set_index('id').stack().rename_axis(index=['id','v']).reset_index(name='val')
df22['idx'] = df22.groupby('id').cumcount()
# print (df22)


df = (df11.merge(df22, on=['id','idx'], how='left')
          .pivot(index='id', columns='v_x', values='val')
          .reindex(df1['id'])
          .rename_axis(None, axis=1)
          .reset_index()
          )
print (df)
     id  val1  val2  val3  val4  val5
0   abs  88.0  76.0  21.0  43.0  21.0
1  dssd  92.4  21.3  21.3  21.3  21.3
2   dsd  22.3   NaN   NaN   NaN   NaN

Answer 2

use merge使用合并

https://pandas.pydata.org/pandas-docs/stable/user_guide/merging.html https://pandas.pydata.org/pandas-docs/stable/user_guide/merging.html

pd.merge(df1, df2, how='outer', left_on='id', right_on='id',
         left_index=False, right_index=False, sort=True,
         suffixes=('_x', '_y'), copy=True, indicator=False,
         validate=None)

Python根据存在于另一个数据帧值中的列索引填充数据帧值

问题描述

2 个解决方案

解决方案1
1 已采纳 2020-09-25 10:59:15

解决方案2
0 2020-09-25 10:57:16

Python根据存在于另一个数据帧值中的列索引填充数据帧值

问题描述

2 个解决方案

解决方案1 1 已采纳 2020-09-25 10:59:15

解决方案2 0 2020-09-25 10:57:16

解决方案1
1 已采纳 2020-09-25 10:59:15

解决方案2
0 2020-09-25 10:57:16