根据另一个表的列更新表信息

Question

我是 python 的新手，有两个数据框，df1 包含有关所有学生及其组和分数的信息，df2 包含有关少数学生更改组和分数时的更新信息。 如何根据 df2 的值（组和分数）更新 df1 中的信息？

df1

   +----+----------+-----------+----------------+
    |    |student No|   group   |       score    |
    |----+----------+-----------+----------------|
    |  0 |        0 |         0 |       0.839626 |
    |  1 |        1 |         0 |       0.845435 |
    |  2 |        2 |         3 |       0.830778 |
    |  3 |        3 |         2 |       0.831565 |
    |  4 |        4 |         3 |       0.823569 |
    |  5 |        5 |         0 |       0.808109 |
    |  6 |        6 |         4 |       0.831645 |
    |  7 |        7 |         1 |       0.851048 |
    |  8 |        8 |         3 |       0.843209 |
    |  9 |        9 |         4 |       0.84902  |
    | 10 |       10 |         0 |       0.835143 |
    | 11 |       11 |         4 |       0.843228 |
    | 12 |       12 |         2 |       0.826949 |
    | 13 |       13 |         0 |       0.84196  |
    | 14 |       14 |         1 |       0.821634 |
    | 15 |       15 |         3 |       0.840702 |
    | 16 |       16 |         0 |       0.828994 |
    | 17 |       17 |         2 |       0.843043 |
    | 18 |       18 |         4 |       0.809093 |
    | 19 |       19 |         1 |       0.85426  |
    +----+----------+-----------+----------------+

df2
+----+-----------+----------+----------------+
|    |   group   |student No|       score    |
|----+-----------+----------+----------------|
|  0 |         2 |        1 |       0.887435 |
|  1 |         0 |       19 |       0.81214  |
|  2 |         3 |       17 |       0.899041 |
|  3 |         0 |        8 |       0.853333 |
|  4 |         4 |        9 |       0.88512  |
+----+-----------+----------+----------------+

结果

自由度：3

   +----+----------+-----------+----------------+
    |    |student No|   group   |       score    |
    |----+----------+-----------+----------------|
    |  0 |        0 |         0 |       0.839626 |
    |  1 |        1 |         2 |       0.887435 |
    |  2 |        2 |         3 |       0.830778 |
    |  3 |        3 |         2 |       0.831565 |
    |  4 |        4 |         3 |       0.823569 |
    |  5 |        5 |         0 |       0.808109 |
    |  6 |        6 |         4 |       0.831645 |
    |  7 |        7 |         1 |       0.851048 |
    |  8 |        8 |         0 |       0.853333 |
    |  9 |        9 |         4 |       0.88512  |
    | 10 |       10 |         0 |       0.835143 |
    | 11 |       11 |         4 |       0.843228 |
    | 12 |       12 |         2 |       0.826949 |
    | 13 |       13 |         0 |       0.84196  |
    | 14 |       14 |         1 |       0.821634 |
    | 15 |       15 |         3 |       0.840702 |
    | 16 |       16 |         0 |       0.828994 |
    | 17 |       17 |         3 |       0.899041 |
    | 18 |       18 |         4 |       0.809093 |
    | 19 |       19 |         0 |       0.81214  |
    +----+----------+-----------+----------------+

我的代码从 df2 更新 df1

dfupdated = df1.merge(df2, how='left', on=['student No'], suffixes=('', '_new'))
dfupdated['group'] = np.where(pd.notnull(dfupdated['group_new']), dfupdated['group_new'],
                                         dfupdated['group'])
dfupdated['score'] = np.where(pd.notnull(dfupdated['score_new']), dfupdated['score_new'],
                                         dfupdated['score'])
dfupdated.drop(['group_new', 'score_new'],axis=1, inplace=True)
dfupdated.reset_index(drop=True, inplace=True)

但我面临以下错误

KeyError: "['group'] not in index"

Answer 1

我不知道怎么了

尝试：

dfupdated = df1.merge(df2, on='student No', how='left')
dfupdated['group'] = dfupdated['group_y'].fillna(dfupdated['group_x'])
dfupdated['score'] = dfupdated['score_y'].fillna(dfupdated['score_x'])
dfupdated.drop(['group_x', 'group_y','score_x', 'score_y'], axis=1,inplace=True)

会给你你想要的解决方案。

从每个组中获得最大值

dfupdated.groupby(['group'], sort=False)['score'].max()

根据另一个表的列更新表信息

问题描述

1 个解决方案

解决方案1
1 已采纳 2021-02-27 19:16:44

根据另一个表的列更新表信息

问题描述

1 个解决方案

解决方案1 1 已采纳 2021-02-27 19:16:44

解决方案1
1 已采纳 2021-02-27 19:16:44