繁体   English   中英

根据行值将单元格从一个 Pandas 数据帧覆盖到另一个

[英]overwriting cells from one pandas dataframe to another based on row values

我有两个数据集(空白和填充),如下所示,我需要将信息从filledblank复制,

blank.head()
| Student name  | Student number    | Mark      | Grade     | Marked by     | Notes     |
|-------------- |----------------   |--------   |--------   |-----------    |--------   |
| John          |16                 | NaN       | NaN       | NaN           | NaN   |
| Mary          |19                 | NaN       | NaN       | NaN           | NaN   |
| Colm          |17                 | NaN       | NaN       | NaN           | NaN   |
| Ellen         |20                 | NaN       | NaN       | NaN           | NaN   |
| Fionna        |21                 | NaN       | NaN       | NaN           | NaN   |

filled.head()

| Student name  | Student number    | Mark  | Grade     | Marked by     | Notes                 |
|-------------- |----------------   |------ |-------    |-----------    |--------------------   |
| Tara          | 31                | 71    | B1        | JL            |        Good           |
| Leah          | 40                | 54    | C2        | CL            | Needs more dragons    |
| john          | 16                | 53    | C2        | MG            |        Good           |
| Aisling       | 200               | 60    | B3        | MOB           |    keep working       |
| Adam          | 88                | 74    | B1        | KOM           |  don't forget apa     |

空白是我想要维护的顺序的主要文档,填充包含为每个学生填写的成绩和其他信息,但它与 df1 的顺序不同。

我需要将“Mark”、“Grade”、“Marked by”和“Notes”列从 df2 复制到 df1,保持 df1 的索引完好无损,并为每个学生复制正确的信息。

我的第一个想法是使用学号作为两个数据框的索引(我猜这些是唯一的数字),然后像这样复制:

blank.set_index('student number')
filled.set_index('student number')

list = ['Mark', 'Grade', 'Marked by', 'Notes'] 
blank[list] = filled[list]

...不知道这是否对你有用...

编辑:创建了问题中完全提供的填充和空白 dfs,因为在我之前的代码的评论中提到了一个关键错误。

输入:

blank= pd.DataFrame({'Student name' : ['John','Mary','Colm','Ellen','Fionna'], "Student number": [16,19,17,20,21], 'Mark' : [np.NaN,np.NaN,np.NaN,np.NaN,np.NaN],'Grade' : [np.NaN,np.NaN,np.NaN,np.NaN,np.NaN], 'Marked by' : [np.NaN,np.NaN,np.NaN,np.NaN,np.NaN], 'Notes' : [np.NaN,np.NaN,np.NaN,np.NaN,np.NaN]})
filled= pd.DataFrame({'Student name' : ['Tara','Leah','john','Aisling','Adam'], "Student number": [31,40,16,200,88], 'Mark' : [71,54,53,60,74],'Grade' : ['B1','C2','C2','B3','B1'], 'Marked by' : ['JL','CL','MG','MOB','KOM'], 'Notes' : ['Good','Needs more dragons','Good','keep working','dont forget apa']})

空白的:

  Student name  Student number  Mark Grade Marked by Notes
0         John              16  53.0    C2        MG  Good
1         Mary              19   NaN   NaN       NaN   NaN
2         Colm              17   NaN   NaN       NaN   NaN
3        Ellen              20   NaN   NaN       NaN   NaN
4       Fionna              21   NaN   NaN       NaN   NaN

填充:

  Student name  Student number  Mark Grade Marked by               Notes
0         Tara              31    71    B1        JL                Good
1         Leah              40    54    C2        CL  Needs more dragons
2         john              16    53    C2        MG                Good
3      Aisling             200    60    B3       MOB        keep working
4         Adam              88    74    B1       KOM     dont forget apa

假设“学生编号”是两个数据帧的共同密钥。 代码如下:

blank[['Mark','Grade','Marked by','Notes']] = blank.merge(filled,on='Student number')[['Mark_y', 'Grade_y','Marked by_y','Notes_y']]

输出:

  Student name  Student number  Mark Grade Marked by Notes
0         John              16  53.0    C2        MG  Good
1         Mary              19   NaN   NaN       NaN   NaN
2         Colm              17   NaN   NaN       NaN   NaN
3        Ellen              20   NaN   NaN       NaN   NaN
4       Fionna              21   NaN   NaN       NaN   NaN

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM