[英]Overwriting values in a pandas dataframe based on NA values from a second one
[英]overwriting cells from one pandas dataframe to another based on row values
我有两个数据集(空白和填充),如下所示,我需要将信息从filled
到blank
复制,
blank.head()
| Student name | Student number | Mark | Grade | Marked by | Notes |
|-------------- |---------------- |-------- |-------- |----------- |-------- |
| John |16 | NaN | NaN | NaN | NaN |
| Mary |19 | NaN | NaN | NaN | NaN |
| Colm |17 | NaN | NaN | NaN | NaN |
| Ellen |20 | NaN | NaN | NaN | NaN |
| Fionna |21 | NaN | NaN | NaN | NaN |
filled.head()
| Student name | Student number | Mark | Grade | Marked by | Notes |
|-------------- |---------------- |------ |------- |----------- |-------------------- |
| Tara | 31 | 71 | B1 | JL | Good |
| Leah | 40 | 54 | C2 | CL | Needs more dragons |
| john | 16 | 53 | C2 | MG | Good |
| Aisling | 200 | 60 | B3 | MOB | keep working |
| Adam | 88 | 74 | B1 | KOM | don't forget apa |
空白是我想要维护的顺序的主要文档,填充包含为每个学生填写的成绩和其他信息,但它与 df1 的顺序不同。
我需要将“Mark”、“Grade”、“Marked by”和“Notes”列从 df2 复制到 df1,保持 df1 的索引完好无损,并为每个学生复制正确的信息。
我的第一个想法是使用学号作为两个数据框的索引(我猜这些是唯一的数字),然后像这样复制:
blank.set_index('student number')
filled.set_index('student number')
list = ['Mark', 'Grade', 'Marked by', 'Notes']
blank[list] = filled[list]
...不知道这是否对你有用...
编辑:创建了问题中完全提供的填充和空白 dfs,因为在我之前的代码的评论中提到了一个关键错误。
输入:
blank= pd.DataFrame({'Student name' : ['John','Mary','Colm','Ellen','Fionna'], "Student number": [16,19,17,20,21], 'Mark' : [np.NaN,np.NaN,np.NaN,np.NaN,np.NaN],'Grade' : [np.NaN,np.NaN,np.NaN,np.NaN,np.NaN], 'Marked by' : [np.NaN,np.NaN,np.NaN,np.NaN,np.NaN], 'Notes' : [np.NaN,np.NaN,np.NaN,np.NaN,np.NaN]})
filled= pd.DataFrame({'Student name' : ['Tara','Leah','john','Aisling','Adam'], "Student number": [31,40,16,200,88], 'Mark' : [71,54,53,60,74],'Grade' : ['B1','C2','C2','B3','B1'], 'Marked by' : ['JL','CL','MG','MOB','KOM'], 'Notes' : ['Good','Needs more dragons','Good','keep working','dont forget apa']})
空白的:
Student name Student number Mark Grade Marked by Notes
0 John 16 53.0 C2 MG Good
1 Mary 19 NaN NaN NaN NaN
2 Colm 17 NaN NaN NaN NaN
3 Ellen 20 NaN NaN NaN NaN
4 Fionna 21 NaN NaN NaN NaN
填充:
Student name Student number Mark Grade Marked by Notes
0 Tara 31 71 B1 JL Good
1 Leah 40 54 C2 CL Needs more dragons
2 john 16 53 C2 MG Good
3 Aisling 200 60 B3 MOB keep working
4 Adam 88 74 B1 KOM dont forget apa
假设“学生编号”是两个数据帧的共同密钥。 代码如下:
blank[['Mark','Grade','Marked by','Notes']] = blank.merge(filled,on='Student number')[['Mark_y', 'Grade_y','Marked by_y','Notes_y']]
输出:
Student name Student number Mark Grade Marked by Notes
0 John 16 53.0 C2 MG Good
1 Mary 19 NaN NaN NaN NaN
2 Colm 17 NaN NaN NaN NaN
3 Ellen 20 NaN NaN NaN NaN
4 Fionna 21 NaN NaN NaN NaN
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.