[英]overwriting cells from one pandas dataframe to another based on row values
I have two data sets (blank and filled) as shown below, and I neet to copy information from filled
into blank
,我有两个数据集(空白和填充),如下所示,我需要将信息从
filled
到blank
复制,
blank.head()
| Student name | Student number | Mark | Grade | Marked by | Notes |
|-------------- |---------------- |-------- |-------- |----------- |-------- |
| John |16 | NaN | NaN | NaN | NaN |
| Mary |19 | NaN | NaN | NaN | NaN |
| Colm |17 | NaN | NaN | NaN | NaN |
| Ellen |20 | NaN | NaN | NaN | NaN |
| Fionna |21 | NaN | NaN | NaN | NaN |
filled.head()
| Student name | Student number | Mark | Grade | Marked by | Notes |
|-------------- |---------------- |------ |------- |----------- |-------------------- |
| Tara | 31 | 71 | B1 | JL | Good |
| Leah | 40 | 54 | C2 | CL | Needs more dragons |
| john | 16 | 53 | C2 | MG | Good |
| Aisling | 200 | 60 | B3 | MOB | keep working |
| Adam | 88 | 74 | B1 | KOM | don't forget apa |
blank is my main document with the order I want to maintain, filled contains the grades and other info filled in for each student but it is not in the same order as df1.空白是我想要维护的顺序的主要文档,填充包含为每个学生填写的成绩和其他信息,但它与 df1 的顺序不同。
I need to copy the columns 'Mark','Grade','Marked by' and 'Notes' from df2 to df1 keeping the index for df1 intact, and copying the right information for each student.我需要将“Mark”、“Grade”、“Marked by”和“Notes”列从 df2 复制到 df1,保持 df1 的索引完好无损,并为每个学生复制正确的信息。
My first thought was to use the student number as the index of both data frames (I'm guessing those are unique numbers) and then just copying like this:我的第一个想法是使用学号作为两个数据框的索引(我猜这些是唯一的数字),然后像这样复制:
blank.set_index('student number')
filled.set_index('student number')
list = ['Mark', 'Grade', 'Marked by', 'Notes']
blank[list] = filled[list]
...not sure if this would work for you though... ...不知道这是否对你有用...
EDIT: Created the filled and blank dfs as exactly provided in the question since a key error was mentioned in the comment on my earlier code.编辑:创建了问题中完全提供的填充和空白 dfs,因为在我之前的代码的评论中提到了一个关键错误。
Input:输入:
blank= pd.DataFrame({'Student name' : ['John','Mary','Colm','Ellen','Fionna'], "Student number": [16,19,17,20,21], 'Mark' : [np.NaN,np.NaN,np.NaN,np.NaN,np.NaN],'Grade' : [np.NaN,np.NaN,np.NaN,np.NaN,np.NaN], 'Marked by' : [np.NaN,np.NaN,np.NaN,np.NaN,np.NaN], 'Notes' : [np.NaN,np.NaN,np.NaN,np.NaN,np.NaN]})
filled= pd.DataFrame({'Student name' : ['Tara','Leah','john','Aisling','Adam'], "Student number": [31,40,16,200,88], 'Mark' : [71,54,53,60,74],'Grade' : ['B1','C2','C2','B3','B1'], 'Marked by' : ['JL','CL','MG','MOB','KOM'], 'Notes' : ['Good','Needs more dragons','Good','keep working','dont forget apa']})
Blank:空白的:
Student name Student number Mark Grade Marked by Notes
0 John 16 53.0 C2 MG Good
1 Mary 19 NaN NaN NaN NaN
2 Colm 17 NaN NaN NaN NaN
3 Ellen 20 NaN NaN NaN NaN
4 Fionna 21 NaN NaN NaN NaN
Filled:填充:
Student name Student number Mark Grade Marked by Notes
0 Tara 31 71 B1 JL Good
1 Leah 40 54 C2 CL Needs more dragons
2 john 16 53 C2 MG Good
3 Aisling 200 60 B3 MOB keep working
4 Adam 88 74 B1 KOM dont forget apa
Assuming that 'student number' is common key for both data frames.假设“学生编号”是两个数据帧的共同密钥。 Code below:
代码如下:
blank[['Mark','Grade','Marked by','Notes']] = blank.merge(filled,on='Student number')[['Mark_y', 'Grade_y','Marked by_y','Notes_y']]
Output:输出:
Student name Student number Mark Grade Marked by Notes
0 John 16 53.0 C2 MG Good
1 Mary 19 NaN NaN NaN NaN
2 Colm 17 NaN NaN NaN NaN
3 Ellen 20 NaN NaN NaN NaN
4 Fionna 21 NaN NaN NaN NaN
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.