如何根据另一个数据框中的列填充数据框中的空值？

Question

I have a dataframe called df1 : 我有一个名为df1的数据df1 ：

ID     Value       Name      Score
-1      10           A         -1
-1       5           B         -1
NaN     0.2       Track C     100
NaN     0.5       Track C     200
1        0           D        100
5        0           D        200

I want to fill the NaN in column ID with multiple rows of Score data from dataframe df2 . 我想用数据帧df2多行Score数据填充列ID的NaN 。

df2 : df2 ：

Score    ID
100      1
100      2
100      3
100      4
200      5
200      6
200      7

So that ultimately, my final dataframe looks like this: df3 : 最终，我的最终数据帧如下所示： df3 ：

ID     Value       Name      Score
-1      10           A         -1
-1       5           B         -1
1       0.2       Track C     100
2       0.2       Track C     100
3       0.2       Track C     100
4       0.2       Track C     100
5       0.5       Track C     200
6       0.5       Track C     200
7       0.5       Track C     200
1        0           D        100
5        0           D        200

How could I accomplish this? 我该怎么做？

Answer 1

I have a solution, but it is not elegant, I plea experienced users to take a look at this. 我有一个解决方案，但是它并不优雅，我恳请经验丰富的用户来看看。

to ease others, here are the code to setup the test case: 为了使其他人感到轻松，以下是设置测试用例的代码：

df1 = pd.DataFrame(
columns=\
'ID     Value       Name      Score'.split(),

data = [
re.split('\s{2,}', line)  for line in \
"""
-1      10           A         -1
-1       5           B         -1
NaN     0.2       Track C     100
NaN     0.5       Track C     200
1        0           D        100
5        0           D        200
""".strip().split('\n')  
],
)

df1 = df1.replace({'NaN':np.nan})

df2 = pd.DataFrame(

columns=\
'Score    ID'.split(),

data = [
re.split('\s{2,}', line)  for line in \
"""
100      1
100      2
100      3
100      4
200      5
200      6
200      7
""".strip().split('\n')  
],
)

and my solution is: 我的解决方案是：

"""
the general first reaction is to pd.merge().
however the hurdle is, how to deal with the fillna of the column "ID".
mine works, but it is too hard coded.
"""

df = pd.merge(left=df1, right=df2, on='Score', how='left')

df['ID'] = df['ID_x'].fillna(df['ID_y'])

finalresult = df.drop(columns=['ID_x', 'ID_y']).drop_duplicates(subset=['ID','Name'])

OUTPUT: OUTPUT：

   Value     Name Score  ID
0     10        A    -1  -1
1      5        B    -1  -1
2    0.2  Track C   100   1
3    0.2  Track C   100   2
4    0.2  Track C   100   3
5    0.2  Track C   100   4
6    0.5  Track C   200   5
7    0.5  Track C   200   6
8    0.5  Track C   200   7
9      0        D   100   1
13     0        D   200   5

Answer 2

You can first use pandas.merge then use pandas.concat to concat both dataframes over axis=0 : 您可以先使用pandas.merge然后使用pandas.concat在axis=0 pandas.concat两个数据帧：

s = pd.merge(df2, df, on='Score', how='left', suffixes=['', '_2'])\
      .drop('ID_2', axis=1)\
      .drop_duplicates('ID')

df3 = pd.concat([df.dropna(), s], ignore_index=True)

Output 产量

print(df3)
     ID     Name  Score  Value
0  -1.0        A     -1   10.0
1  -1.0        B     -1    5.0
2   1.0        D    100    0.0
3   5.0        D    200    0.0
4   1.0  Track C    100    0.2
5   2.0  Track C    100    0.2
6   3.0  Track C    100    0.2
7   4.0  Track C    100    0.2
8   5.0  Track C    200    0.5
9   6.0  Track C    200    0.5
10  7.0  Track C    200    0.5

Answer 3

split your df, then using merge and concat back 分割您的df，然后使用merge和concat返回

df1_1=df1.loc[df1.ID.isnull()].copy()
df1_2=df1.loc[df1.ID.notnull()].copy()
df1_1=df1_1.reset_index().drop('ID',1).merge(df2,on='Score',how='left').set_index('index')

yourdf=pd.concat([df1_1,df1_2],sort=False).sort_index()
yourdf
Out[645]: 
   Value    Name  Score   ID
0   10.0       A     -1 -1.0
1    5.0       B     -1 -1.0
2    0.2  TrackC    100  1.0
2    0.2  TrackC    100  2.0
2    0.2  TrackC    100  3.0
2    0.2  TrackC    100  4.0
3    0.5  TrackC    200  5.0
3    0.5  TrackC    200  6.0
3    0.5  TrackC    200  7.0
4    0.0       D    100  1.0
5    0.0       D    200  5.0

如何根据另一个数据框中的列填充数据框中的空值？

问题描述

3 个解决方案

解决方案1
3 2019-04-26 22:30:40

解决方案2
2 2019-04-26 22:30:16

解决方案3
0 已采纳 2019-04-26 22:30:21

如何根据另一个数据框中的列填充数据框中的空值？

问题描述

3 个解决方案

解决方案1 3 2019-04-26 22:30:40

解决方案2 2 2019-04-26 22:30:16

解决方案3 0 已采纳 2019-04-26 22:30:21

解决方案1
3 2019-04-26 22:30:40

解决方案2
2 2019-04-26 22:30:16

解决方案3
0 已采纳 2019-04-26 22:30:21