使用基于 ID 列的另一行的值来估算 Pandas 数据框列

Question

df:东风：

id   name 
0    toto                    
1    tata
0    NaN

I would like to impute the name column missing value on the third row based on the id.我想根据 id 在第三行估算名称列缺失值。 The desired dataframe would be:所需的数据框将是：

id   name 
0    toto                    
1    tata
0    toto

I did the following:我做了以下事情：

df.loc[df.name.isna(), "name"] = df["id"].map(df["name"])

but it is not working.但它不工作。

Answer 1

import pandas as pd
df = pd.DataFrame({'id':[0,1,0],
              'name':['toto','tata',pd.NA]})

df = df[['id']].merge(df[pd.notna(df['name'])].drop_duplicates(),
                      how = 'left', 
                      on = 'id')
df

Answer 2

If there is only one value exists in the group, you can try如果组中只存在一个值，您可以尝试

df = df.groupby('id').apply(lambda g: g.ffill().bfill())

print(df)

   name
0  toto
1  tata
2  toto

Or sort NaN to the last或者将NaN排序到最后

df = (df.sort_values('name')
      .groupby('id').ffill()
      .sort_index())

使用基于 ID 列的另一行的值来估算 Pandas 数据框列

问题描述

2 个解决方案

解决方案1
2 已采纳 2022-05-29 19:32:59

解决方案2
1 2022-05-29 19:39:50

使用基于 ID 列的另一行的值来估算 Pandas 数据框列

问题描述

2 个解决方案

解决方案1 2 已采纳 2022-05-29 19:32:59

解决方案2 1 2022-05-29 19:39:50

解决方案1
2 已采纳 2022-05-29 19:32:59

解决方案2
1 2022-05-29 19:39:50