用熊猫数据框中另一列的相同值填充空值

Question

i have a pandas dataframe like the following: 我有一个如下所示的熊猫数据框：

How do I fill up the empty cells with same policy numbers for same product type as they exist? 如何为空单元格填充相同产品类型的相同策略编号？

Any suggestion would be very much appreciated. 任何建议将不胜感激。 Thank you 谢谢

Sorry for the confusion, I am adding my sample dataframe now: 抱歉，我现在要添加示例数据框：

sample=[{'POLICY NUMBER':' ','PRODUCT TYPE':'MED'},{'POLICY NUMBER':' ','PRODUCT TYPE':'MED'},{'POLICY NUMBER':'433M49763','PRODUCT TYPE':'MED'},{'POLICY NUMBER':'433M86968','PRODUCT TYPE':'MED'},{'POLICY NUMBER':' ','PRODUCT TYPE':'TED'},{'POLICY NUMBER':'566D158635 ','PRODUCT TYPE':'TED'},{'POLICY NUMBER':'655D158635','PRODUCT TYPE':'TED'},{'POLICY NUMBER':'789D158635','PRODUCT TYPE':'TED'}] sample = [{'POLICY NUMBER'：''，'PRODUCT TYPE'：'MED'}，{'POLICY NUMBER'：''，'PRODUCT TYPE'：'MED'}，{'POLICY NUMBER'：'433M49763' ，'PRODUCT TYPE'：'MED'}，{'POLICY NUMBER'：'433M86968'，'PRODUCT TYPE'：'MED'}，{'POLICY NUMBER'：''，'PRODUCT TYPE'：'TED'}， {'POLICY NUMBER'：'566D158635'，'PRODUCT TYPE'：'TED'}，{'POLICY NUMBER'：'655D158635'，'PRODUCT TYPE'：'TED'}，{'POLICY NUMBER'：'789D158635'， 'PRODUCT TYPE'：'TED'}]

pd.DataFrame(sample) pd.DataFrame（样本）

please note that the empty cells have " " in them too, they are not NaN across the whole dataframe 请注意，空单元格中也包含“”，它们在整个数据框中都不是NaN

Adding to the question above. 添加到上面的问题。 If I have the altered dataframe as above. 如果我有如上所述的更改的数据帧。 How do i get to the following dataframe: 我如何到达以下数据框：

Answer 1

I think you need groupby + transform : 我认为你需要groupby + transform ：

If only one same category per group and no data are empty string s : 如果每个组只有一个相同类别并且没有数据，则为空string s：

df['POLICY NUMBER'] = (df.groupby('PRODUCT TYPE')['POLICY NUMBER']
                         .transform(lambda x: x[x != ''].iat[0]))

print (df)
  POLICY NUMBER PRODUCT TYPE
0     433M86968          MED
1     433M86968          MED
2     433M86968          MED
3     433M86968          MED
4    566D158635          TED
5    566D158635          TED
6    566D158635          TED
7    566D158635          TED

Or if posible there are not always empty stings, but sometimes there are wtrailing whitespaces , need strip : 或者，如果可能的话，并非总是空wtrailing whitespaces ，但有时会有wtrailing whitespaces ，需要使用strip ：

df['POLICY NUMBER'] = (df['POLICY NUMBER'].str.strip().groupby(df['PRODUCT TYPE'])
                                  .transform(lambda x: x[x != ''].iat[0]))

print (df)
  POLICY NUMBER PRODUCT TYPE
0     433M86968          MED
1     433M86968          MED
2     433M86968          MED
3     433M86968          MED
4    566D158635          TED
5    566D158635          TED
6    566D158635          TED
7    566D158635          TED

Solution with sorting and transform last value: 排序和转换last值的解决方案：

df['POLICY NUMBER'] = (df.sort_values(['PRODUCT TYPE','POLICY NUMBER'])
                         .groupby('PRODUCT TYPE')['POLICY NUMBER']
                         .transform('last'))
print (df)
  POLICY NUMBER PRODUCT TYPE
0     433M86968          MED
1     433M86968          MED
2     433M86968          MED
3     433M86968          MED
4    566D158635          TED
5    566D158635          TED
6    566D158635          TED
7    566D158635          TED

EDIT: You need replace empty strings by NaN s and then use bfill for back forward filling NaN s with ffill for forward fillin NaNs: 编辑：您需要用NaN替换空字符串，然后使用bfill进行反向向前填充NaN ，而ffill进行向前填充NaNs：

df['POLICY NUMBER'] = (df['POLICY NUMBER'].str.strip()
                                          .replace('',np.nan)
                                          .groupby(df['PRODUCT TYPE'])
                                          .transform(lambda x: x.bfill().ffill()))

print (df)
  POLICY NUMBER PRODUCT TYPE
0     433M49763          MED
1     433M49763          MED
2     433M49763          MED
3     433M86968          MED
4    566D158635          TED
5    566D158635          TED
6    566D158635          TED
7    789D158635          TED

用熊猫数据框中另一列的相同值填充空值

问题描述

1 个解决方案

解决方案1
2 已采纳 2017-11-14 18:29:26

用熊猫数据框中另一列的相同值填充空值

问题描述

1 个解决方案

解决方案1 2 已采纳 2017-11-14 18:29:26

解决方案1
2 已采纳 2017-11-14 18:29:26