如果其他列中的值相同，则向前或向后填充 NA

Question

Given this example:鉴于这个例子：

import pandas as pd
df = pd.DataFrame({
    "date": ["20180724", "20180725", "20180731", "20180723", "20180731"],
    "identity": [None, "A123456789", None, None, None],
    "hid": [12345, 12345, 12345, 54321, 54321],
    "hospital": ["A", "A", "A", "B", "B"],
    "result": [70, None, 100, 90, 78]
})

Because the first three rows have the same hid and hospital , the values in identity should also be identical.因为前三行的hid和hospital相同，所以identity的值也应该相同。 As for the other two rows, they have the same hid and hospital as well, but no known identity was provided, so the values in identity should remain missing.至于其他两行，它们具有相同的hid和hospital为好，但没有已知的identity被提供，所以在价值identity应该仍然下落不明。 In other words, the desired output is:换句话说，所需的输出是：

       date    identity    hid hospital  result
0  20180724  A123456789  12345        A    70.0
1  20180725  A123456789  12345        A     NaN
2  20180731  A123456789  12345        A   100.0
3  20180723        None  54321        B    90.0
4  20180731        None  54321        B    78.0

I can loop through all combinations of hid s and hospital s like for hid, hospital in df[["hid", "hospital"]].drop_duplicates().itertuples(index=False) , but I don't know how to do next.我可以循环遍历hid和hospital的所有组合，如for hid, hospital in df[["hid", "hospital"]].drop_duplicates().itertuples(index=False) ，但我不知道如何接下来做。

Answer 1

Use groupby and apply in combination with ffill and bfill :使用groupby和apply与组合ffill和bfill ：

df['identity'] = df.groupby(['hid', 'hospital'])['identity'].apply(lambda x: x.ffill().bfill())

This will fill NaNs forward and backwards while separating the values for the specified groups.这将在分隔指定组的值的同时向前和向后填充 NaN。

如果其他列中的值相同，则向前或向后填充 NA

问题描述

1 个解决方案

解决方案1
1 已采纳 2018-08-28 09:58:07

如果其他列中的值相同，则向前或向后填充 NA

问题描述

1 个解决方案

解决方案1 1 已采纳 2018-08-28 09:58:07

解决方案1
1 已采纳 2018-08-28 09:58:07