![](/img/trans.png)
[英]Python Pandas replace NaN in one column with value from a row below of another column
[英]Python Pandas replace NaN in one column with value from another column of the same row it has be as list column
輸入數據框
data = {
'id' :[70,70,1148,557,557,104,581,69],
'r_id' : [[70,34, 44, 23, 11, 71], [70, 53, 33, 73, 41],
np.nan, np.nan, np.nan, np.nan,np.nan,[69, 68, 7],]
}
df = pd.DataFrame.from_dict(data)
print (df)
id r_id
0 70 [70, 34, 44, 23, 11, 71]
1 70 [70, 53, 33, 73, 41]
2 1148 NaN
3 557 NaN
4 557 NaN
5 104 NaN
6 581 NaN
7 69 [69, 68, 7]
輸出數據幀,
data = {
'id' :[70,70,1148,557,557,104,581,69],
'r_id' : [[70,34, 44, 23, 11, 71], [70, 53, 33, 73, 41],
[1148], [557], [557], [104],[581],[69, 68, 7]]
}
df = pd.DataFrame.from_dict(data)
print (df)
id r_id
0 70 [70, 34, 44, 23, 11, 71]
1 70 [70, 53, 33, 73, 41]
2 1148 [1148]
3 557 [557]
4 557 [557]
5 104 [104]
6 581 [581]
7 69 [69, 68, 7]
我想要帶有列表列的目標列 r_id 源列 id 不是列表,請參考 stackoverflow 中的以下鏈接, python-pandas-replace-nan-in-one-column 也嘗試了以下操作,data_merge_rel.RELATED_DEVICE.fillna (data_merge_rel.DF0_Desc_Label_i.to_list(), inplace=True)
我們可以使用list_comprehension
+ Series.fillna
。
首先,我們創建一個列表,其中所有id
值都轉換為list
類型。 然后我們在這里用我們的列表值替換NaN
:
df['temp'] = [[x] for x in df['id']]
df['r_id'] = df['r_id'].fillna(df['temp'])
df = df.drop(columns='temp')
或者在一行中使用apply
(感謝r.ook )
df['r_id'] = df['r_id'].fillna(df['id'].apply(lambda x: [x]))
id r_id
0 70 [70, 34, 44, 23, 11, 71]
1 70 [70, 53, 33, 73, 41]
2 1148 [1148]
3 557 [557]
4 557 [557]
5 104 [104]
6 581 [581]
7 69 [69, 68, 7]
您可以將列 id 轉換為一個數組,添加一個維度,然后創建一個列表並使用 Series fillna
,例如:
df['r_id'] = df['r_id'].fillna(pd.Series(df.id.to_numpy()[:,None].tolist(), index=df.index))
print (df)
id r_id
0 70 [70, 34, 44, 23, 11, 71]
1 70 [70, 53, 33, 73, 41]
2 1148 [1148]
3 557 [557]
4 557 [557]
5 104 [104]
6 581 [581]
7 69 [69, 68, 7]
或者如果你沒有很多nan
,在做任何事情之前只選擇這些行可能是值得的:
mask_na = df.r_id.isna()
df.loc[mask_na, 'r_id'] = pd.Series(df.loc[mask_na,'id'].to_numpy()[:,None].tolist(),
index=df[mask_na].index)
我認為 anky_91 的回答會更快,但你也可以試試這個:
df['r_id'] = np.where(df['r_id'].isnull(),
df['id'].apply(lambda x: [x]),
df['r_id'])
輸出:
id r_id
0 70 [70, 34, 44, 23, 11, 71]
1 70 [70, 53, 33, 73, 41]
2 1148 [1148]
3 557 [557]
4 557 [557]
5 104 [104]
6 581 [581]
7 69 [69, 68, 7]
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.