如果使用 Pandas 在另一行的兩列中匹配，則替換一行中的缺失值

Question

我正在做一個數據分析項目，我有以下 dataframe 看起來像這樣。

ID	店鋪	長	緯度
1	一個	1	-4
2	鈉	2	3
3	C	4	5
4	D	2	3

我想用 id 為 4 的行中的一個填充“store”列中的缺失值 NaN，因為 id 為 2 和 4 的行在“long”和“lat”列中具有相同的值，因此 output 應該看起來像這樣

ID	店鋪	長	緯度
1	一個	1	-4
2	D	2	3
3	C	4	5
4	D	2	3

我想為長 dataframe （幾乎一百萬行）執行此操作，所以我不知道具有相同“long”和“lat”值的行 ID。

我正在使用 Pandas 研究 Python。 我只使用for 循環和 iterrows() 提出了這個解決方案，這非常慢

df_missing_names = df[df['store'].isna()] #rows that have missing names
df_with_names = df[df['store'].notna()] #rows that don't have missing names

for indx, row in df_missing_names.iterrows(): #run through all the rows that don't have names

    for indx_j, row_j in df_with_names.iterrows(): #run through all the rows that have names

        if (row.lat == row_j.lat) & (row.long == row_j.long): #if both lat and long values match
            df[indx, 'store'] = row_j.store #then update name of the row in the original dataframe

有沒有更快的方法使用 Pandas 上的內置函數來執行此操作？ 謝謝您的幫助

Answer 1

您可以使用：

df['store'] = df.groupby(['long', 'lat'], sort=False).bfill()['store']

Output：

   id store  long  lat
0   1     A     1   -4
1   2     D     2    3
2   3     C     4    5
3   4     D     2    3

如果使用 Pandas 在另一行的兩列中匹配，則替換一行中的缺失值

問題描述

1 個解決方案

解決方案1
0 2022-08-25 19:52:01

如果使用 Pandas 在另一行的兩列中匹配，則替換一行中的缺失值

問題描述

1 個解決方案

解決方案1 0 2022-08-25 19:52:01

解決方案1
0 2022-08-25 19:52:01