简体   繁体   English

根据另一列中的值填充 N/A 数据

[英]Fill N/A data based on value in another column

I have a csv file with 2 column store_name and store_location that some store_location is missing.我有一个 csv 文件,其中包含 2 列store_namestore_location ,其中缺少一些store_location And I want to fill missing data with data in same column based on value in another column.我想根据另一列中的值用同一列中的数据填充缺失的数据。

Below is my csv file:下面是我的 csv 文件:

import pandas as pd

df = pd.read_csv('https://raw.githubusercontent.com/hoatranobita/app_to_cloud_4/main/store_location.csv')

在此处输入图像描述

Here is my expected Output:这是我预期的 Output:

在此处输入图像描述

I tried to find solutions but still not find out.我试图找到解决方案,但仍然没有找到。

Thanks.谢谢。

TL;DR: providing 3 different approaches in case you want to: TL;DR:提供 3 种不同的方法,以防您想要:

  1. ensure a unique value per group确保每个组的唯一值

  2. fill the NaN with the first available value用第一个可用值填充 NaN

  3. fill the NaN with the previous/next non-NA row用上一个/下一个非 NA 行填充 NaN

  4. Looks like you could need a unique value per group.看起来您可能需要每个组的唯一值。 Use groupby.transform('first') to get the first non-NA value:使用groupby.transform('first')获取第一个非 NA 值:

df['store_location'] = df.groupby('store_name')['store_location'].transform('first')

output: output:

                             store_name                         store_location
0                       AJ's Liquor III           POINT (-93.648959 42.021456)
1                       AJ's Liquor III           POINT (-93.648959 42.021456)
2                Ambysure Inc / Clinton           POINT (-90.225022 41.833351)
3                Ambysure Inc / Clinton           POINT (-90.225022 41.833351)
4                 Bancroft Liquor Store               POINT (-94.218 43.29355)
5                 Bancroft Liquor Store               POINT (-94.218 43.29355)
6                                Bani's  POINT (-92.455801 42.518018000000005)
7                  Bani's / Cedar Falls  POINT (-92.455801 42.518018000000005)
8                  Bani's / Cedar Falls  POINT (-92.455801 42.518018000000005)
9                      Barrys Mini Mart            POINT (-91.38553 43.050183)
10                 Baxter Family Market           POINT (-93.151465 41.826715)
11             Beecher Liquor / Dubuque  POINT (-90.696886 42.500775000000004)
12           Beer on Floyd / Sioux City  POINT (-96.372185 42.531448000000005)
13                  Beer Thirty Denison           POINT (-95.360162 42.012412)
14  Beer Thirty Storm Lake / Storm Lake           POINT (-95.198584 42.646794)
15  Beer Thirty Storm Lake / Storm Lake           POINT (-95.198584 42.646794)
16  Beer Thirty Storm Lake / Storm Lake           POINT (-95.198584 42.646794)
  1. If there are different values and you want to preserve them, you can replace the NaN with the first non-NA value:如果有不同的值并且您想保留它们,可以将 NaN 替换为第一个非 NA 值:
df['store_location'] = df['store_location'].fillna(df.groupby('store_name')['store_location'].transform('first'))

output: output:

                             store_name                         store_location
0                       AJ's Liquor III           POINT (-93.648959 42.021456)
1                       AJ's Liquor III           POINT (-93.648959 42.021456)
2                Ambysure Inc / Clinton           POINT (-90.225022 41.833351)
3                Ambysure Inc / Clinton           POINT (-90.225022 41.833351)
4                 Bancroft Liquor Store               POINT (-94.218 43.29355)
5                 Bancroft Liquor Store               POINT (-94.218 43.29355)
6                                Bani's  POINT (-92.455801 42.518018000000005)
7                  Bani's / Cedar Falls  POINT (-92.455801 42.518018000000005)
8                  Bani's / Cedar Falls  POINT (-92.455801 42.518018000000005)
9                      Barrys Mini Mart            POINT (-91.38553 43.050183)
10                 Baxter Family Market           POINT (-93.151465 41.826715)
11             Beecher Liquor / Dubuque  POINT (-90.696886 42.500775000000004)
12           Beer on Floyd / Sioux City  POINT (-96.372185 42.531448000000005)
13                  Beer Thirty Denison           POINT (-95.360162 42.012412)
14  Beer Thirty Storm Lake / Storm Lake           POINT (-95.198584 42.646794)
15  Beer Thirty Storm Lake / Storm Lake   POINT (-95.19941700000001 42.647498)
16  Beer Thirty Storm Lake / Storm Lake           POINT (-95.198584 42.646794)
  1. Alternatively, use the previous/next non-NA values per group with ffill + bfill :或者,使用ffill + bfill每组的上一个/下一个非 NA 值:
df['store_location'] = df.groupby('store_name')['store_location'].transform(lambda g: g.ffill().bfill())

output: output:

                             store_name                         store_location
0                       AJ's Liquor III           POINT (-93.648959 42.021456)
1                       AJ's Liquor III           POINT (-93.648959 42.021456)
2                Ambysure Inc / Clinton           POINT (-90.225022 41.833351)
3                Ambysure Inc / Clinton           POINT (-90.225022 41.833351)
4                 Bancroft Liquor Store               POINT (-94.218 43.29355)
5                 Bancroft Liquor Store               POINT (-94.218 43.29355)
6                                Bani's  POINT (-92.455801 42.518018000000005)
7                  Bani's / Cedar Falls  POINT (-92.455801 42.518018000000005)
8                  Bani's / Cedar Falls  POINT (-92.455801 42.518018000000005)
9                      Barrys Mini Mart            POINT (-91.38553 43.050183)
10                 Baxter Family Market           POINT (-93.151465 41.826715)
11             Beecher Liquor / Dubuque  POINT (-90.696886 42.500775000000004)
12           Beer on Floyd / Sioux City  POINT (-96.372185 42.531448000000005)
13                  Beer Thirty Denison           POINT (-95.360162 42.012412)
14  Beer Thirty Storm Lake / Storm Lake           POINT (-95.198584 42.646794)
15  Beer Thirty Storm Lake / Storm Lake   POINT (-95.19941700000001 42.647498)
16  Beer Thirty Storm Lake / Storm Lake   POINT (-95.19941700000001 42.647498)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM