簡體   English   中英

使用來自其他列的非空值填充列中的空值

[英]Fill nulls in columns with non-null values from other columns

給定一個 dataframe 和類似的列,它們之間有 null 個值。 如何使用其他列的非空值動態填充列中的空值而不明確說明其他列名稱的名稱,例如 select 第一列category1 1 並使用同一行其他列的值填充 null 行?

data = {'year': [2010, 2011, 2012, 2013, 2014, 2015, 2016,2017, 2018, 2019],
        'category1': [None, 21, None, 10, None, 30, 31,45, 23, 56],
        'category2': [10, 21, 20, 10, None, 30, None,45, 23, 56],
        'category3': [10, 21, 20, 10, None, 30, 31,45, 23, 56],}


df = pd.DataFrame(data)
df = df.set_index('year')
df

    category1   category2   category3
year            
2010    NaN 10  10
2011    21  21  21
2012    NaN 20  20
2013    10  10  10
2014    NaN NaN NaN
2015    30  30  NaN
2016    31  NaN 31
2017    45  45  45
2018    23  23  23
2019    56  56  56

填寫category1后:

category1   category2   category3
year            
2010    10  10  10
2011    21  21  21
2012    20  20  20
2013    10  10  10
2014    NaN NaN NaN
2015    30  30  NaN
2016    31  NaN 31
2017    45  45  45
2018    23  23  23
2019    56  56  56

IIUC 你可以這樣做:

In [369]: df['category1'] = df['category1'].fillna(df['category2'])

In [370]: df
Out[370]:
      category1  category2  category3
year
2010       10.0       10.0       10.0
2011       21.0       21.0       21.0
2012       20.0       20.0       20.0
2013       10.0       10.0       10.0
2014        NaN        NaN        NaN
2015       30.0       30.0       30.0
2016       31.0        NaN       31.0
2017       45.0       45.0       45.0
2018       23.0       23.0       23.0
2019       56.0       56.0       56.0

如果所有值都是NaN您可以使用first_valid_index和條件:

def f(x):
    if x.first_valid_index() is None:
        return None
    else:
        return x[x.first_valid_index()]

df['a'] = df.apply(f, axis=1)

print (df)
      category1  category2  category3     a
year                                       
2010        NaN       10.0       10.0  10.0
2011       21.0       21.0       21.0  21.0
2012        NaN       20.0       20.0  20.0
2013       10.0       10.0       10.0  10.0
2014        NaN        NaN        NaN   NaN
2015       30.0       30.0       30.0  30.0
2016       31.0        NaN       31.0  31.0
2017       45.0       45.0       45.0  45.0
2018       23.0       23.0       23.0  23.0
2019       56.0       56.0       56.0  56.0

試試這個:

df['category1']= df['category1'].fillna(df.median(axis=1))

你可以用pandas.DataFrame.fillna查看文檔,很清楚

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM