使用来自其他列的非空值填充列中的空值

Question

Given a dataframe with similar columns having null values in between.给定一个 dataframe 和类似的列，它们之间有 null 个值。 How to dynamically fill nulls in the columns with non-null values from other columns without explicitly stating the names of other column names eg select first column category1 and fill the null rows with values from other columns of same rows?如何使用其他列的非空值动态填充列中的空值而不明确说明其他列名称的名称，例如 select 第一列category1 1 并使用同一行其他列的值填充 null 行？

data = {'year': [2010, 2011, 2012, 2013, 2014, 2015, 2016,2017, 2018, 2019],
        'category1': [None, 21, None, 10, None, 30, 31,45, 23, 56],
        'category2': [10, 21, 20, 10, None, 30, None,45, 23, 56],
        'category3': [10, 21, 20, 10, None, 30, 31,45, 23, 56],}


df = pd.DataFrame(data)
df = df.set_index('year')
df

    category1   category2   category3
year            
2010    NaN 10  10
2011    21  21  21
2012    NaN 20  20
2013    10  10  10
2014    NaN NaN NaN
2015    30  30  NaN
2016    31  NaN 31
2017    45  45  45
2018    23  23  23
2019    56  56  56

After filling category1 :填写category1后：

category1   category2   category3
year            
2010    10  10  10
2011    21  21  21
2012    20  20  20
2013    10  10  10
2014    NaN NaN NaN
2015    30  30  NaN
2016    31  NaN 31
2017    45  45  45
2018    23  23  23
2019    56  56  56

Answer 1

IIUC you can do it this way: IIUC 你可以这样做：

In [369]: df['category1'] = df['category1'].fillna(df['category2'])

In [370]: df
Out[370]:
      category1  category2  category3
year
2010       10.0       10.0       10.0
2011       21.0       21.0       21.0
2012       20.0       20.0       20.0
2013       10.0       10.0       10.0
2014        NaN        NaN        NaN
2015       30.0       30.0       30.0
2016       31.0        NaN       31.0
2017       45.0       45.0       45.0
2018       23.0       23.0       23.0
2019       56.0       56.0       56.0

Answer 2

You can use first_valid_index with condition if all values are NaN :如果所有值都是NaN您可以使用first_valid_index和条件：

def f(x):
    if x.first_valid_index() is None:
        return None
    else:
        return x[x.first_valid_index()]

df['a'] = df.apply(f, axis=1)

print (df)
      category1  category2  category3     a
year                                       
2010        NaN       10.0       10.0  10.0
2011       21.0       21.0       21.0  21.0
2012        NaN       20.0       20.0  20.0
2013       10.0       10.0       10.0  10.0
2014        NaN        NaN        NaN   NaN
2015       30.0       30.0       30.0  30.0
2016       31.0        NaN       31.0  31.0
2017       45.0       45.0       45.0  45.0
2018       23.0       23.0       23.0  23.0
2019       56.0       56.0       56.0  56.0

Answer 3

试试这个：

df['category1']= df['category1'].fillna(df.median(axis=1))

Answer 4

你可以用pandas.DataFrame.fillna查看文档，很清楚

使用来自其他列的非空值填充列中的空值

问题描述

3 个解决方案

解决方案1
1 2016-06-16 15:35:53

解决方案2
0 已采纳 2016-06-16 15:36:18

解决方案3
0 2016-06-16 16:15:59

解决方案4
0 2022-01-05 23:00:21

使用来自其他列的非空值填充列中的空值

问题描述

3 个解决方案

解决方案1 1 2016-06-16 15:35:53

解决方案2 0 已采纳 2016-06-16 15:36:18

解决方案3 0 2016-06-16 16:15:59

解决方案4 0 2022-01-05 23:00:21

解决方案1
1 2016-06-16 15:35:53

解决方案2
0 已采纳 2016-06-16 15:36:18

解决方案3
0 2016-06-16 16:15:59

解决方案4
0 2022-01-05 23:00:21