简体   繁体   English

Pandas 数据框 fillna() 只有一些列就位

[英]Pandas dataframe fillna() only some columns in place

I am trying to fill none values in a Pandas dataframe with 0's for only some subset of columns.我试图在 Pandas 数据框中只为某些列子集填充 0 值。

When I do:当我这样做时:

import pandas as pd
df = pd.DataFrame(data={'a':[1,2,3,None],'b':[4,5,None,6],'c':[None,None,7,8]})
print df
df.fillna(value=0, inplace=True)
print df

The output:输出:

     a    b    c
0  1.0  4.0  NaN
1  2.0  5.0  NaN
2  3.0  NaN  7.0
3  NaN  6.0  8.0
     a    b    c
0  1.0  4.0  0.0
1  2.0  5.0  0.0
2  3.0  0.0  7.0
3  0.0  6.0  8.0

It replaces every None with 0 's.它用0替换每个None What I want to do is, only replace None s in columns a and b , but not c .我想要做的是,只替换ab列中a None s,而不是c

What is the best way of doing this?这样做的最佳方法是什么?

You can select your desired columns and do it by assignment:您可以选择所需的列并通过分配来完成:

df[['a', 'b']] = df[['a','b']].fillna(value=0)

The resulting output is as expected:结果输出如预期:

     a    b    c
0  1.0  4.0  NaN
1  2.0  5.0  NaN
2  3.0  0.0  7.0
3  0.0  6.0  8.0

You can using dict , fillna with different value for different column您可以使用不同列的不同值的dict , fillna

df.fillna({'a':0,'b':0})
Out[829]: 
     a    b    c
0  1.0  4.0  NaN
1  2.0  5.0  NaN
2  3.0  0.0  7.0
3  0.0  6.0  8.0

After assign it back分配回来后

df=df.fillna({'a':0,'b':0})
df
Out[831]: 
     a    b    c
0  1.0  4.0  NaN
1  2.0  5.0  NaN
2  3.0  0.0  7.0
3  0.0  6.0  8.0

You can avoid making a copy of the object using Wen's solution and inplace=True:您可以避免使用 Wen 的解决方案和 inplace=True 制作对象的副本:

df.fillna({'a':0, 'b':0}, inplace=True)
print(df)

Which yields:其中产生:

     a    b    c
0  1.0  4.0  NaN
1  2.0  5.0  NaN
2  3.0  0.0  7.0
3  0.0  6.0  8.0

using the top answer produces a warning about making changes to a copy of a df slice.使用最佳答案会产生关于更改 df 切片副本的警告。 Assuming that you have other columns, a better way to do this is to pass a dictionary:假设您有其他列,更好的方法是传递字典:
df.fillna({'A': 'NA', 'B': 'NA'}, inplace=True)

Here's how you can do it all in one line:以下是如何在一行中完成所有操作:

df[['a', 'b']].fillna(value=0, inplace=True)

Breakdown: df[['a', 'b']] selects the columns you want to fill NaN values for, value=0 tells it to fill NaNs with zero, and inplace=True will make the changes permanent, without having to make a copy of the object.细分: df[['a', 'b']]选择要为其填充 NaN 值的列, value=0告诉它用零填充 NaN,而inplace=True将使更改永久化,而无需进行对象的副本。

Or something like:或类似的东西:

df.loc[df['a'].isnull(),'a']=0
df.loc[df['b'].isnull(),'b']=0

and if there is more:如果还有更多:

for i in your_list:
    df.loc[df[i].isnull(),i]=0

For some odd reason this DID NOT work (using Pandas: '0.25.1')由于某些奇怪的原因,这不起作用(使用 Pandas:'0.25.1')

df[['col1', 'col2']].fillna(value=0, inplace=True)

Another solution:另一种解决方案:

subset_cols = ['col1','col2']
[df[col].fillna(0, inplace=True) for col in subset_cols]

Example:示例:

df = pd.DataFrame(data={'col1':[1,2,np.nan,], 'col2':[1,np.nan,3], 'col3':[np.nan,2,3]})

output:输出:

   col1  col2  col3
0  1.00  1.00   nan
1  2.00   nan  2.00
2   nan  3.00  3.00

Apply list comp.应用列表组件。 to fillna values:填充值:

subset_cols = ['col1','col2']
[df[col].fillna(0, inplace=True) for col in subset_cols]

Output:输出:

   col1  col2  col3
0  1.00  1.00   nan
1  2.00  0.00  2.00
2  0.00  3.00  3.00

这应该可以工作并且没有复制警告

df[['a', 'b']] = df.loc[:,['a', 'b']].fillna(value=0)

Sometimes this syntax wont work:有时这种语法不起作用:

df[['col1','col2']] = df[['col1','col2']].fillna()

Use the following instead:请改用以下内容:

df['col1','col2']

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM