[英]Pandas dataframe fillna() only some columns in place
I am trying to fill none values in a Pandas dataframe with 0's for only some subset of columns.我试图在 Pandas 数据框中只为某些列子集填充 0 值。
When I do:当我这样做时:
import pandas as pd
df = pd.DataFrame(data={'a':[1,2,3,None],'b':[4,5,None,6],'c':[None,None,7,8]})
print df
df.fillna(value=0, inplace=True)
print df
The output:输出:
a b c
0 1.0 4.0 NaN
1 2.0 5.0 NaN
2 3.0 NaN 7.0
3 NaN 6.0 8.0
a b c
0 1.0 4.0 0.0
1 2.0 5.0 0.0
2 3.0 0.0 7.0
3 0.0 6.0 8.0
It replaces every None
with 0
's.它用
0
替换每个None
。 What I want to do is, only replace None
s in columns a
and b
, but not c
.我想要做的是,只替换
a
和b
列中a
None
s,而不是c
。
What is the best way of doing this?这样做的最佳方法是什么?
You can select your desired columns and do it by assignment:您可以选择所需的列并通过分配来完成:
df[['a', 'b']] = df[['a','b']].fillna(value=0)
The resulting output is as expected:结果输出如预期:
a b c
0 1.0 4.0 NaN
1 2.0 5.0 NaN
2 3.0 0.0 7.0
3 0.0 6.0 8.0
You can using dict
, fillna
with different value for different column您可以使用不同列的不同值的
dict
, fillna
df.fillna({'a':0,'b':0})
Out[829]:
a b c
0 1.0 4.0 NaN
1 2.0 5.0 NaN
2 3.0 0.0 7.0
3 0.0 6.0 8.0
After assign it back分配回来后
df=df.fillna({'a':0,'b':0})
df
Out[831]:
a b c
0 1.0 4.0 NaN
1 2.0 5.0 NaN
2 3.0 0.0 7.0
3 0.0 6.0 8.0
You can avoid making a copy of the object using Wen's solution and inplace=True:您可以避免使用 Wen 的解决方案和 inplace=True 制作对象的副本:
df.fillna({'a':0, 'b':0}, inplace=True)
print(df)
Which yields:其中产生:
a b c
0 1.0 4.0 NaN
1 2.0 5.0 NaN
2 3.0 0.0 7.0
3 0.0 6.0 8.0
using the top answer produces a warning about making changes to a copy of a df slice.使用最佳答案会产生关于更改 df 切片副本的警告。 Assuming that you have other columns, a better way to do this is to pass a dictionary:
假设您有其他列,更好的方法是传递字典:
df.fillna({'A': 'NA', 'B': 'NA'}, inplace=True)
Here's how you can do it all in one line:以下是如何在一行中完成所有操作:
df[['a', 'b']].fillna(value=0, inplace=True)
Breakdown: df[['a', 'b']]
selects the columns you want to fill NaN values for, value=0
tells it to fill NaNs with zero, and inplace=True
will make the changes permanent, without having to make a copy of the object.细分:
df[['a', 'b']]
选择要为其填充 NaN 值的列, value=0
告诉它用零填充 NaN,而inplace=True
将使更改永久化,而无需进行对象的副本。
Or something like:或类似的东西:
df.loc[df['a'].isnull(),'a']=0
df.loc[df['b'].isnull(),'b']=0
and if there is more:如果还有更多:
for i in your_list:
df.loc[df[i].isnull(),i]=0
For some odd reason this DID NOT work (using Pandas: '0.25.1')由于某些奇怪的原因,这不起作用(使用 Pandas:'0.25.1')
df[['col1', 'col2']].fillna(value=0, inplace=True)
Another solution:另一种解决方案:
subset_cols = ['col1','col2']
[df[col].fillna(0, inplace=True) for col in subset_cols]
Example:示例:
df = pd.DataFrame(data={'col1':[1,2,np.nan,], 'col2':[1,np.nan,3], 'col3':[np.nan,2,3]})
output:输出:
col1 col2 col3
0 1.00 1.00 nan
1 2.00 nan 2.00
2 nan 3.00 3.00
Apply list comp.应用列表组件。 to fillna values:
填充值:
subset_cols = ['col1','col2']
[df[col].fillna(0, inplace=True) for col in subset_cols]
Output:输出:
col1 col2 col3
0 1.00 1.00 nan
1 2.00 0.00 2.00
2 0.00 3.00 3.00
这应该可以工作并且没有复制警告
df[['a', 'b']] = df.loc[:,['a', 'b']].fillna(value=0)
Sometimes this syntax wont work:有时这种语法不起作用:
df[['col1','col2']] = df[['col1','col2']].fillna()
Use the following instead:请改用以下内容:
df['col1','col2']
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.