[英]Replace missing values in all columns except one in pandas dataframe
I have a pandas dataframe with 10 columns and I want to fill missing values for all columns except one (lets say that column is called test
). 我有一个包含10列的pandas数据帧,我希望为除了一列之外的所有列填充缺失值(假设该列称为test
)。 Currently, if I do this: 目前,如果我这样做:
df.fillna(df.median(), inplace=True)
It replaces NA values in all columns with median value, how do I exclude specific column(s) without specifying ALL the other columns 它用中值替换所有列中的NA值,如何在不指定所有其他列的情况下排除特定列
you can use pd.DataFrame.drop
to help out 你可以使用pd.DataFrame.drop
来帮忙
df.drop('unwanted_column', 1).fillna(df.median())
Or pd.Index.difference
或pd.Index.difference
df.loc[:, df.columns.difference(['unwanted_column'])].fillna(df.median())
Or just 要不就
df.loc[:, df.columns != 'unwanted_column']
Input to difference function should be passed as an array (Edited). 输入差异函数应作为数组传递(编辑)。
Just select whatever columns you want using pandas' column indexing: 只需使用pandas的列索引选择您想要的任何列:
>>> import numpy as np
>>> import pandas as pd
>>> df = pd.DataFrame({'A': [np.nan, 5, 2, np.nan, 3], 'B': [np.nan, 4, 3, 5, np.nan], 'C': [np.nan, 4, 3, 2, 1]})
>>> df
A B C
0 NaN NaN NaN
1 5.0 4.0 4.0
2 2.0 3.0 3.0
3 NaN 5.0 2.0
4 3.0 NaN 1.0
>>> cols = ['A', 'B']
>>> df[cols] = df[cols].fillna(df[cols].median())
>>> df
A B C
0 3.0 4.0 NaN
1 5.0 4.0 4.0
2 2.0 3.0 3.0
3 3.0 5.0 2.0
4 3.0 4.0 1.0
data_rnr['CO BORROWER NAME'].fillna("NO",inplace=True)
这将替换列data_rnr ['CO BORROWER NAME']下的“NA”值,字符串为“NO”。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.