简体   繁体   English

从数据框列中选择非None值

[英]selecting not None value from a dataframe column

I would like to use the fillna function to fill None value of a column with its own first most frequent value that is not None or nan. 我想使用fillna函数来填充列的None值,其中第一个最常用的值不是None或nan。

Input DF: 输入DF:

Col_A
a
None
None
c
c
d
d

The output Dataframe could be either: 输出Dataframe可以是:

Col_A
a
c
c
c
c
d
d

Any suggestion would be very appreciated. 任何建议将非常感谢。 Many Thanks, Best Regards, Carlo 非常感谢,最诚挚的问候,卡罗

Prelude: If your None is actually a string , you can simplify any headaches by getting rid of them first-up. 序言:如果您的None实际上是一个字符串 ,您可以通过首先摆脱它们来简化任何麻烦。 Use replace : 使用replace

df = df.replace('None', np.nan)

I believe you could use fillna + value_counts : 我相信你可以使用fillna + value_counts

df

  Col_A
0     a
1   NaN
2   NaN
3     c
4     c
5     d
6     d

df.fillna(df.Col_A.value_counts(sort=False).index[0])

  Col_A
0     a
1     c
2     c
3     c
4     c
5     d
6     d

Or, with Vaishali's suggestion, use idxmax to pick c : 或者,根据Vaishali的建议,使用idxmax来选择c

df.fillna(df.Col_A.value_counts(sort=False).idxmax())

  Col_A
0     a
1     c
2     c
3     c
4     c
5     d
6     d

The fill-values could either be c or d , depending on whether you include sort=False or not. 填充值可以是cd ,具体取决于是否包含sort=False

Details 细节

df.Col_A.value_counts(sort=False)

c    2
a    1
d    2
Name: Col_A, dtype: int64

fillna + mode fillna + mode

df.Col_A.fillna(df.Col_A.mode()[0])
Out[963]: 
0    a
1    c
2    c
3    c
4    c
5    d
6    d
Name: Col_A, dtype: object

To address 'None', you need to use replace then fillna much like @COLDSPEED suggests: 要解决'无',你需要使用replace然后fillna ,就像@COLDSPEED建议:

dr = df.Col_A.replace('None',np.nan)
dr.fillna(dr.dropna().value_counts().index[0])

Output: 输出:

0    a
1    d
2    d
3    c
4    c
5    d
6    d
Name: Col_A, dtype: object

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM