简体   繁体   English

在数据框的列中填充连续NAN

[英]Fill Consecutive NANs in a column of a dataframe

I have a dataframe having a column C, I want to fill consecutive blanks by the same number because later I need to group that row. 我有一个具有C列的数据框,我想用相同的数字填充连续的空白,因为以后我需要将该行分组。

eg 例如

A B C
 1 2 Nan
 1 2 Nan
 1 2 3
 1 2 Nan
 1 2 Nan

the output I want is something like this 我想要的输出是这样的

A B C
1 2 1
1 2 1
1 2 3
1 2 2
1 2 2

I tried using shift() to compare but didn't come to the desired output. 我尝试使用shift()进行比较,但未达到所需的输出。

You can use fillna by new Series created by cumsum by boolean mask : 您可以使用fillna由boolean mask创建的新Seriescumsum

df['C'] = df['C'].fillna(df['C'].notnull().cumsum() + 1)

print (df)
   A  B    C
0  1  2  1.0
1  1  2  1.0
2  1  2  3.0
3  1  2  2.0
4  1  2  2.0

Detail : 详细说明

print (df['C'].notnull().cumsum())
0    0
1    0
2    1
3    1
4    1
Name: C, dtype: int32

The function fillna is your solution: 函数fillna是您的解决方案:

dataframe['yourColumn'] = dataframe['yourColumn'] .fillna( 1 , inplace=True)

Moreover you can put whatever value you want to substitute the nan values. 此外,您可以放置​​任何要替换nan值的值。 For instance, you coul set the mean: 例如,您可以设置均值:

dataframe['yourColumn']= dataframe['yourColumn'].fillna(dataset['yourColumn'] .mean(), inplace=True)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM