[英]Fill Consecutive NANs in a column of a dataframe
I have a dataframe having a column C, I want to fill consecutive blanks by the same number because later I need to group that row. 我有一个具有C列的数据框,我想用相同的数字填充连续的空白,因为以后我需要将该行分组。
eg 例如
A B C
1 2 Nan
1 2 Nan
1 2 3
1 2 Nan
1 2 Nan
the output I want is something like this 我想要的输出是这样的
A B C
1 2 1
1 2 1
1 2 3
1 2 2
1 2 2
I tried using shift() to compare but didn't come to the desired output. 我尝试使用shift()进行比较,但未达到所需的输出。
You can use fillna
by new Series
created by cumsum
by boolean mask
: 您可以使用fillna
由boolean mask
创建的新Series
的cumsum
:
df['C'] = df['C'].fillna(df['C'].notnull().cumsum() + 1)
print (df)
A B C
0 1 2 1.0
1 1 2 1.0
2 1 2 3.0
3 1 2 2.0
4 1 2 2.0
Detail : 详细说明 :
print (df['C'].notnull().cumsum())
0 0
1 0
2 1
3 1
4 1
Name: C, dtype: int32
The function fillna is your solution: 函数fillna是您的解决方案:
dataframe['yourColumn'] = dataframe['yourColumn'] .fillna( 1 , inplace=True)
Moreover you can put whatever value you want to substitute the nan values. 此外,您可以放置任何要替换nan值的值。 For instance, you coul set the mean: 例如,您可以设置均值:
dataframe['yourColumn']= dataframe['yourColumn'].fillna(dataset['yourColumn'] .mean(), inplace=True)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.