简体   繁体   English

Pandas 系列仅将 NaN 填充到一定限度

[英]Pandas Series fill NaNs upto a certain limit only

I have a dataset "artwork.csv" https://gitlab.com/IEA_ML_LAB/test/-/blob/80713d4823c4778d11468bcaf4a5223f6a160c88/artwork.csv我有一个数据集“artwork.csv” https://gitlab.com/IEA_ML_LAB/test/-/blob/80713d4823c4778d11468bcaf4a5223f6a160c88/artwork.Z628CB5675FF524F3E7ZEFE88FAA73

The 'year' column includes int64 and NaN. “年份”列包括 int64 和 NaN。

在此处输入图像描述

I want to replace the first 100 NaN values with text 'no date'.我想用文本“无日期”替换前 100 个 NaN 值。 I tried different methods but didn't succeed.我尝试了不同的方法,但没有成功。

'year' columns has 1279 NaN values. 'year' 列有 1279 个 NaN 值。 I want to set the first 100 out of 1279 to 'no date'我想将 1279 中的前 100 个设置为“无日期”

在此处输入图像描述

The first 100 NaN values:前 100 个 NaN 值:

在此处输入图像描述

I try the below commands.我尝试以下命令。 They don't produce any errors but they don't modify the Series either:它们不会产生任何错误,但它们也不会修改系列:

df.loc[df.year.isnull(), 'year'].iloc[:100] = 'no date'
(df.loc[df.year.isnull(), 'year'].iloc[:100]).replace('NaN', 'no date', inplace=True)
(df.loc[df.year.isnull(), 'year'].iloc[:100]).transform(lambda x: 'no date')

Thanks in advance.提前致谢。

fillna has a limit argument you can set to 100: fillna有一个可以设置为 100 的limit参数:

df['year'] = df['year'].fillna('no date', limit=100)

No need to call iloc beforehand, as that would generate an additional copy of data.无需事先调用iloc ,因为这会生成额外的数据副本。

Although beware mixing strings and floats is probably not the best option here as it severely impacts performance while handling the data.虽然要注意混合字符串和浮点数可能不是最好的选择,因为它在处理数据时会严重影响性能。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM