简体   繁体   中英

Pandas read_csv fillna

I have some data that I am reading from a CSV file and one data frame column is recorded on a different time stamp interval (time series data) and I cant get a df.fillna(method = 'ffill').fillna(method = 'bfill') to work.

If I don't read the CSV file with a keep_default_na=False Python fills the gaps with a NaN but I would like the gaps to be blank so I can use the df.fillna(method = 'ffill')

import pandas as pd
import numpy as np

#read CSV file
df_raw = pd.read_csv('C:\\desktop\\combinedSP.csv', index_col='Date', parse_dates=True, keep_default_na=False)

df_raw.head()

df_raw2 = df_raw.fillna(method = 'ffill').fillna(method = 'bfill')

df_raw2.head()

It seems like no matter what I attempt I am not fixing the issue on the column labeled OAT :(

在此输入图像描述

Any tips greatly appreciated, I have the data CSV file here loaded into my GitHub account.

When you do keep_default_na=False this means that what read_csv usually would read and parse to NaN it will no longer :

By default the following values are interpreted as NaN: '', '#N/A', '#N/AN/A', '#NA', '-1.#IND', '-1.#QNAN', '-NaN', '-nan', '1.#IND', '1.#QNAN', 'N/A', 'NA', 'NULL', 'NaN', 'n/a', 'nan', 'null' .

In this case, it's not parsing the empty string '' as NaN, it's keeping them as the empty string.

Drop that kwarg and the fillnas ought to work.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM