pandas df.fillna() 不替换 na 值

Question

I have a dataframe that looks like this (For clarity: This represents a df with 5 rows and 8 columns) :我有一个看起来像这样的 dataframe （为清楚起见：这代表一个 df 有 5 行和 8 列） ：

            BTC-USD_close  BTC-USD_volume  LTC-USD_close  LTC-USD_volume  \
time                                                                       
1528968660    6489.549805        0.587100      96.580002        9.647200   
1528968720    6487.379883        7.706374      96.660004      314.387024   
1528968780    6479.410156        3.088252      96.570000       77.129799   
1528968840    6479.410156        1.404100      96.500000        7.216067   
1528968900    6479.979980        0.753000      96.389999      524.539978  

            BCH-USD_close  BCH-USD_volume  ETH-USD_close  ETH-USD_volume  
time                                                                      
1528968660     871.719971        5.675361            NaN             NaN  
1528968720     870.859985       26.856577      486.01001       26.019083  
1528968780     870.099976        1.124300      486.00000        8.449400  
1528968840     870.789978        1.749862      485.75000       26.994646  
1528968900     870.000000        1.680500      486.00000       77.355759

And I would like to replace the nan-values in the ETH-USD_close and ETH-USD_volume column.我想替换 ETH-USD_close 和 ETH-USD_volume 列中的 nan-values。 However, when i call df.fillna(method='ffill', inplace=True) , nothing seems to happen;但是，当我调用df.fillna(method='ffill', inplace=True)时，似乎什么也没发生； the missing values are still there and nothing changes in the columns when I step through the program with a debugger.当我使用调试器逐步执行程序时，缺失的值仍然存在，并且列中没有任何变化。

When i use df.isna() to check whether my nan values are correctly interpreted by pandas, this does seem to be the case;当我使用df.isna()检查我的 nan 值是否被 pandas 正确解释时，情况似乎确实如此； check the output of the first few rows when I check by print(df.isna()) :当我通过print(df.isna())检查时，检查前几行的 output ：

            BTC-USD_close  BTC-USD_volume  LTC-USD_close  LTC-USD_volume  \
time                                                                       
1528968660          False           False          False           False   
1528968720          False           False          False           False   
1528968780          False           False          False           False   
1528968840          False           False          False           False   
1528968900          False           False          False           False  

            BCH-USD_close  BCH-USD_volume  ETH-USD_close  ETH-USD_volume  
time                                                                      
1528968660          False           False           True            True  
1528968720          False           False          False           False  
1528968780          False           False          False           False  
1528968840          False           False          False           False  
1528968900          False           False          False           False

A call like df.dropna(inplace=True) does remove the entire row, but this is not what I want.像df.dropna(inplace=True)这样的调用确实会删除整行，但这不是我想要的。 Any suggestions?有什么建议么？

EDIT: In case anyone wants to reproduce the problem, one can download the data from https://pythonprogramming.net/static/downloads/machine-learning-data/crypto_data.zip , unzip it and run the following code in the same directory:编辑：如果有人想重现该问题，可以从https://pythonprogramming.net/static/downloads/machine-learning-data/crypto_data.zip下载数据，解压缩并在同一目录中运行以下代码：

import pandas as pd

#Initialize empty df
main_df = pd.DataFrame()

ratios = ["BTC-USD", "LTC-USD", "BCH-USD", "ETH-USD"]
for ratio in ratios:
    #SET CORRECT PATH HERE
    dataset = f'crypto_data/{ratio}.csv'
    #Use f-strings so we know which close/volume is which
    df_ratio = pd.read_csv(dataset, names=['time', 'low', 'high', 'open', f"{ratio}_close", f"{ratio}_volume"])
    #Set time as index so we can join them on this shared time
    df_ratio.set_index("time", inplace=True)

    #ignore the other columns besides price and volume
    df_ratio = df_ratio[[f"{ratio}_close", f"{ratio}_volume"]]

    if main_df.empty:
        main_df = df_ratio
    else:
        main_df = main_df.join(df_ratio)


main_df.fillna(method='ffill', inplace=True) #THIS DOESN'T SEEM TO WORK

Answer 1

Ah.啊。

You cannot ffill a NaN value if it is the first value of a series: it has no previous value.如果它是系列的第一个值，则不能ffill NaN值：它没有先前的值。

Using .ffill().bfill() could solve this but might be creating false data.使用.ffill().bfill()可以解决这个问题，但可能会创建错误数据。

pandas df.fillna() 不替换 na 值

问题描述

1 个解决方案

解决方案1
1 已采纳 2020-05-05 07:47:15

pandas df.fillna() 不替换 na 值

问题描述

1 个解决方案

解决方案1 1 已采纳 2020-05-05 07:47:15

解决方案1
1 已采纳 2020-05-05 07:47:15